Commits · aee11fe427b2f2fd66c3ef3cd91757ec00420ac9 · zhusg / transformers-new

16 Feb, 2024 4 commits

Fix max_length criteria when using inputs_embeds (#28994) · aee11fe4


* fix max_length for inputs_embeds

* make style

* Update src/transformers/generation/utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Static Cache: load models with MQA or GQA (#28975)

* fix

* fix tests

* fix tests

* Update src/transformers/generation/utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* more fixes

* make style

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

aee11fe4

Update important model list (#29019) · 8876ce8a
Lysandre Debut authored 1 year ago

8876ce8a
Update all references to canonical models (#29001) · f497f564
Lysandre Debut authored 1 year ago
```
* Script & Manual edition

* Update
```
f497f564
add test marker to run all tests with @require_bitsandbytes (#28278) · 1e402b95
Titus authored 1 year ago

1e402b95

15 Feb, 2024 8 commits

Fix a tiny typo in `generation/utils.py::GenerateEncoderDecoderOutput`'s docstring (#29044) · f3aa7db4
Sadra Barikbin authored 1 year ago
```
Update utils.py
```
f3aa7db4
Removed obsolete attribute setting for AQLM quantization. (#29034) · b0a7f44f
Andrei Panferov authored 1 year ago
```
removed redundant field
```
b0a7f44f
Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043) · 4156f517
amyeroberts authored 1 year ago
```
* Patch to skip currently failing tests

* Whoops - wrong place
```
4156f517
FIX: Fix error with `logger.warning` + inline with recent refactor (#29039) · 6d1f5456
Younes Belkada authored 1 year ago
```
Update modeling_utils.py
```
6d1f5456
Fix copies between DETR and DETA (#29037) · 8a0ed0a9
amyeroberts authored 1 year ago

8a0ed0a9

DeformableDetrModel support fp16 (#29013) · 5b6fa230

Donggeun Yu authored 1 year ago


* Update ms_deform_attn_cuda.cu

* Update ms_deform_attn_cuda.cuh

* Update modeling_deformable_detr.py

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update modeling_deformable_detr.py

* python utils/check_copies.py --fix_and_overwrite

* Fix dtype missmatch error

* Update test_modeling_deformable_detr.py

* Update test_modeling_deformable_detr.py

* Update modeling_deformable_detr.py

* Update modeling_deformable_detr.py

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5b6fa230

Add cuda_custom_kernel in DETA (#28989) · 83e96dc0

Sangbum Daniel Choi authored 1 year ago

* enable graident checkpointing in DetaObjectDetection

* fix missing part in original DETA

* make style

* make fix-copies

* Revert "make fix-copies"

This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358.

* remove fix-copies of DetaDecoder

* enable swin gradient checkpointing

* fix gradient checkpointing in donut_swin

* add tests for deta/swin/donut

* Revert "fix gradient checkpointing in donut_swin"

This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d.

* change supports_gradient_checkpointing pipeline to PreTrainedModel

* Revert "add tests for deta/swin/donut"

This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b.

* Revert "Revert "fix gradient checkpointing in donut_swin""

This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f.

* Simple revert

* enable deformable detr gradient checkpointing

* add gradient in encoder

* add cuda_custom_kernel function in MSDA

* make style and fix input of DetaMSDA

* make fix-copies

* remove n_levels in input of DetaMSDA

* minor changes

* refactor custom_cuda_kernel like yoso format
https://github.com/huggingface/transformers/blob/0507e69d34f8902422eb4977ec066dd6bef179a0/src/transformers/models/yoso/modeling_yoso.py#L53

83e96dc0

Fix static generation when compiling! (#28937) · f3788b09

Arthur authored 1 year ago


* wow I was scared!

* fix everything

* nits

* make it BC?

* add todo

* nits

* is_tracing should still be used to pass tracing tests

* nits

* some nits to make sure genration works with static cache uncompiled

* fix sdpa

* fix FA2 for both static and dynamic in a better way?

* style

* fix-copies

* fix fix copies

* fix sequential beam searcg

* style

* use `keys_to_ignore`

* nit

* correct dtype inference when init

* :( the fix for FA2 is still not optimal to investigate!

* styling

* nits

* nit

* this might work better

* add comment

* Update src/transformers/models/llama/modeling_llama.py

* "position_ids" -> "cache_position"

* style

* nit

* Remove changes that should no be propagatted just yet

* Apply suggestions from code review

* Styling

* make sure we raise an errir for static cache with FA2 enabled

* move  to the bottom of the signature

* style

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py

* nit in the name

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f3788b09

14 Feb, 2024 15 commits

[`CLeanup`] Revert SDPA attention changes that got in the static kv cache PR (#29027) · 609a1767
Arthur authored 1 year ago
```
* revert unrelated changes that got in

* style
```
609a1767

FIX [`Trainer` / tags]: Fix trainer + tags when users do not pass `"tags"` to... · 7a0fccc6

Younes Belkada authored 1 year ago

FIX [`Trainer` / tags]: Fix trainer + tags when users do not pass `"tags"` to `trainer.push_to_hub()` (#29009)

* fix trainer tags

* add test

7a0fccc6

[TPU] Support PyTorch/XLA FSDP via SPMD (#28949) · 5f06053d

Jiewen Tan authored 1 year ago

* Initial commit

* Add guards for the global mesh

* Address more comments

* Move the dataloader into integrations/tpu.py

* Fix linters

* Make karg more explicitly

* Remove the move device logic

* Fix the CI

* Fix linters

* Re-enable checkpointing

5f06053d

Backbone kwargs in config (#28784) · 0199a484

amyeroberts authored 1 year ago


* Enable instantiating model with pretrained backbone weights

* Clarify pretrained import

* Use load_backbone instead

* Add backbone_kwargs to config

* Pass kwargs to constructors

* Fix up

* Input verification

* Add tests

* Tidy up

* Update tests/utils/test_backbone_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

0199a484

Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948) · 725f4ad1

JB (Don) authored 1 year ago

* Add tie_weights() to LM heads and set bias in set_output_embeddings()

The bias were not tied correctly in some LM heads, and this change should fix that.

* Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin

* Adding _tie_weights() to MPNet and Vilt

* Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device

* Rename to test name to save_load to match the convention

725f4ad1

Mask Generation Task Guide (#28897) · 3f4e79d2

Merve Noyan authored 1 year ago


* Create mask_generation.md

* add h1

* add to toctree

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update mask_generation.md

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update mask_generation.md

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

* Update mask_generation.md

* Update mask_generation.md

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Maria Khalusova <kafooster@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>

3f4e79d2

Fix flaky test vision encoder-decoder generate (#28923) · 354775bc
Raushan Turganbay authored 1 year ago

354775bc

Introduce AcceleratorConfig dataclass (#28664) · 0507e69d

Zach Mueller authored 1 year ago


* Introduce acceleratorconfig dataclass

* Extra second warn

* Move import

* Try moving import under is_accelerate_available

* Quality

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Clean

* Remove to_kwargs

* Change version

* Improve tests by including dispatch and split batches

* Improve reliability

* Update tests/trainer/test_trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fixup tests and review nits

* Make tests pass

* protect import

* Protect import

* Empty-Commit

* Make training_args.to_dict handle the AcceleratorConfig

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

0507e69d

Set the dataset format used by `test_trainer` to float32 (#28920) · 69ca640d
Huazhong Ji authored 1 year ago
```
Co-authored-by: unit_test <test@unit.com>
```
69ca640d

[`Doc`] Fix docbuilder - make `BackboneMixin` and `BackboneConfigMixin`... · 7252e8d9

amyeroberts authored 1 year ago

[`Doc`] Fix docbuilder - make `BackboneMixin` and `BackboneConfigMixin` importable from `utils`.  (#29002)

* Trigger doc build

* Test removing references

* Importable from utils

* Trigger another run on a new commit for testing

7252e8d9

AQLM quantizer support (#28928) · 1ecf5f7c

Andrei Panferov authored 1 year ago


* aqlm init

* calibration and dtypes

* docs

* Readme update

* is_aqlm_available

* Simpler link in docs

* Test TODO real reference

* init _import_structure fix

* AqlmConfig autodoc

* integration aqlm

* integrations in tests

* docstring fix

* legacy typing

* Less typings

* More kernels information

* Performance -> Accuracy

* correct tests

* remoced multi-gpu test

* Update docs/source/en/quantization.md

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Brought back multi-gpu tests

* Update src/transformers/integrations/aqlm.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update tests/quantization/aqlm_integration/test_aqlm.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Andrei Panferov <blacksamorez@yandex-team.ru>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

1ecf5f7c

Add SiglipForImageClassification and CLIPForImageClassification (#28952) · 63ffd56d
NielsRogge authored 1 year ago
```
* First draft

* Add CLIPForImageClassification

* Remove scripts

* Fix doctests
```
63ffd56d

Add `StableLM` (#28810) · de6029a0

Jonathan Tow authored 1 year ago

* Add `StableLM`

* fix(model): re-create from `huggingface-cli add-new-model-like persimmon`

* fix: re-add changes to address comments

* fix(readme): add links to paper

* fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref

* fix(tests): re-add `@slow` decorator to integration tests

* fix(tests): import slow...

* fix(readme_hd): remove whitespace edit

* fix(tokenizer): auto tokenizer tuple

* skip doctests for `modeling_stablelm`

de6029a0

ENH [`AutoQuantizer`]: enhance trainer + not supported quant methods (#28991) · 164bdef8
Younes Belkada authored 1 year ago
```
* enhance trainer + not support quant methods

* remove all old logic

* add version
```
164bdef8

ENH: Do not pass warning message in case `quantization_config` is in config... · 1d12b8bc

Younes Belkada authored 1 year ago

ENH: Do not pass warning message in case `quantization_config` is in config but not passed as an arg (#28988)

* Update auto.py

* Update auto.py

* Update src/transformers/quantizers/auto.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/quantizers/auto.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1d12b8bc

13 Feb, 2024 5 commits

[`DETR`] Update the processing to adapt masks & bboxes to reflect padding (#28363) · bd4b83e1

amyeroberts authored 1 year ago

* Update the processing so bbox coords are adjusted for padding

* Just pad masks

* Tidy up, add tests

* Better tests

* Fix yolos and mark as slow for pycocotols

* Fix yolos - return_tensors

* Clarify padding and normalization behaviour

bd4b83e1

Update configuration_llama.py: fixed broken link (#28946) · 3de6a6b4

Aditya Kane authored 1 year ago


* Update configuration_llama.py: fix broken link

* [Nit] Explicit redirection not required

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

3de6a6b4

Static Cache: load models with MQA or GQA (#28975) · 3e70a207
Joao Gante authored 1 year ago

3e70a207

Add sudachi_projection option to BertJapaneseTokenizer (#28503) · da20209d

Hiroshi Matsuda authored 1 year ago


* add sudachi_projection option

* Upgrade sudachipy>=0.6.8

* add a test case for sudachi_projection

* Compatible with older versions of SudachiPy

* make fixup

* make style

* error message for unidic download

* revert jumanpp test cases

* format options for sudachi_projection

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* format options for sudachi_split_mode and sudachi_dict_type

* comment

* add tests for full_tokenizer kwargs

* pass projection arg directly

* require_sudachi_projection

* make style

* revert upgrade sudachipy

* check is_sudachi_projection_available()

* revert dependency_version_table and bugfix

* style format

* simply raise ImportError

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* simply raise ImportError

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

da20209d

[`NllbTokenizer`] refactor with added tokens decoder (#27717) · b4456753

Arthur authored 1 year ago


* refactor with addedtokens decoder

* style

* get rid of lang code to id

* style

* keep some things for BC

* update tests

* add the mask token at the end of the vocab

* nits

* nits

* fix final tests

* style

* nits

* Update src/transformers/models/nllb/tokenization_nllb_fast.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* nits

* style?

* Update src/transformers/convert_slow_tokenizer.py

* make it a tad bit more custom

* ruff please stop
Co-Authored by avidale

<dale.david@mail.ru>

* Update
Co-authored-by: avidale
<dale.david@mail.ru>

* Update
Co-authored-by: avidale <dale.david@mail.ru>

* oupts

* ouft

* nites

* test

* fix the remaining failing tests

* style

* fix failing test

* ficx other test

* temp dir + test the raw init

* update test

* style

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b4456753

12 Feb, 2024 8 commits
- [i18n-de] Translate CONTRIBUTING.md to German (#28954) · d90acc16
  Klaus Hipp authored 1 year ago
```
* Translate contributing.md to German

* Fix formatting issues in contributing.md

* Address review comments

* Fix capitalization
```
  d90acc16
- [Docs] Add video section (#28958) · 78ba9f46
  NielsRogge authored 1 year ago
```
Add video section
```
  78ba9f46
- [Docs] Add language identifiers to fenced code blocks (#28955) · fe3df9d5
  Klaus Hipp authored 1 year ago
```
Add language identifiers to code blocks
```
  fe3df9d5
- Clean up staging tmp checkpoint directory (#28848) · c617f988
  Yunxuan Xiao authored 1 year ago
```
clean up remaining tmp checkpoint dir

Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
```
  c617f988
- Always initialize tied output_embeddings if it has a bias term (#28947) · 136cd893
  JB (Don) authored 1 year ago
```
Continue to initialize tied output_embeddings if it has a bias term

The bias term is not tied, and so will need to be initialized accordingly.
```
  136cd893
- Updated requirements for image-classification samples: datasets>=2.14.0 (#28974) · 792819f6
  Alexey Fadeev authored 1 year ago
```
Updated datasets requirements. Need a package version >= 2.14.0
```
  792819f6
- Tests: tag `test_save_load_fast_init_from_base` as flaky (#28930) · e30bbb26
  Joao Gante authored 1 year ago
  
  e30bbb26
- [`pipelines`] updated docstring with vqa alias (#28951) · 1709886e
  cmahmut authored 1 year ago
```
updated docstring with vqa alias
```
  1709886e