Commits · a3f82328eda0a1b37bc49cb66d168d74327b60c0 · zhusg / transformers-new

13 Jan, 2025 8 commits

[i18n-ar] Translated file : docs/source/ar/tasks/token_classification.md into Arabic (#35193) · a3f82328

Ahmed Almaghz authored 6 months ago


* Create token_classification.md

* Update token_classification.md

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

a3f82328

[tests] make cuda-only tests device-agnostic (#35607) · 2fa876d2
Fanli Lin authored 6 months ago
```
* intial commit

* remove unrelated files

* further remove

* Update test_trainer.py

* fix style
```
2fa876d2
[`Compile`] Only test compiling model forward pass (#35658) · e6f9b034
Arthur authored 6 months ago
```
* rename test to only compile forward!

* style emu
```
e6f9b034

Enable different torch dtype in sub models (#34873) · 84a67891

Raushan Turganbay authored 6 months ago

* fix

* fix test

* add tests

* add more tests

* fix tests

* supposed to be a torch.dtype test

* handle BC and make fp32 default

84a67891

[`Phi`] bias should be True (#35650) · 87089176
Arthur authored 6 months ago
```
bias should be True
```
87089176

Removed some duplicated code (#35637) · 91f14f1f

Sai-Suraj-27 authored 6 months ago


* Removed duplicate class field definition.

* Removed duplicate code in try-except block.

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

91f14f1f

Fix whisper compile (#35413) · b8c34d97
jiqing-feng authored 6 months ago
```
Fix compile error

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
```
b8c34d97
Fix device in rope module when using dynamic updates (#35608) · cd44bdb4
Cyril Vallez authored 6 months ago
```
fix rope device
```
cd44bdb4

10 Jan, 2025 15 commits

Update codeowners with individual model owners (#35595) · 15bd3e61

Matt authored 6 months ago


* Update codeowners with individual model owners

* rip yoach

* add comment

* Replace - with _

* Add @qubvel for zero-shot object-detection

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add yoni for omdet-turbo

* Update CODEOWNERS

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* Refactor / comment the CODEOWNERS file

* Capture modular files as well

* Add dummies without owner

* More cleanup

* Set Niels on a few more models that he added

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

15bd3e61

Skip `MobileNetV1ModelTest::test_batching_equivalence` for now (#35614) · 1e3c6c1f
Yih-Dar authored 6 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
1e3c6c1f

Fix flaky `test_beam_search_low_memory` (#35611) · 04eae987

Yih-Dar authored 6 months ago


* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

04eae987

Let `EarlyStoppingCallback` not require `load_best_model_at_end` (#35101) · b02828e4
Zach Mueller authored 6 months ago
```
* Bookmark

* Add warning
```
b02828e4

Added error when sequence length is bigger than max_position_embeddings (#32156) · 0aaf124f

Taha Akbari authored 6 months ago

* Added error when sequence length is bigger than max_position_embeddings

* Fixed formatting

* Fixed bug

* Changed copies to match

* Fixed bug

* Applied suggestions

* Removed redundant code

* Fixed bugs

* Bug fix

* Bug fix

* Added requested Changes

* Fixed bug

* Fixed unwanted change

* Fixed unwanated changes

* Fixed formatting

0aaf124f

Use inherit tempdir makers for tests + fix failing DS tests (#35600) · 1211e616
Zach Mueller authored 6 months ago
```
* Use existing APIs to make tempdir folders

* Fixup deepspeed too

* output_dir -> tmp_dir
```
1211e616
Fix flaky `test_custom_4d_attention_mask` (#35606) · bbc00046
Yih-Dar authored 6 months ago
```
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
bbc00046
v4.49.0-dev · f63829c8
Arthur Zucker authored 6 months ago

f63829c8

[WIP] Emu3: add model (#33770) · 52e1f87c

Raushan Turganbay authored 6 months ago


* model can convert to HF and be loaded back

* nit

* works in single batch generation but hallucinates

* use the image tokens

* add image generation

* now it works

* add tests

* update

* add modulare but it doesn't work for porting docstring :(

* skip some tests

* add slow tests

* modular removed the import?

* guess this works

* update

* update

* fix copies

* fix test

* fix copies

* update

* docs

* fix tests

* last fix tests?

* pls

* repo consistency

* more style

* style

* remove file

* address comments

* tiny bits

* update after the new modular

* fix tests

* add one more cond in check attributes

* decompose down/up/mid blocks

* allow static cache generation in VLMs

* nit

* fix copies

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix VAE upsampling

* Update src/transformers/models/emu3/modular_emu3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments

* state overwritten stuff explicitly

* fix copies

* add the flag for flex attn

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

52e1f87c

Fix flex_attention in training mode (#35605) · ccc0381d
Cyril Vallez authored 6 months ago
```
* fix flex

* add test

* style
```
ccc0381d
Remove `benchmark.py` after #34275 · a9bd1e62
Arthur Zucker authored 6 months ago

a9bd1e62

Chat template: return vectorized output in processors (#34275) · e0646f3d

Raushan Turganbay authored 6 months ago


* update chat template

* style

* fix tests

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typehints + docs

* fix tests

* remove unnecessary warnings

* forgot code style :(

* allow users to pass backend and num frames

* Update docs/source/en/chat_templating.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typo fix

* style

* address comments

* align with "pipeline" template

* update docs

* update docs

* unpack for all kwargs?

* wrong conflict resolution while rebasing

* tmp

* update docs

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

e0646f3d

Add Moonshine (#34784) · 5f087d13

eustlb authored 6 months ago


* config draft

* full encoder forward

* full decoder forward

* fix sdpa and FA2

* fix sdpa and FA2

* moonshine model

* moonshine model forward

* fix attention with past_key_values

* add MoonshineForConditionalGeneration

* fix cache handling and causality for cross attention

* no causal attention mask for the encoder

* model addition (imports etc)

* small nit

* nits

* Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* add rope_theta

* nits

* model doc

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* imports

* add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES

* updates modular

* make

* make fix-copies

* ruff check examples fix

* fix check_modular_conversion

* nit

* nits

* nits

* copied from -> imports

* imports fix

* integrate attention refacto

* modular edge case

* remove encoder

* convolutions params in config

* run modular_model_converter

* make

* Update docs/source/en/model_doc/moonshine.md

Co-authored-by: Joshua Lochner <admin@xenova.com>

* MoonshineModelTest

* correct typo

* make style

* integration tests

* make

* modular convert

* name conversion update (up_proj -> fc1 etc)

* update config

* update MLP

* update attention

* update encoder layer

* update decoder layer

* update convolutions parameters

* update encoder

* remove INPUTS_DOCSTRING

* update decoder

* update conditional generation

* update pretrained model

* imports

* modular converted

* update doc

* fix

* typo

* update doc

* update license

* update init

* split config in file

* two classes for MLP

* attention from GLM

* from GlmRotaryEmbedding

* split MLP

* apply arthur's review suggestions

* apply arthur's review suggestions

* apply arthur's review suggestions

* auto feature extractor

* convert modular

* fix + make

* convert modular

* make

* unsplit config

* use correct checkpoint

* wrap generate

* update tests

* typos

* make

* typo

* update doc

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>

5f087d13

Skip `torchscript` tests if a cache object is in model's outputs (#35596) · 6f127d3f
Yih-Dar authored 6 months ago
```
* fix 1

* fix 1

* comment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
6f127d3f

ModernBert: reuse GemmaRotaryEmbedding via modular + Integration tests (#35459) · 6b73ee89

Tom Aarsen authored 6 months ago

* Introduce 5 integration tests for the 4 model classes + torch export

* ModernBert: reuse GemmaRotaryEmbedding via modular

* Revert #35589, keep rope_kwargs; rely on them in modular_modernbert

* Revert "Revert #35589, keep rope_kwargs; rely on them in modular_modernbert"

This reverts commit 11b44b9ee83e199cbfb7c5ba2d11f7a7fdbba2d3.

* Don't set rope_kwargs; override 'self.rope_init_fn' call instead

6b73ee89

09 Jan, 2025 17 commits

Add flex_attn to diffllama (#35601) · 8de7b1ba
Zach Mueller authored 6 months ago
```
Add sdpa to diffllama
```
8de7b1ba

ModernBERT bug fixes (#35404) · 1e3ddcb2

Benjamin Warner authored 6 months ago


* bug fixes

* organize imports

* wrap cpu warning in reference_compile

* Avoid needing repad_logits_with_grad, always repad with grads when training

I'm not 100% that the conditional with "or labels is None" makes sense though - not sure what the intention is there. Perhaps we can remove that?

* Revert "Avoid needing repad_logits_with_grad, always repad with grads when training"

This reverts commit cedcb4e89bcea199a1135a0933e71f534b656239.

* Fix grammar: keep -> keeps

* Propagate grammar fix with modular_model_converter

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>

1e3ddcb2

add `_supports_flex_attn = True` for models that do support it (#35598) · e97d7a5b
Arthur authored 6 months ago
```
* add `_supports_flex_attn = True`

* fix repo consistency
```
e97d7a5b

[doc] deepspeed universal checkpoint (#35015) · c9c682d1

胡译文 authored 6 months ago


* universal checkpoint

* Update docs/source/en/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

c9c682d1

Refactor/fix Cohere2 (#35594) · 3a4ae6ea
Cyril Vallez authored 6 months ago
```
* refactor/fix cohere2

* add kwargs

* tests

* remove func and import it
```
3a4ae6ea

[`tokenizers`] Ensure that add_prefix_space is propagated to... · 32e0db8a

Tom Aarsen authored 6 months ago

[`tokenizers`] Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer (#35593)

* Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer

in PreTrainedTokenizerFast, rather than relying on subclasses to take care of this.

* Simplify setting self.add_prefix_space, ensure pre_tok exists

* Wrap in try-except to catch 'Custom PreTokenizer cannot be serialized'

https://github.com/huggingface/tokenizers/blob/862d1a346a99183017b1eb5ad1aa3133b466784f/bindings/python/src/pre_tokenizers.rs#L672 produces the Exception. They're triggered by the roformer tests, as the RoFormerTokenizerFast uses a custom PreTokenizer.

* Propagate add_prefix_space in T5TokenizerFast to superclass

32e0db8a

Fix modular edge case + modular sorting order (#35562) · 46276f9a

Cyril Vallez authored 6 months ago

* look-ahead negation

* re add examples by default

* Fix the bug in topological sort

* Update create_dependency_mapping.py

* start adding test

* finalize test

* more tests

* style

* style

46276f9a

PR for Issue #22694: Fixed Training Evaluation table display for VSCode (#35557) · d3fe9fa3
Amit Luhar authored 6 months ago

d3fe9fa3
Small fix rope kwargs (#35589) · 395b114b
Pablo Montalvo authored 6 months ago
```
* don't know why this keeps popping up?

* remove unused rope_kwargs
```
395b114b

Fix flaky `SwitchTransformersModelTest::test_training_gradient` (#35587) · 82dd6c14

Yih-Dar authored 6 months ago


* fix

* Update tests/models/switch_transformers/test_modeling_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

82dd6c14

`tokenizer` train from iterator without pre_tokenizers (#35396) · eb4579cf
Arthur authored 6 months ago
```
* fix if else issues

* add a test

* fix the test

* style
```
eb4579cf
feat: add TP plan for granite (#35573) · 320512df
Mehant Kammakomati authored 6 months ago
```
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
```
320512df

[Idefics3] Move image features to same device as input embeds (#35100) · 633da1b1

Saif Rehman Nasir authored 6 months ago


* [Idefics3] Move image features to same device as input embeds

* Update src/transformers/models/idefics3/modeling_idefics3.py

* make style

---------

Co-authored-by: Saif Rehman Nasir <shyshin@github.com>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>

633da1b1

Add inputs_embeds param to ModernBertModel (#35373) · 832c6191

Jack Morris authored 6 months ago


* update modular_modernbert -- add inputs_embeds param to ModernBertModel

* Fix implementation issues; extend to other classes; docstring

First of all, the inputs_embeds shouldn't fully replace `self.embeddings(input_ids)`, because this call also does layer normalization and dropout. So, now both input_ids and inputs_embeds is passed to the ModernBertEmbeddings, much like how BertEmbeddings is implemented.

I also added `inputs_embeds` to the docstring, and propagated the changes to the other model classes.

I also introduced an error if input_ids and input_embeds are both or neither provided.

Lastly, I fixed an issue with device being based solely on input_ids with attention_mask.

* Propagate inputs_embeds to ModernBertForMaskedLM correctly

Also reintroduce inputs_embeds test

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>

832c6191

Fix flaky `test_batching_equivalence` (#35564) · 1b2f942a

Yih-Dar authored 6 months ago


* yes!

* oh no!!!

* oh no!!!

* style

* oh no!!!

* oh no!!!

* oh no!!!

* oh no!!!

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1b2f942a

Setup loss_type in config at model init time (#34616) · 4adc415b

Chander G authored 6 months ago


* setup loss_type in config at model init time

ensures no additional graph break introduced when torch.compile'ed

fixes #34615

Signed-off-by: ChanderG <mail@chandergovind.org>

* lookup loss mapping at init time instead of manual setup

Signed-off-by: ChanderG <mail@chandergovind.org>

* remove redundant lookup at loss_function time

* overwride losstype at init time

---------

Signed-off-by: ChanderG <mail@chandergovind.org>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>

4adc415b

Re-add missing __all__ for Cohere and Phi3 (#35578) · c8ab6ce6
Cyril Vallez authored 6 months ago
```
re-add missing __all__
```
c8ab6ce6