Commits · v4.51.3-MLCD-release · 某某某 / transformers-new

15 Apr, 2025 7 commits

Fix missing return type for MLCD docs (#37527) · 356b3cd7
Pavel Iakubovskii authored 2 months ago
```
* Fix missing return type for docs

* trigger
```
v4.51.3-MLCD-preview

356b3cd7

fix: Restore explicit error surfacing for unexpected hub exceptions (#37525) · 0ad3710d

Manuel de Prada Corral authored 2 months ago


* fix: Restore explicit error surfacing for unexpected hub exceptions

Prior to PR #36033, unexpected exceptions (e.g., ModuleNotFoundError) during hub model loading were not swallowed silently. They either matched specific except blocks or were raised.

After #36033, a catch-all except Exception block was introduced without a fallback else, causing unknown errors to be silently ignored and leading to misleading downstream behavior.

This commit adds an `else: raise e` to ensure only explicitly handled exceptions are suppressed. All others are surfaced, restoring pre-4.50 behavior and aiding in debugging and dependency visibility.

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

0ad3710d

Add Fast Yolos Processor (#37292) · f6c79f76

Parteek authored 2 months ago


* Add Fast Yolos Processor

* Update modular file

* Fix copies

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

f6c79f76

Llama4: remove redundant transpose of router_logits (#37468) · ecaeee66
Pavel Belevich authored 2 months ago
```
* Llama4: remove redundant transpose of router_logits

* Fix formatting
```
ecaeee66

Add MLCD model (#36182) · 6f7ea1cf

Huajie Tan authored 2 months ago

* Add MLCD model

* Update codes for auto-mapping

* Add test scripts for MLCD

* Update doc for MLCD model

* Fix import error

* Fix import error

* Fix CI error for attention_outputs

* Fix code style for CI

* Fix code style for CI

* Fix code style for CI

* Fix code style for CI

* Fix code style for CI

* Fix CI error for initialization

* Fix code style for CI

* Fix code style for CI

* Reformat codes and docs for CI test

* Reformat codes and docs for CI test

* Remove unused attributes for CI test

* Fix style for CI test

* List MLCD in flash_attn doc

* Fix: typos, modulars, refactors from suggestions

* Refactoring convert_mlcd_weights_to_hf.py from suggestions

* Fix: docs conflicts

* Fix error for CI test

* Fix style for CI test

* Add integration test for MLCD

* Refactoring by class inheritance

* Fix: refactor attention interface, adjust codes

* Fix: merging conflicts

* Fix: merging conflicts

* Fix: style for CI test

* Fix: style for CI test

* Fix: set test_resize_embeddings to be False

* Fix: initializer for CI test

* Fix: conflicts, CI test, warning and refactoring

* Fix: merging conflicts

* Refactor

* Update docs

* Fix mistakes

* Remove unused args and fix multi-gpu error

* Revert position_embeddings

* Solve conflicts

* Solve conflicts

* Remove dummy

* Update _init_weights

* Update _init_weights

* Update _init_weights for CI test

6f7ea1cf

Change default value of `attn_temperature_tuning` (#37501) · d6ac923a
AinL authored 2 months ago
```
fix: change default value of `attn_temperature_tuning`
```
d6ac923a

Detect and use device context manager or global device in `from_pretrained` (#37216) · c8e0e603

Cyril Vallez authored 2 months ago

* Update modeling_utils.py

* improve

* Update modeling_utils.py

* Update test_modeling_common.py

* Update test_modeling_timm_backbone.py

* Update test_modeling_common.py

* Update test_modeling_common.py

* Update test_modeling_common.py

* Update test_modeling_common.py

* CIs

c8e0e603

14 Apr, 2025 24 commits

Don't auto-assign reviewers when the author is in HF (#37500) · 4e63a174
Matt authored 2 months ago
```
* Don't auto-assign reviewers when the author is in HF

* Trigger tests
```
4e63a174
Remove deprecation warning for `num_logits_to_keep` (#37149) · 8ab29650
Cyril Vallez authored 2 months ago
```
* remove everything

* style
```
8ab29650

Add Fast owlvit Processor (#37164) · 20ceaca2

Parteek authored 2 months ago


* Add Fast Owlvit Processor

* Update image_processing_owlvit_fast.py

* Update image_processing_owlvit_fast.py

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

20ceaca2

[qwen-omni] fix processor (#37493) · cb39f7dd
Raushan Turganbay authored 2 months ago
```
* fix

* delete print

* accept kwargs in overriden models as well

* remove duplicate
```
v4.51.3-Qwen2.5-Omni-preview

cb39f7dd
Fixing gated repo issues (#37463) · d228f50a
Mohamed Mekkouri authored 2 months ago
```
using unsloth model
```
d228f50a
Fix wrong argparse type in modular checker script (#37472) · a5dfb989
7mile authored 2 months ago
```
fix(util): wrong argparse type in modular checker script
```
a5dfb989
Add Fast Mobilenet-V2 Processor (#37113) · a53a63c9
Parteek authored 2 months ago
```
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
```
a53a63c9

Add ImageProcessorFast to BiT processor (#37180) · 4774a39d

Yann Chéné authored 2 months ago


* Add ImageProcessorFast to BiT processor

* propose a fast processor and add tests

* all tests pass except one

* run make

* remove useless print

* use same test as clip

* apply make

* Update src/transformers/models/bit/image_processing_bit_fast.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* Update setup.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* Update src/transformers/models/bit/image_processing_bit_fast.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* apply review comment

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

4774a39d

Add Fast LeViT Processor (#37154) · e43f168e

Parteek authored 2 months ago


* Add Fast LeViT Processor

* Update levit.md

* Update src/transformers/models/levit/image_processing_levit_fast.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* ruff check

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

e43f168e

Fix mask handling for flex attention in llama/gemma2/mistral/qwen2 (#37381) · 1efcfa9c

Rupesh K Srivastava authored 2 months ago

* fix BlockMask handling when using flex_attention for llama/mistral/gemma2

* fix attention_mask types

* revert type hints and fixup

* remove unnecessary assertion

1efcfa9c

[bug] deprecated deta load_cuda_kernel, MultiScaleDeformableAttention (#37443) · 86064035
Keumgang Cha authored 2 months ago
```
* Update modeling_deta.py

* variable initialization
```
86064035

Add Fast Image Processor for Donut (#37081) · 7cc9e61a

Vinh H. Pham authored 2 months ago


* add donut fast image processor support

* run make style

* Update src/transformers/models/donut/image_processing_donut_fast.py

Co-authored-by: Parteek <parteekkamboj112@gmail.com>

* update test, remove none default values

* add do_align_axis = True test, fix bug in slow image processor

* run make style

* remove np usage

* make style

* Apply suggestions from code review

* Update src/transformers/models/donut/image_processing_donut_fast.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* add size revert in preprocess

* make style

* fix copies

* add test for preprocess with kwargs

* make style

* handle None input_data_format in align_long_axis

---------

Co-authored-by: Parteek <parteekkamboj112@gmail.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

7cc9e61a

Detect and fix most `_init_weights()` issues - make it work for composite models (#37070) · 4e538409

Cyril Vallez authored 2 months ago

* Update test_modeling_common.py

* Fix Llama and its modular children

* Update test_modeling_common.py

* qwen3

* first try at prioritizing models

* Update test_modeling_common.py

* Update test_modeling_common.py

* Update test_modeling_common.py

* test

* fix

* fix

* more models

* more

* more

* more

* smarter init for composite models!

* fix post rebase

* smol

* fix missing args

* more

* typo

* Super elegant and efficient init for submodels

* Update modeling_utils.py

* style

* last fixes

* cleanup

* finalize cleanup

* CIs

* improve docstring

* Update modeling_utils.py

* llama4

* style

* CIs

* style

* add dpt

* granite speech

* qwen 2.5 omni

* better fix

* Parse the config file instead

* CIs

4e538409

Add Fast Image Processor for LayoutLMv3 (#37201) · 1897a02d

Vinh H. Pham authored 2 months ago


* support fast image processor layoutlmv3

* make style

* add warning and update test

* make style

* Update src/transformers/models/layoutlmv3/image_processing_layoutlmv3_fast.py

* Update image_processing_auto.py

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

1897a02d

Fixed broken links (#37466) · 7bff4bdc
Cypher Pepe authored 2 months ago
```
* Update broken link

* Update broken link
```
7bff4bdc

Add Fast Image Processor for LayoutLMv2 (#37203) · e16775d1

Vinh H. Pham authored 2 months ago


* add support layoutlmv2

* make style

* Apply suggestions from code review

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* add warning and clean up

* make style

* Update src/transformers/models/layoutlmv2/image_processing_layoutlmv2_fast.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

e16775d1

Add Fast Image Processor for Flava (#37135) · 49b9a69a

Vinh H. Pham authored 2 months ago


* support flava fast image processor

* run style and quality

* update test

* update according to reviews

* make style

* update comment on BICUBIC

* make style

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

49b9a69a

[ci] fix doc builder (#37489) · a5079a2c
Raushan Turganbay authored 2 months ago
```
happy doc ci
```
a5079a2c

Add Fast Image Processor for Perceiver (#37176) · e7f5724e

Vinh H. Pham authored 2 months ago


* add test and fast image processor

* make style

* Update src/transformers/models/perceiver/image_processing_perceiver_fast.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* make style

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

e7f5724e

Add Qwen2.5-Omni (#36752) · 4b8c6d4c

BakerBunker authored 2 months ago


* Add qwen2.5-omni

* Remove einops dependency

* Add torchdiffeq dependency

* Sort init

* Add torchdiffeq to extras['diffeq']

* Fix repo consistency

* use cached_file

* del odeint

* renew pytest

* format

* Remove torchdiffeq

* format

* fixed batch infer bug

* Change positional_embedding to parameter

* Change default speaker

* Config revision

* Use modular & code clean

* code clean

* decouple padding with model & code cleaning

* sort init

* fix

* fix

* Second code review

* fix

* fix

* rename vars to full name + some comments

* update pytest

* Code clean & fix

* fix

* style

* more clean up

* fixup

* smaller vision model in tests

* fix processor test

* deflake a bit the tests (still flaky though)

* de-flake tests finally + add generation mixin

* final nits i hope

* make sure processor tests are complete

* replace with Qwen2_5OmniForConditionalGeneration

* fix tests after updating ckpt

* fix typos when cleaning, also we can't change ckpt

* fixup

* images and videos kwargs for processor

* thinker and talker loadable from hub ckpt

* address comments and update tests after rebase

* fixup

* skip for now

* fixup

* fixup

* remove torch dependency in processors

---------

Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.con>
Co-authored-by: feizi.wx <feizi.wx@alibaba-inc.com>
Co-authored-by: raushan <raushan@huggingface.co>

4b8c6d4c

Fix tests failed with gated repos. (#37484) · ac1df5fc
Yih-Dar authored 2 months ago
```
* fix

* slow

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
ac1df5fc
Remove `fsspec` dependency which isn't directly used by transformers (#37318) · 1ef64710
cyyever authored 2 months ago
```
Signed-off-by: cyy <cyyever@outlook.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
```
1ef64710
make test_snowman_image_captioning pass on XPU, by sharing same atol w/ ROCM (#37480) · 47b9f06a
Yao Matrix authored 2 months ago
```
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
```
47b9f06a
fix: (llama4) fix no_split_modules to be picked up for fsdpv1 and v2 sharding (#37462) · 78cea3e2
Mehant Kammakomati authored 2 months ago
```
fix: fix no_split_modules to be picked up for fsdpv1 and v2 sharding

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
```
78cea3e2

11 Apr, 2025 9 commits

Fix typing issues with SigLip2 (#37356) · 953196a4

Eric Wiener authored 2 months ago


* Fix issues

* Fix comment

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

953196a4

[agents] remove agents 🧹 (#37368) · aaf129cd
Joao Gante authored 2 months ago

aaf129cd
Delete hubconf.py (#37455) · 69e6ddf2
Matt authored 2 months ago
```
* Delete hubconf.py

* Trigger tests
```
69e6ddf2

Add Granite Speech Support (#36801) · 623d395a

Alex Brooks authored 2 months ago


* First pass at speech granite

Add encoder / projector, rename things

* Combine into one model file with causal lm outputs for forward

* Add loss calc

* Fix config loading

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Split new / old loading logic

* Use transformers integration for loading peft adapters

* Add generation wrapper for selective lora enablement

* Add note for qformer encoder automodel

* Guard torch/audio imports in feature extractor

* Handle granite speech autoclasses

* Handle optional deps in package structure for granite speech

* Add granite pretrained model def for init

* Add dummy objects for torch/torchaudio

* Add tests for granite speech processor

* Minor formatting fixes and refactoring

* Add options for falling back to config in forward

* Tentative model docstrings for granite speech

* Fix config type

* Remove legacy load

* Allow non-lora variants for granite speech

* Override weight tying for llm

* Use text config instead of llm config

* Add output embeddings getter to fix weight tying

* Fix relative imports

* computing the number of audio features, based on the raw audio sequence.

* collating audio inputs, and keeping the original lengths.

* asserted we have text. otherwise we can't specify the audio special token.

* assering the number of audio-symbols/audios match correctly.
running get validated_audios only when audio is present

* indentation bugfix + supporting different feature lengths when expanding audio.

* redundant, done in _get_validated_text

* adapting the tests:
- we must have text (not either audio or text)
- _get_num_audio_features takes a list of raw lengths, provided it insetad.

* Minor cleanup, remove unused import

* Add more tests for batch feature processing

* Allow setting offset in rel position embeddings

* Add config option for warning if peft is not installed w/ lora

* Port blip2 qformer code into granite speech

* Add sad test for numpy arr processing

* Allow numpy arrays / tuples in granite speech processor

* Fix config type for projector

* - pad instead of creating a zeros tensor, to keep the original dtype/device (support bfloat16)
- cast input_features to the model dtype (support bfloat16)

* merge Blip2QFormerConfig to GraniteSpeechProjectorConfig

* prevent a crash when re-saving/loading the model (line 109)

* consider additional edge cases during preprocessing.

* consider additional edge cases during preprocessing.

* add features mask for batched inference (bugfix)

* Minor refactor, remove multiaudio processor tests

* Add set input/output embeddings for granite speech

* Fix feature dim check in processor test

* Pop input features in embed test for granite speech

* Small fixes for test edge cases

Add granite speech to seq2seq causal lm mapping names

* Add small tests for granite speech model

* Fix data parallelism test

* Standardize model class names

* Fix check for copies

* Fix misaligned init check

* Skip granite speech in checkpoint check

* Use default for tie_word_embeddings in granite speech

* Fix non documentation granite speech repo issues

* Fix comments and docstring checks

* Add placeholder docs for granite speech

* Fix test naming collision

* Code formatting

* Rerun torch dummy obj regen

* Fix save pretrained for granite speech

* Import sorting

* Fix tests typo

* Remove offset hack

* Pass args through encoder config

* Remove unused prune heads from blip2

* removing einsum. replaced with explicit multiplication (relative positional encodings) and sdpa attention.

* remove Sequential from ConformerFeedForward and ConformerConvModule. + fix for sdpa attention

* remove GraniteSpeechConformerScale

* rename to hidden_states

* rename conformer layers to self.layers, remove the first linear from the list to keep the list homogenous.

* move pre-norm to the attention/feedforward blocks (avoid complex module wrapping)

* adding pre_norm into forward

* feature extractor refactoring to resemble how it's done in phi4multimodal.

* rename feature_extractor to audio_processor

* bugfix: input_feature_mask fix to get the exact number tokens.

* Fix pytest decorator in processor test

* Add (disabled) integration tests for granite speech

* Fix handling of optional feature masking

* Loosen validation in processing for vLLM compatability

* Formatting fixes

* Update init structure to mirror llama

* Make granite speech projector generic

* Update test config to reflect generic projector

* Formatting fixes

* Fix typos, add license

* Fix undefined var in input processing

* Cleanup and expose ctc encoder

* Add missing config docstrings

* Better var names, type hints, etc

* Set attn context size in init

* Add max pos emb to encoder config

* Cleanup feature extractor

* Add granite speech architecture details

* Remove granite speech qformer ref

* Add paper link, explicit calc for qkv

* Calculate padding directly in depthwise conv1d init

* Raise value error instead of asserting

* Reorder class defs (classes used at top)

* Precompute relpos distances

* Run formatting

* Pass attention distances through forward

* Apply suggestions from code review

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>

* Add todo for using common batch feature extraction

* Rename audios/features

* Ensure chat template may be provided to processor

* Move granite speech docs to audio models

* Add todos for input proc refactoring

* Fix import order

* Guard torch import

* Use relative imports

* Require torch backend for processor in granite speech

* Add backend guards in feature extractor

---------

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
Co-authored-by: Avihu Dekel <avihu.dekel@ibm.com>
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>

623d395a

nit: typing use Llama4TextConfig instead of Llama4Config (#37430) · 435f88f1
Mehant Kammakomati authored 2 months ago
```
nit: typing to text config

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
```
435f88f1

Add XPU case to is_torch_bf16_gpu_available (#37132) · 954f31cd

cyyever authored 2 months ago


* Add xpu case to is_torch_bf16_gpu_available

Signed-off-by: cyy <cyyever@outlook.com>

* Refine error messages

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

954f31cd

Add weights_only=True to torch.load (#37062) · 28eae8b4
cyyever authored 2 months ago

28eae8b4

🚨

Allow saving and loading multiple "raw" chat... · bf46e448

Matt authored 2 months ago

🚨 🚨

 Allow saving and loading multiple "raw" chat template files (#36588)

* Add saving in the new format (but no loading yet!)

* Add saving in the new format (but no loading yet!)

* A new approach to template files!

* make fixup

* make fixup, set correct dir

* Some progress but need to rework for cached_file

* Rework loading handling again

* Small fixes

* Looks like it's working now!

* make fixup

* Working!

* make fixup

* make fixup

* Add TODO so I don't miss it

* Cleaner control flow with one less indent

* Copy the new logic to processing_utils as well

* Proper support for dicts of templates

* make fixup

* define the file/dir names in a single place

* Update the processor chat template reload test as well

* Add processor loading of multiple templates

* Flatten correctly to match tokenizers

* Better support when files are empty sometimes

* Stop creating those empty templates

* Revert changes now we don't have empty templates

* Revert changes now we don't have empty templates

* Don't support separate template files on the legacy path

* Rework/simplify loading code

* Make sure it's always a chat_template key in chat_template.json

* Update processor handling of multiple templates

* Add a full save-loading test to the tokenizer tests as well

* Correct un-flattening

* New test was incorrect

* Correct error/offline handling

* Better exception handling

* More error handling cleanup

* Add skips for test failing on main

* Reorder to fix errors

* make fixup

* clarify legacy processor file docs and location

* Update src/transformers/processing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Rename to _jinja and _legacy

* Stop saving multiple templates in the legacy format

* Cleanup the processing code

* Cleanup the processing code more

* make fixup

* make fixup

* correct reformatting

* Use correct dir name

* Fix import location

* Use save_jinja_files instead of save_raw_chat_template_files

* Correct the test for saving multiple processor templates

* Fix type hint

* Update src/transformers/utils/hub.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Patch llava_onevision test

* Update src/transformers/processing_utils.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Refactor chat template saving out into a separate function

* Update tests for the new default

* Don't do chat template saving logic when chat template isn't there

* Ensure save_jinja_files is propagated to tokenizer correctly

* Trigger tests

* Update more tests to new default

* Trigger tests

---------

Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>

bf46e448

Disable kernels for quantization (#37446) · 89787474
Mohamed Mekkouri authored 2 months ago
```
fix
```
89787474