Commits · c772bff31a65c9c6002d0e74797cb130959a3716 · 某某某 / transformers-new

05 Feb, 2025 4 commits
- add support for empty list as input to create_model_card (#36042) · c772bff3
  ROZBEH authored 5 months ago
```
handle cases where it is list
```
  c772bff3
- Add XPU type for work-around -inf mask causing sdpa NaN issue in modeling files (#35647) · 315a9f49
  Liangliang Ma authored 5 months ago
```
* add xpu for unmask

* change modular for generated matching

* add lastest modeling for helium
```
  315a9f49
- Fix synced multi-GPU generation with LLMs and VLMs (#35893) · d8080d55
  ManukyanD authored 5 months ago
```
* Fix synced multi-GPU generation

* fix copies

---------

Co-authored-by: Davit Manukyan <ManukyanD>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
```
  d8080d55
- Fix Gemma2 synced multi-GPU generation (#35232) · 4831a94e
  ManukyanD authored 5 months ago
```
* Fix Gemma2 synced multi-GPU generation

* Fix import ordering in modular_gemma2.py
```
  4831a94e
04 Feb, 2025 16 commits

Refactoring of ImageProcessorFast (#35069) · fa56dcc2

Yoni Gozlan authored 5 months ago

* add init and base image processing functions

* add add_fast_image_processor to transformers-cli

* add working fast image processor clip

* add fast image processor to doc, working tests

* remove "to be implemented" SigLip

* fix unprotected import

* fix unprotected vision import

* update ViTImageProcessorFast

* increase threshold slow fast ewuivalence

* add fast img blip

* add fast class in tests with cli

* improve cli

* add fast image processor convnext

* add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision

* add device kwarg to ImagesKwargs for fast processing on cuda

* cleanup

* fix unprotected import

* group images by sizes and add batch processing

* Add batch equivalence tests, skip when center_crop is used

* cleanup

* update init and cli

* fix-copies

* refactor convnext, cleanup base

* fix

* remove patching mixins, add piped torchvision transforms for ViT

* fix unbatched processing

* fix f strings

* protect imports

* change llava onevision to class transforms (test)

* fix convnext

* improve formatting (following Pavel review)

* fix handling device arg

* improve cli

* fix

* fix inits

* Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs

* uniformize qwen2_vl fast

* fix docstrings

* add add fast image processor llava

* remove min_pixels max_pixels from accepted size

* nit

* nit

* refactor fast image processors docstrings

* cleanup and remove fast class transforms

* update add fast image processor transformers cli

* cleanup docstring

* uniformize pixtral fast and  make _process_image explicit

* fix prepare image structure llava next/onevision

* Use typed kwargs instead of explicit args

* nit fix import Unpack

* clearly separate pops and gets in base preprocess. Use explicit typed kwargs

* make qwen2_vl preprocess arguments hashable

fa56dcc2

Add DAB-DETR for object detection (#30803) · 8d73a386

David authored 5 months ago


* initial commit

* encoder+decoder layer changes WIP

* architecture checks

* working version of detection + segmentation

* fix modeling outputs

* fix return dict + output att/hs

* found the position embedding masking bug

* pre-training version

* added iamge processors

* typo in init.py

* iterupdate set to false

* fixed num_labels in class_output linear layer bias init

* multihead attention shape fixes

* test improvements

* test update

* dab-detr model_doc update

* dab-detr model_doc update2

* test fix:test_retain_grad_hidden_states_attentions

* config file clean and renaming variables

* config file clean and renaming variables fix

* updated convert_to_hf file

* small fixes

* style and qulity checks

* return_dict fix

* Merge branch main into add_dab_detr

* small comment fix

* skip test_inputs_embeds test

* image processor updates + image processor test updates

* check copies test fix update

* updates for check_copies.py test

* updates for check_copies.py test2

* tied weights fix

* fixed image processing tests and fixed shared weights issues

* added numpy nd array option to get_Expected_values method in test_image_processing_dab_detr.py

* delete prints from test file

* SafeTensor modification to solve HF Trainer issue

* removing the safetensor modifications

* make fix copies and hf uplaod has been added.

* fixed index.md

* fixed repo consistency

* styel fix and dabdetrimageprocessor docstring update

* requested modifications after the first review

* Update src/transformers/models/dab_detr/image_processing_dab_detr.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* repo consistency has been fixed

* update copied NestedTensor function after main merge

* Update src/transformers/models/dab_detr/modeling_dab_detr.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* temp commit

* temp commit2

* temp commit 3

* unit tests are fixed

* fixed repo consistency

* updated expected_boxes varible values based on related notebook results in DABDETRIntegrationTests file.

* temporarialy config modifications and repo consistency fixes

* Put dilation parameter back to config

* pattern embeddings have been added to the rename_keys method

* add dilation comment to config + add as an exception in check_config_attributes SPECIAL CASES

* delete FeatureExtractor part from docs.md

* requested modifications in modeling_dab_detr.py

* [run_slow] dab_detr

* deleted last segmentation code part, updated conversion script and changed the hf path in test files

* temp commit of requested modifications

* temp commit of requested modifications 2

* updated config file, resolved codepaths and refactored conversion script

* updated decodelayer block types and refactored conversion script

* style and quality update

* small modifications based on the request

* attentions are refactored

* removed loss functions from modeling file, added loss function to lossutils, tried to move the MLP layer generation to config but it failed

* deleted imageprocessor

* fixed conversion script + quality and style

* fixed config_att

* [run_slow] dab_detr

* changing model path in conversion file and in test file

* fix Decoder variable naming

* testing the old loss function

* switched back to the new loss function and testing with the odl attention functions

* switched back to the new last good result modeling file

* moved back to the version when I asked the review

* missing new line at the end of the file

* old version test

* turn back to newest mdoel versino but change image processor

* style fix

* style fix after merge main

* [run_slow] dab_detr

* [run_slow] dab_detr

* added device and type for head bias data part

* [run_slow] dab_detr

* fixed model head bias data fill

* changed test_inference_object_detection_head assertTrues to torch test assert_close

* fixes part 1

* quality update

* self.bbox_embed in decoder has been restored

* changed Assert true torch closeall methods to torch testing assertclose

* modelcard markdown file has been updated

* deleted intemediate list from decoder module

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

8d73a386

Update tests regarding attention types after #35235 (#36024) · fe52679e

Yih-Dar authored 5 months ago


* update

* update

* update

* dev-ci

* more changes

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fe52679e

CircleCI with python 3.9 (#36027) · 014a1fa2

Yih-Dar authored 5 months ago


update docker files

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

014a1fa2

feat(ci): ignore trufflehog unverified results (#36031) · c98b4679
Luc Georges authored 5 months ago

c98b4679
Hotfix for `self-comment-ci.yml` (#36030) · 9855acb9
Yih-Dar authored 5 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
9855acb9

Display warning for unknown quants config instead of an error (#35963) · 9f486bad

Marc Sun authored 5 months ago


* add supports_quant_method check

* fix

* add test and fix suggestions

* change logic slightly

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

9f486bad

Commont bot CI for other jobs (`generation` / `quantization`) (#35341) · f19bfa50

Yih-Dar authored 5 months ago


* quantization CI on PRs

* fix

* fix

* add 2 members

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f19bfa50

Fix RMSNormGated in Zamba2 (#35943) · a93b8058

pglorio authored 5 months ago


* First commit

* Finish model implementation

* First commit

* Finish model implementation

* Register zamba2

* generated modeling and configuration

* generated modeling and configuration

* added hybrid cache

* fix attention_mask in mamba

* dropped unused loras

* fix flash2

* config docstrings

* fix config and fwd pass

* make fixup fixes

* text_modeling_zamba2

* small fixes

* make fixup fixes

* Fix modular model converter

* added inheritances in modular, renamed zamba cache

* modular rebase

* new modular conversion

* fix generated modeling file

* fixed import for Zamba2RMSNormGated

* modular file cleanup

* make fixup and model tests

* dropped inheritance for Zamba2PreTrainedModel

* make fixup and unit tests

* Add inheritance of rope from GemmaRotaryEmbedding

* moved rope to model init

* drop del self.self_attn and del self.feed_forward

* fix tests

* renamed lora -> adapter

* rewrote adapter implementation

* fixed tests

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Dropped adapter in-place sum

* removed rope from attention init

* updated rope

* created get_layers method

* make fixup fix

* make fixup fixes

* make fixup fixes

* update to new attention standard

* update to new attention standard

* make fixup fixes

* minor fixes

* cache_position

* removed cache_position postion_ids use_cache

* remove config from modular

* removed config from modular (2)

* import apply_rotary_pos_emb from llama

* fixed rope_kwargs

* Instantiate cache in Zamba2Model

* fix cache

* fix @slow decorator

* small fix in modular file

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* several minor fixes

* inherit mamba2decoder fwd and drop position_ids in mamba

* removed docstrings from modular

* reinstate zamba2 attention decoder fwd

* use regex for tied keys

* Revert "use regex for tied keys"

This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.

* use regex for tied keys

* add cpu to slow forward tests

* dropped config.use_shared_mlp_adapter

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* re-convert from modular

* extended Zamba2RMSNormGated to n_groups>1

* removed einops import

* set _supports_sdpa = True

* add use_mem_eff_path flag for fused mamba2 fwd

* added docstring for use_mem_eff_ath flag

---------

Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

a93b8058

Fix device mismatch error in Whisper model during feature extraction (#35866) · bc9a6d83

Sumit Vij authored 5 months ago


* Fix device mismatch error in whisper feature extraction

* Set default device

* Address code review feedback

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>

bc9a6d83

Refactor (and fix) gpt_neox (#35610) · 9afb904b

Cyril Vallez authored 5 months ago

* start a nice modular

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* update

* Update modular_gpt_neox.py

* convert

* fix attribute

* fix attrs

* oups

* fix

* fix

* fix

* fix

* fix

* fix order to pass test (see with accelerate team)

* trigger CIs

* modular

* update

* up

* Update test_modeling_gpt_neox.py

* Update test_modeling_gpt_neox.py

* trigger CIs

* correctly pass arg

* simplify

* remove key warning

* update tp -> it's compatible since the view is before

* trigger CIs

9afb904b

Update Mistral converter (#35967) · ad305989

Cyril Vallez authored 5 months ago

* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py

* update

* style

* move it to integrations

* style

* trigger CIs

* trigger CIs

ad305989

layernorm_decay_fix (#35927) · b1954fd6

Ryoo Kwangrok authored 5 months ago

* layernorm_decay_fix

* W293 fix

* ruff format fix

* black format

* ruff format

* erase last layer

* add test_get_parameter_names_rmsnorm

* rmsnorm fix

b1954fd6

apply_chat_template: consistent behaviour for... · 2ba040a7

Dmitry Tarasov authored 5 months ago

apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582)

* apply_chat_template: consistent return_tensors behaviour with return_assistant_tokens_mask flag

* test_chat_template_return_assistant_tokens_mask: support tokenizers with no attention mask

* test_chat_template_return_assistant_tokens_mask: skip tokenizers with no padding token

* test_chat_template_return_assistant_tokens_mask: force tokenizer padding_side=right

---------

Co-authored-by: Eduard Allakhverdov <goncharova@airi.net>
Co-authored-by: d.tarasov <d.tarasov@airi.net>

2ba040a7

Fix custom kernel for DeformableDetr, RT-Detr, GroindingDINO, OmDet-Turbo in Pytorch 2.6.0 (#35979) · 9c02cb62
Pavel Iakubovskii authored 5 months ago
```
Updates type().is_cuda() -> .is_cuda(); .data<> -> .data_ptr<>
```
9c02cb62
Qwen2-VL: fix rope delta calculation (#36013) · 5d75a25b
Raushan Turganbay authored 5 months ago
```
* fix rope delats calculation

* add test

* style
```
5d75a25b

03 Feb, 2025 3 commits

Update Granite Vision Model Path / Tests (#35998) · e284c7e9

Alex Brooks authored 5 months ago


* Update granite vision model path

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Enable granite vision test

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

---------

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

e284c7e9

Add mean_resizing for every VLMs' resizing_token_embeddings() (#35717) · 9d2056f1
Gar authored 5 months ago
```
* refine all resize_token_embedding()

* ruff format

* hotfix
```
9d2056f1
Update-tp test (#35844) · 7eecdf2a
Arthur authored 5 months ago
```
* update test for now

* up

* cleanup

* update todo
```
7eecdf2a

31 Jan, 2025 4 commits

use torch 2.6 for daily CI (#35985) · 62db3e6e

Yih-Dar authored 5 months ago


use torch 2.6 for CI

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

62db3e6e

Add GOT-OCR 2.0 to Transformers (#34721) · 2b469431

Yoni Gozlan authored 5 months ago

* init modular got_ocr2

* Get correct got_ocr architecture

* add processing

* run modular with processing

* add working inference

* apply modular

* Refactor and fix style

* Refactor, cleanup, fix style

* fix init order

* Fix docs

* add base modeling tests

* fix style and consistency

* rename doc file

* fix repo consistency

* fix inference with box

* add image processing and support for crop_to_multi_page

* Fix batch inference

* add tests

* fixup

* fix slow test

* fix docstrings

* Add model doc

* update to new init

* fix input autocast pixel_values dtype

* update doc

* move doc to multimodal

* Reformat crop_image_to_patches and add docstrings

* Fix example in forward docstring

* Address Pablo review

* [run slow] got_ocr2

* remove defaults defined twice

* apply modular

* add torch_device to integration tests

* update modular

* follow-up Pavel review

* add device variable in doc

* fix doc multi-page

* Force eager attention for vision encoder to avoid attn implementation conflict

* revert qwen2vl doc changes

* use Qwen2ForCausalLM instead of Qwen2Model

* make fixup

* refactor gotocr2 to llava style

* uniformize function names and reduce checks

* final nits

* fix pixel_values dtype error

* change checkpoint names

* fix modular

2b469431

[Moshi] disable automatic compilation if the model can't compile (#35992) · 5bbee12a
Joao Gante authored 5 months ago
```
moshi cant compile
```
5bbee12a
[Moonshine] compute head_dim_padding at init (#35984) · e6f4a4eb
eustlb authored 5 months ago
```
compute head_dim_padding at init
```
e6f4a4eb

30 Jan, 2025 10 commits

Add support for nested images to LLava and VipLLava (#35558) · d7188ba6

Yoni Gozlan authored 5 months ago

* move make_flat_list_of_images and make_batched_videos to image_utils

* remove unnecessary is_vision_available

* move make_nested_list_of_images to image_utils

* fix fast pixtral image processor

* fix import mllama

* fix make_nested_list_of_images

* add tests

* convert 4d arrays/tensors to list

* add test_make_batched_videos

* add support nested batch of videos

* fix image processing qwen2vl

d7188ba6

Handle empty change indices in SAM's mask to rle conversion (#35665) · e4227eb4

Marcel authored 5 months ago

* Handle empty change indices in RLE conversion for masks

* [test] Add unit tests for RLE encoding of masks in SamProcessor

* [test] Update RLE conversion tests to use TensorFlow implementation

* [test] Fix formatting in SamProcessorTest according to check_code_quality action

* [test] Fix formatting in SamProcessorTest according to check_code_quality

* [test] Refactored rle test cases into one test and used tf tensors in tf test cases

* [test] Fix: removed self parameter from refactored methods

* [test] Removed nested methods in run-length encoding tests for PyTorch and TensorFlow

* [test] Added description to individual to run-length encoding tests for PyTorch and TensorFlow.

e4227eb4

not to use A100 for `benchmark.yml` (#35974) · 47bd4296
Yih-Dar authored 5 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
47bd4296

Support batching for UsefulSensors Moonshine (#35922) · 693328f2

Nat Jeffries authored 5 months ago


* Add support for attention masking in moonshine.

Tested against Open ASR Leaderboard with batch size 256.

* Update comments and ensure attention masks are passed everywhere.

Perform attention mask downsampling inside of moonshine forward call.

* Hide padding behind conditional. Fix encoder/decoder masking.

- Correctly pipe encoder attention mask into decoder
- Add correct scaling factor if one is not already provided.
- Fix formatting with ruff

* Add auto generated modeling_moonshine file.

* Update formatting in generated model file.

* Address review comments.

* Fix typo.

* Add `pad_head_dim_to_multiple_of` to moonshine config.

* Correct args order for MooonshineConfig.

* Update configuration moonshine too.

* Update src/transformers/models/moonshine/modular_moonshine.py

* Update src/transformers/models/moonshine/configuration_moonshine.py

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>

693328f2

Less flaky for `TimmBackboneModelTest::test_batching_equivalence` (#35971) · 57576818
Yih-Dar authored 5 months ago
```
* fix

* remove is_flaky

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
57576818
Revert p_mask to a list in DQA pipeline (#35964) · e320d554
Matt authored 5 months ago
```
* p_mask back to being a list

* Remove breakpoint
```
e320d554
Whisper: fix static cache CI (#35852) · 365fecb4
Raushan Turganbay authored 5 months ago
```
* fix

* remove overriden method

* small change
```
365fecb4

Pixtral: vectorize patch embeddings and enable tests (#35122) · 9725e5be

Raushan Turganbay authored 5 months ago

* initial POC

* - batch mix feature

* fix tests

* fix tests

* make style

* do not skip and instead fix tests

* update

* return back the test

* correct text with the correct ckpt

9725e5be

[bart] minor test fixes (#35965) · 8bc4c89e
Joao Gante authored 5 months ago
```
fix tests
```
8bc4c89e
Fix is_causal being a tensor (#35791) · 19f2ec80
Ilyas Moutawwakil authored 5 months ago
```
* fix is_causal being a tensor

* convert in sdpa attention only when  jit tracing
```
19f2ec80

29 Jan, 2025 3 commits
- fix iterator overflow when gradient accumulation is 1 (#35960) · 7547f55e
  Wing Lian authored 5 months ago
  
  7547f55e
- [generate] move max time tests (#35962) · 4d3b1076
  Joao Gante authored 5 months ago
```
* move max time tests to their right place

* move test to the right place
```
  4d3b1076
- Update README.md (#35958) · 4d1d4896
  Boris Malashenko authored 5 months ago
```
There should be a dot after pip install .
```
  4d1d4896