Commits · muellerzr-speedup-modular-conversion · zhusg / transformers-new

06 Feb, 2025 13 commits

quality · c631f36c
[[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL authored 4 months ago

c631f36c
Brr · 139f2cae
[[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL authored 4 months ago

139f2cae

Fix StopStringCriteria to handle tokens above len(tokenizer) (#35797) · 4563ba2c

Matt authored 4 months ago


* Fix StopStringCriteria to handle tokens above len(tokenizer)

This fixes #35244 by clipping token IDs to be within the tokenizer's vocabulary size before performing the embedding lookup. This prevents index errors when model.config.vocab_size > len(tokenizer).

The fix:
1. Adds a clamp operation to ensure token IDs are within bounds
2. Adds a test case to verify the behavior

* Use self.stop_strings instead of stop_strings

* Handle clipping correctly

* make fixup

* Update test to the new embedding vecs

* Use much bigger values in the mismatch test

* Typo fix

* Slight simplification

---------

Co-authored-by: openhands <openhands@all-hands.dev>

4563ba2c

Fix model kwargs (#35875) · 28f73bc3

Zach Mueller authored 4 months ago

* Save state

* Make a failing test

* Better test

* mpt -> done, many more to go

* Rm extranious

* Bamba

* Bert

* big_bird

* biogpt

* bloom

* codegen

* ctrl

* data2vec

* dbrx

* Through up to Dbrx

* electra

* ernie

* falcon

* Fuyu/persimmon

* Include noop kwargs to base models

* Rebase

* Skip musigen

* Refactor/skip mllama

* Revert makefile

* Rm file

* Fix PT failing, need to modify rest of loss funcs to not resize

* Propagate some

* Continue

* More

* More options

* Mostly fixed

* Proved that it's the same

* Bloom is good

* Make ability to override loss func possible

* Fixup

* Clean

* Fix xglm

* Quality tests

* Skip OCR2

* Make specific loss for xglm

* Make order the same/line up 1:1

* xglm

* Skip fx output loss bloom model

* Didn't pass in pad_token_id

* Fix quality

28f73bc3

Fix words typos in ggml test. (#36060) · 1590c664
湛露先生 authored 4 months ago
```
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
```
1590c664

Nail in edge case of torch dtype being overriden permantly in the case of an error (#35845) · 1ce0e299

Zach Mueller authored 4 months ago


* Nail in edge case of torch dtype

* Rm unused func

* Apply suggestions from code review

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Refactor tests to only mock what we need, don't introduce injection functions

* SetUp/TearDown

* Do super

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

1ce0e299

Save checkpoint to temporary directory to handle partial saves during failures (#35580) · e3458af7
SilverSoldier authored 4 months ago
```
Save checkpoint to temporary folder first

Since partial/missing files due to failures throw error during load
```
e3458af7
Paligemma: fix generation with Gemma2 (#36044) · 3dd1de39
Raushan Turganbay authored 4 months ago
```
* fix paligemma

* nit

* use `kwargs` in models that can load any LM
```
3dd1de39
Update `test_flash_attn_2_can_dispatch_composite_models` (#36050) · dce99708
Yih-Dar authored 4 months ago
```
* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
dce99708

Fix repo consistency (#36063) · 37faa97d

Yih-Dar authored 4 months ago


* fix 1

* fix 2

* fix modular

* simplify at the same time

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

37faa97d

Fix usage of unpad_input function (#35925) · ed98ad35

Pavel Gein authored 4 months ago

Fix usage of unpad_function

See https://github.com/huggingface/transformers/issues/35899

In the [commit](https://github.com/Dao-AILab/flash-attention/commit/cdbbe844b1c0bcba3362e1f8c8af4d6f6d0bf300

) return type of `unpad_input` was changed.
Now the code support older and newer versions

Co-authored-by: Pavel Gein <pavel.gein@gmail.com>

ed98ad35

Iterative generation using Input embeds and `past_key_values` (#35890) · 7aee036e

Yaswanth Gali authored 4 months ago

* Iterative generation using input embeds

* ruff fix

* Added Testcase

* Updated comment

* ♻️ Refactored testcase

* Skip test for these models

* Continue generation using input embeds and cache

* Skip generate_continue_from_embeds test

* Refactor `prepare_input_for_generation` func

* Continue generation using input embeds and cache

* Modular changes fix

* Overwrite 'prepare_inputs_for_generation' function

7aee036e

Add `Qwen2VLImageProcessorFast` into `Qwen2VLProcessor` (#35987) · b5f327f3

Ye Liu authored 4 months ago


* Add `Qwen2VLImageProcessorFast` into `Qwen2VLProcessor`

* Use `AutoImageProcessor` instead

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

b5f327f3

05 Feb, 2025 10 commits

Fix Audio Classification Pipeline top_k Documentation Mismatch and Bug #35736 (#35771) · 0de15c98

Sambhav Dixit authored 4 months ago


* added condition for top_k Doc mismatch fix

* initilation of test file for top_k changes

* added test for returning all labels

* added test for few labels

* tests/test_audio_classification_top_k.py

* final fix

* ruff fix

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>

0de15c98

Fix how we compute the final non-padding token for ForSequenceClassification models (#35911) · 694aaa7f

Matt authored 4 months ago

* Fix how we compute the final non-padding token for Gemma (and probably other models)

* .size() -> .shape[]

* Propagating changes to other models

* Propagating changes to other models

* Change it for all ForSequenceClassification models

* Fix batch dim

* More TF fixes

* Copy the TF fix around as well

* Correct layer name for TFCTRL

* Cleaner .to()

* Clean up the nested if-else

* Use argmax() instead of .max().values

694aaa7f

[docs] no hard-coding cuda (#36043) · 531d1511
Fanli Lin authored 4 months ago
```
make device-agnostic
```
531d1511
[docs] fix bugs in the bitsandbytes documentation (#35868) · 7399f802
Fanli Lin authored 4 months ago
```
* fix doc

* update model
```
7399f802

[docs] no hard coding cuda as bnb has multi-backend support (#35867) · 0a1a8e3c

Fanli Lin authored 4 months ago


* change cuda to DEVICE

* Update docs/source/en/llm_tutorial.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

0a1a8e3c

DeepSpeed github repo move sync (#36021) · 9dc1efa5
Stas Bekman authored 4 months ago
```
deepspeed github repo move
```
9dc1efa5
add support for empty list as input to create_model_card (#36042) · c772bff3
ROZBEH authored 4 months ago
```
handle cases where it is list
```
c772bff3
Add XPU type for work-around -inf mask causing sdpa NaN issue in modeling files (#35647) · 315a9f49
Liangliang Ma authored 4 months ago
```
* add xpu for unmask

* change modular for generated matching

* add lastest modeling for helium
```
315a9f49

Fix synced multi-GPU generation with LLMs and VLMs (#35893) · d8080d55

ManukyanD authored 4 months ago


* Fix synced multi-GPU generation

* fix copies

---------

Co-authored-by: Davit Manukyan <ManukyanD>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>

d8080d55

Fix Gemma2 synced multi-GPU generation (#35232) · 4831a94e
ManukyanD authored 4 months ago
```
* Fix Gemma2 synced multi-GPU generation

* Fix import ordering in modular_gemma2.py
```
4831a94e

04 Feb, 2025 16 commits

Refactoring of ImageProcessorFast (#35069) · fa56dcc2

Yoni Gozlan authored 4 months ago

* add init and base image processing functions

* add add_fast_image_processor to transformers-cli

* add working fast image processor clip

* add fast image processor to doc, working tests

* remove "to be implemented" SigLip

* fix unprotected import

* fix unprotected vision import

* update ViTImageProcessorFast

* increase threshold slow fast ewuivalence

* add fast img blip

* add fast class in tests with cli

* improve cli

* add fast image processor convnext

* add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision

* add device kwarg to ImagesKwargs for fast processing on cuda

* cleanup

* fix unprotected import

* group images by sizes and add batch processing

* Add batch equivalence tests, skip when center_crop is used

* cleanup

* update init and cli

* fix-copies

* refactor convnext, cleanup base

* fix

* remove patching mixins, add piped torchvision transforms for ViT

* fix unbatched processing

* fix f strings

* protect imports

* change llava onevision to class transforms (test)

* fix convnext

* improve formatting (following Pavel review)

* fix handling device arg

* improve cli

* fix

* fix inits

* Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs

* uniformize qwen2_vl fast

* fix docstrings

* add add fast image processor llava

* remove min_pixels max_pixels from accepted size

* nit

* nit

* refactor fast image processors docstrings

* cleanup and remove fast class transforms

* update add fast image processor transformers cli

* cleanup docstring

* uniformize pixtral fast and  make _process_image explicit

* fix prepare image structure llava next/onevision

* Use typed kwargs instead of explicit args

* nit fix import Unpack

* clearly separate pops and gets in base preprocess. Use explicit typed kwargs

* make qwen2_vl preprocess arguments hashable

fa56dcc2

Add DAB-DETR for object detection (#30803) · 8d73a386

David authored 4 months ago


* initial commit

* encoder+decoder layer changes WIP

* architecture checks

* working version of detection + segmentation

* fix modeling outputs

* fix return dict + output att/hs

* found the position embedding masking bug

* pre-training version

* added iamge processors

* typo in init.py

* iterupdate set to false

* fixed num_labels in class_output linear layer bias init

* multihead attention shape fixes

* test improvements

* test update

* dab-detr model_doc update

* dab-detr model_doc update2

* test fix:test_retain_grad_hidden_states_attentions

* config file clean and renaming variables

* config file clean and renaming variables fix

* updated convert_to_hf file

* small fixes

* style and qulity checks

* return_dict fix

* Merge branch main into add_dab_detr

* small comment fix

* skip test_inputs_embeds test

* image processor updates + image processor test updates

* check copies test fix update

* updates for check_copies.py test

* updates for check_copies.py test2

* tied weights fix

* fixed image processing tests and fixed shared weights issues

* added numpy nd array option to get_Expected_values method in test_image_processing_dab_detr.py

* delete prints from test file

* SafeTensor modification to solve HF Trainer issue

* removing the safetensor modifications

* make fix copies and hf uplaod has been added.

* fixed index.md

* fixed repo consistency

* styel fix and dabdetrimageprocessor docstring update

* requested modifications after the first review

* Update src/transformers/models/dab_detr/image_processing_dab_detr.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* repo consistency has been fixed

* update copied NestedTensor function after main merge

* Update src/transformers/models/dab_detr/modeling_dab_detr.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* temp commit

* temp commit2

* temp commit 3

* unit tests are fixed

* fixed repo consistency

* updated expected_boxes varible values based on related notebook results in DABDETRIntegrationTests file.

* temporarialy config modifications and repo consistency fixes

* Put dilation parameter back to config

* pattern embeddings have been added to the rename_keys method

* add dilation comment to config + add as an exception in check_config_attributes SPECIAL CASES

* delete FeatureExtractor part from docs.md

* requested modifications in modeling_dab_detr.py

* [run_slow] dab_detr

* deleted last segmentation code part, updated conversion script and changed the hf path in test files

* temp commit of requested modifications

* temp commit of requested modifications 2

* updated config file, resolved codepaths and refactored conversion script

* updated decodelayer block types and refactored conversion script

* style and quality update

* small modifications based on the request

* attentions are refactored

* removed loss functions from modeling file, added loss function to lossutils, tried to move the MLP layer generation to config but it failed

* deleted imageprocessor

* fixed conversion script + quality and style

* fixed config_att

* [run_slow] dab_detr

* changing model path in conversion file and in test file

* fix Decoder variable naming

* testing the old loss function

* switched back to the new loss function and testing with the odl attention functions

* switched back to the new last good result modeling file

* moved back to the version when I asked the review

* missing new line at the end of the file

* old version test

* turn back to newest mdoel versino but change image processor

* style fix

* style fix after merge main

* [run_slow] dab_detr

* [run_slow] dab_detr

* added device and type for head bias data part

* [run_slow] dab_detr

* fixed model head bias data fill

* changed test_inference_object_detection_head assertTrues to torch test assert_close

* fixes part 1

* quality update

* self.bbox_embed in decoder has been restored

* changed Assert true torch closeall methods to torch testing assertclose

* modelcard markdown file has been updated

* deleted intemediate list from decoder module

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

8d73a386

Update tests regarding attention types after #35235 (#36024) · fe52679e

Yih-Dar authored 4 months ago


* update

* update

* update

* dev-ci

* more changes

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fe52679e

CircleCI with python 3.9 (#36027) · 014a1fa2

Yih-Dar authored 4 months ago


update docker files

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

014a1fa2

feat(ci): ignore trufflehog unverified results (#36031) · c98b4679
Luc Georges authored 4 months ago

c98b4679
Hotfix for `self-comment-ci.yml` (#36030) · 9855acb9
Yih-Dar authored 4 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
9855acb9

Display warning for unknown quants config instead of an error (#35963) · 9f486bad

Marc Sun authored 4 months ago


* add supports_quant_method check

* fix

* add test and fix suggestions

* change logic slightly

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

9f486bad

Commont bot CI for other jobs (`generation` / `quantization`) (#35341) · f19bfa50

Yih-Dar authored 4 months ago


* quantization CI on PRs

* fix

* fix

* add 2 members

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f19bfa50

Fix RMSNormGated in Zamba2 (#35943) · a93b8058

pglorio authored 4 months ago


* First commit

* Finish model implementation

* First commit

* Finish model implementation

* Register zamba2

* generated modeling and configuration

* generated modeling and configuration

* added hybrid cache

* fix attention_mask in mamba

* dropped unused loras

* fix flash2

* config docstrings

* fix config and fwd pass

* make fixup fixes

* text_modeling_zamba2

* small fixes

* make fixup fixes

* Fix modular model converter

* added inheritances in modular, renamed zamba cache

* modular rebase

* new modular conversion

* fix generated modeling file

* fixed import for Zamba2RMSNormGated

* modular file cleanup

* make fixup and model tests

* dropped inheritance for Zamba2PreTrainedModel

* make fixup and unit tests

* Add inheritance of rope from GemmaRotaryEmbedding

* moved rope to model init

* drop del self.self_attn and del self.feed_forward

* fix tests

* renamed lora -> adapter

* rewrote adapter implementation

* fixed tests

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Dropped adapter in-place sum

* removed rope from attention init

* updated rope

* created get_layers method

* make fixup fix

* make fixup fixes

* make fixup fixes

* update to new attention standard

* update to new attention standard

* make fixup fixes

* minor fixes

* cache_position

* removed cache_position postion_ids use_cache

* remove config from modular

* removed config from modular (2)

* import apply_rotary_pos_emb from llama

* fixed rope_kwargs

* Instantiate cache in Zamba2Model

* fix cache

* fix @slow decorator

* small fix in modular file

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* several minor fixes

* inherit mamba2decoder fwd and drop position_ids in mamba

* removed docstrings from modular

* reinstate zamba2 attention decoder fwd

* use regex for tied keys

* Revert "use regex for tied keys"

This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.

* use regex for tied keys

* add cpu to slow forward tests

* dropped config.use_shared_mlp_adapter

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* re-convert from modular

* extended Zamba2RMSNormGated to n_groups>1

* removed einops import

* set _supports_sdpa = True

* add use_mem_eff_path flag for fused mamba2 fwd

* added docstring for use_mem_eff_ath flag

---------

Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

a93b8058

Fix device mismatch error in Whisper model during feature extraction (#35866) · bc9a6d83

Sumit Vij authored 4 months ago


* Fix device mismatch error in whisper feature extraction

* Set default device

* Address code review feedback

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>

bc9a6d83

Refactor (and fix) gpt_neox (#35610) · 9afb904b

Cyril Vallez authored 4 months ago

* start a nice modular

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* update

* Update modular_gpt_neox.py

* convert

* fix attribute

* fix attrs

* oups

* fix

* fix

* fix

* fix

* fix

* fix order to pass test (see with accelerate team)

* trigger CIs

* modular

* update

* up

* Update test_modeling_gpt_neox.py

* Update test_modeling_gpt_neox.py

* trigger CIs

* correctly pass arg

* simplify

* remove key warning

* update tp -> it's compatible since the view is before

* trigger CIs

9afb904b

Update Mistral converter (#35967) · ad305989

Cyril Vallez authored 4 months ago

* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py

* update

* style

* move it to integrations

* style

* trigger CIs

* trigger CIs

ad305989

layernorm_decay_fix (#35927) · b1954fd6

Ryoo Kwangrok authored 4 months ago

* layernorm_decay_fix

* W293 fix

* ruff format fix

* black format

* ruff format

* erase last layer

* add test_get_parameter_names_rmsnorm

* rmsnorm fix

b1954fd6

apply_chat_template: consistent behaviour for... · 2ba040a7

Dmitry Tarasov authored 4 months ago

apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582)

* apply_chat_template: consistent return_tensors behaviour with return_assistant_tokens_mask flag

* test_chat_template_return_assistant_tokens_mask: support tokenizers with no attention mask

* test_chat_template_return_assistant_tokens_mask: skip tokenizers with no padding token

* test_chat_template_return_assistant_tokens_mask: force tokenizer padding_side=right

---------

Co-authored-by: Eduard Allakhverdov <goncharova@airi.net>
Co-authored-by: d.tarasov <d.tarasov@airi.net>

2ba040a7

Fix custom kernel for DeformableDetr, RT-Detr, GroindingDINO, OmDet-Turbo in Pytorch 2.6.0 (#35979) · 9c02cb62
Pavel Iakubovskii authored 4 months ago
```
Updates type().is_cuda() -> .is_cuda(); .data<> -> .data_ptr<>
```
9c02cb62
Qwen2-VL: fix rope delta calculation (#36013) · 5d75a25b
Raushan Turganbay authored 4 months ago
```
* fix rope delats calculation

* add test

* style
```
5d75a25b

03 Feb, 2025 1 commit

Update Granite Vision Model Path / Tests (#35998) · e284c7e9

Alex Brooks authored 4 months ago


* Update granite vision model path

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Enable granite vision test

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

---------

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

e284c7e9