Commits · 045c02f2090969fe4bea88749fee95fb3aa9f3b3 · 某某某 / transformers-new

23 Jan, 2025 8 commits

[DOC] Fix contamination and missing paragraph in translation (#35851) · 045c02f2
Yosshi999 authored 5 months ago
```
Fix contamination and missing paragraph in translation
```
045c02f2

Granite Vision Support (#35579) · 71cc8161

Alex Brooks authored 5 months ago


* Add multimodal granite support

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

Support multiple image feature layres

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Remove failing validation for visual encoders with no cls

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Update llava based models / configs to support list of feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Add tests for multiple feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Use conditional instead of except for misaligned feature shapes

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* crop cls from each hidden state

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Fix formatting

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Support single vision feature int in vipllava

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Fix typo in vision feature selection strategy validation

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add tentative integration test for granite vision models

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add granite vision docs

Replace multimodal granite refs with granite vision

Add granite vision / llava next alias

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Use image url in granitevision example

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

---------

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

71cc8161

Fix more CI tests (#35661) · 8f1509a9
Arthur authored 5 months ago
```
add tooslow for the fat ones
```
8f1509a9

Fix uploading processors/tokenizers to WandB on train end (#35701) · 0a950e0b

Jack Roberts authored 5 months ago

* rename tokenizer to processing_class in WandbCallback.on_train_end

* rename tokenizer to processing_class in ClearMLCallback and DVCLiveCallback

0a950e0b

Fix GA loss for Deepspeed (#35808) · 4ec425ff

張庭瑜 authored 5 months ago

* Fix GA loss for Deepspeed

* Turn off loss scaling in DeepSpeed engine by scale_wrt_gas

* Add comment linking to PR

4ec425ff

add qwen2.5vl (#35569) · f3f6c865

ShuaiBai623 authored 5 months ago


* add qwen2.5vl

* fix

* pass check table

* add modular file

* fix style

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* padd copy check

* use modular

* fix

* fix

* fix

* update flashatt2&sdpa support_list

* Update docs/source/en/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update config

* update

* fix hf path

* rename Qwen2_5_VLVideosKwargs

* fix

* fix

* update

* excuted modular

* rollback init

* fix

* formated

* simpler init

* fix

* fix

* fix

* fix

* fix

* update docs

* fix

* fix

* update Qwen2VLRotaryEmbedding for yarn

* fix

---------

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: gewenbin0992 <gewenbin292@163.com>
Co-authored-by: gewenbin0992 <67409248+gewenbin0992@users.noreply.github.com>

f3f6c865

[Backend support] Allow `num_logits_to_keep` as Tensor + add flag (#35757) · d3af76df
Cyril Vallez authored 5 months ago
```
* support

* Update modeling_utils.py

* style

* most models

* Other models

* fix-copies

* tests + generation utils
```
d3af76df
[ `tests`] remove some flash attention class tests (#35817) · 8736e91a
Arthur authored 5 months ago
```
remove class from tests
```
8736e91a

22 Jan, 2025 12 commits
- Fix NoneType type as it requires py>=3.10 (#35843) · 2c3a44f9
  Marc Sun authored 5 months ago
```
fix type
```
  2c3a44f9
- Add PyTorch version check for FA backend on AMD GPUs (#35813) · fdcc62c8
  Mohit Sharma authored 5 months ago
```
Disable FA backend for SDPA on AMD GPUs (PyTorch < 2.4.1)
```
  fdcc62c8
- Fix compatibility issues when using auto_gptq with these older versions (#35830) · 3b977058
  LRL-ModelCloud authored 5 months ago
```
convert_model method of optimum only accepts a single nn.Module type model parameter for versions less than 1.23.99.
```
  3b977058
- [chat] docs fix (#35840) · 62bd8394
  Joao Gante authored 5 months ago
```
docs fix
```
  62bd8394
- Fix `head_dim` in config extracted from Gemma2 GGUF model (#35818) · 487e2f63
  Isotr0py authored 5 months ago
```
fix gemma2 head dim

Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
```
  487e2f63
- [Chat] Add Chat from TRL 🐈 (#35714) · b3d67224
  Joao Gante authored 5 months ago
```
* tmp commit

* add working chat

* add docts

* docs 2

* use auto dtype by default
```
  b3d67224
- Fix : Nemotron tokenizer for GGUF format (#35836) · a7738f5a
  Mohamed Mekkouri authored 5 months ago
```
fix nemotron gguf
```
  a7738f5a
- [pipeline] missing import regarding assisted generation (#35752) · ec28957f
  Joao Gante authored 5 months ago
```
missing import
```
  ec28957f
- [gpt2] fix generation tests (#35822) · 36c9181f
  Joao Gante authored 5 months ago
```
fix gpt2 generation tests
```
  36c9181f
- Hotfix: missing `working-directory` in `self-comment-ci.yml` (#35833) · f439e28d
  Yih-Dar authored 5 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  f439e28d
- Init cache on meta device (#35164) · 373e50e9
  Raushan Turganbay authored 5 months ago
```
* init cache on meta device

* offloaded static + enable tests

* tests weren't running before  :(

* update

* fix mamba

* fix copies

* update

* address comments and fix tests

* fix copies

* Update src/transformers/cache_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update

* mamba fix

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
```
  373e50e9
- Another security patch for `self-comment-ci.yml` (#35816) · 870e2c8e
  Yih-Dar authored 5 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  870e2c8e
21 Jan, 2025 20 commits

Remove pyav pin to allow python 3.11 to be used (#35823) · f4f33a20

CalOmnie authored 5 months ago


* Remove pyav pin to allow python 3.11 to be used

* Run make fixup

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>

f4f33a20

Remove old `benchmark` code (#35730) · 90b46e98

Joao Gante authored 5 months ago

* remove traces of the old deprecated benchmarks

* also remove old tf benchmark example, which uses deleted code

* run doc builder

90b46e98

[Mimi] update test expected values for t4 runners (#35696) · 870eb7b4
eustlb authored 5 months ago
```
update values for t4
```
870eb7b4

Improve modular documentation (#35737) · 8ac851b0

Cyril Vallez authored 5 months ago

* start a nice doc

* keep improving the doc

* Finalize doc

* Update modular_transformers.md

* apply suggestion

8ac851b0

add Qwen2-VL image processor fast (#35733) · 107f9f51

Yoni Gozlan authored 5 months ago

* add qwen2_vl image processor fast

* add device to ImagesKwargs

* remove automatic fix copies

* fix fast_is_faster_than_slow

* remove unnecessary import

107f9f51

move fastspeech to audio models (#35788) · 3df90103
eustlb authored 5 months ago

3df90103

[i18n-ar] Translated file: `docs/source/ar/tasks/masked_language_modeling.md` into Arabic (#35198) · 741d5523

Ahmed Almaghz authored 5 months ago


* إضافة الترجمة العربية: masked_language_modeling.md

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update _toctree.yml

* Add language_modeling.md

* Add Sequence_classifiation.md

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

741d5523

Optimized set_initialized_submodules. (#35493) · 568941bf
v2ray authored 5 months ago

568941bf
Remove deprecated `get_cached_models` (#35809) · 7051c5fc
Lucain authored 5 months ago
```
* Remove deprecated get_cached_models

* imports
```
7051c5fc
Fixed typo in autoawq version number in an error message for IPEX backend requirements. (#35815) · 97fbaf08
InfroLab authored 5 months ago
```
Fixed typo in version number for IPEX backend required minimal autoawq version
```
97fbaf08
Fix : BLOOM tie_word_embeddings in GGUF (#35812) · dbd84741
Mohamed Mekkouri authored 5 months ago
```
* fix bloom ggml

* fix falcon output

* make style
```
dbd84741

Auto-add `timm` tag to timm-wrapper models. (#35794) · 678bd7f1

Pedro Cuenca authored 5 months ago

Works for fine-tuned or exported models:

```py
from transformers import AutoModelForImageClassification

checkpoint = "timm/vit_base_patch16_224.augreg2_in21k_ft_in1k"
model = AutoModelForImageClassification.from_pretrained(checkpoint)

model.push_to_hub("pcuenq/tw1")
```

The uploaded model will now show snippets for both the timm and the
transformers libraries.

678bd7f1

Support adamw_torch_8bit (#34993) · dc10f790
fzyzcjy authored 5 months ago
```
* var

* more

* test
```
dc10f790

add a new flax example for Bert model inference (#34794) · f82b19cb

Louie Tsai authored 5 months ago


* add a new example for flax inference cases

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix for "make fixup"

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

f82b19cb

[Doc] Adding blog post to model doc for `TimmWrapper` (#35744) · edbabf6b

Aritra Roy Gosthipaty authored 5 months ago


* adding blog post to model doc

* Update docs/source/en/model_doc/timm_wrapper.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* review suggestions

* review suggestions

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

edbabf6b

Byebye `test_batching_equivalence`'s flakiness (#35729) · fd8d61fd

Yih-Dar authored 5 months ago


* fix

* fix

* skip

* better error message

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fd8d61fd

Add LlavaImageProcessor (#33191) · 78f5ee02

NielsRogge authored 5 months ago


* First draft

* Add equivalence test

* Update docstrings

* Add tests

* Use numpy

* Fix tests

* Improve variable names

* Improve docstring

* Add link

* Remove script

* Add copied from

* Address comment

* Add note in docs

* Add docstring, data format

* Improve test

* Add test

* update

* Update src/transformers/models/llava/image_processing_llava.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/llava/image_processing_llava.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* loop once only

---------

Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

78f5ee02

Update AMD Docker image (#35804) · 8e4cedd9
ivarflakstad authored 5 months ago

8e4cedd9

Fix "test_chat_template_dict" in video LLMs (#35660) · 705aeaaa

Raushan Turganbay authored 5 months ago


* fix  "test_chat_template_dict" in llava_onevision

* Update src/transformers/models/llava_next_video/processing_llava_next_video.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* get one video calles once

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

705aeaaa

Deterministic sorting in modular converter when adding new functions (#35795) · e867b974
Cyril Vallez authored 5 months ago
```
deterministic sort
```
e867b974