Commits · c82319b493889aaa60912319369e33dd049420fc · 某某某 / transformers-new

13 Feb, 2025 11 commits

Helium documentation fixes (#36170) · c82319b4

Lysandre Debut authored 5 months ago

* Helium documentation fixes

* Update helium.md

* Update helium.md

* Update helium.md

c82319b4

Move `DataCollatorForMultipleChoice` from the docs to the package (#34763) · 8f137b24

Thomas Bauwens authored 5 months ago


* Add implementation for DataCollatorForMultipleChoice based on docs.

* Add DataCollatorForMultipleChoice to import structure.

* Remove custom DataCollatorForMultipleChoice implementations from example scripts.

* Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean.

* Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable.

* Apply suggested changes and run make fixup.

* fix copies, style and fixup

* add missing documentation

* nits

* fix docstring

* style

* nits

* isort

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>

8f137b24

Fix PretrainedTokenizerFast check => Fix PretrainedTokenizerFast Save (#35835) · 35c15505

CL-ModelCloud authored 5 months ago


* Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json

* Update tokenization_utils_base.py

* Update tokenization_utils_base.py

* Update tokenization_utils_base.py

* add tokenizer class type test

* code review

* code opt

* fix bug

* Update test_tokenization_fast.py

* ruff check

* make style

* code opt

* Update test_tokenization_fast.py

---------

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>

35c15505

docs: fix return type annotation of `get_default_model_revision` (#35982) · 3c912c90
Marco Edward Gorelli authored 5 months ago

3c912c90

qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 (#36083) · 6a1ab634

gewenbin0992 authored 5 months ago

* qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1

* fix

* fix

* fix

* fix

* add tests

* fix test bugs

* fix

* fix failed tests

* fix

6a1ab634

Fix tests for vision models (#35654) · d4198628

Pavel Iakubovskii authored 5 months ago

* Trigger tests

* [run-slow] beit, detr, dinov2, vit, textnet

* Fix BEiT interpolate_pos_encoding

* Fix DETR test

* Update DINOv2 test

* Fix textnet

* Fix vit

* Fix DPT

* fix data2vec test

* Fix textnet test

* Update interpolation check

* Fix ZoeDepth tests

* Update interpolate embeddings for BEiT

* Apply suggestions from code review

d4198628

Replace deprecated update_repo_visibility (#35970) · e60ae0d0
Lucain authored 5 months ago

e60ae0d0
Fix Gemma2 dtype issue when storing weights in float16 precision (#35398) · 9065cf0d
Nerogar authored 5 months ago
```
fix gemma2 dtype issue when storing weights in float16 precision
```
9065cf0d

Add reminder config to issue template and print DS version in env (#35156) · 08ab1abf

Ben Schneider authored 5 months ago

* update env command to log deepspeed version

* suppress deepspeed import logging

* Add reminder to include configs to repro description in bug report.

* make fixup

* [WIP] update import utils for deepspeed

* Change to using is_deepspeed_available() from integrations.

* make fixup

08ab1abf

Fix PaliGemma Pad Token Masking During Training #35855 (#35859) · 950cfb0b

Sambhav Dixit authored 5 months ago


* change order of unmasking of tokens

* library import

* class setup

* test function

* refactor

* add commit message

* test modified

* explict initiliasation of weights + made model smaller

* removed sepete testing file

* fixup

* fixup core

* test attention mask with token types

* tests fixup

* removed PaliGemmaAttentionMaskTest class

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>

950cfb0b

Mllama fsdp (#36000) · 1614d196

Benjamin Badger authored 5 months ago


* pixel input assignment revoked

* double send

* Update src/transformers/models/mllama/modeling_mllama.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

1614d196

12 Feb, 2025 20 commits

Add git LFS to AMD docker image (#36016) · 847854b0
ivarflakstad authored 5 months ago
```
Add git lfs to AMD docker image
```
847854b0
skip `test_initialization` for `VitPoseBackboneModelTest` for now (#36154) · 9985d06a
Yih-Dar authored 5 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
9985d06a

Fix test fetcher (#36129) · 4a5a7b99

Yih-Dar authored 5 months ago


* fix

* fix

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

4a5a7b99

Add more rigerous non-slow grad accum tests (#35668) · 1fae54c7

Zach Mueller authored 5 months ago

* Add more rigerous non-slow grad accum tests

* Further nits

* Re-add space

* Readbility

* Use tinystories instead

* Revert transformer diff

* tweak threshs

1fae54c7

Update doc re list of models supporting TP (#35864) · f869d486
Ke Wen authored 5 months ago
```
Update doc about models' TP support
```
f869d486

adding option to save/reload scaler (#34932) · 281c0c8b

hsilva664 authored 5 months ago

* Adding option to save/reload scaler

* Removing duplicate variable

* Adding save/reload test

* Small fixes on deterministic algorithm call

* Moving LLM test to another file to isolate its environment

* Moving back to old file and using subprocess to run test isolated

* Reverting back accidental change

* Reverting back accidental change

281c0c8b

Fix multi gpu loss sync condition, add doc and test (#35743) · a33ac830

kang sheng authored 5 months ago

* Fix multi gpu loss sync condition, add doc and test

* rename function and class

* loss should not scale during inference

* fix typo

a33ac830

Optim: APOLLO optimizer integration (#36062) · 08c4959a

zhuHQ authored 5 months ago

* Added APOLLO optimizer integration

* fix comment

* Remove redundancy: Modularize low-rank optimizer construction

* Remove redundancy: Remove useless comment

* Fix comment: Add typing

* Fix comment: Rewrite apollo desc

08c4959a

multi-gpu: fix tensor device placements for various models (#35763) · 24405127

Dmitry Rogozhkin authored 5 months ago


* milti-gpu: fix inputs_embeds + position_embeds

Fixing the following errors in few models:
```
>       hidden_states = inputs_embeds + pos_embeds
E       RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:2 and xpu:3!
```

Fixes: #35762
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* multi-gpu: fix tensor device placements for various models

Fixes: #35762
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Apply make fix-copies

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

24405127

🚨 Remove cache migration script (#35810) · befea8c4
Lucain authored 5 months ago
```
* Remove cache migration script

* remove dummy move_cache
```
befea8c4

Bump cryptography from 43.0.1 to 44.0.1 in... · d52a9d08

dependabot[bot] authored 5 months ago

Bump cryptography from 43.0.1 to 44.0.1 in /examples/research_projects/decision_transformer (#36142)

Bump cryptography in /examples/research_projects/decision_transformer

Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.1 to 44.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/43.0.1...44.0.1

)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

d52a9d08

Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/vqgan-clip (#36136) · 31e4831b

dependabot[bot] authored 5 months ago

Bump transformers in /examples/research_projects/vqgan-clip

Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0

)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

31e4831b

Fix Gradient Checkpointing for Deberta & Deberta-V2 using PEFT / Adapters (#35898) · 243aeb7c
Leon Engländer authored 5 months ago
```
Replace In-Place Operations for Deberta and Deberta-V2
```
243aeb7c
[commands] remove deprecated/inoperational commands (#35718) · 8a2f062e
Joao Gante authored 5 months ago
```
rm deprecated/inoperational commands
```
8a2f062e
VLM: enable skipped tests (#35746) · 8fc6ecba
Raushan Turganbay authored 5 months ago
```
* fix cached tests

* fix some tests

* fix pix2struct

* fix
```
8fc6ecba

Add utility for Reload Transformers imports cache for development workflow #35508 (#35858) · d6897b46

Sambhav Dixit authored 5 months ago


* Reload transformers fix form cache

* add imports

* add test fn for clearing import cache

* ruff fix to core import logic

* ruff fix to test file

* fixup for imports

* fixup for test

* lru restore

* test check

* fix style changes

* added documentation for usecase

* fixing

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>

d6897b46

Whisper: remove redundant assisted generation tests (#34814) · 1cc7ca32
Joao Gante authored 5 months ago
```
* remove redundant test

* delete another test

* revert default max_length

* (wrong place, moving)
```
1cc7ca32

added warning to Trainer when label_names is not specified for PeftModel (#32085) · 0cd5e2df

MilkClouds authored 5 months ago


* feat: added warning to Trainer when label_names is not specified for PeftModel

* Update trainer.py

* feat: peft detectw ith `_is_peft_model`

* Update src/transformers/trainer.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Applied formatting in trainer.py

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

0cd5e2df

add RAdamScheduleFree optimizer (#35313) · 377d8e2b

nhamanasu authored 5 months ago


* add RAdamScheduleFree optimizer

* revert schedulefree version to the minimum requirement

* refine is_schedulefree_available so that it can take min_version

* refine documents

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

377d8e2b

Add pipeline parallel plan to `PretrainedConfig` and `PreTrainedModel` (#36091) · f5fff672

Harry Mellor authored 5 months ago


* Add `base_model_pp_plan` to `PretrainedConfig`

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add `_pp_plan` to `PreTrainedModel`

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add both to Llama for testing

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Fix type error

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Update to suggested schema

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* `_pp_plan` keys are not patterns

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Simplify schema

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Fix typing error

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Update input name for Llama

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Aria

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Bamba

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Cohere 1 & 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to diffllama and emu3

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Gemma 1 & 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to GLM and GPT NeoX

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Granite and Helium

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Mistral and Mixtral

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to OLMo 1 & 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Phi and Phi 3

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan for Starcoder 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add enum for accessing inputs and outputs

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Update type hints to use tuples

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Change outer list to tuple

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

---------

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

f5fff672

11 Feb, 2025 9 commits

[docs] update awq doc (#36079) · 11afab19

Fanli Lin authored 5 months ago


* update awq doc

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add note for inference

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

11afab19

[docs] minor doc fix (#36127) · 9b69986e
Fanli Lin authored 5 months ago
```
fix
```
9b69986e

Make `output_dir` Optional in `TrainingArguments` #27866 (#35735) · 1b57de8d

Sambhav Dixit authored 5 months ago


* make output_dir optional

* inintaied a basic testing module to validate and verify the changes

* Test output_dir default to 'tmp_trainer' when  unspecified.

* test existing functionality of output_dir.

* test that output dir only created when needed

* final check

* added doc string and changed the tmp_trainer to trainer_output

* amke style fixes to test file.

* another round of fixup

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>

1b57de8d

update tiktoken integ to use converted (#36135) · 03534a92
Arthur authored 5 months ago

03534a92
Fix CI issues (#35662) · 3a5c328f
Pablo Montalvo authored 5 months ago
```
* make explicit gpu dep

* [run-slow] bamba
```
3a5c328f

Fix max size deprecated warning (#34998) · 775252ab

Hicham Tala authored 5 months ago

* Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning

* Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning

* Remove deprecated warnings and eliminate `max_size` usage

* Test use `int` as argument for `size`
Add a test to ensure test can pass successfully and backward compatibility

* The test pipelines still use `max_size`
Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys

* Reformatting

* Reformatting

* Revert "Reformatting"

This reverts commit c3040acee75440357cffd1f60c9d29ff5b2744b8.

* Revert "Reformatting"

This reverts commit ac4522e5c9a02d2d0c298295026db68ea26453df.

* Revert "The test pipelines still use `max_size`"

This reverts commit eaed96f041ffc32459536e1524d87f7a12ddee29.

* Revert "Test use `int` as argument for `size`"

This reverts commit 1925ee38c7c5eabb11832316712df1d4ba8043d0.

* Revert "Remove deprecated warnings and eliminate `max_size` usage"

This reverts commit d8e7e6ff9025931468fc1f3827cda1fa391003d5.

* Change version `4.26` to "a future version"

* Reformatting

* Revert "Change version `4.26` to "a future version""

This reverts commit 2b53f9e4

775252ab

update awesome-transformers.md. (#36115) · 5489fea5
湛露先生 authored 5 months ago
```
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
```
5489fea5
fix: typos in documentation files (#36122) · 76048be4
Maxim Evtush authored 5 months ago
```
* Update tools.py

* Update text_generation.py

* Update question_answering.py
```
76048be4

Add common test for `torch.export` and fix some vision models (#35124) · f42d46cc

Pavel Iakubovskii authored 5 months ago

* Add is_torch_greater_or_equal test decorator

* Add common test for torch.export

* Fix bit

* Fix focalnet

* Fix imagegpt

* Fix seggpt

* Fix swin2sr

* Enable torch.export test for vision models

* Enable test for video models

* Remove json

* Enable for hiera

* Enable for ijepa

* Fix detr

* Fic conditional_detr

* Fix maskformer

* Enable test maskformer

* Fix test for deformable detr

* Fix custom kernels for export in rt-detr and deformable-detr

* Enable test for all DPT

* Remove custom test for deformable detr

* Simplify test to use only kwargs for export

* Add comment

* Move compile_compatible_method_lru_cache to utils

* Fix beit export

* Fix deformable detr

* Fix copies data2vec<->beit

* Fix typos, update test to work with dict

* Add seed to the test

* Enable test for vit_mae

* Fix beit tests

* [run-slow] beit, bit, conditional_detr, data2vec, deformable_detr, detr, focalnet, imagegpt, maskformer, rt_detr, seggpt, swin2sr

* Add vitpose test

* Add textnet test

* Add dinov2 with registers

* Update tests/test_modeling_common.py

* Switch to torch.testing.assert_close

* Fix masformer

* Remove save-load from test

* Add dab_detr

* Add depth_pro

* Fix and test RT-DETRv2

* Fix dab_detr

f42d46cc