Commits · update-from-pretrained · zhusg / transformers-new

24 Mar, 2025 20 commits

fix · 9e4965ad
Arthur Zucker authored 2 months ago

9e4965ad
Merge branch 'main' of github.com:huggingface/transformers into update-from-pretrained · 71d47f48
Arthur Zucker authored 2 months ago

71d47f48
small fixes · 6c7eaa51
Arthur Zucker authored 2 months ago

6c7eaa51
final · abffdfeb
Arthur Zucker authored 2 months ago

abffdfeb

Added support for seed in `DataCollatorForWholeWordMask` (#36903) · 48385aa4

gautham authored 2 months ago

* Added support for seed in `DataCollatorForWholeWordMask`, and also wrote tests.

Also fixed bugs where the code hardcoded values for mask replacement probability and random replacement probability, instead of using the values passed by the user.

* formatting issues

* Used better way to generate seed in TF. Made tests more consistent.

48385aa4

nits · 9b4f4333
Arthur Zucker authored 2 months ago

9b4f4333
mrege with main and simplify · 1d8516d5
Arthur Zucker authored 2 months ago

1d8516d5

More precise comment (#36935) · 5932606d

Yih-Dar authored 2 months ago


* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5932606d

Fix pytorch defomr attn path (#36923) · 2be29844
Pavel Iakubovskii authored 2 months ago
```
* Fix pytorch path for DeformableAttention

* Apply for GroundingDino
```
2be29844
Merge branch 'main' of github.com:huggingface/transformers into update-from-pretrained · 5c854904
Arthur Zucker authored 2 months ago

5c854904
[2/N] Use pyupgrade --py39-plus to improve code (#36857) · 00d07726
cyyever authored 2 months ago
```
Use pyupgrade --py39-plus to improve code
```
00d07726

Update `trainer_pt_utils.py` docstrings for consistency (#36912) · a6ecb541

Ethan Knights authored 2 months ago


* Update trainer_pt_utils.py

* update docstrings trainer_pt_utils.py for consistency

* Update src/transformers/trainer_pt_utils.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

a6ecb541

Fix typos (#36910) · cbf924b7
omahs authored 2 months ago
```
* fix typos

* fix typos

* fix typos

* fix typos
```
cbf924b7
Use another repo. for Mistral3 processor testing (#36925) · 340500b1
Yih-Dar authored 2 months ago
```
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
340500b1
Fix Compressed tensors to_dict_diff (#36922) · 9e125d9a
Mohamed Mekkouri authored 2 months ago
```
fix
```
9e125d9a

[chameleon] fix num image token check (#36918) · 57f551c7

Raushan Turganbay authored 2 months ago

* [chameleon] fix num image token check

* embed after merging image token

* skip this also

* mistral require_read_token

57f551c7

tests: fix asyncio.wait() usage for python>=3.11 (#36898) · a41e08aa

Dmitry Rogozhkin authored 2 months ago

tests: fix asyncio.wait() usage for python>=3.7

Passing coroutings directly to `asyncio.wait()` is deprecated since
python 3.8 and removed starting from python 3.11. Instead, it's required
to explicitly wrap coroutine in the task with `asyncio.create_task()` which
first appeared in python 3.7.

We step into this issue running the following Transformers tests on a
system with python 3.11 or later (for example, Ubuntu 24.04 has python 3.12):

* `tests/trainer/test_trainer_distributed.py`
* `tests/extended/test_trainer_ext.py`

The error will be:
```
src/transformers/testing_utils.py:2380: in execute_subprocess_async
    result = loop.run_until_complete(
/usr/lib/python3.12/asyncio/base_events.py:687: in run_until_complete
    return future.result()
src/transformers/testing_utils.py:2368: in _stream_subprocess
    await asyncio.wait(
...
E           TypeError: Passing coroutines is forbidden, use tasks explicitly.

```

See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait
See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait
See: https://docs.python.org/3.7/library/asyncio-task.html#asyncio.create_task



Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

a41e08aa

[Fix] Add `original_max_position_embeddings` to YARN rope_scaling optional keys (#36877) · e28be7a6
XinyuanTong authored 2 months ago
```
[fix] Update optional keys in _validate_yarn_parameters to include original_max_position_embeddings
```
e28be7a6
Fix torch version guard at import (#36907) · 48da44be
Raushan Turganbay authored 2 months ago
```
fix
```
48da44be
fix Gemma3 Config (#36893) · fe4ca2f4
AbdelKarim ELJANDOUBI authored 2 months ago
```
* fix Gemma3 Config

* fix config in modular gemm3
```
fe4ca2f4

21 Mar, 2025 20 commits

Update installation.md (#36826) · c9d1e523
Aritra Roy Gosthipaty authored 2 months ago
```
* Update installation.md

* Update README.md
```
c9d1e523

[docs] Model docs (#36469) · d253de6d

Steven Liu authored 2 months ago

* initial

* fix

* fix

* update

* fix

* fixes

* quantization

* attention mask visualizer

* multimodal

* small changes

* fix code samples

d253de6d

Fix Pan and Scan on batched images Gemma3 (#36864) · beb9b5b0

Yoni Gozlan authored 2 months ago

* process flattened images in fast image proc

* process flattened images in low proc and add tests

* remove print

* add unbalanced batch test pas image proc

* fix integration tests

beb9b5b0

Simplify keep_in_fp32_modules logic (#36722) · dd3933dd

Cyril Vallez authored 2 months ago

* better regex everywhere

* fix

* Update test_modeling_instructblip.py

* BC with explanations this time otherwise it makes no sense at all

* Update test_modeling_instructblip.py

* style

* CIs

* update _keep_in_fp32_modules in blip2

* Update modeling_utils.py

* Update modeling_utils.py

* style

* CIs

* add check

* trigger CIs

* Update modeling_utils.py

* trigger CIs

dd3933dd

fix: loss computation after embeddings resize - mllama (#36840) · 90e2df5d

Sukriti Sharma authored 2 months ago


* move loss to generation class

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* code cleanup

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* test for resize and loss computation

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix tests

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix:test for resize and loss

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix resize embedding mllama test

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* review changes

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

---------

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

90e2df5d

push v4.51.0.dev0 · 4542b8fb
Arthur Zucker authored 3 months ago

4542b8fb
Fix: dtype cannot be str (#36262) · 523f6e74
Raushan Turganbay authored 3 months ago
```
* fix

* this wan't supposed to be here, revert

* refine tests a bit more
```
523f6e74
Minor Gemma 3 fixes (#36884) · 3f9ff19b
Pablo Montalvo authored 3 months ago
```
fix attention mask dtype + outputs type
```
3f9ff19b

Use `deformable_detr` kernel from the Hub (#36853) · f94b0c59

Daniël de Kok authored 3 months ago

* Use `deformable_detr` kernel from the Hub

Remove the `deformable_detr` kernel from `kernels/` and use the
pre-built kernel from the Hub instead.

* Add license header

* Add `kernels` as an extra `hub-kernels`

Also add it to `testing`, so that the kernel replacement gets tested
when using CUDA in CI.

f94b0c59

Gemma 3 tests expect greedy decoding (#36882) · 2638d54e
Pablo Montalvo authored 3 months ago
```
tests expect greedy decoding
```
2638d54e

🔴

supersede paligemma forward to shift... · b8aadc31

Pablo Montalvo authored 3 months ago

🔴 🔴 🔴

 supersede paligemma forward to shift pos id indexing (#36859)

* supersede paligemma forward to shift pos id indexing

* fix prepare_inputs_ as well

* fix modular error

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

b8aadc31

add eustlb as an actor · 6321876b
Arthur Zucker authored 3 months ago

6321876b
[generate] model defaults being inherited only happens for newer models (#36881) · 94f48762
Joao Gante authored 3 months ago

94f48762

Revert "Update deprecated Jax calls (#35919)" (#36880) · f19d018b

Arthur authored 3 months ago

* Revert "Update deprecated Jax calls (#35919)"

This reverts commit f0d5b2ff.

* Revert "Update deprecated Jax calls (#35919)"

This reverts commit f0d5b2ff.

* udpate

f19d018b

Make ViTPooler configurable (#36517) · 62116c96

sebbaur authored 3 months ago

* Make ViT Pooler configurable, so that it is possible to pick the activation function and the number of channels in the output

* Add documentation and allow functions as activations (instead of just string)

* formatting change

* Use ACT2FN

* Formatting change

* Formatting changes

* force pooler_act to be string

* force pooler_act to be string

* Add configs to OBJECTS_TO_IGNORE to make check_docstrings happy

* Making the same change in ijepa to make check_modular_conversion happy

* Add IJepaConfig to make CI happy

* rename pooler_size to pooler_output_size as defined in the config

* typo

* revert change to ignore variable

* Ran utils/check_docstrings.py --fix_and_overwrite

* revert unrelated change

* remove redundant defaults

* rename self.act -> self.activation

* tanh activation function in mapping

62116c96

chore: fix typos in the tests directory (#36813) · 26c83490

Afanti authored 3 months ago

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* fix: format codes

* chore: fix copy mismatch issue

* fix: format codes

* chore: fix copy mismatch issue

* chore: fix copy mismatch issue

* chore: fix copy mismatch issue

* chore: restore previous words

* chore: revert unexpected changes

26c83490

Remove call to `.item` in `get_batch_samples` (#36861) · 0adbc873
regisss authored 3 months ago

0adbc873

FIX FSDP plugin update for QLoRA (#36720) · 6bb8565f

Benjamin Bossan authored 3 months ago


The _fsdp_qlora_plugin_updates checks for LoraConfig but other PEFT
methods can also support quantized models, e.g. VeRA. Therefore, the
isinstance check is now looking for PeftConfig in general.

Moreover, the fsdp_plugin variable may be undefined in the 2nd if
condition, leading to an `UnboundLocalError` error. This is fixed by not
assigning the variable at all.

I checked for tests that may need updating but only found
test_fsdp_config_transformers_auto_wrap associated with this change.
AFAICT, this test does not cover the changed code, since the test does
not start the training loop. Therefore, I haven't updated any tests. LMK
if/how this fix should be tested.

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

6bb8565f

[CI] doc builder without custom image (#36862) · 949cca40

Joao Gante authored 3 months ago

* no image

* test

* revert jax version updates

* make fixup

* update autodoc path for model_addition_debugger

* shieldgemma2

* add missing pages to toctree

949cca40

Mllama: raise better error (#35934) · 97d2f9d8
Raushan Turganbay authored 3 months ago
```
* fix mllama

* update test

* fix test
```
97d2f9d8