Commits · run_amd_scheduled_ci_caller_testing1 · zhusg / transformers-new

06 Dec, 2024 1 commit
- Revert deletion of self-push-amd.yml for now · 6054220b
  Ivar Flakstad authored 7 months ago
  
  6054220b
05 Dec, 2024 12 commits

Merge branch 'main' into secure-amd-ci · 2a938385
ivarflakstad authored 7 months ago

2a938385

Adaptive dynamic number of speculative tokens (#34156) · e27465c8

Jonathan Mamou authored 7 months ago


* initial commit

* update strategy

* add tradeoff FPR TPR with cost

* all probs

* fix

* fix

* fix style

* Update src/transformers/generation/configuration_utils.py

shorter docstring

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* import guard

* fix style

* add is_sklearn_available condition

* vectorizing to flatten the for-loop

* fix style

* disable adaptation for UAG

* update doc

* add TestAssistedCandidateGeneratorUpdateStrategy

* fix style

* protect import

* fix style

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

e27465c8

Fix flaky Hub CI (`test_trainer.py`) (#35062) · b0a51e5c

Yih-Dar authored 7 months ago


* fix

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* check

* check

* check

* check

* check

* check

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* check

* check

* check

* Final space

* Final adjustment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>

b0a51e5c

[`trainer`] fix the GA `model_accepts_loss_kwargs` (#34915) · a928d9c1
Arthur authored 7 months ago
```
* fix

* style

* values

* fix
```
a928d9c1
BLIP: this is correct now (#35081) · e682c17e
Raushan Turganbay authored 7 months ago
```
this is correct now
```
e682c17e

Add I-JEPA (#33125) · 50189e36

João Marcelo authored 7 months ago

* first draft

* add IJepaEmbeddings class

* fix copy-from for IJepa model

* add weight conversion script

* update attention class names in IJepa model

* style changes

* Add push_to_hub option to convert_ijepa_checkpoint function

* add initial tests for I-JEPA

* minor style changes to conversion script

* make fixup related

* rename conversion script

* Add I-JEPA to sdpa docs

* minor fixes

* adjust conversion script

* update conversion script

* adjust sdpa docs

* [run_slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* formatting issues

* adjust modeling to modular code

* add IJepaModel to objects to ignore in docstring checks

* [run-slow] ijepa

* fix formatting issues

* add usage instruction snippet to docs

* change pos encoding, add checkpoint for doc

* add verify logits for all models

* [run-slow] ijepa

* update docs to include image feature extraction instructions

* remove pooling layer from IJepaModel in image classification class

* [run-slow] ijepa

* remove pooling layer from IJepaModel constructor

* update docs

* [run-slow] ijepa

* [run-slow] ijepa

* small changes

* [run-slow] ijepa

* style adjustments

* update copyright in init file

* adjust modular ijepa

* [run-slow] ijepa

50189e36

Deprecate quanto and switch to optimum-quanto (#35001) · 95a855e2
Mohamed Mekkouri authored 7 months ago
```
* deprecate quanto

* fix style
```
95a855e2

Fix `tie_word_embeddings` handling for GGUF models (#35085) · 482cb28a

Isotr0py authored 7 months ago


* fix tie_word_embeddings

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>

482cb28a

Update Mistral conversion script (#34829) · 35447054

Cyril Vallez authored 7 months ago

* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py

35447054

[`tokenizers`] bump to 0.21 (#34972) · 93f87d3c
Arthur authored 7 months ago
```
bump to 0.21
```
93f87d3c

[Whisper] Fix whisper tokenizer (#34537) · 54aae121

eustlb authored 7 months ago


* handle single timestamp ending

* include last timestamp token

* handle single timestamp ending

* avoid floating points arithm limitations

* ensure float64 operations

* new test

* make fixup

* make copies

* handle edge case double tokens ending with different tokens

* handle single timestamp ending

* make fixup

* handle conditioning on prev segments

* fix

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* [run-slow] whisper

* don't call item() to avoid unnecessary sync

* fix

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>

54aae121

Informative (#35059) · beb2c66e

Yih-Dar authored 7 months ago


* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

beb2c66e

04 Dec, 2024 7 commits

[docs] Increase visibility of torch_dtype="auto" (#35067) · 1ed1de2f
Steven Liu authored 7 months ago
```
* auto-dtype

* feedback
```
1ed1de2f

[docs] add a comment that offloading requires CUDA GPU (#35055) · baa3b221

Fanli Lin authored 7 months ago


* add commen to offloading

* Update docs/source/en/kv_cache.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

baa3b221

Support for easier multimodal use of modular (#35056) · 1da1e0d7

Cyril Vallez authored 7 months ago

* update modular and add examples

* style

* improve example comments

* style

* fix small logic issue for imports

* fix relative order issue when files do not make sense

* Improve comments

* trigger CIs

1da1e0d7

[`GPTNeoX`] Flex Attention + Refactor (#34896) · 46df8599

Anton Vlasjuk authored 7 months ago


* gpt neox flex attention + refactor

* some formatting

* small fix on dropout

* add assertion on flex attn test

* flaky ci :(

* add head mask support

* style

* handle dtype, replace torch where

* fixup flex with output attns

* code review and several other fixes

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style

* remove unnecessary comment

* remove incorrect comment

* make flex attn check more agnostic tor versions and centralized

* change peft input dtype check to value since q and k could be affected by other stuff like RoPE

* i forgor

* flaky

* code review and small fixes

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

46df8599

Add Pytorch Tensor Parallel support for Qwen2, Qwen2Moe, Starcoder2 (#35007) · accb7204

Vladislav Bronzov authored 7 months ago


* add base tp plan for qwen2 and qwen2moe

* add parallel tp for starcoder2

* fix modular conversion

* add infer dim for qkv states

* Update src/transformers/models/qwen2_moe/configuration_qwen2_moe.py

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

accb7204

Use hf-workflows for both push and scheduled AMD CI · 32b37186
Ivar Flakstad authored 7 months ago

32b37186
Fix `pad_token_tensor` is None in warning (#34005) · c7a109ec
Tianshu Wang authored 7 months ago
```
Fix pad_token_tensor is None in warning
```
c7a109ec

03 Dec, 2024 9 commits

[docs] use device-agnostic API instead of hard-coded cuda (#35048) · 329f5dbf
Fanli Lin authored 7 months ago
```
replace cuda
```
329f5dbf

[docs] use device-agnostic instead of `cuda` (#35047) · b8cdc262

Fanli Lin authored 7 months ago

* fix on xpu

* [run_all]

* add the missing import for Image lib

* add more devices in comment

* bug fix

* replace cuda

b8cdc262

Translate community.md into Chinese (#35013) · 346597b6

wwwbai authored 7 months ago


* community translation

* Update docs/source/zh/community.md

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Isotr0py <2037008807@qq.com>

346597b6

[docs] fix example code bug (#35054) · 3deaa817
Fanli Lin authored 7 months ago
```
fix code bug
```
3deaa817

fix speecht5 failure issue in test_peft_gradient_checkpointing_enable… (#34454) · 125de416

Wang, Yi authored 7 months ago


* fix speecht5 failure issue in test_peft_gradient_checkpointing_enable_disable

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* [run-slow] speecht5

---------

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Matt <rocketknight1@gmail.com>

125de416

Use AMD CI workflow defined in hf-workflows · d38ed44e
Ivar Flakstad authored 7 months ago

d38ed44e
Fix `BertGeneration` (#35043) · 7a7f2769
Yih-Dar authored 7 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
7a7f2769
Add token cost + runtime monitoring to Agent and HfEngine children (#34548) · 901f5045
Aymeric Roucher authored 7 months ago
```
* Add monitoring to Agent and HfEngine children
```
901f5045

Automatic compilation in generate: do not rely on inner function (#34923) · ee37bf0d

Cyril Vallez authored 7 months ago

* compiled forward in PreTrainedModel

* update

* style

* update name

* trigger CIs

* Add way to use custom compile args

* style

* switch parameterization to generation_config

* Add to inits

* Update configuration_utils.py

* inits

* style

* docs

* style

* Update configuration_utils.py

* back without dataclass for repo consistency

* Update configuration_utils.py

* style

* style

* style once again

* add config serialization

* update

* true dataclass

* trigger CIs

* merge compile methods + remove serialization of compile config

ee37bf0d

02 Dec, 2024 11 commits

Translate bertlogy.md into Chinese (#34908) · f9c7e602

wwwbai authored 7 months ago


* bertology translation

* Update docs/source/zh/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/zh/bertology.md

Co-authored-by: blueingman <15329507600@163.com>

* Update docs/source/zh/bertology.md

Co-authored-by: blueingman <15329507600@163.com>

* Update docs/source/zh/bertology.md

Co-authored-by: Isotr0py <2037008807@qq.com>

* Update docs/source/zh/bertology.md

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: blueingman <15329507600@163.com>
Co-authored-by: Isotr0py <2037008807@qq.com>

f9c7e602

[docs] add the missing import for Image and bug fix (#34776) · 527dc04e
Fanli Lin authored 7 months ago
```
* add the missing import for Image lib

* add more devices in comment

* bug fix
```
527dc04e
[i18n-ar] Translated file : `docs/source/ar/notebooks.md` into Arabic (#33049) · 4955e4e6
Ahmed Almaghz authored 7 months ago
```
* Add docs/source/ar/notebooks.md to Add_docs_source_ar_notebooks.md

* Update notebooks.md

* Update _toctree.yml
```
4955e4e6
add docstring example for compute_loss_func (#35020) · f0dec874
secrettoad authored 7 months ago

f0dec874

Multiple typo fixes in Tutorials docs (#35035) · 31299670

Henry Hyeonmok Ko authored 7 months ago

* Fixed typo in multi gpu docs and OLMoE version

* Fixed typos in docs for agents, agents advanced, knowledge distillation, and image feature extraction

* Fixed incorrect usage of model.image_guided_detection in zero shot object detection docs

31299670

Fix `test_eager_matches_sdpa_inference` for `XPU` backend (#34889) · 31830474

Dmitry Rogozhkin authored 7 months ago


* Use torch.nn.attention.sdpa_kernel instead of deprecated torch.backends.cuda.sdp_kernel

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Fix test_eager_matches_sdpa_inference for XPU backend

As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH
which is implemented on PyTorch level using aten operators and is device
agnostic with respect to implementation of each aten operator. Thus, we can
reuse CUDA (or CPU) MATH weights for XPU.

Fixes: #34888
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Use torch.amp.autocast instead of deprecated torch.cuda.amp.autocast in nemotron

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

31830474

Add type hints for forward functions in Gemma2 (#35034) · f41d5d8f
Jacky Lee authored 7 months ago
```
* feat: add gemma2 type hints

* fix: mask is optional
```
f41d5d8f
Typo in warning switching to optimum-quanto (#35028) · 7b5f76e3
Bojun Feng authored 7 months ago
```
fix typos
```
7b5f76e3
Optimize memory usage of mllama encoder (#34930) · c24c79eb
milesial authored 7 months ago
```
mllama encoder memory optimization
```
c24c79eb
fix variable undefined bug when return_tensors is not specified in llava processing (#34953) · 9ab8c5b5
Weize Chen authored 7 months ago
```
* fix variable undefined bug when return_tensors is not specified in llava processor

* improve readability
```
9ab8c5b5
Only cast `cu_seqlens` when tracing (#35016) · 3480cbb9
Joshua Lochner authored 7 months ago
```
* Only cast `cu_seqlens` when tracing

* Formatting
```
3480cbb9