Commits · ci-test-huggingface-hub-v0.28.0.rc0 · 某某某 / transformers-new

28 Jan, 2025 5 commits

Test hfh v0.28.0.rc0 · 0fd6a8d1
Celina Hanouti authored 4 months ago

0fd6a8d1
Fix: loading DBRX back from saved path (#35728) · b764c20b
Raushan Turganbay authored 4 months ago
```
* fix dtype as dict for some models + add test

* add comment in tests
```
b764c20b

Add default TP plan for all models with backend support (#35870) · 3613f568

Cyril Vallez authored 4 months ago


* Add some tp plans!

* More tp plans!

* Add it in the comment

* style

* Update configuration_mixtral.py

* Update configuration_phi.py

* update the layout according to special archs

* fix mixtral

* style

* trigger CIs

* trigger CIs

* CIs

* olmo2

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

3613f568

Use rocm6.2 for AMD images (#35930) · 96625d85

ivarflakstad authored 4 months ago

* Use rocm6.2 as rocm6.3 only has nightly pytorch wheels atm

* Use stable wheel index for torch libs

96625d85

Remove `_supports_static_cache = True` for some model classes (#34975) · bf16a182
Yih-Dar authored 4 months ago
```
* use mask_fill

* remove comment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
bf16a182

27 Jan, 2025 11 commits

[docs] Fix Zamba2 (#35916) · 86d75646
Steven Liu authored 4 months ago
```
fix code block
```
86d75646
Close Zamba2Config code block (#35914) · 414658f9
Matt authored 4 months ago
```
* close zamba2 code block

* Add Zamba2 to toctree
```
414658f9

Fix the config class comparison for remote code models (#35592) · 63e9c941

Matt authored 4 months ago

* Fix the config class comparison when repeatedly saving and loading remote code models

* once again you have committed your debug breakpoint

63e9c941

[docs] uv install (#35821) · c550a1c6
Steven Liu authored 4 months ago
```
uv install
```
c550a1c6

Fix typing in audio_utils.chroma_filter_bank (#35888) · cd6591bf

CalOmnie authored 4 months ago


* Fix typing in audio_utils.chroma_filter_bank

* Apply make style

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>

cd6591bf

Split and clean up GGUF quantization tests (#35502) · e57b4599

Isotr0py authored 4 months ago


* clean up ggml test

Signed-off-by: Isotr0py <2037008807@qq.com>

* port remaining tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* further cleanup

Signed-off-by: Isotr0py <2037008807@qq.com>

* format

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix broken tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* update comment

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

* reorganize tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* k-quants use qwen2.5-0.5B

Signed-off-by: Isotr0py <2037008807@qq.com>

* move ggml tokenization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove dead code

Signed-off-by: Isotr0py <2037008807@qq.com>

* add assert for serilization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* use str for parameterize

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>

e57b4599

image-classification pipeline single-label and multi-label prob type... · 5c576f5a

Ross Wightman authored 4 months ago

 image-classification pipeline single-label and multi-label prob type squashing fns (sigmoid vs softmax) are backwards (#35848)

single-label and multi-label prob type squashing fns (sigmoid vs softmax) were backwards for image-classification pipeline

5c576f5a

Added `segmentation maps` support for DPT image processor (#34345) · 5450e7c8

Mikhail Moskovchenko authored 4 months ago

* Added `segmentation_maps` support for DPT image processor

* Added tests for dpt image processor

* Moved preprocessing into separate functions

* Added # Copied from statements

* Fixed # Copied from statements

* Added `segmentation_maps` support for DPT image processor

* Added tests for dpt image processor

* Moved preprocessing into separate functions

* Added # Copied from statements

* Fixed # Copied from statements

5450e7c8

Update deepspeed amd image (#35906) · a50befa9
ivarflakstad authored 4 months ago

a50befa9

Add Zamba2 (#34517) · 33cb1f7b

pglorio authored 4 months ago


* First commit

* Finish model implementation

* First commit

* Finish model implementation

* Register zamba2

* generated modeling and configuration

* generated modeling and configuration

* added hybrid cache

* fix attention_mask in mamba

* dropped unused loras

* fix flash2

* config docstrings

* fix config and fwd pass

* make fixup fixes

* text_modeling_zamba2

* small fixes

* make fixup fixes

* Fix modular model converter

* added inheritances in modular, renamed zamba cache

* modular rebase

* new modular conversion

* fix generated modeling file

* fixed import for Zamba2RMSNormGated

* modular file cleanup

* make fixup and model tests

* dropped inheritance for Zamba2PreTrainedModel

* make fixup and unit tests

* Add inheritance of rope from GemmaRotaryEmbedding

* moved rope to model init

* drop del self.self_attn and del self.feed_forward

* fix tests

* renamed lora -> adapter

* rewrote adapter implementation

* fixed tests

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Dropped adapter in-place sum

* removed rope from attention init

* updated rope

* created get_layers method

* make fixup fix

* make fixup fixes

* make fixup fixes

* update to new attention standard

* update to new attention standard

* make fixup fixes

* minor fixes

* cache_position

* removed cache_position postion_ids use_cache

* remove config from modular

* removed config from modular (2)

* import apply_rotary_pos_emb from llama

* fixed rope_kwargs

* Instantiate cache in Zamba2Model

* fix cache

* fix @slow decorator

* small fix in modular file

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* several minor fixes

* inherit mamba2decoder fwd and drop position_ids in mamba

* removed docstrings from modular

* reinstate zamba2 attention decoder fwd

* use regex for tied keys

* Revert "use regex for tied keys"

This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.

* use regex for tied keys

* add cpu to slow forward tests

* dropped config.use_shared_mlp_adapter

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* re-convert from modular

---------

Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

33cb1f7b

Fix fast image processor warnings in object detection examples (#35892) · 14a9bb52
Sugendran Ganess authored 4 months ago
```
Have the DETR examples default to using the fast image  processor
```
14a9bb52

26 Jan, 2025 1 commit
- [doctest] Fixes (#35863) · f11f57c9
  Steven Liu authored 4 months ago
```
doctest fixes
```
  f11f57c9
24 Jan, 2025 4 commits

Add `Rocketknight1` to `self-comment-ci.yml` (#35881) · fc269f77
Yih-Dar authored 4 months ago
```
my bad

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
fc269f77
add xpu device check in device_placement (#35865) · bcb841f0
Fanli Lin authored 4 months ago
```
add xpu device
```
bcb841f0

use torch.testing.assertclose instead to get more details about error in cis (#35659) · b912f5ee

Arthur authored 4 months ago

* use torch.testing.assertclose instead to get more details about error in cis

* fix

* style

* test_all

* revert for I bert

* fixes and updates

* more image processing fixes

* more image processors

* fix mamba and co

* style

* less strick

* ok I won't be strict

* skip and be done

* up

b912f5ee

Fix Llava-NeXT / Llava-NeXT Video / Llava-OneVision's token unpadding mismatch (#35779) · 72d1a4cd
Suyuchen Wang authored 4 months ago
```
* Fix Llava OneVision's token padding

* Fix Llava next and Llava next video's token unpadding for consistency
```
72d1a4cd

23 Jan, 2025 11 commits

Fix `test_pipelines_video_classification` that was always failing (#35842) · b5aaf875

CalOmnie authored 4 months ago


* Fix test_pipelines_video_classification that was always failing

* Update video pipeline docstring to reflect actual return type

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>

b5aaf875

fix apply_chat_template() padding choice (#35828) · 328e2ae4

baoyf4244 authored 4 months ago

fix apply_chat_template() padding choice to bool, str, PaddingStrategy and the docstring of pad()

328e2ae4

Fix typo (#35854) · d2a424b5
SilverSoldier authored 4 months ago

d2a424b5
[DOC] Fix contamination and missing paragraph in translation (#35851) · 045c02f2
Yosshi999 authored 4 months ago
```
Fix contamination and missing paragraph in translation
```
045c02f2

Granite Vision Support (#35579) · 71cc8161

Alex Brooks authored 4 months ago


* Add multimodal granite support

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

Support multiple image feature layres

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Remove failing validation for visual encoders with no cls

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Update llava based models / configs to support list of feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Add tests for multiple feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Use conditional instead of except for misaligned feature shapes

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* crop cls from each hidden state

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Fix formatting

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Support single vision feature int in vipllava

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Fix typo in vision feature selection strategy validation

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add tentative integration test for granite vision models

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add granite vision docs

Replace multimodal granite refs with granite vision

Add granite vision / llava next alias

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Use image url in granitevision example

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

---------

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

71cc8161

Fix more CI tests (#35661) · 8f1509a9
Arthur authored 4 months ago
```
add tooslow for the fat ones
```
8f1509a9

Fix uploading processors/tokenizers to WandB on train end (#35701) · 0a950e0b

Jack Roberts authored 4 months ago

* rename tokenizer to processing_class in WandbCallback.on_train_end

* rename tokenizer to processing_class in ClearMLCallback and DVCLiveCallback

0a950e0b

Fix GA loss for Deepspeed (#35808) · 4ec425ff

張庭瑜 authored 4 months ago

* Fix GA loss for Deepspeed

* Turn off loss scaling in DeepSpeed engine by scale_wrt_gas

* Add comment linking to PR

4ec425ff

add qwen2.5vl (#35569) · f3f6c865

ShuaiBai623 authored 4 months ago


* add qwen2.5vl

* fix

* pass check table

* add modular file

* fix style

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* padd copy check

* use modular

* fix

* fix

* fix

* update flashatt2&sdpa support_list

* Update docs/source/en/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update config

* update

* fix hf path

* rename Qwen2_5_VLVideosKwargs

* fix

* fix

* update

* excuted modular

* rollback init

* fix

* formated

* simpler init

* fix

* fix

* fix

* fix

* fix

* update docs

* fix

* fix

* update Qwen2VLRotaryEmbedding for yarn

* fix

---------

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: gewenbin0992 <gewenbin292@163.com>
Co-authored-by: gewenbin0992 <67409248+gewenbin0992@users.noreply.github.com>

f3f6c865

[Backend support] Allow `num_logits_to_keep` as Tensor + add flag (#35757) · d3af76df
Cyril Vallez authored 4 months ago
```
* support

* Update modeling_utils.py

* style

* most models

* Other models

* fix-copies

* tests + generation utils
```
d3af76df
[ `tests`] remove some flash attention class tests (#35817) · 8736e91a
Arthur authored 4 months ago
```
remove class from tests
```
8736e91a

22 Jan, 2025 8 commits
- Fix NoneType type as it requires py>=3.10 (#35843) · 2c3a44f9
  Marc Sun authored 4 months ago
```
fix type
```
  2c3a44f9
- Add PyTorch version check for FA backend on AMD GPUs (#35813) · fdcc62c8
  Mohit Sharma authored 4 months ago
```
Disable FA backend for SDPA on AMD GPUs (PyTorch < 2.4.1)
```
  fdcc62c8
- Fix compatibility issues when using auto_gptq with these older versions (#35830) · 3b977058
  LRL-ModelCloud authored 4 months ago
```
convert_model method of optimum only accepts a single nn.Module type model parameter for versions less than 1.23.99.
```
  3b977058
- [chat] docs fix (#35840) · 62bd8394
  Joao Gante authored 4 months ago
```
docs fix
```
  62bd8394
- Fix `head_dim` in config extracted from Gemma2 GGUF model (#35818) · 487e2f63
  Isotr0py authored 4 months ago
```
fix gemma2 head dim

Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
```
  487e2f63
- [Chat] Add Chat from TRL (#35714) · b3d67224
  Joao Gante authored 4 months ago
```
* tmp commit

* add working chat

* add docts

* docs 2

* use auto dtype by default
```
  b3d67224
- Fix : Nemotron tokenizer for GGUF format (#35836) · a7738f5a
  Mohamed Mekkouri authored 4 months ago
```
fix nemotron gguf
```
  a7738f5a
- [pipeline] missing import regarding assisted generation (#35752) · ec28957f
  Joao Gante authored 4 months ago
```
missing import
```
  ec28957f