Commits · e6f4a4ebbf970c12fe475be79a039f943c28f975 · 某某某 / transformers-new

31 Jan, 2025 1 commit
- [Moonshine] compute head_dim_padding at init (#35984) · e6f4a4eb
  eustlb authored 5 months ago
```
compute head_dim_padding at init
```
  e6f4a4eb
30 Jan, 2025 10 commits

Add support for nested images to LLava and VipLLava (#35558) · d7188ba6

Yoni Gozlan authored 5 months ago

* move make_flat_list_of_images and make_batched_videos to image_utils

* remove unnecessary is_vision_available

* move make_nested_list_of_images to image_utils

* fix fast pixtral image processor

* fix import mllama

* fix make_nested_list_of_images

* add tests

* convert 4d arrays/tensors to list

* add test_make_batched_videos

* add support nested batch of videos

* fix image processing qwen2vl

d7188ba6

Handle empty change indices in SAM's mask to rle conversion (#35665) · e4227eb4

Marcel authored 5 months ago

* Handle empty change indices in RLE conversion for masks

* [test] Add unit tests for RLE encoding of masks in SamProcessor

* [test] Update RLE conversion tests to use TensorFlow implementation

* [test] Fix formatting in SamProcessorTest according to check_code_quality action

* [test] Fix formatting in SamProcessorTest according to check_code_quality

* [test] Refactored rle test cases into one test and used tf tensors in tf test cases

* [test] Fix: removed self parameter from refactored methods

* [test] Removed nested methods in run-length encoding tests for PyTorch and TensorFlow

* [test] Added description to individual to run-length encoding tests for PyTorch and TensorFlow.

e4227eb4

not to use A100 for `benchmark.yml` (#35974) · 47bd4296
Yih-Dar authored 5 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
47bd4296

Support batching for UsefulSensors Moonshine (#35922) · 693328f2

Nat Jeffries authored 5 months ago


* Add support for attention masking in moonshine.

Tested against Open ASR Leaderboard with batch size 256.

* Update comments and ensure attention masks are passed everywhere.

Perform attention mask downsampling inside of moonshine forward call.

* Hide padding behind conditional. Fix encoder/decoder masking.

- Correctly pipe encoder attention mask into decoder
- Add correct scaling factor if one is not already provided.
- Fix formatting with ruff

* Add auto generated modeling_moonshine file.

* Update formatting in generated model file.

* Address review comments.

* Fix typo.

* Add `pad_head_dim_to_multiple_of` to moonshine config.

* Correct args order for MooonshineConfig.

* Update configuration moonshine too.

* Update src/transformers/models/moonshine/modular_moonshine.py

* Update src/transformers/models/moonshine/configuration_moonshine.py

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>

693328f2

Less flaky for `TimmBackboneModelTest::test_batching_equivalence` (#35971) · 57576818
Yih-Dar authored 5 months ago
```
* fix

* remove is_flaky

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
57576818
Revert p_mask to a list in DQA pipeline (#35964) · e320d554
Matt authored 5 months ago
```
* p_mask back to being a list

* Remove breakpoint
```
e320d554
Whisper: fix static cache CI (#35852) · 365fecb4
Raushan Turganbay authored 5 months ago
```
* fix

* remove overriden method

* small change
```
365fecb4

Pixtral: vectorize patch embeddings and enable tests (#35122) · 9725e5be

Raushan Turganbay authored 5 months ago

* initial POC

* - batch mix feature

* fix tests

* fix tests

* make style

* do not skip and instead fix tests

* update

* return back the test

* correct text with the correct ckpt

9725e5be

[bart] minor test fixes (#35965) · 8bc4c89e
Joao Gante authored 5 months ago
```
fix tests
```
8bc4c89e
Fix is_causal being a tensor (#35791) · 19f2ec80
Ilyas Moutawwakil authored 5 months ago
```
* fix is_causal being a tensor

* convert in sdpa attention only when  jit tracing
```
19f2ec80

29 Jan, 2025 11 commits

fix iterator overflow when gradient accumulation is 1 (#35960) · 7547f55e
Wing Lian authored 5 months ago

7547f55e
[generate] move max time tests (#35962) · 4d3b1076
Joao Gante authored 5 months ago
```
* move max time tests to their right place

* move test to the right place
```
4d3b1076
Update README.md (#35958) · 4d1d4896
Boris Malashenko authored 5 months ago
```
There should be a dot after pip install .
```
4d1d4896

[tests] further fix `Tester object has no attribute '_testMethodName'` (#35781) · f0ae65c1

Fanli Lin authored 5 months ago


* bug fix

* update with more cases

* more entries

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f0ae65c1

update docker file `transformers-pytorch-deepspeed-latest-gpu` (#35940) · ec7790f0
Yih-Dar authored 5 months ago
```
update docker file for deepspeed

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
ec7790f0

Trainer Refactor: Part 1 (#35567) · 5d257111

Zach Mueller authored 5 months ago


* start

* So far: 30%

* Small fix

* Continuing update

* Continuing

* Forgot to check if not None

* Continuing refactor

* Fix if else

* Fix ref

* Should make tests pass

* Keep grad norm same

* Document

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Err instead of info for logging RNG state error

* Seperate out to func

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

5d257111

Output dicts support in text generation pipeline (#35092) · 23d782ea

Jonas Rohw authored 5 months ago


* Support for generate_argument: return_dict_in_generate=True, instead of returning a error

* fix: call test with return_dict_in_generate=True

* fix: Only import torch if it is present

* update: Encapsulate output_dict changes

* fix: added back original comments

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

23d782ea

Fix flaky `test_assisted_decoding_matches_greedy_search` (#35951) · cf904048
Yih-Dar authored 5 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
cf904048
Update `squad_convert_example_to_features` to work with numpy v2 (#35955) · 692afa10
Yih-Dar authored 5 months ago
```
* Fix

* Fix

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
692afa10
Update `unwrap_and_save_reload_schedule` to use `weights_only=False` (#35952) · c600e89f
Yih-Dar authored 5 months ago
```
* fix

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
c600e89f
fix `test_generated_length_assisted_generation` (#34935) · 42c8ccfd
Nadav Timor authored 5 months ago
```
fix test_generated_length_assisted_generation
```
42c8ccfd

28 Jan, 2025 11 commits

use torch constraints to check if covariance is positive definite during mean resizing. (#35693) · ec7afad6
Mohamed Abu El-Nasr authored 5 months ago
```
* use torch constraints to check for psd

* small nit

* Small change

* Small change for the ci

* nit
```
ec7afad6
Remove INC notebook reference in documentation (#35936) · 61cbb723
Ella Charlaix authored 5 months ago
```
remove INC notebook in documentation
```
61cbb723
fix(FA): QKV not being casted to target_dtype for FA with dpo lora (#35834) · 478c4f2d
NanoCode012 authored 5 months ago
```
fix(FA): QKV not being casted to target_dtype due to dtype check
```
478c4f2d
Test: generate with `torch.compile(model.forward)` as a fast test (#34544) · ece8c424
Joao Gante authored 5 months ago

ece8c424

Fix TP initialization (#35860) · f48ecd76

Cyril Vallez authored 5 months ago

* fix tp

* Update modeling_utils.py

* style

* style

* Update test_tp.py

* Update test_tp.py

* style

* Update test_tp.py

* Update test_tp.py

* Update test_tp.py

* Update test_tp.py

f48ecd76

Qwen-2-5-VL: fix CI (#35935) · f85ba204
Raushan Turganbay authored 5 months ago
```
fix
```
f85ba204

Fix mask slicing for models with HybridCache (#35681) · 3f860dba

Cyril Vallez authored 5 months ago

* correctly slice

* check mask

* Update modular_gemma2.py

* fix

* add tests

* fix typo

* finally fix mask slicing

* Finally correctly slice in all cases!!

* add test for all attention functions

* small fix in tests

* trick around dynamo tracing issue

* last update

* more robust

* kwargs propagation

* make it explicit for checkpointing

* apply modular

3f860dba

Fix: loading DBRX back from saved path (#35728) · b764c20b
Raushan Turganbay authored 5 months ago
```
* fix dtype as dict for some models + add test

* add comment in tests
```
b764c20b

Add default TP plan for all models with backend support (#35870) · 3613f568

Cyril Vallez authored 5 months ago


* Add some tp plans!

* More tp plans!

* Add it in the comment

* style

* Update configuration_mixtral.py

* Update configuration_phi.py

* update the layout according to special archs

* fix mixtral

* style

* trigger CIs

* trigger CIs

* CIs

* olmo2

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

3613f568

Use rocm6.2 for AMD images (#35930) · 96625d85

ivarflakstad authored 5 months ago

* Use rocm6.2 as rocm6.3 only has nightly pytorch wheels atm

* Use stable wheel index for torch libs

96625d85

Remove `_supports_static_cache = True` for some model classes (#34975) · bf16a182
Yih-Dar authored 5 months ago
```
* use mask_fill

* remove comment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
bf16a182

27 Jan, 2025 7 commits

[docs] Fix Zamba2 (#35916) · 86d75646
Steven Liu authored 5 months ago
```
fix code block
```
86d75646
Close Zamba2Config code block (#35914) · 414658f9
Matt authored 5 months ago
```
* close zamba2 code block

* Add Zamba2 to toctree
```
414658f9

Fix the config class comparison for remote code models (#35592) · 63e9c941

Matt authored 5 months ago

* Fix the config class comparison when repeatedly saving and loading remote code models

* once again you have committed your debug breakpoint

63e9c941

[docs] uv install (#35821) · c550a1c6
Steven Liu authored 5 months ago
```
uv install
```
c550a1c6

Fix typing in audio_utils.chroma_filter_bank (#35888) · cd6591bf

CalOmnie authored 5 months ago


* Fix typing in audio_utils.chroma_filter_bank

* Apply make style

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>

cd6591bf

Split and clean up GGUF quantization tests (#35502) · e57b4599

Isotr0py authored 5 months ago


* clean up ggml test

Signed-off-by: Isotr0py <2037008807@qq.com>

* port remaining tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* further cleanup

Signed-off-by: Isotr0py <2037008807@qq.com>

* format

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix broken tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* update comment

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

* reorganize tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* k-quants use qwen2.5-0.5B

Signed-off-by: Isotr0py <2037008807@qq.com>

* move ggml tokenization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove dead code

Signed-off-by: Isotr0py <2037008807@qq.com>

* add assert for serilization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* use str for parameterize

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>

e57b4599

🚨

image-classification pipeline single-label and multi-label prob type... · 5c576f5a

Ross Wightman authored 5 months ago

🚨🚨🚨 image-classification pipeline single-label and multi-label prob type squashing fns (sigmoid vs softmax) are backwards (#35848)

single-label and multi-label prob type squashing fns (sigmoid vs softmax) were backwards for image-classification pipeline

5c576f5a