- 24 Mar, 2025 20 commits
-
-
Arthur Zucker authored
-
Arthur Zucker authored
-
Arthur Zucker authored
-
Arthur Zucker authored
-
gautham authored
* Added support for seed in `DataCollatorForWholeWordMask`, and also wrote tests. Also fixed bugs where the code hardcoded values for mask replacement probability and random replacement probability, instead of using the values passed by the user. * formatting issues * Used better way to generate seed in TF. Made tests more consistent.
-
Arthur Zucker authored
-
Arthur Zucker authored
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Pavel Iakubovskii authored
* Fix pytorch path for DeformableAttention * Apply for GroundingDino
-
Arthur Zucker authored
-
cyyever authored
Use pyupgrade --py39-plus to improve code
-
Ethan Knights authored
* Update trainer_pt_utils.py * update docstrings trainer_pt_utils.py for consistency * Update src/transformers/trainer_pt_utils.py --------- Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com>
-
omahs authored
* fix typos * fix typos * fix typos * fix typos
-
Yih-Dar authored
* fix * fix * fix * fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Mohamed Mekkouri authored
fix
-
Raushan Turganbay authored
* [chameleon] fix num image token check * embed after merging image token * skip this also * mistral require_read_token
-
Dmitry Rogozhkin authored
tests: fix asyncio.wait() usage for python>=3.7 Passing coroutings directly to `asyncio.wait()` is deprecated since python 3.8 and removed starting from python 3.11. Instead, it's required to explicitly wrap coroutine in the task with `asyncio.create_task()` which first appeared in python 3.7. We step into this issue running the following Transformers tests on a system with python 3.11 or later (for example, Ubuntu 24.04 has python 3.12): * `tests/trainer/test_trainer_distributed.py` * `tests/extended/test_trainer_ext.py` The error will be: ``` src/transformers/testing_utils.py:2380: in execute_subprocess_async result = loop.run_until_complete( /usr/lib/python3.12/asyncio/base_events.py:687: in run_until_complete return future.result() src/transformers/testing_utils.py:2368: in _stream_subprocess await asyncio.wait( ... E TypeError: Passing coroutines is forbidden, use tasks explicitly. ``` See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait See: https://docs.python.org/3.7/library/asyncio-task.html#asyncio.create_task Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com>
-
XinyuanTong authored
[fix] Update optional keys in _validate_yarn_parameters to include original_max_position_embeddings
-
Raushan Turganbay authored
fix
-
AbdelKarim ELJANDOUBI authored
* fix Gemma3 Config * fix config in modular gemm3
-
- 21 Mar, 2025 20 commits
-
-
Aritra Roy Gosthipaty authored
* Update installation.md * Update README.md
-
Steven Liu authored
* initial * fix * fix * update * fix * fixes * quantization * attention mask visualizer * multimodal * small changes * fix code samples
-
Yoni Gozlan authored
* process flattened images in fast image proc * process flattened images in low proc and add tests * remove print * add unbalanced batch test pas image proc * fix integration tests
-
Cyril Vallez authored
* better regex everywhere * fix * Update test_modeling_instructblip.py * BC with explanations this time otherwise it makes no sense at all * Update test_modeling_instructblip.py * style * CIs * update _keep_in_fp32_modules in blip2 * Update modeling_utils.py * Update modeling_utils.py * style * CIs * add check * trigger CIs * Update modeling_utils.py * trigger CIs
-
Sukriti Sharma authored
* move loss to generation class Signed-off-by:
Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * code cleanup Signed-off-by:
Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * test for resize and loss computation Signed-off-by:
Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix tests Signed-off-by:
Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix:test for resize and loss Signed-off-by:
Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix resize embedding mllama test Signed-off-by:
Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * review changes Signed-off-by:
Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by:
Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
-
Arthur Zucker authored
-
Raushan Turganbay authored
* fix * this wan't supposed to be here, revert * refine tests a bit more
-
Pablo Montalvo authored
fix attention mask dtype + outputs type
-
Daniël de Kok authored
* Use `deformable_detr` kernel from the Hub Remove the `deformable_detr` kernel from `kernels/` and use the pre-built kernel from the Hub instead. * Add license header * Add `kernels` as an extra `hub-kernels` Also add it to `testing`, so that the kernel replacement gets tested when using CUDA in CI.
-
Pablo Montalvo authored
tests expect greedy decoding
-
Pablo Montalvo authored
🔴 🔴 🔴 supersede paligemma forward to shift pos id indexing (#36859) * supersede paligemma forward to shift pos id indexing * fix prepare_inputs_ as well * fix modular error --------- Co-authored-by:Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Arthur Zucker authored
-
Joao Gante authored
-
sebbaur authored
* Make ViT Pooler configurable, so that it is possible to pick the activation function and the number of channels in the output * Add documentation and allow functions as activations (instead of just string) * formatting change * Use ACT2FN * Formatting change * Formatting changes * force pooler_act to be string * force pooler_act to be string * Add configs to OBJECTS_TO_IGNORE to make check_docstrings happy * Making the same change in ijepa to make check_modular_conversion happy * Add IJepaConfig to make CI happy * rename pooler_size to pooler_output_size as defined in the config * typo * revert change to ignore variable * Ran utils/check_docstrings.py --fix_and_overwrite * revert unrelated change * remove redundant defaults * rename self.act -> self.activation * tanh activation function in mapping
-
Afanti authored
* chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * fix: format codes * chore: fix copy mismatch issue * fix: format codes * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: restore previous words * chore: revert unexpected changes
-
regisss authored
-
Benjamin Bossan authored
The _fsdp_qlora_plugin_updates checks for LoraConfig but other PEFT methods can also support quantized models, e.g. VeRA. Therefore, the isinstance check is now looking for PeftConfig in general. Moreover, the fsdp_plugin variable may be undefined in the 2nd if condition, leading to an `UnboundLocalError` error. This is fixed by not assigning the variable at all. I checked for tests that may need updating but only found test_fsdp_config_transformers_auto_wrap associated with this change. AFAICT, this test does not cover the changed code, since the test does not start the training loop. Therefore, I haven't updated any tests. LMK if/how this fix should be tested. Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
Joao Gante authored
* no image * test * revert jax version updates * make fixup * update autodoc path for model_addition_debugger * shieldgemma2 * add missing pages to toctree
-
Raushan Turganbay authored
* fix mllama * update test * fix test
-