- 13 Feb, 2025 11 commits
-
-
Lysandre Debut authored
* Helium documentation fixes * Update helium.md * Update helium.md * Update helium.md
-
Thomas Bauwens authored
* Add implementation for DataCollatorForMultipleChoice based on docs. * Add DataCollatorForMultipleChoice to import structure. * Remove custom DataCollatorForMultipleChoice implementations from example scripts. * Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean. * Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable. * Apply suggested changes and run make fixup. * fix copies, style and fixup * add missing documentation * nits * fix docstring * style * nits * isort --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Arthur Zucker <arthur.zucker@gmail.com>
-
CL-ModelCloud authored
* Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json * Update tokenization_utils_base.py * Update tokenization_utils_base.py * Update tokenization_utils_base.py * add tokenizer class type test * code review * code opt * fix bug * Update test_tokenization_fast.py * ruff check * make style * code opt * Update test_tokenization_fast.py --------- Co-authored-by:
Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by:
LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
-
Marco Edward Gorelli authored
-
gewenbin0992 authored
* qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 * fix * fix * fix * fix * add tests * fix test bugs * fix * fix failed tests * fix
-
Pavel Iakubovskii authored
* Trigger tests * [run-slow] beit, detr, dinov2, vit, textnet * Fix BEiT interpolate_pos_encoding * Fix DETR test * Update DINOv2 test * Fix textnet * Fix vit * Fix DPT * fix data2vec test * Fix textnet test * Update interpolation check * Fix ZoeDepth tests * Update interpolate embeddings for BEiT * Apply suggestions from code review
-
Lucain authored
-
Nerogar authored
fix gemma2 dtype issue when storing weights in float16 precision
-
Ben Schneider authored
* update env command to log deepspeed version * suppress deepspeed import logging * Add reminder to include configs to repro description in bug report. * make fixup * [WIP] update import utils for deepspeed * Change to using is_deepspeed_available() from integrations. * make fixup
-
Sambhav Dixit authored
* change order of unmasking of tokens * library import * class setup * test function * refactor * add commit message * test modified * explict initiliasation of weights + made model smaller * removed sepete testing file * fixup * fixup core * test attention mask with token types * tests fixup * removed PaliGemmaAttentionMaskTest class --------- Co-authored-by:
sambhavnoobcoder <indosambahv@gmail.com>
-
Benjamin Badger authored
* pixel input assignment revoked * double send * Update src/transformers/models/mllama/modeling_mllama.py Co-authored-by:
Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by:
Pavel Iakubovskii <qubvel@gmail.com>
-
- 12 Feb, 2025 20 commits
-
-
ivarflakstad authored
Add git lfs to AMD docker image
-
Yih-Dar authored
fix Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* fix * fix * update --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Zach Mueller authored
* Add more rigerous non-slow grad accum tests * Further nits * Re-add space * Readbility * Use tinystories instead * Revert transformer diff * tweak threshs
-
Ke Wen authored
Update doc about models' TP support
-
hsilva664 authored
* Adding option to save/reload scaler * Removing duplicate variable * Adding save/reload test * Small fixes on deterministic algorithm call * Moving LLM test to another file to isolate its environment * Moving back to old file and using subprocess to run test isolated * Reverting back accidental change * Reverting back accidental change
-
kang sheng authored
* Fix multi gpu loss sync condition, add doc and test * rename function and class * loss should not scale during inference * fix typo
-
zhuHQ authored
* Added APOLLO optimizer integration * fix comment * Remove redundancy: Modularize low-rank optimizer construction * Remove redundancy: Remove useless comment * Fix comment: Add typing * Fix comment: Rewrite apollo desc
-
Dmitry Rogozhkin authored
* milti-gpu: fix inputs_embeds + position_embeds Fixing the following errors in few models: ``` > hidden_states = inputs_embeds + pos_embeds E RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:2 and xpu:3! ``` Fixes: #35762 Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * multi-gpu: fix tensor device placements for various models Fixes: #35762 Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * Apply make fix-copies Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> --------- Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
-
Lucain authored
* Remove cache migration script * remove dummy move_cache
-
dependabot[bot] authored
Bump cryptography from 43.0.1 to 44.0.1 in /examples/research_projects/decision_transformer (#36142) Bump cryptography in /examples/research_projects/decision_transformer Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.1 to 44.0.1. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/43.0.1...44.0.1 ) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
dependabot[bot] authored
Bump transformers in /examples/research_projects/vqgan-clip Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 ) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Leon Engländer authored
Replace In-Place Operations for Deberta and Deberta-V2
-
Joao Gante authored
rm deprecated/inoperational commands
-
Raushan Turganbay authored
* fix cached tests * fix some tests * fix pix2struct * fix
-
Sambhav Dixit authored
* Reload transformers fix form cache * add imports * add test fn for clearing import cache * ruff fix to core import logic * ruff fix to test file * fixup for imports * fixup for test * lru restore * test check * fix style changes * added documentation for usecase * fixing --------- Co-authored-by:
sambhavnoobcoder <indosambahv@gmail.com>
-
Joao Gante authored
* remove redundant test * delete another test * revert default max_length * (wrong place, moving)
-
MilkClouds authored
* feat: added warning to Trainer when label_names is not specified for PeftModel * Update trainer.py * feat: peft detectw ith `_is_peft_model` * Update src/transformers/trainer.py Co-authored-by:
Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Applied formatting in trainer.py --------- Co-authored-by:
Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
-
nhamanasu authored
* add RAdamScheduleFree optimizer * revert schedulefree version to the minimum requirement * refine is_schedulefree_available so that it can take min_version * refine documents --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Harry Mellor authored
* Add `base_model_pp_plan` to `PretrainedConfig` Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add `_pp_plan` to `PreTrainedModel` Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add both to Llama for testing Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix type error Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update to suggested schema Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * `_pp_plan` keys are not patterns Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Simplify schema Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix typing error Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update input name for Llama Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Aria Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Bamba Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Cohere 1 & 2 Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to diffllama and emu3 Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Gemma 1 & 2 Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to GLM and GPT NeoX Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Granite and Helium Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Mistral and Mixtral Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to OLMo 1 & 2 Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Phi and Phi 3 Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Starcoder 2 Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add enum for accessing inputs and outputs Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update type hints to use tuples Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> * Change outer list to tuple Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 11 Feb, 2025 9 commits
-
-
Fanli Lin authored
* update awq doc * Update docs/source/en/quantization/awq.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * add note for inference --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Fanli Lin authored
fix
-
Sambhav Dixit authored
* make output_dir optional * inintaied a basic testing module to validate and verify the changes * Test output_dir default to 'tmp_trainer' when unspecified. * test existing functionality of output_dir. * test that output dir only created when needed * final check * added doc string and changed the tmp_trainer to trainer_output * amke style fixes to test file. * another round of fixup --------- Co-authored-by:
sambhavnoobcoder <indosambahv@gmail.com>
-
Arthur authored
-
Pablo Montalvo authored
* make explicit gpu dep * [run-slow] bamba
-
Hicham Tala authored
* Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning * Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning * Remove deprecated warnings and eliminate `max_size` usage * Test use `int` as argument for `size` Add a test to ensure test can pass successfully and backward compatibility * The test pipelines still use `max_size` Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys * Reformatting * Reformatting * Revert "Reformatting" This reverts commit c3040acee75440357cffd1f60c9d29ff5b2744b8. * Revert "Reformatting" This reverts commit ac4522e5c9a02d2d0c298295026db68ea26453df. * Revert "The test pipelines still use `max_size`" This reverts commit eaed96f041ffc32459536e1524d87f7a12ddee29. * Revert "Test use `int` as argument for `size`" This reverts commit 1925ee38c7c5eabb11832316712df1d4ba8043d0. * Revert "Remove deprecated warnings and eliminate `max_size` usage" This reverts commit d8e7e6ff9025931468fc1f3827cda1fa391003d5. * Change version `4.26` to "a future version" * Reformatting * Revert "Change version `4.26` to "a future version"" This reverts commit 2b53f9e4
-
湛露先生 authored
Signed-off-by:
zhanluxianshen <zhanluxianshen@163.com>
-
Maxim Evtush authored
* Update tools.py * Update text_generation.py * Update question_answering.py
-
Pavel Iakubovskii authored
* Add is_torch_greater_or_equal test decorator * Add common test for torch.export * Fix bit * Fix focalnet * Fix imagegpt * Fix seggpt * Fix swin2sr * Enable torch.export test for vision models * Enable test for video models * Remove json * Enable for hiera * Enable for ijepa * Fix detr * Fic conditional_detr * Fix maskformer * Enable test maskformer * Fix test for deformable detr * Fix custom kernels for export in rt-detr and deformable-detr * Enable test for all DPT * Remove custom test for deformable detr * Simplify test to use only kwargs for export * Add comment * Move compile_compatible_method_lru_cache to utils * Fix beit export * Fix deformable detr * Fix copies data2vec<->beit * Fix typos, update test to work with dict * Add seed to the test * Enable test for vit_mae * Fix beit tests * [run-slow] beit, bit, conditional_detr, data2vec, deformable_detr, detr, focalnet, imagegpt, maskformer, rt_detr, seggpt, swin2sr * Add vitpose test * Add textnet test * Add dinov2 with registers * Update tests/test_modeling_common.py * Switch to torch.testing.assert_close * Fix masformer * Remove save-load from test * Add dab_detr * Add depth_pro * Fix and test RT-DETRv2 * Fix dab_detr
-