- 16 Feb, 2024 4 commits
-
-
Raushan Turganbay authored
* fix max_length for inputs_embeds * make style * Update src/transformers/generation/utils.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Static Cache: load models with MQA or GQA (#28975) * fix * fix tests * fix tests * Update src/transformers/generation/utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more fixes * make style --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Lysandre Debut authored
-
Lysandre Debut authored
* Script & Manual edition * Update
-
Titus authored
-
- 15 Feb, 2024 8 commits
-
-
Sadra Barikbin authored
Update utils.py
-
Andrei Panferov authored
removed redundant field
-
amyeroberts authored
* Patch to skip currently failing tests * Whoops - wrong place
-
Younes Belkada authored
Update modeling_utils.py
-
amyeroberts authored
-
Donggeun Yu authored
* Update ms_deform_attn_cuda.cu * Update ms_deform_attn_cuda.cuh * Update modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_deformable_detr.py * python utils/check_copies.py --fix_and_overwrite * Fix dtype missmatch error * Update test_modeling_deformable_detr.py * Update test_modeling_deformable_detr.py * Update modeling_deformable_detr.py * Update modeling_deformable_detr.py --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Sangbum Daniel Choi authored
* enable graident checkpointing in DetaObjectDetection * fix missing part in original DETA * make style * make fix-copies * Revert "make fix-copies" This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358. * remove fix-copies of DetaDecoder * enable swin gradient checkpointing * fix gradient checkpointing in donut_swin * add tests for deta/swin/donut * Revert "fix gradient checkpointing in donut_swin" This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d. * change supports_gradient_checkpointing pipeline to PreTrainedModel * Revert "add tests for deta/swin/donut" This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b. * Revert "Revert "fix gradient checkpointing in donut_swin"" This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f. * Simple revert * enable deformable detr gradient checkpointing * add gradient in encoder * add cuda_custom_kernel function in MSDA * make style and fix input of DetaMSDA * make fix-copies * remove n_levels in input of DetaMSDA * minor changes * refactor custom_cuda_kernel like yoso format https://github.com/huggingface/transformers/blob/0507e69d34f8902422eb4977ec066dd6bef179a0/src/transformers/models/yoso/modeling_yoso.py#L53
-
Arthur authored
* wow I was scared! * fix everything * nits * make it BC? * add todo * nits * is_tracing should still be used to pass tracing tests * nits * some nits to make sure genration works with static cache uncompiled * fix sdpa * fix FA2 for both static and dynamic in a better way? * style * fix-copies * fix fix copies * fix sequential beam searcg * style * use `keys_to_ignore` * nit * correct dtype inference when init * :( the fix for FA2 is still not optimal to investigate! * styling * nits * nit * this might work better * add comment * Update src/transformers/models/llama/modeling_llama.py * "position_ids" -> "cache_position" * style * nit * Remove changes that should no be propagatted just yet * Apply suggestions from code review * Styling * make sure we raise an errir for static cache with FA2 enabled * move to the bottom of the signature * style * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py * nit in the name --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 14 Feb, 2024 15 commits
-
-
Arthur authored
* revert unrelated changes that got in * style
-
Younes Belkada authored
FIX [`Trainer` / tags]: Fix trainer + tags when users do not pass `"tags"` to `trainer.push_to_hub()` (#29009) * fix trainer tags * add test
-
Jiewen Tan authored
* Initial commit * Add guards for the global mesh * Address more comments * Move the dataloader into integrations/tpu.py * Fix linters * Make karg more explicitly * Remove the move device logic * Fix the CI * Fix linters * Re-enable checkpointing
-
amyeroberts authored
* Enable instantiating model with pretrained backbone weights * Clarify pretrained import * Use load_backbone instead * Add backbone_kwargs to config * Pass kwargs to constructors * Fix up * Input verification * Add tests * Tidy up * Update tests/utils/test_backbone_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
JB (Don) authored
* Add tie_weights() to LM heads and set bias in set_output_embeddings() The bias were not tied correctly in some LM heads, and this change should fix that. * Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin * Adding _tie_weights() to MPNet and Vilt * Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device * Rename to test name to save_load to match the convention
-
Merve Noyan authored
* Create mask_generation.md * add h1 * add to toctree * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update mask_generation.md * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update mask_generation.md * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Klaus Hipp <khipp@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Klaus Hipp <khipp@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Klaus Hipp <khipp@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md * Update mask_generation.md * Update mask_generation.md --------- Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Maria Khalusova <kafooster@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Klaus Hipp <khipp@users.noreply.github.com>
-
Raushan Turganbay authored
-
Zach Mueller authored
* Introduce acceleratorconfig dataclass * Extra second warn * Move import * Try moving import under is_accelerate_available * Quality * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Clean * Remove to_kwargs * Change version * Improve tests by including dispatch and split batches * Improve reliability * Update tests/trainer/test_trainer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixup tests and review nits * Make tests pass * protect import * Protect import * Empty-Commit * Make training_args.to_dict handle the AcceleratorConfig --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Huazhong Ji authored
Co-authored-by:
unit_test <test@unit.com>
-
amyeroberts authored
[`Doc`] Fix docbuilder - make `BackboneMixin` and `BackboneConfigMixin` importable from `utils`. (#29002) * Trigger doc build * Test removing references * Importable from utils * Trigger another run on a new commit for testing
-
Andrei Panferov authored
* aqlm init * calibration and dtypes * docs * Readme update * is_aqlm_available * Simpler link in docs * Test TODO real reference * init _import_structure fix * AqlmConfig autodoc * integration aqlm * integrations in tests * docstring fix * legacy typing * Less typings * More kernels information * Performance -> Accuracy * correct tests * remoced multi-gpu test * Update docs/source/en/quantization.md Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Brought back multi-gpu tests * Update src/transformers/integrations/aqlm.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/aqlm_integration/test_aqlm.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by:
Andrei Panferov <blacksamorez@yandex-team.ru> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
NielsRogge authored
* First draft * Add CLIPForImageClassification * Remove scripts * Fix doctests
-
Jonathan Tow authored
* Add `StableLM` * fix(model): re-create from `huggingface-cli add-new-model-like persimmon` * fix: re-add changes to address comments * fix(readme): add links to paper * fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref * fix(tests): re-add `@slow` decorator to integration tests * fix(tests): import slow... * fix(readme_hd): remove whitespace edit * fix(tokenizer): auto tokenizer tuple * skip doctests for `modeling_stablelm`
-
Younes Belkada authored
* enhance trainer + not support quant methods * remove all old logic * add version
-
Younes Belkada authored
ENH: Do not pass warning message in case `quantization_config` is in config but not passed as an arg (#28988) * Update auto.py * Update auto.py * Update src/transformers/quantizers/auto.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 13 Feb, 2024 5 commits
-
-
amyeroberts authored
* Update the processing so bbox coords are adjusted for padding * Just pad masks * Tidy up, add tests * Better tests * Fix yolos and mark as slow for pycocotols * Fix yolos - return_tensors * Clarify padding and normalization behaviour
-
Aditya Kane authored
* Update configuration_llama.py: fix broken link * [Nit] Explicit redirection not required Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Joao Gante authored
-
Hiroshi Matsuda authored
* add sudachi_projection option * Upgrade sudachipy>=0.6.8 * add a test case for sudachi_projection * Compatible with older versions of SudachiPy * make fixup * make style * error message for unidic download * revert jumanpp test cases * format options for sudachi_projection Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * format options for sudachi_split_mode and sudachi_dict_type * comment * add tests for full_tokenizer kwargs * pass projection arg directly * require_sudachi_projection * make style * revert upgrade sudachipy * check is_sudachi_projection_available() * revert dependency_version_table and bugfix * style format * simply raise ImportError Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * simply raise ImportError --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Arthur authored
* refactor with addedtokens decoder * style * get rid of lang code to id * style * keep some things for BC * update tests * add the mask token at the end of the vocab * nits * nits * fix final tests * style * nits * Update src/transformers/models/nllb/tokenization_nllb_fast.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits * style? * Update src/transformers/convert_slow_tokenizer.py * make it a tad bit more custom * ruff please stop Co-Authored by avidale <dale.david@mail.ru> * Update Co-authored-by: avidale <dale.david@mail.ru> * Update Co-authored-by:
avidale <dale.david@mail.ru> * oupts * ouft * nites * test * fix the remaining failing tests * style * fix failing test * ficx other test * temp dir + test the raw init * update test * style --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 12 Feb, 2024 8 commits
-
-
Klaus Hipp authored
* Translate contributing.md to German * Fix formatting issues in contributing.md * Address review comments * Fix capitalization
-
NielsRogge authored
Add video section
-
Klaus Hipp authored
Add language identifiers to code blocks
-
Yunxuan Xiao authored
clean up remaining tmp checkpoint dir Signed-off-by:
woshiyyya <xiaoyunxuan1998@gmail.com>
-
JB (Don) authored
Continue to initialize tied output_embeddings if it has a bias term The bias term is not tied, and so will need to be initialized accordingly.
-
Alexey Fadeev authored
Updated datasets requirements. Need a package version >= 2.14.0
-
Joao Gante authored
-
cmahmut authored
updated docstring with vqa alias
-