- 11 Apr, 2025 12 commits
-
-
Matt authored
🚨 🚨 Allow saving and loading multiple "raw" chat template files (#36588) * Add saving in the new format (but no loading yet!) * Add saving in the new format (but no loading yet!) * A new approach to template files! * make fixup * make fixup, set correct dir * Some progress but need to rework for cached_file * Rework loading handling again * Small fixes * Looks like it's working now! * make fixup * Working! * make fixup * make fixup * Add TODO so I don't miss it * Cleaner control flow with one less indent * Copy the new logic to processing_utils as well * Proper support for dicts of templates * make fixup * define the file/dir names in a single place * Update the processor chat template reload test as well * Add processor loading of multiple templates * Flatten correctly to match tokenizers * Better support when files are empty sometimes * Stop creating those empty templates * Revert changes now we don't have empty templates * Revert changes now we don't have empty templates * Don't support separate template files on the legacy path * Rework/simplify loading code * Make sure it's always a chat_template key in chat_template.json * Update processor handling of multiple templates * Add a full save-loading test to the tokenizer tests as well * Correct un-flattening * New test was incorrect * Correct error/offline handling * Better exception handling * More error handling cleanup * Add skips for test failing on main * Reorder to fix errors * make fixup * clarify legacy processor file docs and location * Update src/transformers/processing_utils.py Co-authored-by:Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by:
Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by:
Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by:
Lucain <lucainp@gmail.com> * Rename to _jinja and _legacy * Stop saving multiple templates in the legacy format * Cleanup the processing code * Cleanup the processing code more * make fixup * make fixup * correct reformatting * Use correct dir name * Fix import location * Use save_jinja_files instead of save_raw_chat_template_files * Correct the test for saving multiple processor templates * Fix type hint * Update src/transformers/utils/hub.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Patch llava_onevision test * Update src/transformers/processing_utils.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Refactor chat template saving out into a separate function * Update tests for the new default * Don't do chat template saving logic when chat template isn't there * Ensure save_jinja_files is propagated to tokenizer correctly * Trigger tests * Update more tests to new default * Trigger tests --------- Co-authored-by:
Lucain <lucainp@gmail.com> Co-authored-by:
Julien Chaumond <julien@huggingface.co>
-
Mohamed Mekkouri authored
fix
-
Wing Lian authored
prevent creating a view/leaf param for low rank optimizers:
-
Bowen Bao authored
-
Raushan Turganbay authored
* clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot
-
Mohamed Mekkouri authored
* remove mlp for now * disable on docker
-
Lysandre Debut authored
Test fetcher
-
Arthur authored
* the fix that did not get in * add kernels * full graph does not work * simpler is better * Update src/transformers/integrations/hub_kernels.py Co-authored-by:
Daniël de Kok <me@danieldk.eu> * Update src/transformers/integrations/fbgemm_fp8.py Co-authored-by:
Daniël de Kok <me@danieldk.eu> * Update src/transformers/integrations/hub_kernels.py Co-authored-by:
Daniël de Kok <me@danieldk.eu> * fixup --------- Co-authored-by:
Daniël de Kok <me@danieldk.eu>
-
Arthur authored
* update `kernels` * oups * new pinned version
-
Lysandre Debut authored
* Reverse dependency map shouldn't be created when test_all is set * [test_all] Remove dummies * Modular fixes * Update utils/check_repo.py Co-authored-by:
Pablo Montalvo <39954772+molbap@users.noreply.github.com> * [test_all] Better docs * [test_all] Update src/transformers/commands/chat.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * [test_all] Remove deprecated AdaptiveEmbeddings from the tests * [test_all] Doc builder * [test_all] is_dummy * [test_all] Import utils * [test_all] Doc building should not require all deps --------- Co-authored-by:
Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
Donggeun Yu authored
Corrects the file path used to locate the CUDA kernels for the Deformable Attention module. This ensures that the kernels are loaded correctly, resolving potential errors during module initialization and usage.
-
Yao Matrix authored
* enhance require_deterministic_for_xpu Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by:
YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by:
YAO Matrix <matrix.yao@intel.com>
-
- 10 Apr, 2025 24 commits
-
-
cyyever authored
* Remove unneeded library version checks Signed-off-by:
cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by:
cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by:
cyy <cyyever@outlook.com> * Fix ROCm get_device_capability Signed-off-by:
cyy <cyyever@outlook.com> * Revert "Fix ROCm get_device_capability" This reverts commit 0e756434bd7e74ffd73de5500476072b096570a6. * Remove unnecessary check Signed-off-by:
cyy <cyyever@outlook.com> * Revert changes Signed-off-by:
cyy <cyyever@outlook.com> --------- Signed-off-by:
cyy <cyyever@outlook.com>
-
duanjunwen authored
-
Mohamed Mekkouri authored
add myself
-
Mehant Kammakomati authored
* feat: custom tp_size, new transformers tp interface Signed-off-by:
Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: review cmt - error when tp_plan not set for tp_size Signed-off-by:
Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: nit in docs Signed-off-by:
Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by:
Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Matej Sirovatka <54212263+S1ro1@users.noreply.github.com>
-
Terrasse authored
Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
Isotr0py authored
* add gemma3 gguf support Signed-off-by:
Isotr0py <2037008807@qq.com> * fix typo and add gguf limit Signed-off-by:
Isotr0py <2037008807@qq.com> * fix a typo Signed-off-by:
Isotr0py <2037008807@qq.com> * add vision conversion test Signed-off-by:
Isotr0py <2037008807@qq.com> * fix typos Signed-off-by:
Isotr0py <2037008807@qq.com> --------- Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
Mohamed Mekkouri authored
* initial commit * style * update * change approach attention * clean up * fix import * update * update * fix style * change method * attention * add mlp back * change name * update name * fix copies * fix config * fix
-
Mohamed Mekkouri authored
* nit * fix * fix
-
Mario Michael Krell authored
Previously, the identity function was used for dropped tokens with a weight from the expert that was not applied to the hidden states. This was misleading, because dropping means, the expert weight is zero. Instead of trying to fix the weight, we take an easier approach by initializing with zeros. Fixes issue https://github.com/huggingface/transformers/issues/37017
-
AbdelKarim ELJANDOUBI authored
* add classifier head to donut * add to transformers __init__ * add to auto model * fix typo * add loss for image classification * add checkpoint * remove no needed import * reoder import * format * consistency * add test of classifier * add doc * try ignore * update loss for all swin models
-
Mohamed Mekkouri authored
* fix * empty commit * empty * nit * fix maybe ?
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Raushan Turganbay authored
* fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now
-
Arthur authored
use `rms_norm_eps`
-
ivarflakstad authored
* Allow rocm systems to run these tests * Fix skipTest logic * Use get_device_properties to check system capabilities
-
Wang, Yi authored
* from_pretrained should handle xpu case Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * fmt Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
Yih-Dar authored
* send trainer/fsdd/deepspeed channel * update * change name * no . * final --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Arthur authored
* update `kernels` * oups
-
Wing Lian authored
-
Cyril Vallez authored
* first try (maybe race condition) * Update cache_utils.py * cannot avoid the race condition -> use 2 layers * Update cache_utils.py * Update cache_utils.py
-
Cyril Vallez authored
* add +1 * Update modeling_llama4.py
-
Mohamed Mekkouri authored
* explain tp_plan * add llama4 check * add clarification
-
Serge Panev authored
* Handle torch ver in flexattn * update
-
Manuel de Prada Corral authored
-
- 09 Apr, 2025 4 commits
-
-
Wing Lian authored
-
Arthur authored
* debugging improvements * add debugging details * add more debugging details * debug more * the fix that did not get in * First fix flex * fix query offset * fix flex first * fix device mask creation for speed * small mask creation sdpa * Update flex_attention.py * remove chunked prefill from HybridChunkedCache * never seen such a fucked up merged * clean up layers + output * add summary json file * Efficient general cache * Update cache_utils.py * cleanup * fix? * fix! * oups typo * not everywhere * more fixes * revert unrelated changes * Fix but ugly for now -> should use pad instead * oups * re-initialize the cache * Use pad to simplify * style * correct slicing --------- Co-authored-by:
Pablo <pablo.montalvo.leroux@gmail.com> Co-authored-by:
Cyril Vallez <cyril.vallez@gmail.com>
-
Mohamed Mekkouri authored
* fix * keep fused * contiguous * rm print * update * update * rm print
-
DerekLiu35 authored
* update AwqQuantizer * fix style * add an arg to get_modules_to_not_convert to add get_keys_to_not_convert(model)
-