- 10 Apr, 2025 2 commits
- 09 Apr, 2025 16 commits
-
-
Matej Sirovatka authored
-
Wing Lian authored
-
Arthur authored
* debugging improvements * add debugging details * add more debugging details * debug more * the fix that did not get in * First fix flex * fix query offset * fix flex first * fix device mask creation for speed * small mask creation sdpa * Update flex_attention.py * remove chunked prefill from HybridChunkedCache * never seen such a fucked up merged * clean up layers + output * add summary json file * Efficient general cache * Update cache_utils.py * cleanup * fix? * fix! * oups typo * not everywhere * more fixes * revert unrelated changes * Fix but ugly for now -> should use pad instead * oups * re-initialize the cache * Use pad to simplify * style * correct slicing --------- Co-authored-by:
Pablo <pablo.montalvo.leroux@gmail.com> Co-authored-by:
Cyril Vallez <cyril.vallez@gmail.com>
-
Mohamed Mekkouri authored
* fix * keep fused * contiguous * rm print * update * update * rm print
-
DerekLiu35 authored
* update AwqQuantizer * fix style * add an arg to get_modules_to_not_convert to add get_keys_to_not_convert(model)
-
Marc Sun authored
-
Brayden Zhong authored
Apply torchfix to replace deprecated functions: `_pytree._register_pytree_node` and `torch.cpu.amp.autocast` (#37372) fix: apply torchfix
-
Sangyun_LEE (이상윤) authored
* add peft model in constant * add test * fix formating * make fixup execute * change code * check by self.task * add test * fixup test code * fix minor typo * fix pipeline test * apply maintainers reqests
-
DerekLiu35 authored
* initial draft * make documentation simpler * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * turn pros and cons into tables * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * add links to each quant method page * separate calibration vs no calibration methods * add calibration time estimates --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Mehant Kammakomati authored
Signed-off-by:
Mehant Kammakomati <mehant.kammakomati2@ibm.com>
-
Mehant Kammakomati authored
Signed-off-by:
Mehant Kammakomati <mehant.kammakomati2@ibm.com>
-
Mehant Kammakomati authored
Signed-off-by:
Mehant Kammakomati <mehant.kammakomati2@ibm.com>
-
Marc Sun authored
* update * create docker image * 03 * uninstall pytest as it conflits with transformers * wrong one * better * see which package depends on pytest * up * resintall * fix * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Arthur authored
* add changed * Revert "add changed" This reverts commit 0a0166a1fe80556115a49fbf0c2132de0f4f85c9. * update with NEW MODEL class called GLM4 * update * Update glm4.md * Name * style * fix copies * fixup test --------- Co-authored-by:
Yuxuan Zhang <2448370773@qq.com>
-
Jonas M. Kübler authored
fix conversion script no_rope_layers `no_rope_layers` should either be a list of NoPE layers or None, such that it is created in the config from the `no_rope_layer_interval` Co-authored-by:
Pedro Cuenca <pedro@huggingface.co>
-
Raushan Turganbay authored
* update composition flag usage * remove print * fix tests * actually fix * oh c'mon * now should be fixed right? * fix copies
-
- 08 Apr, 2025 14 commits
-
-
Jerry Zhang authored
* Preserve requires_grad in pre quantized model Summary: discovered this when running lm-eval for some models, current code will set requires_grad to True always Test Plan: lm_eval --model hf --model_args pretrained=jerryzh168/phi4-torchao-gguf-q4_k --tasks hellaswag --device cuda:0 --batch_size 8 Reviewers: Subscribers: Tasks: Tags: * ruff format --------- Co-authored-by:
Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
-
Matt authored
* More limited setup -> setupclass conversion * make fixup * Trigger tests * Fixup UDOP * Missed a spot * tearDown -> tearDownClass where appropriate * Couple more class fixes * Fixups for UDOP and VisionTextDualEncoder * Ignore errors when removing the tmpdir, in case it already got cleaned up somewhere * CLIP fixes * More correct classmethods * Wav2Vec2Bert fixes * More methods become static * More class methods * More class methods * Revert changes for integration tests / modeling files * Use a different tempdir for tests that actually write to it * Remove addClassCleanup and just use teardownclass * Remove changes in modeling files * Cleanup get_processor_dict() for got_ocr2 * Fix regression on Wav2Vec2BERT test that was masked by this before * Rework tests that modify the tmpdir * make fix-copies * revert clvp modeling test changes * Fix CLIP processor test * make fix-copies
-
KimmiShi authored
* fix(qwen): fix shape error when using tp * Update modeling_qwen2_vl.py --------- Co-authored-by:
shidongxing <shidongxing@pjlab.org.cn> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Jonathan Mamou authored
* initial commit * fix * fix style * set default to prune * add tests * comment * remove prune flag from generate * address Joao's comments * deprecate_kwarg * add doc * fix target_vocab_size * Update src/transformers/generation/candidate_generator.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * fix deprecated argument assistant_model_device --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
Joao Gante authored
-
Kerry authored
* Skip non-selected experts for mixtral and qwen2_moe * Fix: tensor tolist() * WIP: tokenization test * fix modular source of truth * nits --------- Co-authored-by:
Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Joao Gante authored
l4 + dynamic rope decorator
-
Ryan Mullins authored
* Set vision config to None for Gemma 1B conversion * Trigger tests --------- Co-authored-by:
Matt <rocketknight1@gmail.com>
-
Yih-Dar authored
Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Cyril Vallez authored
* cleaning * CIs
-
cyyever authored
Signed-off-by:
cyy <cyyever@outlook.com>
-
Minho Ryu authored
* convert float for yarn related arguments in rope_scaling * sort keys alphabetically --------- Co-authored-by:
ryan.agile <ryan.agile@kakaobrain.com>
-
Alex Brooks authored
* Expose blip2qformer * Add missing args to blip2 config
-
Arthur authored
* update for fixes * more fixes * fuxix dynamic cache? * style * fix both traiining and generating. Eager seems alright * dynamic does not work * fix most cases, use_cache or not, eager or not, no default cache (ex: not training but you want to get cache states) * should be final fixes * fix more stuff no cat * style * fix * style * final sytle * qualityeioiwhjfaopsejdpofqsdjkfjha;wesdhgfkjlqsw.denghjkaswednkgs * fix * revert
-
- 07 Apr, 2025 8 commits
-
-
salman authored
* adding compile kwarg for torch 2.6 * fixing dynamic * addressing comment * typo * Update src/transformers/integrations/flex_attention.py --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Wing Lian authored
* more fixes for post-training llama4 * use target_length instead of guearded past_key_values
-
Tugsbayasgalan Manlaibaatar authored
Co-authored-by:
Pavel Iakubovskii <qubvel@gmail.com>
-
logesh R authored
* Updated documentation for Donut model * Update docs/source/en/model_doc/donut.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/donut.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/donut.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/donut.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated code suggestions * Update docs/source/en/model_doc/donut.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated code suggestion to Align with the AutoModel example * Update docs/source/en/model_doc/donut.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated notes section included code examples * close hfoption block and indent --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Mohamed Mekkouri authored
* add bnb * style * update * add pre_quantized check
-
Parag Ekbote authored
* Update model card for jamba * Apply the suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review-2 Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * update model page. * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update as per code review. * Update docs/source/en/model_doc/jamba.md as per code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/jamba.md as per code review ` Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * update as per code review. * fixes --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Devesh Rahatekar authored
* Improved Model card for Gemma2 * Made changes in gemma2 as suggested * Made more changes in the doc (adding image, notes, closing hfoptions) * minor fixes --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Mohamed Mekkouri authored
clean up
-