- 13 Feb, 2025 30 commits
-
-
Matt authored
-
Yih-Dar authored
fix my bad Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Mohamed Mekkouri authored
fix
-
Wizyoung authored
fix load key name for _load_rng_state under torch.cuda Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
Yih-Dar authored
* speeddddd * speeddddd * speeddddd * speeddddd --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Jiahao Li authored
* Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks * Make rotary_pos_emb optional & fix type * Adapt pre-computed cos/sin to Qwen2.5VL * More concise
-
மனோஜ்குமார் பழனிச்சாமி authored
* Remove traces of the progressbar * Use tqdm auto
-
Joao Gante authored
* tmp commit * move tests to the right class * remove ALL all_generative_model_classes = ... * skip tf roberta * skip InstructBlipForConditionalGenerationDecoderOnlyTest * videollava * reduce diff * reduce diff * remove on vlms * fix a few more * manual rebase bits * more manual rebase * remove all manual generative model class test entries * fix up to ernie * a few more removals * handle remaining cases * recurrent gemma * it's better here * make fixup * tf idefics is broken * tf bert + generate is broken * don't touch tf :() * don't touch tf :( * make fixup * better comments for test skips * revert tf changes * remove empty line removal * one more * missing one
-
Arthur authored
* add disable compile code * fix
-
Arthur authored
* fix training issues * Update Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
Elvir Crnčević authored
* Resolve vptq conflict * Rename spqr package to spqr_quant * Get rid of aqlm mention * Start working on tests * Resolve ruff code checks * Ruff format * Isort * Test updates * Add gpu tag * Rename to modules_to_not_convert * Config update * Docs and config update * Docs and config update * Update to update_torch_dtype * spqr config parameter validation * Ruff update * Apply ruff fixes * Test fixes * Ruff update * Mark tests as @slow again; Ruff; Docstring update * Ruff * Remove absolute path * Resolve typo * Remove redundandt log * Check accelerate/spqr availability * Ruff fix * Check if the config contains proper shapes * Ruff test * Documentation update * overview update * Ruff checks * Ruff code quality * Make style * Update docs/source/en/quantization/spqr.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update spqr.md * Enable gptqmodel (#35012) * gptqmodel Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by:
ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by:
Qubitium-ModelCloud <qubitium@modelcloud.ai> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass **kwargs * limit gptqmodel and optimum version Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix warning Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix version check Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * revert unrelated changes Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * enable gptqmodel tests Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix requires gptq Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass **kwargs * add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by:
LRL <lrl@lbx.dev> Co-authored-by:
Qubitium-ModelCloud <qubitium@modelcloud.ai> * fix format Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix format again Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (#7) * fix format and tests Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix memory check Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix device mismatch Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix result check Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * update tests Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * review: update docs (#10) * review: update docs (#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * update document (#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by:
Qubitium-ModelCloud <qubitium@modelcloud.ai> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> Co-authored-by:
LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by:
ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by:
Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by:
ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by:
LRL <lrl@lbx.dev> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix : Nemotron Processor in GGUF conversion (#35708) * fixing nemotron processor * make style * Update docs/source/en/quantization/spqr.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add missing TOC to doc --------- Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
jiqing-feng <jiqing.feng@intel.com> Co-authored-by:
LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by:
ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by:
Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by:
ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by:
LRL <lrl@lbx.dev> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
dependabot[bot] authored
Bump transformers in /examples/research_projects/adversarial Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 ) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
dependabot[bot] authored
Bump transformers in /examples/tensorflow/language-modeling-tpu Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 ) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Joao Gante authored
* revert inputs_embeds len * Update test_utils.py * make fixup
-
Mohamed Mekkouri authored
* fix * fix
-
Arthur authored
test was weird
-
Joao Gante authored
skip modular checks based on diff
-
Pavel Iakubovskii authored
* Remove loading custom kernels * Remove config param * Fixup
-
Mohamed Mekkouri authored
* first commit * adding kernels * fix create_quantized_param * fix quantization logic * end2end * fix style * fix imports * fix consistency * update * fix style * update * udpate after review * make style * update * update * fix * update * fix docstring * update * update after review * update * fix scheme * update * update * fix * update * fix docstring * add source * fix test --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
Lysandre Debut authored
* Helium documentation fixes * Update helium.md * Update helium.md * Update helium.md
-
Thomas Bauwens authored
* Add implementation for DataCollatorForMultipleChoice based on docs. * Add DataCollatorForMultipleChoice to import structure. * Remove custom DataCollatorForMultipleChoice implementations from example scripts. * Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean. * Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable. * Apply suggested changes and run make fixup. * fix copies, style and fixup * add missing documentation * nits * fix docstring * style * nits * isort --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Arthur Zucker <arthur.zucker@gmail.com>
-
CL-ModelCloud authored
* Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json * Update tokenization_utils_base.py * Update tokenization_utils_base.py * Update tokenization_utils_base.py * add tokenizer class type test * code review * code opt * fix bug * Update test_tokenization_fast.py * ruff check * make style * code opt * Update test_tokenization_fast.py --------- Co-authored-by:
Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by:
LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
-
Marco Edward Gorelli authored
-
gewenbin0992 authored
* qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 * fix * fix * fix * fix * add tests * fix test bugs * fix * fix failed tests * fix
-
Pavel Iakubovskii authored
* Trigger tests * [run-slow] beit, detr, dinov2, vit, textnet * Fix BEiT interpolate_pos_encoding * Fix DETR test * Update DINOv2 test * Fix textnet * Fix vit * Fix DPT * fix data2vec test * Fix textnet test * Update interpolation check * Fix ZoeDepth tests * Update interpolate embeddings for BEiT * Apply suggestions from code review
-
Lucain authored
-
Nerogar authored
fix gemma2 dtype issue when storing weights in float16 precision
-
Ben Schneider authored
* update env command to log deepspeed version * suppress deepspeed import logging * Add reminder to include configs to repro description in bug report. * make fixup * [WIP] update import utils for deepspeed * Change to using is_deepspeed_available() from integrations. * make fixup
-
Sambhav Dixit authored
* change order of unmasking of tokens * library import * class setup * test function * refactor * add commit message * test modified * explict initiliasation of weights + made model smaller * removed sepete testing file * fixup * fixup core * test attention mask with token types * tests fixup * removed PaliGemmaAttentionMaskTest class --------- Co-authored-by:
sambhavnoobcoder <indosambahv@gmail.com>
-
Benjamin Badger authored
* pixel input assignment revoked * double send * Update src/transformers/models/mllama/modeling_mllama.py Co-authored-by:
Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by:
Pavel Iakubovskii <qubvel@gmail.com>
-
- 12 Feb, 2025 10 commits
-
-
ivarflakstad authored
Add git lfs to AMD docker image
-
Yih-Dar authored
fix Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* fix * fix * update --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Zach Mueller authored
* Add more rigerous non-slow grad accum tests * Further nits * Re-add space * Readbility * Use tinystories instead * Revert transformer diff * tweak threshs
-
Ke Wen authored
Update doc about models' TP support
-
hsilva664 authored
* Adding option to save/reload scaler * Removing duplicate variable * Adding save/reload test * Small fixes on deterministic algorithm call * Moving LLM test to another file to isolate its environment * Moving back to old file and using subprocess to run test isolated * Reverting back accidental change * Reverting back accidental change
-
kang sheng authored
* Fix multi gpu loss sync condition, add doc and test * rename function and class * loss should not scale during inference * fix typo
-
zhuHQ authored
* Added APOLLO optimizer integration * fix comment * Remove redundancy: Modularize low-rank optimizer construction * Remove redundancy: Remove useless comment * Fix comment: Add typing * Fix comment: Rewrite apollo desc
-
Dmitry Rogozhkin authored
* milti-gpu: fix inputs_embeds + position_embeds Fixing the following errors in few models: ``` > hidden_states = inputs_embeds + pos_embeds E RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:2 and xpu:3! ``` Fixes: #35762 Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * multi-gpu: fix tensor device placements for various models Fixes: #35762 Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * Apply make fix-copies Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> --------- Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
-
Lucain authored
* Remove cache migration script * remove dummy move_cache
-