- 06 Dec, 2024 1 commit
-
-
Ivar Flakstad authored
-
- 05 Dec, 2024 12 commits
-
-
ivarflakstad authored
-
Jonathan Mamou authored
* initial commit * update strategy * add tradeoff FPR TPR with cost * all probs * fix * fix * fix style * Update src/transformers/generation/configuration_utils.py shorter docstring Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * import guard * fix style * add is_sklearn_available condition * vectorizing to flatten the for-loop * fix style * disable adaptation for UAG * update doc * add TestAssistedCandidateGeneratorUpdateStrategy * fix style * protect import * fix style --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
Yih-Dar authored
* fix * Update src/transformers/testing_utils.py Co-authored-by:
Lucain <lucainp@gmail.com> * fix * fix * fix * fix * fix * fix * fix * fix * check * check * check * check * check * check * Update src/transformers/testing_utils.py Co-authored-by:
Lucain <lucainp@gmail.com> * Update src/transformers/testing_utils.py Co-authored-by:
Lucain <lucainp@gmail.com> * check * check * check * Final space * Final adjustment --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Lucain <lucainp@gmail.com>
-
Arthur authored
* fix * style * values * fix
-
Raushan Turganbay authored
this is correct now
-
João Marcelo authored
* first draft * add IJepaEmbeddings class * fix copy-from for IJepa model * add weight conversion script * update attention class names in IJepa model * style changes * Add push_to_hub option to convert_ijepa_checkpoint function * add initial tests for I-JEPA * minor style changes to conversion script * make fixup related * rename conversion script * Add I-JEPA to sdpa docs * minor fixes * adjust conversion script * update conversion script * adjust sdpa docs * [run_slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * formatting issues * adjust modeling to modular code * add IJepaModel to objects to ignore in docstring checks * [run-slow] ijepa * fix formatting issues * add usage instruction snippet to docs * change pos encoding, add checkpoint for doc * add verify logits for all models * [run-slow] ijepa * update docs to include image feature extraction instructions * remove pooling layer from IJepaModel in image classification class * [run-slow] ijepa * remove pooling layer from IJepaModel constructor * update docs * [run-slow] ijepa * [run-slow] ijepa * small changes * [run-slow] ijepa * style adjustments * update copyright in init file * adjust modular ijepa * [run-slow] ijepa
-
Mohamed Mekkouri authored
* deprecate quanto * fix style
-
Isotr0py authored
* fix tie_word_embeddings Signed-off-by:
Isotr0py <2037008807@qq.com> * fix Signed-off-by:
Isotr0py <2037008807@qq.com> --------- Signed-off-by:
Isotr0py <2037008807@qq.com>
-
Cyril Vallez authored
* Update convert_mistral_weights_to_hf.py * Update convert_mistral_weights_to_hf.py * Update convert_mistral_weights_to_hf.py
-
Arthur authored
bump to 0.21
-
eustlb authored
* handle single timestamp ending * include last timestamp token * handle single timestamp ending * avoid floating points arithm limitations * ensure float64 operations * new test * make fixup * make copies * handle edge case double tokens ending with different tokens * handle single timestamp ending * make fixup * handle conditioning on prev segments * fix * Update src/transformers/models/whisper/generation_whisper.py Co-authored-by:
Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * [run-slow] whisper * don't call item() to avoid unnecessary sync * fix --------- Co-authored-by:
Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> Co-authored-by:
Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
-
Yih-Dar authored
* fix * fix * fix * fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
- 04 Dec, 2024 7 commits
-
-
Steven Liu authored
* auto-dtype * feedback
-
Fanli Lin authored
* add commen to offloading * Update docs/source/en/kv_cache.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Cyril Vallez authored
* update modular and add examples * style * improve example comments * style * fix small logic issue for imports * fix relative order issue when files do not make sense * Improve comments * trigger CIs
-
Anton Vlasjuk authored
* gpt neox flex attention + refactor * some formatting * small fix on dropout * add assertion on flex attn test * flaky ci :( * add head mask support * style * handle dtype, replace torch where * fixup flex with output attns * code review and several other fixes * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * style * remove unnecessary comment * remove incorrect comment * make flex attn check more agnostic tor versions and centralized * change peft input dtype check to value since q and k could be affected by other stuff like RoPE * i forgor * flaky * code review and small fixes * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Vladislav Bronzov authored
* add base tp plan for qwen2 and qwen2moe * add parallel tp for starcoder2 * fix modular conversion * add infer dim for qkv states * Update src/transformers/models/qwen2_moe/configuration_qwen2_moe.py --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Ivar Flakstad authored
-
Tianshu Wang authored
Fix pad_token_tensor is None in warning
-
- 03 Dec, 2024 9 commits
-
-
Fanli Lin authored
replace cuda
-
Fanli Lin authored
* fix on xpu * [run_all] * add the missing import for Image lib * add more devices in comment * bug fix * replace cuda
-
wwwbai authored
* community translation * Update docs/source/zh/community.md Co-authored-by:
Isotr0py <2037008807@qq.com> --------- Co-authored-by:
Isotr0py <2037008807@qq.com>
-
Fanli Lin authored
fix code bug
-
Wang, Yi authored
* fix speecht5 failure issue in test_peft_gradient_checkpointing_enable_disable Signed-off-by:
Wang, Yi <yi.a.wang@intel.com> * [run-slow] speecht5 --------- Signed-off-by:
Wang, Yi <yi.a.wang@intel.com> Co-authored-by:
Matt <rocketknight1@gmail.com>
-
Ivar Flakstad authored
-
Yih-Dar authored
fix Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Aymeric Roucher authored
* Add monitoring to Agent and HfEngine children
-
Cyril Vallez authored
* compiled forward in PreTrainedModel * update * style * update name * trigger CIs * Add way to use custom compile args * style * switch parameterization to generation_config * Add to inits * Update configuration_utils.py * inits * style * docs * style * Update configuration_utils.py * back without dataclass for repo consistency * Update configuration_utils.py * style * style * style once again * add config serialization * update * true dataclass * trigger CIs * merge compile methods + remove serialization of compile config
-
- 02 Dec, 2024 11 commits
-
-
wwwbai authored
* bertology translation * Update docs/source/zh/_toctree.yml Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/zh/bertology.md Co-authored-by:
blueingman <15329507600@163.com> * Update docs/source/zh/bertology.md Co-authored-by:
blueingman <15329507600@163.com> * Update docs/source/zh/bertology.md Co-authored-by:
Isotr0py <2037008807@qq.com> * Update docs/source/zh/bertology.md Co-authored-by:
Isotr0py <2037008807@qq.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
blueingman <15329507600@163.com> Co-authored-by:
Isotr0py <2037008807@qq.com>
-
Fanli Lin authored
* add the missing import for Image lib * add more devices in comment * bug fix
-
Ahmed Almaghz authored
* Add docs/source/ar/notebooks.md to Add_docs_source_ar_notebooks.md * Update notebooks.md * Update _toctree.yml
-
secrettoad authored
-
Henry Hyeonmok Ko authored
* Fixed typo in multi gpu docs and OLMoE version * Fixed typos in docs for agents, agents advanced, knowledge distillation, and image feature extraction * Fixed incorrect usage of model.image_guided_detection in zero shot object detection docs
-
Dmitry Rogozhkin authored
* Use torch.nn.attention.sdpa_kernel instead of deprecated torch.backends.cuda.sdp_kernel Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * Fix test_eager_matches_sdpa_inference for XPU backend As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH which is implemented on PyTorch level using aten operators and is device agnostic with respect to implementation of each aten operator. Thus, we can reuse CUDA (or CPU) MATH weights for XPU. Fixes: #34888 Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * Use torch.amp.autocast instead of deprecated torch.cuda.amp.autocast in nemotron Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> --------- Signed-off-by:
Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
-
Jacky Lee authored
* feat: add gemma2 type hints * fix: mask is optional
-
Bojun Feng authored
fix typos
-
milesial authored
mllama encoder memory optimization
-
Weize Chen authored
* fix variable undefined bug when return_tensors is not specified in llava processor * improve readability
-
Joshua Lochner authored
* Only cast `cu_seqlens` when tracing * Formatting
-