- 28 Mar, 2025 11 commits
-
-
Arthur Zucker authored
-
Arthur Zucker authored
-
Arthur Zucker authored
-
Arthur Zucker authored
-
Arthur Zucker authored
-
Arthur Zucker authored
-
Arthur Zucker authored
-
Arthur Zucker authored
-
jp authored
* Add image_token_id and video_token_id handling in Llava processors * fix: image to video * fix: correct image and video token ID handling in Llava processors * fix: improve image and video token ID handling in Llava processors
-
Arthur Zucker authored
-
Manuel Faysse authored
* fix sdpa implementation * ruff * also modify 2_5 for consistency
-
- 27 Mar, 2025 21 commits
-
-
Perry Gibson authored
* bug: fully remove legacy cache from Llama * bug: fix CI issues * bug: update jetmoe model * bug: apply =check_modular_conversion.py= fix * bug: apply make fix-copies * bug: fix ruff * PR suggestions * Remove trailing commas in auto-gen files * Trivial new line removal
-
Arthur Zucker authored
-
Finn-Ole Höner authored
-
cyyever authored
-
Prem Kumar M authored
Replace split with jnp's split function for flax models (#36854)
-
cyyever authored
-
cyyever authored
Fix typing for None-able variables
-
cyyever authored
* Avoid unnecessary tensor copy in loss computing * Add type
-
湛露先生 authored
Signed-off-by:
zhanluxianshen <zhanluxianshen@163.com>
-
Joao Gante authored
-
eustlb authored
* fix fft_bin_width computation * update docstring + enforce correct params * update test with correct value * udpate test * update feature extractors for concerned models * update * make * udpate docstring * udpate docstring
-
Raushan Turganbay authored
* add audio from video * typos * delete print * comments
-
Pavel Iakubovskii authored
* Fixup * trigger
-
Sungyoon Jeong authored
* Optimize to_py_obj for python-native numeric lists and scalars * Fix bug that tuple is not converted to list * Try np.array for more robust type checking * Apply review and add tests for to_py_obj
-
jiqing-feng authored
* fix pegasus init weights Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix the rest of models Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix test Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix informer init Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * init weight before checking Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix roformer tests Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * fix roformer tests Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by:
jiqing-feng <jiqing.feng@intel.com>
-
Parteek authored
* Added conversion Script * Update src/transformers/models/depth_anything/convert_distill_any_depth_to_hf.py Co-authored-by:
Pavel Iakubovskii <qubvel@gmail.com> * Updated Conversion Script * Update src/transformers/models/depth_anything/convert_distill_any_depth_to_hf.py Co-authored-by:
Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by:
Pavel Iakubovskii <qubvel@gmail.com>
-
Mohamed Mekkouri authored
* skip fp8 linear * add capability check * format
-
hoshi-hiyouga authored
* Update optimization.py * Update optimization.py
-
Yih-Dar authored
* fix * fix * fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Kyle Sayers authored
support loading fp8 Signed-off-by:
Kyle Sayers <kylesayrs@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Michael Goin authored
-
- 26 Mar, 2025 8 commits
-
-
Abu Bakr Soliman authored
* push ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * update __init__ loading * set imports for ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * remove debugging logs * update init_weights method * remove custom initialization for ModernBertForQuestionAnswering * apply make fix-copies * apply make style * apply make fix-copies * append ModernBertForQuestionAnswering to the pipeline supported models * remove unused file * remove invalid autoload value * update en/model_doc/modernbert.md * apply make fixup command * make fixup * Update dummies * update usage tips for ModernBertForQuestionAnswering * update usage tips for ModernBertForQuestionAnswering * add init * add lint * add consistency * update init test * change text to trigger stuck text * use self.loss_function instead of custom loss By @Cyrilvallez Co-authored-by:
Cyril Vallez <cyril.vallez@gmail.com> * Update modeling_modernbert.py make comparable commit to even it out * Match whitespace * whitespace --------- Co-authored-by:
Matt <rocketknight1@gmail.com> Co-authored-by:
Orion Weller <wellerorion@gmail.com> Co-authored-by:
Orion Weller <31665361+orionw@users.noreply.github.com> Co-authored-by:
Cyril Vallez <cyril.vallez@gmail.com>
-
Yao Matrix authored
* fix transformers_cli relative import path issue Signed-off-by:
Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by:
Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by:
Yao, Matrix <matrix.yao@intel.com> Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
Steven Liu authored
add image
-
Arthur Zucker authored
temp fix for TP : some attention layers's FP8 scales are too small + shared is local colwise and anything is local if FP8 because weights are used
-
cyyever authored
* Remove deprecated training arguments * More fixes * More fixes * More fixes
-
Afanti authored
* chore: enhance code comments * chore: enhance code comments * chore: enhance code comments * chore: enhance code comments * chore: enhance code comments * chore: enhance code comments * chore: enhance code comments
-
Marc Sun authored
* fix learning rate log * fix lr log * add lr
-
Mohamed Mekkouri authored
fix
-