- 28 Jan, 2025 5 commits
-
-
Celina Hanouti authored
-
Raushan Turganbay authored
* fix dtype as dict for some models + add test * add comment in tests
-
Cyril Vallez authored
* Add some tp plans! * More tp plans! * Add it in the comment * style * Update configuration_mixtral.py * Update configuration_phi.py * update the layout according to special archs * fix mixtral * style * trigger CIs * trigger CIs * CIs * olmo2 --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
ivarflakstad authored
* Use rocm6.2 as rocm6.3 only has nightly pytorch wheels atm * Use stable wheel index for torch libs
-
Yih-Dar authored
* use mask_fill * remove comment --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
- 27 Jan, 2025 11 commits
-
-
Steven Liu authored
fix code block
-
Matt authored
* close zamba2 code block * Add Zamba2 to toctree
-
Matt authored
* Fix the config class comparison when repeatedly saving and loading remote code models * once again you have committed your debug breakpoint
-
Steven Liu authored
uv install
-
CalOmnie authored
* Fix typing in audio_utils.chroma_filter_bank * Apply make style --------- Co-authored-by:
Louis Groux <louis.cal.groux@gmail.com>
-
Isotr0py authored
* clean up ggml test Signed-off-by:
Isotr0py <2037008807@qq.com> * port remaining tests Signed-off-by:
Isotr0py <2037008807@qq.com> * further cleanup Signed-off-by:
Isotr0py <2037008807@qq.com> * format Signed-off-by:
Isotr0py <2037008807@qq.com> * fix broken tests Signed-off-by:
Isotr0py <2037008807@qq.com> * update comment Signed-off-by:
Isotr0py <2037008807@qq.com> * fix Signed-off-by:
Isotr0py <2037008807@qq.com> * reorganize tests Signed-off-by:
Isotr0py <2037008807@qq.com> * k-quants use qwen2.5-0.5B Signed-off-by:
Isotr0py <2037008807@qq.com> * move ggml tokenization test Signed-off-by:
Isotr0py <2037008807@qq.com> * remove dead code Signed-off-by:
Isotr0py <2037008807@qq.com> * add assert for serilization test Signed-off-by:
Isotr0py <2037008807@qq.com> * use str for parameterize Signed-off-by:
Isotr0py <2037008807@qq.com> --------- Signed-off-by:
Isotr0py <2037008807@qq.com>
-
Ross Wightman authored
image-classification pipeline single-label and multi-label prob type squashing fns (sigmoid vs softmax) are backwards (#35848) single-label and multi-label prob type squashing fns (sigmoid vs softmax) were backwards for image-classification pipeline -
Mikhail Moskovchenko authored
* Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements
-
ivarflakstad authored
-
pglorio authored
* First commit * Finish model implementation * First commit * Finish model implementation * Register zamba2 * generated modeling and configuration * generated modeling and configuration * added hybrid cache * fix attention_mask in mamba * dropped unused loras * fix flash2 * config docstrings * fix config and fwd pass * make fixup fixes * text_modeling_zamba2 * small fixes * make fixup fixes * Fix modular model converter * added inheritances in modular, renamed zamba cache * modular rebase * new modular conversion * fix generated modeling file * fixed import for Zamba2RMSNormGated * modular file cleanup * make fixup and model tests * dropped inheritance for Zamba2PreTrainedModel * make fixup and unit tests * Add inheritance of rope from GemmaRotaryEmbedding * moved rope to model init * drop del self.self_attn and del self.feed_forward * fix tests * renamed lora -> adapter * rewrote adapter implementation * fixed tests * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Dropped adapter in-place sum * removed rope from attention init * updated rope * created get_layers method * make fixup fix * make fixup fixes * make fixup fixes * update to new attention standard * update to new attention standard * make fixup fixes * minor fixes * cache_position * removed cache_position postion_ids use_cache * remove config from modular * removed config from modular (2) * import apply_rotary_pos_emb from llama * fixed rope_kwargs * Instantiate cache in Zamba2Model * fix cache * fix @slow decorator * small fix in modular file * Update docs/source/en/model_doc/zamba2.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * several minor fixes * inherit mamba2decoder fwd and drop position_ids in mamba * removed docstrings from modular * reinstate zamba2 attention decoder fwd * use regex for tied keys * Revert "use regex for tied keys" This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5. * use regex for tied keys * add cpu to slow forward tests * dropped config.use_shared_mlp_adapter * Update docs/source/en/model_doc/zamba2.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * re-convert from modular --------- Co-authored-by:
root <root@node-2.us-southcentral1-a.compute.internal> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Sugendran Ganess authored
Have the DETR examples default to using the fast image processor
-
- 26 Jan, 2025 1 commit
-
-
Steven Liu authored
doctest fixes
-
- 24 Jan, 2025 4 commits
-
-
Yih-Dar authored
my bad Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Fanli Lin authored
add xpu device
-
Arthur authored
* use torch.testing.assertclose instead to get more details about error in cis * fix * style * test_all * revert for I bert * fixes and updates * more image processing fixes * more image processors * fix mamba and co * style * less strick * ok I won't be strict * skip and be done * up
-
Suyuchen Wang authored
* Fix Llava OneVision's token padding * Fix Llava next and Llava next video's token unpadding for consistency
-
- 23 Jan, 2025 11 commits
-
-
CalOmnie authored
* Fix test_pipelines_video_classification that was always failing * Update video pipeline docstring to reflect actual return type --------- Co-authored-by:
Louis Groux <louis.cal.groux@gmail.com>
-
baoyf4244 authored
fix apply_chat_template() padding choice to bool, str, PaddingStrategy and the docstring of pad()
-
SilverSoldier authored
-
Yosshi999 authored
Fix contamination and missing paragraph in translation
-
Alex Brooks authored
* Add multimodal granite support Signed-off-by:
Alex-Brooks <Alex.Brooks@ibm.com> Support multiple image feature layres Signed-off-by:
Alex-Brooks <Alex.Brooks@ibm.com> * Remove failing validation for visual encoders with no cls Signed-off-by:
Alex-Brooks <Alex.Brooks@ibm.com> * Update llava based models / configs to support list of feature layers Signed-off-by:
Alex-Brooks <Alex.Brooks@ibm.com> * Add tests for multiple feature layers Signed-off-by:
Alex-Brooks <Alex.Brooks@ibm.com> * Use conditional instead of except for misaligned feature shapes Signed-off-by:
Alex-Brooks <Alex.brooks@ibm.com> * crop cls from each hidden state Signed-off-by:
Alex-Brooks <Alex.brooks@ibm.com> * Fix formatting Signed-off-by:
Alex-Brooks <Alex.Brooks@ibm.com> * Support single vision feature int in vipllava Signed-off-by:
Alex-Brooks <Alex.Brooks@ibm.com> * Fix typo in vision feature selection strategy validation Signed-off-by:
Alex-Brooks <Alex.brooks@ibm.com> * Add tentative integration test for granite vision models Signed-off-by:
Alex-Brooks <Alex.brooks@ibm.com> * Add granite vision docs Replace multimodal granite refs with granite vision Add granite vision / llava next alias Signed-off-by:
Alex-Brooks <Alex.brooks@ibm.com> * Use image url in granitevision example Signed-off-by:
Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by:
Alex-Brooks <Alex.Brooks@ibm.com> Signed-off-by:
Alex-Brooks <Alex.brooks@ibm.com>
-
Arthur authored
add tooslow for the fat ones
-
Jack Roberts authored
* rename tokenizer to processing_class in WandbCallback.on_train_end * rename tokenizer to processing_class in ClearMLCallback and DVCLiveCallback
-
張庭瑜 authored
* Fix GA loss for Deepspeed * Turn off loss scaling in DeepSpeed engine by scale_wrt_gas * Add comment linking to PR
-
ShuaiBai623 authored
* add qwen2.5vl * fix * pass check table * add modular file * fix style * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by:
Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by:
Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by:
Minho Shim <6764739+minostauros@users.noreply.github.com> * padd copy check * use modular * fix * fix * fix * update flashatt2&sdpa support_list * Update docs/source/en/_toctree.yml Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * update config * update * fix hf path * rename Qwen2_5_VLVideosKwargs * fix * fix * update * excuted modular * rollback init * fix * formated * simpler init * fix * fix * fix * fix * fix * update docs * fix * fix * update Qwen2VLRotaryEmbedding for yarn * fix --------- Co-authored-by:
Minho Shim <6764739+minostauros@users.noreply.github.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
gewenbin0992 <gewenbin292@163.com> Co-authored-by:
gewenbin0992 <67409248+gewenbin0992@users.noreply.github.com>
-
Cyril Vallez authored
* support * Update modeling_utils.py * style * most models * Other models * fix-copies * tests + generation utils
-
Arthur authored
remove class from tests
-
- 22 Jan, 2025 8 commits
-
-
Marc Sun authored
fix type
-
Mohit Sharma authored
Disable FA backend for SDPA on AMD GPUs (PyTorch < 2.4.1)
-
LRL-ModelCloud authored
convert_model method of optimum only accepts a single nn.Module type model parameter for versions less than 1.23.99.
-
Joao Gante authored
docs fix
-
Isotr0py authored
fix gemma2 head dim Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
-
Joao Gante authored
* tmp commit * add working chat * add docts * docs 2 * use auto dtype by default
-
Mohamed Mekkouri authored
fix nemotron gguf
-
Joao Gante authored
missing import
-