- 22 Oct, 2024 9 commits
-
-
Cyril Vallez authored
Fix FA2
-
HALLOUARD authored
* feat: [RT-DETR] Add onnx runtime config and fix onnx inference bug Optype (Where) * fix lint * use dtype istead of torch.float32 * add doc * remove onnx config * use dtype info * use tensor to fix lint
-
Marc Sun authored
update PR template
-
Matt authored
* Sync video classification pipeline * Add disclaimer
-
regisss authored
Fix korean doc _toctree.yml
-
Steven Liu authored
fix generationconfigs
-
Raushan Turganbay authored
* this worked in normal generation, needs more tests * fix almost all tests in t5 * nit * longt5, umt5, mt5 * style * udop, pix2struct * more models * fix some tests * fix onnx tests * tracing tests fixed * compile enabled and tested for t5 models * fix small bug in slow tests * [run-slow] t5 * uncomment * style * update with new generation refactoring * nit * fix copies * this is the fix, had to change t5 to fix copies * update * [run-slow] t5 * [run-slow] t5 * update * add test for encoder only T5 * clean up after rebase * fix pop2piano * add comment * style * fix copies after rebase * fix copies missed this one
-
Raushan Turganbay authored
* update * fix tests + fix copies * fix tests once more
-
Raushan Turganbay authored
* first try * codestyle * idefics2 is happy * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma * fix-copies * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo * blip-2 needs to init vision from config * when was this removed O_o * minor fix * tests * this way? * tests * model-agnostic code * codestyle * add tests for idefics * modify general test for VLMs * no generation test for vlm yet! * no generation test here also * wanr in VIT-SDPA if output attn * add more tests * user can pass dict as attn impl * repo consistency * update * muicgen * no prints * forgot speech enc-dec and clip * how many composite models we have? * musicgen meelody is same as mudicgen * +siglip * fix tests + add some more * remove idefics custom overriden code * make idefics2 automappable * nits * skip tests * doctests * Update src/transformers/models/idefics2/configuration_idefics2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/clip/test_modeling_clip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * major update, no need for automap * clean up * add FA2 test * more tests * style * skip tests * why did these started failing now? * no attributes for FA2 needed * one tiny test * address comment about FA2 false warning * style * add new models and resolve conflicts * fix copies * let it be this way for now, come back tomorrow to review * some more fixes * update * more updates * update * fix copies * style and tests * another big update * fix tests * fix tests * update * another update * fix tests * fix copies * fix tests --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 21 Oct, 2024 5 commits
-
-
Andrés Marafioti authored
The method `model_download_tool` was called `model_download_counter` earlier in the tutorial, this raises an error when following the code.
-
Matt authored
Add a section on writing generation prompts
-
Yoni Gozlan authored
* add fully functionning image_processing_detr_fast * Create tensors on the correct device * fix copies * fix doc * add tests equivalence cpu gpu * fix doc en * add relative imports and copied from * Fix copies and nit
-
Yoni Gozlan authored
* change import logging * fix CI
-
Raushan Turganbay authored
* don't rely on main input name * update
-
- 18 Oct, 2024 6 commits
-
-
Matthew Hoffman authored
* Only cast logits to float when computing loss Some misses from #31292 and #33902 * Move logits.float() into existing if labels is not None branch
-
Matt authored
* Trigger UDOP tests * Try forcing dtype in LayoutLMV3 * Do checks to see where uint8 is getting in * Do checks to see where uint8 is getting in * Found it! * Add .astype(np.float32) * Remove forced check, make fixup * Checking where exactly the uint8 creeps in * More checking on the uint8 issues * Manually upcast in rescale() * Remove UDOP trigger
-
Cyril Vallez authored
* Create modular_glm.py * Update modular_glm.py * Finalize architecture without all attentions * Add all attentions modules * Finalize modular * Update given last version * Last update * Finalize model * Finalize converter * Update convert_glm_weights_to_hf.py * style * style * Create __init__.py * Aff all inits * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Correct the rotary embeddings * Remove apply_residual_connection_post_layernorm (always false) * remove use_rms_norm (always true) * remove past_layer_norm (always true) * Update __init__.py * Update config and license * start adding tests and doc * Add doc + style * Update test_modeling_glm.py * Add dummies * Apply correct modeling * Refactor attention to follow llama * Update __init__.py * Update convert_glm_weights_to_hf.py * Correct bias * remove linear_bias and pdrop (never used) * apply modular * Simplify converter * remove dummies + style * add model_input_names * Add pretraining_tp to config for when eager attention is used * Update modular to remove all pretraining_tp * Update test_modeling_glm.py * Update the __all__ * Update __all__ * Update __init__.py * Update test_modeling_glm.py * add revisions * Add the correct repos and revisions * style * Update __init__.py * update exports * remove import of modular files * style * Apply Llama changes + refine converter * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * style * Use new modular converter * add pretrainedmodel to init * style * Update test_modeling_glm.py * Move config outside modular to please CI about docstrings * Add dummies to please CI * Update glm.md * Update glm.md
-
Lysandre Debut authored
* Informative * style * Informative 2 * Apply suggestions from code review Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com> --------- Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com>
-
byi8220 authored
* fix broken require_torch_up_to_2_accelerators * make style
-
Raushan Turganbay authored
fix
-
- 17 Oct, 2024 13 commits
-
-
Arthur authored
* fix copies, skip fx for llama * styke * re-fix copies * last? * style
-
Zach Mueller authored
* bookmark * Bookmark * Bookmark * Actually implement * Pass in kwarg explicitly * Adjust for if we do or don't have labels * Bookmark fix for od * bookmark * Fin * closer * Negate accelerate grad accum div * Fixup not training long enough * Add in compute_loss to take full model output * Document * compute_loss -> compute_loss_fn * Add a test * Refactor * Refactor * Uncomment tests * Update tests/trainer/test_trainer.py Co-authored-by:
Daniel Han <danielhanchen@gmail.com> --------- Co-authored-by:
Daniel Han <danielhanchen@gmail.com>
-
Pedro Cuenca authored
* Support Llama 3.2 conversion (text models) Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com> * Fix rope factor * Update chat template Initialize from a well-known template. The guidance is that the changes should be applied to 3.1 models as well. * Remove import * Support Llama Guard 3 conversion * Tokenizer details * Fix eos added token in base models * Fix generation config for base models * Specify revision for known tokenizers * Style * Reuse chat templates for older models * Improve error when converting tokenizer < Llama 3 --------- Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com>
-
Arthur authored
* quick fix * 3 losses * oups * fix * nits * check how it scales for special models * propagate for conditiona detr * propagate * propagate * propagate * fixes * propagate changes * update * fixup * nits * f string * fixes * more fixes * ? * nit * arg annoying f string * nits * grumble * update * nit * refactor * fix fetch tests * nit * nit * Update src/transformers/loss/loss_utils.py Co-authored-by:
Kashif Rasul <kashif.rasul@gmail.com> * update * nit * fixup * make pass * nits * port code to more models * fixup * ntis * arf * update * update * nits * update * fix * update * nits * fine * agjkfslga.jsdlkgjklas * nits * fix fx? * update * update * styel * fix imports * update * update * fixup to fix the torch fx? --------- Co-authored-by:
Kashif Rasul <kashif.rasul@gmail.com>
-
Joao Gante authored
* tmp * all visited * test all * Update src/transformers/models/moshi/modeling_moshi.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * delete another one :D --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
David Chanin authored
There's a bug on M1 macs with transformer >= 4.43.0 and torch >= 2.1.0, where if a model has tied embeddings, then the fast loading from #31771 causes a bus error when the model is actually run. This can be solved by disabling `_supports_param_buffer_assignment` for these models. More info in comments in #33357
-
Guang Yang authored
Llama3_1b and Llama2_7b are ExecuTorch compatible Co-authored-by:
Guang Yang <guangyang@fb.com>
-
Name authored
* removes decord dependency optimize np Revert "optimize" This reverts commit faa136b51ec4ec5858e5b0ae40eb7ef89a88b475. helpers as documentation pydoc missing keys * make fixup * require_av --------- Co-authored-by:
ad <hi@arnaudiaz.com>
-
Sebastian Schoennenbeck authored
* Strip final message * Do full strip instead of rstrip * Retrigger CI --------- Co-authored-by:
Matt <rocketknight1@gmail.com>
-
Christopher McGirr authored
* fix(Wav2Vec2ForCTC): torch export Resolves the issue described in #34022 by implementing the masking of the hidden states using an elementwise multiplication rather than indexing with assignment. The torch.export functionality seems to mark the tensor as frozen even though the update is legal. This change is a workaround for now to allow the export of the model as a FxGraph. Further investigation is required to find the real solution in pytorch. * [run-slow] hubert, unispeech, unispeech_sat, wav2vec2
-
Yih-Dar authored
* ping * fix * fix * fix * remove runner * update members --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Amos You authored
* change cpu offload warning for fp8 quantization * change cpu offload warning for fp4 quantization * change cpu offload variable name for fp8 and fp4 quantization
-
larin92 authored
Update 'trainer._get_eval_sampler()' to support 'group_by_length' argument Trainer didn't support grouping by length for evaluation, which made evaluation slow with 'eval_batch_size'>1. Updated 'trainer._get_eval_sampler()' method was based off of 'trainer._get_train_sampler()'.
-
- 16 Oct, 2024 7 commits
-
-
Reza Rahemtola authored
* fix(utils): Avoid using torch Tensor or PIL Image if not available * Trigger CI --------- Co-authored-by:
Matt <rocketknight1@gmail.com>
-
Yoni Gozlan authored
* nit fix wrong llava onevision name in tokenization auto * add qwen2_vl and fix style
-
steveepreston authored
Revert `accelerate` bug
-
alpertunga-bile authored
* auto-gptq requirement is removed & model is changed & tokenizer pad token is assigned * values func is changed with extensions & sequence key value bug is fixed * map key value check is added in ExtensionsTree * empty trimmed_ids bug is fixed * tail_id IndexError is fixed * empty trimmed_ids bug fix is updated for failed test * too much specific case for specific tokenizer is removed * input_ids check is updated * require auto-gptq import is removed * key error check is changed with empty list check * empty input_ids check is added * empty trimmed_ids fix is checked with numel function * usage change comments are added * test changes are commented * comment style and quality bugs are fixed * test comment style and quality bug is fixed
-
Yoach Lacombe authored
* clean mimi commit * some nits suggestions from Arthur * make fixup * first moshi WIP * converting weights working + configuration + generation configuration * finalize converting script - still missing tokenizer and FE and processor * fix saving model w/o default config * working generation * use GenerationMixin instead of inheriting * add delay pattern mask * fix right order: moshi codes then user codes * unconditional inputs + generation config * get rid of MoshiGenerationConfig * blank user inputs * update convert script:fix conversion, add tokenizer, feature extractor and bf16 * add and correct Auto classes * update modeling code, configuration and tests * make fixup * fix some copies * WIP: add integration tests * add dummy objects * propose better readiblity and code organisation * update tokenization tests * update docstrigns, eval and modeling * add .md * make fixup * add MoshiForConditionalGeneration to ignore Auto * revert mimi changes * re * further fix * Update moshi.md * correct md formating * move prepare causal mask to class * fix copies * fix depth decoder causal * fix and correct some tests * make style and update .md * correct config checkpoitn * Update tests/models/moshi/test_tokenization_moshi.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/moshi/test_tokenization_moshi.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * make style * Update src/transformers/models/moshi/__init__.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * change firm in copyrights * udpate config with nested dict * replace einsum * make style * change split to True * add back splt=False * remove tests in convert * Update tests/models/moshi/test_modeling_moshi.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * add default config repo + add model to FA2 docstrings * remove logits float * fix some tokenization tests and ignore some others * make style tokenization tests * update modeling with sliding window + update modeling tests * [run-slow] moshi * remove prepare for generation frol CausalLM * isort * remove copied from * ignore offload tests * update causal mask and prepare 4D mask aligned with recent changes * further test refine + add back prepare_inputs_for_generation for depth decoder * correct conditional use of prepare mask * update slow integration tests * fix multi-device forward * remove previous solution to device_map * save_load is flaky * fix generate multi-devices * fix device * move tensor to int --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Marc Sun <marc@huggingface.co>
-
Raushan Turganbay authored
* support embeds * use cache from config * style... * fix tests after rebase