- 03 Sep, 2024 2 commits
- 30 Aug, 2024 9 commits
- 28 Aug, 2024 1 commit
-
-
ydshieh authored
-
- 27 Aug, 2024 4 commits
-
-
ydshieh authored
-
Yih-Dar authored
disable scheduled daily CI temporary Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Aya authored
* fix: multilingual midel convert to tflite get wrong token * fix: modify test_force_tokens_logits_processor the checking value as scores.dtype.min --------- Co-authored-by:
kent.sc.hung <kent.sc.hung@benq.com> Co-authored-by:
Aya <[kent831217@gmail.com]>
-
Sai-Suraj-27 authored
* Fixed failing CodeGenTokenizationTest::test_truncation. * [run_slow] Codegen * [run_slow] codegen
-
- 26 Aug, 2024 9 commits
-
-
Zach Mueller authored
Fixup py 38
-
Pablo Montalvo authored
* fix documentation * update config
-
Sai-Suraj-27 authored
Fixed pydantic required version in dockerfiles.
-
Ritik Nandwal authored
* Add changes for uroman package to handle non-Roman characters * Update docs for uroman changes * Modifying error message to warning, for backward compatibility * Update instruction for user to install uroman * Update docs for uroman python version dependency and backward compatibility * Update warning message for python version compatibility with uroman * Refine docs
-
Joao Gante authored
-
Joao Gante authored
-
Joao Gante authored
-
Shijie authored
* support-qwen2-vl * tidy * tidy * tidy * tidy * tidy * tidy * tidy * hyphen->underscore * make style * add-flash2-tipd * delete-tokenize=False * remove-image_processor-in-init-file * add-qwen2_vl-in-MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES * format-doct * support-Qwen2VLVisionConfig * remove-standardize_cache_format * fix-letter-varaibles * remove-torch-in-image-processor * remove-useless-docstring * fix-one-letter-varaible-name * change-block-name * default-quick-gelu-in-vision * remove-useless-doc * use-preimplemented-flash-forward * fix-doc * fix-image-processing-doc * fix-apply-rotary-embed * fix-flash-attn-sliding-window * refactor * remove-default_template * remove-reorder_cache * simple-get-rope_deltas * update-prepare_inputs_for_generation * update-attention-mask * update-rotary_seq_len * remove-state * kv_seq_length * remove-warning * _supports_static_cache * remove-legacy-cache * refactor * fix-replace * mrope-section-doc * code-quality * code-quality * polish-doc * fix-image-processing-test * update readme * Update qwen2_vl.md * fix-test * Update qwen2_vl.md * nit * processor-kwargs * hard-code-norm_layer * code-quality * discard-pixel-values-in-gen * fix-inconsistent-error-msg * unify-image-video * hidden_act * add-docstring * vision-encode-as-PreTrainedModel * pixel-to-target-dtype * update doc and low memoryvit * format * format * channel-foramt * fix vit_flashatt * format * inherit-Qwen2VLPreTrainedModel * simplify * format-test * remove-one-line-func-in-image-processing * avoid-one-line-reshape * simplify-rotary_seq_len * avoid-single-letter-variable * no-for-loop-sdpa * avoid-single-letter-variable * remove-one-line-reshape * remove-one-line-reshape * remove-no-rope-in-vit-logic * default-mrope * add-copied-from * more-docs-for-mrope * polish-doc * comment-and-link * polish-doc * single-letter-variables * simplify-image-processing * video->images * kv_seq_len-update * vision-rope-on-the-fly * vision-eager-attention * change-processor-order --------- Co-authored-by:
baishuai <baishuai.bs@alibaba-inc.com> Co-authored-by:
ShuaiBai623 <43326198+ShuaiBai623@users.noreply.github.com>
-
S M Jishanul Islam authored
-
- 23 Aug, 2024 7 commits
-
-
Matt authored
-
Arun Prakash A authored
* added doctring to SchedulerType class * Remove trailing whitespace src/transformers/trainer_utils.py Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * fixup --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Donggeun Yu authored
* Update modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update ms_deform_attn_cuda.cu * Update modeling_deformable_detr.py * Update modeling_deformable_detr.py * [empty] this is a empty commit --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Matt authored
* Add new Jinja features: - Do extension - Break/continue in loops - Call strftime to get current datetime in any format * Add new Jinja features: - Do extension - Break/continue in loops - Call strftime to get current datetime in any format * Fix strftime template * Add template strip() just to be safe * Remove the do extension to make porting easier, and also because it's the least useful * Rename test * strftime -> strftime_now * Split test * Update test to use strftime_now * Refactor everything out into chat_template_utils * Refactor everything out into chat_template_utils * Refactor everything out into chat_template_utils * Refactor everything out into chat_template_utils * Refactor everything out into chat_template_utils
-
Jason (Siyu) Zhu authored
* add liger integration * fix syntax * fix import issue * add trainer.md * Use _apply_liger_kernel() * Fixed log message * Update docs/source/en/trainer.md Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/trainer.md Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
Byron Hsu <byronhsu1230@gmail.com> * Update src/transformers/trainer.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
Byron Hsu <byronhsu1230@gmail.com> * Update docs/source/en/trainer.md Co-authored-by:
Byron Hsu <byronhsu1230@gmail.com> * Fixed checkstyle and updated readme * Added test * Fixed checkstyle * fix docstring * rename use_liger to use_liger_kernel * Trigger Build * Added test * add fix-copies * Fixed copy inconsistencies --------- Co-authored-by:
shimizust <sshimizu@linkedin.com> Co-authored-by:
Steven Shimizu <shimizust@gmail.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Byron Hsu <byronhsu1230@gmail.com>
-
Joao Gante authored
Forbid `PretrainedConfig` from saving `generate` parameters; Update deprecations in `generate`-related code 🧹 (#32659) Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Cyril Vallez authored
* Add .float() in all generation methods logit outputs * Switch float-casting of logits to training only for main models * Add `num_logits_to_keep` in Llama and add it by default in generate * Apply style * Add num_logits_to_keep as arg in prepare_input_for_generation * Add support for Mistral * Revert models except llama and mistral * Fix default None value in _supports_num_logits_to_keep() * Fix dimension of dummy input * Add exception for prophetnet in _supports_num_logits_to_keep() * Update _supports_num_logits_to_keep() to use inspect.signature() * Add deprecation cycle + remove modification with pretraining_tp * Apply style * Add most used models * Apply style * Make `num_logits_to_keep` an int in all cases to remove if-else clause * Add compile check for the warning * Fix torch versions * style * Add gemma2 * Update warning version * Add comment about .float operations in generation utils * Add tests in GenerationTesterMixin and ModelTesterMixin * Fix batch size for assisted decoding in tests * fix small issues in test * refacor test * fix slicing removing dim issue * Add nemotron support (should fix check-copy issue in CIs) * Trigger new CIs * Trigger new CIs * Bump version * Bump version in TODO * Trigger CIs * remove blank space * Trigger CIs
-
- 22 Aug, 2024 8 commits
-
-
Stefano Fiorucci authored
fix outdated link
-
Joao Gante authored
-
Jinuk authored
* docs: ko: tasks/knowledge_distillation_for_image_classification.md * feat: nmt draft * fix: manual edits * Apply suggestions from code review Co-authored-by:
Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review Co-authored-by:
Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review Co-authored-by:
Ahnjj_DEV <ahnjj.dev@gmail.com> * Apply suggestions from code review Co-authored-by:
Ahnjj_DEV <ahnjj.dev@gmail.com> * Apply suggestions from code review Co-authored-by:
Ahnjj_DEV <ahnjj.dev@gmail.com> * Apply suggestions from code review Co-authored-by:
Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review Co-authored-by:
Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review Co-authored-by:
Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review --------- Co-authored-by:
Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> Co-authored-by:
Ahnjj_DEV <ahnjj.dev@gmail.com>
-
Franz Louis Cesista authored
fix save_pretrained
-
Andrés Marafioti authored
-
Joao Gante authored
-
Shaopeng Fu authored
fix: (issue #32689) `AttributeError` raised when using `Trainer` with `eval_on_start=True` in Jupyter Notebook. (#32849) fix: `AttributeError` raised when using `Trainer` with `eval_on_start=True` in Jupyter Notebook.
-
Isotr0py authored
* add chat_template to gguf tokenizer * add template through tokenizer config
-