Commits · run_scheduled_ci_now_5 · 某某某 / transformers-new

03 Sep, 2024 2 commits
- don't update hf_hub · 80908c99
  ydshieh authored 9 months ago
  
  80908c99
- use new hf_hub · 061bdfee
  ydshieh authored 9 months ago
  
  061bdfee
30 Aug, 2024 9 commits
- try new cluster - debug 006 - noexist 2 · 6ab85162
  ydshieh authored 9 months ago
  
  6ab85162
- try new cluster - debug 006 - noexist · 07443203
  ydshieh authored 9 months ago
  
  07443203
- try new cluster - debug 006 - noexist · 867ac1c7
  ydshieh authored 9 months ago
  
  867ac1c7
- try new cluster - debug 005 - altclip - try delete cache · 6174857b
  ydshieh authored 9 months ago
  
  6174857b
- try new cluster - debug 004 - altclip · d26ae17c
  ydshieh authored 9 months ago
  
  d26ae17c
- try new cluster - debug 003 - delete 1 cache · d785b6d6
  ydshieh authored 9 months ago
  
  d785b6d6
- try new cluster - debug 002 · b6e60e12
  ydshieh authored 9 months ago
  
  b6e60e12
- try new cluster - debug 001 · c5f91e44
  ydshieh authored 9 months ago
  
  c5f91e44
- try new cluster - debug 001 · 22415e9a
  ydshieh authored 9 months ago
  
  22415e9a
28 Aug, 2024 1 commit
- try new cluster · 924ead22
  ydshieh authored 9 months ago
  
  924ead22
27 Aug, 2024 4 commits

trigger daily CI manually · abb454ab
ydshieh authored 10 months ago

abb454ab

disable scheduled daily CI temporarily (#33136) · 3806faa1

Yih-Dar authored 10 months ago


disable scheduled daily CI temporary

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

3806faa1

fix: multilingual midel convert to tflite get wrong token (#32079) · 7562366d

Aya authored 10 months ago


* fix: multilingual midel convert to tflite get wrong token

* fix: modify test_force_tokens_logits_processor the checking value as scores.dtype.min

---------

Co-authored-by: kent.sc.hung <kent.sc.hung@benq.com>
Co-authored-by: Aya <[kent831217@gmail.com]>

7562366d

fix: Fixed CodeGenTokenizationTest::test_truncation failing test (#32850) · 3bf6dd8a
Sai-Suraj-27 authored 10 months ago
```
* Fixed failing CodeGenTokenizationTest::test_truncation.

* [run_slow] Codegen

* [run_slow] codegen
```
3bf6dd8a

26 Aug, 2024 9 commits

Fixup py 38 type hints for mps friendly (#33128) · 9578c259
Zach Mueller authored 10 months ago
```
Fixup py 38
```
9578c259
quickfix documentation (#32566) · 26f043bd
Pablo Montalvo authored 10 months ago
```
* fix documentation

* update config
```
26f043bd
fix: Fixed `pydantic` required version in dockerfiles to make it compatible with DeepSpeed (#33105) · 35627729
Sai-Suraj-27 authored 10 months ago
```
Fixed pydantic required version in dockerfiles.
```
35627729

Add changes for uroman package to handle non-Roman characters (#32404) · a378a54a

Ritik Nandwal authored 10 months ago

* Add changes for uroman package to handle non-Roman characters

* Update docs for uroman changes

* Modifying error message to warning, for backward compatibility

* Update instruction for user to install uroman

* Update docs for uroman python version dependency and backward compatibility

* Update warning message for python version compatibility with uroman

* Refine docs

a378a54a

mps: add `isin_mps_friendly`, a wrapper function for `torch.isin` (#33099) · 72d4a3f9
Joao Gante authored 10 months ago

72d4a3f9
Test: add higher `atol` in `test_forward_with_num_logits_to_keep` (#33093) · 894d421e
Joao Gante authored 10 months ago

894d421e
CI: add torchvision to the consistency image (#32941) · 93e0e1a8
Joao Gante authored 10 months ago

93e0e1a8

support qwen2-vl (#32318) · 19e6e80e

Shijie authored 10 months ago


* support-qwen2-vl

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* hyphen->underscore

* make style

* add-flash2-tipd

* delete-tokenize=False

* remove-image_processor-in-init-file

* add-qwen2_vl-in-MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES

* format-doct

* support-Qwen2VLVisionConfig

* remove-standardize_cache_format

* fix-letter-varaibles

* remove-torch-in-image-processor

* remove-useless-docstring

* fix-one-letter-varaible-name

* change-block-name

* default-quick-gelu-in-vision

* remove-useless-doc

* use-preimplemented-flash-forward

* fix-doc

* fix-image-processing-doc

* fix-apply-rotary-embed

* fix-flash-attn-sliding-window

* refactor

* remove-default_template

* remove-reorder_cache

* simple-get-rope_deltas

* update-prepare_inputs_for_generation

* update-attention-mask

* update-rotary_seq_len

* remove-state

* kv_seq_length

* remove-warning

* _supports_static_cache

* remove-legacy-cache

* refactor

* fix-replace

* mrope-section-doc

* code-quality

* code-quality

* polish-doc

* fix-image-processing-test

* update readme

* Update qwen2_vl.md

* fix-test

* Update qwen2_vl.md

* nit

* processor-kwargs

* hard-code-norm_layer

* code-quality

* discard-pixel-values-in-gen

* fix-inconsistent-error-msg

* unify-image-video

* hidden_act

* add-docstring

* vision-encode-as-PreTrainedModel

* pixel-to-target-dtype

* update doc and low memoryvit

* format

* format

* channel-foramt

* fix vit_flashatt

* format

* inherit-Qwen2VLPreTrainedModel

* simplify

* format-test

* remove-one-line-func-in-image-processing

* avoid-one-line-reshape

* simplify-rotary_seq_len

* avoid-single-letter-variable

* no-for-loop-sdpa

* avoid-single-letter-variable

* remove-one-line-reshape

* remove-one-line-reshape

* remove-no-rope-in-vit-logic

* default-mrope

* add-copied-from

* more-docs-for-mrope

* polish-doc

* comment-and-link

* polish-doc

* single-letter-variables

* simplify-image-processing

* video->images

* kv_seq_len-update

* vision-rope-on-the-fly

* vision-eager-attention

* change-processor-order

---------

Co-authored-by: baishuai <baishuai.bs@alibaba-inc.com>
Co-authored-by: ShuaiBai623 <43326198+ShuaiBai623@users.noreply.github.com>

19e6e80e

Updated the custom_models.md changed cross_entropy code (#33118) · 8defc95d
S M Jishanul Islam authored 10 months ago

8defc95d

23 Aug, 2024 7 commits

Update Jinja docs with new functions and general cleanup (#33097) · 0a7af19f
Matt authored 10 months ago

0a7af19f

added doctring to SchedulerType class (#32898) · e3a5f35c

Arun Prakash A authored 10 months ago


* added doctring to SchedulerType class

* Remove trailing whitespace  src/transformers/trainer_utils.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fixup

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

e3a5f35c

DeviceGuard added to use Deformable Attention more safely on multi-GPU (#32910) · 1dbd9d36

Donggeun Yu authored 10 months ago


* Update modeling_deformable_detr.py

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update ms_deform_attn_cuda.cu

* Update modeling_deformable_detr.py

* Update modeling_deformable_detr.py

* [empty] this is a empty commit

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1dbd9d36

Enable some Jinja extensions and add datetime capabilities (#32684) · 371b9c14

Matt authored 10 months ago

* Add new Jinja features:

- Do extension
- Break/continue in loops
- Call strftime to get current datetime in any format

* Add new Jinja features:

- Do extension
- Break/continue in loops
- Call strftime to get current datetime in any format

* Fix strftime template

* Add template strip() just to be safe

* Remove the do extension to make porting easier, and also because it's the least useful

* Rename test

* strftime -> strftime_now

* Split test

* Update test to use strftime_now

* Refactor everything out into chat_template_utils

* Refactor everything out into chat_template_utils

* Refactor everything out into chat_template_utils

* Refactor everything out into chat_template_utils

* Refactor everything out into chat_template_utils

371b9c14

Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to Trainer (#32860) · adb91179

Jason (Siyu) Zhu authored 10 months ago


* add liger integration

* fix syntax

* fix import issue

* add trainer.md

* Use _apply_liger_kernel()

* Fixed log message

* Update docs/source/en/trainer.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update docs/source/en/trainer.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>

* Update src/transformers/trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>

* Update docs/source/en/trainer.md

Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>

* Fixed checkstyle and updated readme

* Added test

* Fixed checkstyle

* fix docstring

* rename use_liger to use_liger_kernel

* Trigger Build

* Added test

* add fix-copies

* Fixed copy inconsistencies

---------

Co-authored-by: shimizust <sshimizu@linkedin.com>
Co-authored-by: Steven Shimizu <shimizust@gmail.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>

adb91179

Forbid `PretrainedConfig` from saving `generate` parameters; Update... · 970a16ec

Joao Gante authored 10 months ago

Forbid `PretrainedConfig` from saving `generate` parameters; Update deprecations in `generate`-related code 🧹  (#32659)

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

970a16ec

Reducing memory usage: removing useless logits computation in generate() (#31292) · 22e6f145

Cyril Vallez authored 10 months ago

* Add .float() in all generation methods logit outputs

* Switch float-casting of logits to training only for main models

* Add `num_logits_to_keep` in Llama and add it by default in generate

* Apply style

* Add num_logits_to_keep as arg in prepare_input_for_generation

* Add support for Mistral

* Revert models except llama and mistral

* Fix default None value in _supports_num_logits_to_keep()

* Fix dimension of dummy input

* Add exception for prophetnet in _supports_num_logits_to_keep()

* Update _supports_num_logits_to_keep() to use inspect.signature()

* Add deprecation cycle + remove modification with pretraining_tp

* Apply style

* Add most used models

* Apply style

* Make `num_logits_to_keep` an int in all cases to remove if-else clause

* Add compile check for the warning

* Fix torch versions

* style

* Add gemma2

* Update warning version

* Add comment about .float operations in generation utils

* Add tests in GenerationTesterMixin and ModelTesterMixin

* Fix batch size for assisted decoding in tests

* fix small issues in test

* refacor test

* fix slicing removing dim issue

* Add nemotron support (should fix check-copy issue in CIs)

* Trigger new CIs

* Trigger new CIs

* Bump version

* Bump version in TODO

* Trigger CIs

* remove blank space

* Trigger CIs

22e6f145

22 Aug, 2024 8 commits

docs: fix outdated link to TF32 explanation (#32947) · d806fa3e
Stefano Fiorucci authored 10 months ago
```
fix outdated link
```
d806fa3e
Generate: Deprecate returning legacy cache by default; Handle `use_cache=False` (#32863) · a26de151
Joao Gante authored 10 months ago

a26de151

🌐

[i18n-KO] Translated `knowledge_distillation_for_image_classification.md to Korean" (#32334) · 09e6579d

Jinuk authored 10 months ago


* docs: ko: tasks/knowledge_distillation_for_image_classification.md

* feat: nmt draft

* fix: manual edits

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

---------

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

09e6579d

Fix regression on `Processor.save_pretrained` caused by #31691 (#32921) · 273c0afc
Franz Louis Cesista authored 10 months ago
```
fix save_pretrained
```
273c0afc
[run_slow] idefics2 (#32840) · 18199b34
Andrés Marafioti authored 10 months ago

18199b34
Gemma2: eager attention by default (#32865) · 975b988b
Joao Gante authored 10 months ago

975b988b

fix: (issue #32689) `AttributeError` raised when using `Trainer` with... · f1d822ba

Shaopeng Fu authored 10 months ago

fix: (issue #32689) `AttributeError` raised when using `Trainer` with `eval_on_start=True` in Jupyter Notebook. (#32849)

fix: `AttributeError` raised when using `Trainer` with `eval_on_start=True` in Jupyter Notebook.

f1d822ba

Add chat_template for tokenizer extracted from GGUF model (#32908) · ee8c01f8
Isotr0py authored 10 months ago
```
* add chat_template to gguf tokenizer

* add template through tokenizer config
```
ee8c01f8