Commits · 618697ef5363ed7f990e8ca4b45a2a13f39ce9e8 · zhusg / transformers-new

13 Mar, 2023 21 commits

[deepspeed docs] Activation Checkpointing (#22099) · 618697ef

Stas Bekman authored 2 years ago


* [deepspeed docs] Activation Checkpointing

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update deepspeed.mdx

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

618697ef

[trainer] fix bug in grad accum with multiple epochs (#22098) · 5b85add7
Stas Bekman authored 2 years ago
```
* [trainer] fix bug in grad accum

* comment out debug

* fix one-off

* rename counter
```
5b85add7
Enforce same behavior as PyTorch 2.0 for older versions (#22136) · 1c801d65
Sylvain Gugger authored 2 years ago

1c801d65
Trainer: let generate pick its inputs (#22108) · e16cbe88
Joao Gante authored 2 years ago
```
* Let generate pick its inputs

* fix squad seq2seq example
```
e16cbe88

[`Whiper`] add `get_input_embeddings` to `WhisperForAudioClassification` (#22133) · d979cf6e

Younes Belkada authored 2 years ago


* add `get_input_embeddings` to `WhisperForAudioClassification`

* add common tests

* fix another common test

* Update tests/models/whisper/test_modeling_whisper.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix style

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d979cf6e

Update configuration_align.py (projected_dim=640) (#22139) · 98797237
bishmdl76 authored 2 years ago
```
Update configuration_align.py

updated projected_dim=640 from 512 in arguments of AlignConfig
```
98797237
Add a new script to check model testers' config (#22063) · 54ee56b1
Yih-Dar authored 2 years ago
```
* Add script

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
54ee56b1
Adding Type Hints to TF_Pegasus model (#21941) · a096eaca
mollerup23 authored 2 years ago
```
* Adding Type Hints to TF_Pegasus model

* Updated some parameters per maintainer comments
```
a096eaca
Fix doc link for MGP-STR (#22138) · 6cb5132a
Sylvain Gugger authored 2 years ago

6cb5132a

Zero-shot image classification task guide (#22132) · 8def252d

Maria Khalusova authored 2 years ago


* WIP

* WIP

* manual inference example

* make style

* Apply suggestions from code review

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

---------

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

8def252d

Fix gradient checkpointing bug in trocr (#22126) · e61081e7

Karim Foda authored 2 years ago


* Fix gradient checkpointing bug in trocr

* Fix format

* Update src/transformers/models/trocr/modeling_trocr.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

e61081e7

Fix gradient checkpointing bug in LongT5 (#22130) · ef74e7e7
Karim Foda authored 2 years ago

ef74e7e7
Fix gradient checkpointing bug in xmod (#22129) · c1db6a3b
Karim Foda authored 2 years ago

c1db6a3b
[`Blip2`] skip accelerate test (#22124) · 6652e7da
Younes Belkada authored 2 years ago
```
skip accelerate test
```
6652e7da
Added big_models.mdx italian translation #17600 (#22115) · dd3a0580
Nicola Procopio authored 2 years ago
```
* updated toctree

* italian translation big_model.mdx

* italian translation big_models
```
dd3a0580
Fix gradient checkpointing bug in xlm_roberta_xl (#22128) · 0768c5e2
Karim Foda authored 2 years ago

0768c5e2
Fix gradient checkpointing bug in Trajectory Transformer (#22125) · 4c14c1f4
Karim Foda authored 2 years ago

4c14c1f4
Fix gradient checkpointing bug in xglm (#22127) · d0876a09
Karim Foda authored 2 years ago

d0876a09
Add pr_checks.mdx Italian translation (#17459) (#22116) · 0c883766
Alex Calabrese authored 2 years ago
```
* Add pr_checks.mdx Italian translation (#17459)

* Updated pr_checks.mdx Italian translation (#17459)
```
0c883766

add new model of MGP-STR (#21418) · 102b5ff4

wangpeng authored 2 years ago


* add new model of MGP-STR

* fix the check failings

* remove torch and numpy from mgp_tokenization

* remove unused import from modeling_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str.py

* add test_processing_mgp_str

* add test_processing_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str and add softmax outs to model

* rm test_processing_mgp_str and add softmax outs to model

* rewrite the code of mgp-str according to PR suggestions

* rewrite the code of mgp-str according to PR suggestions

* add new model of MGP-STR

* fix the check failings

* remove torch and numpy from mgp_tokenization

* remove unused import from modeling_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str.py

* add test_processing_mgp_str

* add test_processing_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str and add softmax outs to model

* rewrite the code of mgp-str according to PR suggestions

* rewrite the code of mgp-str according to PR suggestions

* remove representation_size from MGPSTRConfig

* reformat configuration_mgp_str.py

* format test_processor_mgp_str.py

* add test for tokenizer and complete model/processer test and model file

* rm Unnecessary tupple in modeling_mgp_str

* reduce hidden_size/layers/label_size in test_model

* add integration tests and change MGPSTR to Mgpstr

* add test for logit values

* reformat test model file

---------

Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>

102b5ff4

Add AutoModelForZeroShotImageClassification (#22087) · 32e3466d
Alara Dirik authored 2 years ago
```
Adds AutoModelForZeroShotImageClassification to transformers
```
32e3466d

11 Mar, 2023 1 commit
- [Whisper] Remove embed_tokens from encoder docstring (#21996) · b90fbc7e
  Sanchit Gandhi authored 2 years ago
```
* [Whisper] Remove embed_tokens from encoder docstring

* new line to retrigger CI

* remove new line
```
  b90fbc7e
10 Mar, 2023 11 commits
- Revert "[GPT2] Propose fix for #21080" (#22093) · 2f320661
  Yih-Dar authored 2 years ago
```
Revert "[GPT2] Propose fix for #21080 (#21853)" to avoid CI failure

This reverts commit a3fef89b.
```
  2f320661
- Fix imports of TF MobileViT (#22065) · 499770c0
  Sylvain Gugger authored 2 years ago
```
* Fix imports of TF MobileViT

* Fix copies
```
  499770c0
- GPT-J specific half precision on CPU note (#22086) · bdec2768
  Maria Khalusova authored 2 years ago
```
* re: #21989

* update re: #21989

* removed cpu option

* make style
```
  bdec2768
- handle numpy inputs in whole word mask data collator (#22032) · 2f4cdd97
  Dean Wyatte authored 2 years ago
  
  2f4cdd97
- Fix hint in src/transformers/modeling_utils.py (#22074) · a70da86b
  J-shang authored 2 years ago
```
fix hint
```
  a70da86b
- Fix gradient checkpointing bug in Speecht5 (#22080) · 419d979f
  Karim Foda authored 2 years ago
```
* Fix gradient checkpointing bug in Speecht5

* Update modeling_speech_to_text.py

* Update src/transformers/models/speech_to_text/modeling_speech_to_text.py

* Fix change errors

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
```
  419d979f
- Generate - Fix broken documentation links (#22078) · 7014fc36
  Joao Gante authored 2 years ago
```
fix broken links
```
  7014fc36
- Fix small typo in flan-ul2.mdx (#22068) · ade26bf9
  Kevin Jiang authored 2 years ago
```
* Update flan-ul2.mdx

* Update flan-ul2.mdx
```
  ade26bf9
- [GPT2] Propose fix for #21080 (#21853) · a3fef89b
  Arthur authored 2 years ago
```
* Make sure position ids are masked

* test that padded input produce the same results

* fix failing tests

* fixup

* fix batch test
```
  a3fef89b
- Fix gradient checkpointing bug in switch transformer (#22081) · eee195b3
  Karim Foda authored 2 years ago
  
  eee195b3
- Fix gradient checkpointing bug in Speech2Text (#22079) · b9273353
  Karim Foda authored 2 years ago
```
* Fix gradient checkpointing bug in Speech2Text

* Update modeling_speech_to_text.py

* Update modeling_speech_to_text_2.py
```
  b9273353
09 Mar, 2023 7 commits
- Add a progress bar for the total download of shards (#22062) · a9bd5df1
  Sylvain Gugger authored 2 years ago
```
* Add a progress bar for the total download of shards

* Check for no cache at all

* Fix check
```
  a9bd5df1
- Fix case when using --gradient_accumulation_steps with DDP disabled. (#22007) · 1a5fc300
  aws-sangeetha authored 2 years ago
```
Co-authored-by: EC2 Default User <ec2-user@ip-172-31-42-72.us-west-2.compute.internal>
```
  1a5fc300
- Update tiny model creation script (#22058) · 6d9031f2
  Yih-Dar authored 2 years ago
```
Update the script

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  6d9031f2
- Add setters by type of args to TrainingArguments (#21570) · 7a2b915e
  Sylvain Gugger authored 2 years ago
```
* Add setters by type of args to TrainingArguments

* Define more setters
```
  7a2b915e
- Skip 3 tests for `WhisperEncoderModelTest` (#22060) · ab81d31d
  Yih-Dar authored 2 years ago
```
* skip 3 tests

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  ab81d31d
- Edit the docstring of `image_processing_donut` to match code (#22033) · 8434cb87
  Jiali Mei authored 2 years ago
```
* Edit the docstring of `image_processing_donut` to match code

* improve style

* more style improvement after installing quality
```
  8434cb87
- [deepspeed] offload + non-cpuadam optimizer exception (#22043) · ec24132b
  Stas Bekman authored 2 years ago
```
* [deepspeed] offload + non-cpuadam optimizer exception

* flip

* revert min version
```
  ec24132b