Commits · b439129e74bb207138e49ffb1f147bd94aa58574 · zhusg / transformers-new

01 Sep, 2023 7 commits

[VITS] Add to TTA pipeline (#25906) · b439129e

Sanchit Gandhi authored 1 year ago


* [VITS] Add to TTA pipeline

* Update tests/pipelines/test_pipelines_text_to_audio.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* remove extra spaces

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

b439129e

Revert frozen training arguments (#25903) · be0e189b
Zach Mueller authored 1 year ago
```
* Revert frozen training arguments

* TODO
```
be0e189b
Remove broken docs for MusicGen (#25905) · 69c5b8f1
Omar Sanseviero authored 1 year ago
```
Update musicgen.md
```
69c5b8f1

Better error message for pipeline loading (#25912) · 16d6e308

Yih-Dar authored 1 year ago


* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

16d6e308

Falcon: Add RoPE scaling (#25878) · 53e2fd78
Joao Gante authored 1 year ago

53e2fd78

fix FSDP model resume optimizer & scheduler (#25852) · 024acd27

pkumc authored 1 year ago


* fix FSDP resume optimizer & scheduler

* improve trainer code quality

---------

Co-authored-by: machi04 <machi04@meituan.com>

024acd27

add VITS model (#24085) · 4ece3b94

Matthijs Hollemans authored 1 year ago


* add VITS model

* let's vits

* finish TextEncoder (mostly)

* rename VITS to Vits

* add StochasticDurationPredictor

* ads flow model

* add generator

* correctly set vocab size

* add tokenizer

* remove processor & feature extractor

* add PosteriorEncoder

* add missing weights to SDP

* also convert LJSpeech and VCTK checkpoints

* add training stuff in forward

* add placeholder tests for tokenizer

* add placeholder tests for model

* starting cleanup

* let the great renaming begin!

* use config

* global_conditioning

* more cleaning

* renaming variables

* more renaming

* more renaming

* it never ends

* reticulating the splines

* more renaming

* HiFi-GAN

* doc strings for main model

* fixup

* fix-copies

* don't make it a PreTrainedModel

* fixup

* rename config options

* remove training logic from forward pass

* simplify relative position

* use actual checkpoint

* style

* PR review fixes

* more review changes

* fixup

* more unit tests

* fixup

* fix doc test

* add integration test

* improve tokenizer tests

* add tokenizer integration test

* fix tests on GPU (gave OOM)

* conversion script can handle repos from hub

* add conversion script for all MMS-TTS checkpoints

* automatically create a README for the converted checkpoint

* small changes to config

* push README to hub

* only show uroman note for checkpoints that need it

* remove conversion script because code formatting breaks the readme

* make WaveNet layers configurable

* rename variables

* simplifying the math

* output attentions and hidden states

* remove VitsFlip in flow model

* also got rid of the other flip

* fix tests

* rename more variables

* rename tokenizer, add phonemization

* raise error when phonemizer missing

* re-order config docstrings to match method

* change config naming

* remove redundant str -> list

* fix copyright: vits authors -> kakao enterprise

* (mean, log_variances) -> (prior_mean, prior_log_variances)

* if return dict -> if not return dict

* speed -> speaking rate

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update fused tanh sigmoid

* reduce dims in tester

* audio -> output_values

* audio -> output_values in tuple out

* fix return type

* fix return type

* make _unconstrained_rational_quadratic_spline a function

* all nn's to accept a config

* add spectro to output

* move {speaking rate, noise scale, noise scale duration} to config

* path -> attn_path

* idxs -> valid idxs -> padded idxs

* output values -> waveform

* use config for attention

* make generation work

* harden integration test

* add spectrogram to dict output

* tokenizer refactor

* make style

* remove 'fake' padding token

* harden tokenizer tests

* ron norm test

* fprop / save tests deterministic

* move uroman to tokenizer as much as possible

* better logger message

* fix vivit imports

* add uroman integration test

* make style

* up

* matthijs -> sanchit-gandhi

* fix tokenizer test

* make fix-copies

* fix dict comprehension

* fix config tests

* fix model tests

* make outputs consistent with reverse/not reverse

* fix key concat

* more model details

* add author

* return dict

* speaker error

* labels error

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vits/convert_original_checkpoint.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove uromanize

* add docstrings

* add docstrings for tokenizer

* upper-case skip messages

* fix return dict

* style

* finish tests

* update checkpoints

* make style

* remove doctest file

* revert

* fix docstring

* fix tokenizer

* remove uroman integration test

* add sampling rate

* fix docs / docstrings

* style

* add sr to model output

* fix outputs

* style / copies

* fix docstring

* fix copies

* remove sr from model outputs

* Update utils/documentation_tests.txt

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add sr as allowed attr

---------

Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4ece3b94

31 Aug, 2023 11 commits

remove torch_dtype override (#25894) · ef10dbce

Marc Sun authored 1 year ago


* remove torch_dtype override

* style

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ef10dbce

Smarter check for `is_tensor` (#25871) · 0f08cd20

Sylvain Gugger authored 1 year ago


* Smarter check for

* Use protected functions

* Do others too

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Address review comments

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

0f08cd20

Update `setup.py` (#25893) · 3fb1535b

Yih-Dar authored 1 year ago


update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

3fb1535b

Add type hints for tf models batch 1 (#25853) · eaf5e98e

David Reguera authored 1 year ago

* Add type hints to `TFBlipTextModel`

* Add missing type hints to DPR family models

* Add type hints to `TFLEDModel`

* Add type hints to `TFLxmertForPreTraining`

* Add missing type hints to `TFMarianMTModel` and `TFMarianModel`

* Add missing type hints to `TFRagModel` & `TFRagTokenForGeneration`

* Make type hints annotations consistent

eaf5e98e

[`InstructBlip`] FINAL Fix instructblip test (#25887) · 9c5acca0
Younes Belkada authored 1 year ago
```
fix instructblip test
```
9c5acca0
Save image_processor while saving pipeline (ImageSegmentationPipeline) (#25884) · 2be8a909
raghavanone authored 1 year ago
```
* Save image_processor while saving pipeline (ImageSegmentationPipeline)

* Fix black issues
```
2be8a909
[`CodeLlama`] Fix CI (#25890) · a39ebbf8
Arthur authored 1 year ago
```
* Fix coellama

* style
```
a39ebbf8

[`TokenizerFast`] `can_save_slow_tokenizer` as a property for when... · 3b39b906

Arthur authored 1 year ago

[`TokenizerFast`] `can_save_slow_tokenizer` as a property for when `vocab_file`'s folder was removed (#25626)

* pad token should be None by default

* fix tests

* nits

* check if isfile vocabfile

* add warning if sp model folder was deleted

* save SPM when missing folder for sloz

* update the ` can_save_slow_tokenizer`  to be a property

* first batch

* second batch

* missing one

3b39b906

Modify efficient GPU training doc with now-available adamw_bnb_8bit optimizer (#25807) · 99fc3ac8

Vibhor Kumar authored 1 year ago


* Modify single-GPU efficient training doc with now-available adamw_bnb_8bit optimizer

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

99fc3ac8

fix ds z3 checkpointing when `stage3_gather_16bit_weights_on_model_save=False` (#25817) · e95bcaee
Sourab Mangrulkar authored 1 year ago
```
* fix ds z3 checkpointing when  `stage3_gather_16bit_weights_on_model_save=False`

* refactoring
```
e95bcaee

For xla tensors, use an alternative way to get a unique id (#25802) · f8468b4f

qihqi authored 1 year ago

* For xla tensors, use an alternative way to get a unique id

Because xla tensors don't have storage.

* add is_torch_tpu_available check

f8468b4f

30 Aug, 2023 11 commits
- [ViTDet] Fix doc tests (#25880) · 716bb2e3
  NielsRogge authored 1 year ago
```
Fix docstrings
```
  716bb2e3
- Reduce CI output (#25876) · 1c6f072d
  Yih-Dar authored 1 year ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  1c6f072d
- pin pandas==2.0.3 (#25875) · 9219d142
  Yih-Dar authored 1 year ago
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  9219d142
- Docs: fix example failing doctest in `generation_strategies.md ` (#25874) · 459bc673
  Joao Gante authored 1 year ago
  
  459bc673
- fix max_memory for bnb (#25842) · 72298178
  Marc Sun authored 1 year ago
  
  72298178
- Fix imports (#25869) · f73c2097
  Yih-Dar authored 1 year ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  f73c2097
- Remote tools are turned off (#25867) · ed290b08
  Lysandre Debut authored 1 year ago
  
  ed290b08
- Add Blip2 model in VQA pipeline (#25532) · 09dc9951
  Juan Pizarro authored 1 year ago
```
* Add Blip2 model in VQA pipeline

* use require_torch_gpu for test_large_model_pt_blip2

* use can_generate in vqa pipeline

* test Blip2ForConditionalGeneration using float16

* remove custom can_generate from Blip2ForConditionalGeneration
```
  09dc9951
- Add flax installation in daily doctest workflow (#25860) · 62399d6f
  Yih-Dar authored 1 year ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  62399d6f
- minor typo fix in PeftAdapterMixin docs (#25829) · 52574026
  Aman Gupta Karmani authored 1 year ago
```
fix minor documentation typo
```
  52574026
- Update README.md (#25832) · 1bf2f36d
  Nino Risteski authored 1 year ago
```
deleted unnecessary comma in the Adding a new model section.
```
  1bf2f36d
29 Aug, 2023 11 commits

Generate: models with custom `generate()` return `True` in `can_generate()` (#25838) · 07998ef3
Joao Gante authored 1 year ago

07998ef3
Update README.md (#25834) · 8c75cfda
Nino Risteski authored 1 year ago
```
_toctree.yml file. broken link, now fixed.
```
8c75cfda

Support loading base64 images in pipelines (#25633) · dbc16f44

Haylee Schäfer authored 1 year ago


* support loading base64 images

* add test

* mention in docs

* remove the logging

* sort imports

* update error message

* Update tests/utils/test_image_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* restructure to catch base64 exception

* doesn't like the newline

* download files

* format

* optimize imports

* guess it needs a space?

* support loading base64 images

* add test

* remove the logging

* sort imports

* restructure to catch base64 exception

* doesn't like the newline

* download files

* optimize imports

* guess it needs a space?

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

dbc16f44

MaskFormer,Mask2former - reduce memory load (#25741) · ce2d4bc6
amyeroberts authored 1 year ago
```
Allocate result array ahead of time
```
ce2d4bc6
[AutoTokenizer] Add data2vec to mapping (#25835) · 0daeeb40
Sanchit Gandhi authored 1 year ago

0daeeb40
update remaining `Pop2Piano` checkpoints (#25827) · 0e59c939
Susnato Dhar authored 1 year ago
```
update checkpoints
```
0e59c939
update warning to If you want to use the new behaviour, set `legacy=… (#25833) · 245dcc49
Arthur authored 1 year ago
```
update warning to If you want to use the new behaviour, set `legacy=False`. instead of True
```
245dcc49

[i18n-KO] Translated `community.md` to Korean (#25674) · aade754b

Sohyun Sim authored 1 year ago


* docs: ko: community.md

* feat: deepl draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

---------

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

aade754b

[i18n-KO] Translated `add_new_pipeline.md` to Korean (#25498) · d97fd871

heuristicwave authored 1 year ago


* dos: ko: add_new_pipeline.mdx

* feat: chatgpt draft

* fix: manual edits

* docs: ko: add_new_pipeline

Update _toctree

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* Update docs/source/ko/add_new_pipeline.md

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

---------

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

d97fd871

Tests: detect lines removed from "utils/not_doctested.txt" and doctest ALL... · a35f889a
Joao Gante authored 1 year ago
```
Tests: detect lines removed from "utils/not_doctested.txt" and doctest ALL generation files (#25763)
```
a35f889a

Error with checking args.eval_accumulation_steps to gather tensors (#25819) · 483861d5

Chau Nguyen authored 1 year ago

* Update trainer.py (error with checking steps in args.eval_accumulation_steps to gather tensors)

While the deprecated code has the correct check (line 3772): 
"if args.eval_accumulation_steps is not None and (step + 1) % args.eval_accumulation_steps == 0:"

The current code does not (line 3196):
"if args.eval_accumulation_steps is not None and self.accelerator.sync_gradients:"

We need to check "(step + 1) % args.eval_accumulation_steps == 0". Hence, the line 3196 should be modified to:
"if args.eval_accumulation_steps is not None and (step + 1) % args.eval_accumulation_steps == 0 and self.accelerator.sync_gradients:"

* Fix error with checking args.eval_accumulation_steps to gather tensors

483861d5