Commits · 619ecfe26f29d6ac1c73a2f7465646087622e1e1 · 某某某 / transformers-new

18 Jan, 2024 9 commits

[Whisper Tok] Move token ids to CPU when computing offsets (#28485) · 619ecfe2
Sanchit Gandhi authored 1 year ago
```
* move token ids to cpu

* check for torch attr
```
619ecfe2
[ASR Pipe] Update init to set model type and subsequently call parent init method (#28486) · 0eaa5ea3
Sanchit Gandhi authored 1 year ago
```
* add image processor arg

* super

* rm args
```
0eaa5ea3
Fix the documentation checkpoint for xlm-roberta-xl (#28567) · c662c78c
Jeremy Fowers authored 1 year ago
```
* Fix the documentation checkpoint for xlm-roberta-xl

* Improve docstring consistency
```
c662c78c

Use `LoggingLevel` context manager in 3 tests (#28575) · 0754217c

Yih-Dar authored 1 year ago


* inside with LoggingLevel

* remove is_flaky

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

0754217c

Add new meta w2v2-conformer BERT-like model (#28165) · d2cdefb9

Yoach Lacombe authored 1 year ago


* first commit

* correct default value non causal

* update config and modeling code

* update converting checkpoint

* clean modeling and fix tests

* make style

* add new config parameters to docstring

* fix copied from statements

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* make position_embeddings_type docstrings clearer

* clean converting script

* remove function not used

* clean modeling file

* apply suggestion for test file + add convert script to not_doctested

* modify tests according to review - cleaner logic and more tests

* Apply nit suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add checker of valid position embeddings type

* instantiate new layer norm layer with the right eps

* fix freeze_feature_encoder since it can be None in some cases

* add test same output in convert script

* restore wav2vec2conformer and add new model

* create processor and FE + clean

* add new model code

* fix convert script and set default config parameters

* correct model id paths

* make style

* make fix-copies and cleaning files

* fix copied from statements

* complete .md and fixe copies

* clean convert script argument defaults

* fix config parameters docstrings

* fix config docstring

* add copied from and enrich FE tests

* fix copied from and repo-consistency

* add autotokenizer

* make test input length shorter and change docstring code

* fix docstrings and copied from

* add add_adapter to ASR training example

* make testing of adapters more robust

* adapt to multi adapter layers

* refactor input_values->input_features and remove w2v2-bert feature extractor

* remove pretraining model

* remove depreciated features and useless lines

* add copied from and ignore statements to modeling tests

* remove pretraining model #2

* change import in convert script

* change default in convert script

* update readme and remove useless line

* Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* refactor BERT to Bert for consistency

* remove useless ignore copy statement

* add persistent to buffer in rotary

* add eps in LayerNorm init and remove copied from

* add adapter activation parameters and add copied from statements

* Fix copied statements and add unitest.skip reasons

* add copied statement in test_processor

* refactor processor

* make style

* replace numpy random by torch rand

* remove expected output CTC

* improve converting script with processor class

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove gumbel class

* remove tests related to previously deleted class

* Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* correct typos

* remove uused parameters

* update processor to takes both text and audio

* update checkpoints

* update expected output and add ctc expected output

* add label_attention_mask

* replace pt with np in processor tests

* fix typo

* revert to behaviour with labels_attention_mask

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

d2cdefb9

chore: Fix multiple typos (#28574) · 5d8eb93e
hugo-syn authored 1 year ago

5d8eb93e

[`Core Tokenization`] Support a fix for spm fast models (#26678) · 81899778

Arthur authored 1 year ago

* fix

* last attempt

* current work

* fix forward compatibility

* save all special tokens

* current state

* revert additional changes

* updates

* remove tokenizer.model

* add a test and the fix

* nit

* revert one more break

* fix typefield issue

* quality

* more tests

* fix fields for FC

* more nits?

* new additional changes

* how

* some updates

* the fix

* where do we stand

* nits

* nits

* revert unrelated changes

* nits nits nits

* styling

* don't break llama just yet

* revert llama changes

* safe arg check

* fixup

* Add a test for T5

* Necessary changes

* Tests passing, added tokens need to not be normalized. If the added tokens are normalized, it will the stripping which seems to be unwanted for a normal functioning

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Update to main

* nits

* fmt

* more and more test

* comments

* revert change as tests are failing

* make the test more readble

* nits

* refactor the test

* nit

* updates

* simplify

* style

* style

* style convert slow

* Update src/transformers/convert_slow_tokenizer.py

81899778

Use `weights_only` only if torch >= 1.13 (#28506) · a1668cc7

Yih-Dar authored 1 year ago


* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

a1668cc7

Save `Processor` (#27761) · 3005f965

Yih-Dar authored 1 year ago


* save processor

* Update tests/models/auto/test_processor_auto.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/test_processing_common.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

3005f965

17 Jan, 2024 7 commits

Fix Switch Transformers When sparse_step = 1 (#28564) · 98dda8ed
Ahmed Elnaggar authored 1 year ago
```
Fix sparse_step = 1

I case sparse_step = 1, the current code will not work.
```
98dda8ed

Allow to train dinov2 with different dtypes like bf16 (#28504) · fa6d12f7

Lucas Thompson authored 1 year ago

I want to train dinov2 with bf16 but I get the following error in https://github.com/huggingface/transformers/blob/bc72b4e2cdcbc80d5f56731f35dbc9c18b4c8de6/src/transformers/models/dinov2/modeling_dinov2.py#L635:

```
RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same
```

Since the input dtype is torch.float32, the parameter dtype has to be torch.float32...

@LZHgrla and I checked the code of clip vision encoder and found there is an automatic dtype transformation (https://github.com/huggingface/transformers/blob/bc72b4e2cdcbc80d5f56731f35dbc9c18b4c8de6/src/transformers/models/clip/modeling_clip.py#L181-L182).

So I add similar automatic dtype transformation to modeling_dinov2.py.

fa6d12f7

Fix SDPA tests (#28552) · 2c1eebc1

fxmarty authored 1 year ago


* skip bf16 test if not supported by device

* fix

* fix bis

* use is_torch_bf16_available_on_device

* use is_torch_fp16_available_on_device

* fix & use public llama

* use 1b model

* fix flacky test

---------

Co-authored-by: Your Name <you@example.com>

2c1eebc1

Add qwen2 (#28436) · d6ffe74d

Junyang Lin authored 1 year ago


* add config, modeling, and tokenization

* add auto and init

* update readme

* update readme

* update team name

* fixup

* fixup

* update config

* update code style

* update for fixup

* update for fixup

* update for fixup

* update for testing

* update for testing

* fix bug for config and tokenization

* fix bug for bos token

* not doctest

* debug tokenizer

* not doctest

* debug tokenization

* debug init for tokenizer

* fix style

* update init

* delete if in token auto

* add tokenizer doc

* add tokenizer in init

* Update dummy_tokenizers_objects.py

* update

* update

* debug

* Update tokenization_qwen2.py

* debug

* Update convert_slow_tokenizer.py

* add copies

* add copied from and make style

* update files map

* update test

* fix style

* fix merge reading and update tests

* fix tests

* fix tests

* fix style

* debug a variable in readme

* Update src/transformers/models/qwen2/configuration_qwen2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update test and copied from

* fix style

* update qwen2 tokenization  and tests

* Update tokenization_qwen2.py

* delete the copied from after property

* fix style

* update tests

* update tests

* add copied from

* fix bugs

* update doc

* add warning for sliding window attention

* update qwen2 tokenization

* fix style

* Update src/transformers/models/qwen2/modeling_qwen2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix tokenizer fast

---------

Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com>
Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d6ffe74d

Fixes default value of `softmax_scale` in `PhiFlashAttention2`. (#28537) · d93ef7d7
Gustavo de Rosa authored 1 year ago
```
* fix(phi): Phi does not use softmax_scale in Flash-Attention.

* chore(docs): Update Phi docs.
```
d93ef7d7

symbolic_trace: add past_key_values, llama, sdpa support (#28447) · a6adc05e

fxmarty authored 1 year ago

* torch.fx: add pkv, llama, sdpa support

* Update src/transformers/models/opt/modeling_opt.py

* remove spaces

* trigger ci

* use explicit variable names

a6adc05e

[Makefile] Exclude research projects from format (#28551) · 09eb11a1
Patrick von Platen authored 1 year ago

09eb11a1

16 Jan, 2024 10 commits

Config: warning when saving generation kwargs in the model config (#28514) · f4f57f9d
Joao Gante authored 1 year ago

f4f57f9d

Add is_model_supported for fx (#28521) · 7142bdfa

inisis authored 1 year ago


* modify check_if_model_is_supported to return bool

* add is_model_supported and have check_if_model_is_supported use that

* Update src/transformers/utils/fx.py

Fantastic

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

7142bdfa

Clearer error for SDPA when explicitely requested (#28006) · 02f8738e
fxmarty authored 1 year ago
```
* clearer error for sdpa

* better message
```
02f8738e

[`SpeechT5Tokenization`] Add copied from and fix the... · fe23256b

Arthur authored 1 year ago

[`SpeechT5Tokenization`]  Add copied from and fix the `convert_tokens_to_string` to match the fast decoding scheme (#28522)

* Add copied from and fix the `convert_tokens_to_string` to match the fast decoding scheme

* fixup

* add a small test

* style test file

* nites

fe23256b

[`TokenizationRoformerFast`] Fix the save and loading (#28527) · 96d08831
Arthur authored 1 year ago
```
* cleanup

* add a test

* update the test

* style

* revert part that allows to pickle the tokenizer
```
96d08831

[ `TokenizationUtils`] Fix `add_special_tokens` when the token is already there (#28520) · 716df5fb

Arthur authored 1 year ago


* fix adding special tokens when the token is already there.

* add a test

* add a test

* nit

* fix the test: make sure the order is preserved

* Update tests/test_tokenization_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

716df5fb

Fix/speecht5 bug (#28481) · 07ae53e6

Nima Yaqmuri authored 1 year ago

* Fix bug in SpeechT5 speech decoder prenet's forward method

- Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues.
- Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact.
- This change resolves a critical bug affecting the model's performance in handling speaker embeddings.

* Refactor SpeechT5 text to speech integration tests

- Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite.
- Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations.
- Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing.
- Fixed existing test cases where incorrect assumptions about output shapes led to potential errors.

* Fix bug in SpeechT5 speech decoder prenet's forward method

* Refactor SpeechT5 text to speech integration tests

* Enhance handling of speaker embeddings in SpeechT5

- Refined the generate and generate_speech functions in the SpeechT5 class to robustly handle two scenarios for speaker embeddings: matching the batch size (one embedding per sample) and one-to-many (a single embedding for all samples in the batch).
- The update includes logic to repeat the speaker embedding when a single embedding is provided for multiple samples, and a ValueError is raised for any mismatched dimensions.
- Also added corresponding test cases to validate both scenarios, ensuring complete coverage and functionality for diverse speaker embedding situations.

* Improve Test Robustness with Randomized Speaker Embeddings

07ae53e6

Fix mismatching loading in from_pretrained with/without accelerate (#28414) · 66db33dd

fxmarty authored 1 year ago

* fix mismatching behavior in from_pretrained with/without accelerate

* meaningful refactor

* remove added space

* add test

* fix model on the hub

* comment

* use tiny model

* style

66db33dd

Improving Training Performance and Scalability Documentation (#28497) · 002566f3

Hamza FILALI authored 1 year ago


* Improving Training Performance and Scaling documentation by adding PEFT techniques to suggestions to reduce memory requirements for training

* Update docs/source/en/perf_train_gpu_one.md

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

002566f3

Remove `task` arg in `load_dataset` in image-classification example (#28408) · 0cdcd7a2

regisss authored 1 year ago

* Remove `task` arg in `load_dataset` in image-classification example

* Manage case where "train" is not in dataset

* Add new args to manage image and label column names

* Similar to audio-classification example

* Fix README

* Update tests

0cdcd7a2

15 Jan, 2024 13 commits

SiLU activation wrapper for safe importing (#28509) · edb17023
amyeroberts authored 1 year ago
```
Add back in wrapper for safe importing
```
edb17023
improve dev setup comments and hints (#28495) · ff86bc36
Timothy Cronin authored 1 year ago
```
* improve dev setup comments and hints

* fix tests for new dev setup hints
```
ff86bc36
fix: sampling in flax keeps EOS (#28378) · 735968b6
Boris Dayma authored 1 year ago

735968b6
Generate: consolidate output classes (#28494) · 7e0ddf89
Joao Gante authored 1 year ago

7e0ddf89
Add a use_safetensors arg to TFPreTrainedModel.from_pretrained() (#28511) · 72db39c0
Matt authored 1 year ago
```
* Add a use_safetensors arg to TFPreTrainedModel.from_pretrained()

* One more catch!

* One more one more catch
```
72db39c0
Fixed minor typos (#28489) · 78d767e3
Rishit Ratna authored 1 year ago

78d767e3
[GPTQ] Fix test (#28018) · 7c8dd88d
Marc Sun authored 1 year ago
```
* fix test

* reduce length

* smaller model
```
7c8dd88d

Tokenizer kwargs in textgeneration pipe (#28362) · 366c0327

thedamnedrhino authored 1 year ago

* added args to the pipeline

* added test

* more sensical tests

* fixup

* docs

* typo
;

* docs

* made changes to support named args

* fixed test

* docs update

* styles

* docs

* docs

366c0327

Add the XPU device check for pipeline mode (#28326) · a573ac74

yuanwu2017 authored 1 year ago


* Add the XPU check for pipeline mode

When setting xpu device for pipeline, It needs to use is_torch_xpu_available to load ipex and determine whether the device is available.

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Don't move model to device when hf_device_map isn't None

1. Don't move model to device when hf_device_map is not None
2. The device string maybe includes the device index, so use 'in'instead of equal

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Raise the error when xpu is not available

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Update src/transformers/pipelines/base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/pipelines/base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Modify the error message

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Change message format.

Signed-off-by: yuanwu <yuan.wu@intel.com>

---------

Signed-off-by: yuanwu <yuan.wu@intel.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

a573ac74

[`core`/ FEAT] Add the possibility to push custom tags using `PreTrainedModel` itself (#28405) · 1b9a2e4c

Younes Belkada authored 1 year ago


* v1 tags

* remove unneeded conversion

* v2

* rm unneeded warning

* add more utility methods

* Update src/transformers/utils/hub.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/utils/hub.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/utils/hub.py

Co-authored-by: Lucain <lucainp@gmail.com>

* more enhancements

* oops

* merge tags

* clean up

* revert unneeded change

* add extensive docs

* more docs

* more kwargs

* add test

* oops

* fix test

* Update src/transformers/modeling_utils.py

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update src/transformers/utils/hub.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/modeling_utils.py

* Update src/transformers/trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add more conditions

* more logic

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

1b9a2e4c

Don't set `finetuned_from` if it is a local path (#28482) · 64bdbd88
Yih-Dar authored 1 year ago
```
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
64bdbd88
[`chore`] Update warning text, a word was missing (#28017) · 881e966a
Tom Aarsen authored 1 year ago
```
Update warning, a word was missing
```
881e966a
Fix paths to AI Sweden Models reference and model loading (#28423) · 121641ca
Francisco Kurucz authored 1 year ago
```
Fix URL to Ai Sweden Models reference and model loading
```
121641ca

13 Jan, 2024 1 commit
- Generate: fix candidate device placement (#28493) · bc72b4e2
  Joao Gante authored 1 year ago
```
* fix candidate device

* this line shouldn't have been in
```
  bc72b4e2