Commits · larger_runner · 某某某 / transformers-new

19 Mar, 2024 12 commits

[test_all] larger runner n=12 · ffbcc360
ydshieh authored 1 year ago

ffbcc360
[test_all] larger runner n=16 · 9abcc1e8
ydshieh authored 1 year ago

9abcc1e8
[test_all] larger runner · 3cd599a5
ydshieh authored 1 year ago

3cd599a5
Fix `check_copies` not capturing the diff in model/paper title and link (#29724) · 66ce9593
Yih-Dar authored 1 year ago
```
* fix

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
66ce9593

Llama: partial 4d masks (#29731) · 4294f0c3

Joao Gante authored 1 year ago


* partial 4d masks

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4294f0c3

Clean-up generation tests after moving methods to private (#29582) · 425ba56c

Raushan Turganbay authored 1 year ago

* clean-up tests

* refine comments

* fix musicgen tests

* make style

* remove slow decorator from a test

* more clean-up

* fix other failing tests

425ba56c

Implementation of SuperPoint and AutoModelForKeypointDetection (#28966) · 56baa033

StevenBucaille authored 1 year ago


* Added SuperPoint docs

* Added tests

* Removed commented part

* Commit to create and fix add_superpoint branch with a new branch

* Fixed dummy_pt_objects

* Committed missing files

* Fixed README.md

* Apply suggestions from code review

Fixed small changes

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Moved ImagePointDescriptionOutput from modeling_outputs.py to modeling_superpoint.py

* Removed AutoModelForKeypointDetection and related stuff

* Fixed inconsistencies in image_processing_superpoint.py

* Moved infer_on_model logic simply in test_inference

* Fixed bugs, added labels to forward method with checks whether it is properly a None value, also added tests about this logic in test_modeling_superpoint.py

* Added tests to SuperPointImageProcessor to ensure that images are properly converted to grayscale

* Removed remaining mentions of MODEL_FOR_KEYPOINT_DETECTION_MAPPING

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fixed from (w, h) to (h, w) as input for tests

* Removed unnecessary condition

* Moved last_hidden_state to be the first returned

* Moved last_hidden_state to be the first returned (bis)

* Moved last_hidden_state to be the first returned (ter)

* Switched image_width and image_height in tests to match recent changes

* Added config as first SuperPointConvBlock init argument

* Reordered README's after merge

* Added missing first config argument to SuperPointConvBlock instantiations

* Removed formatting error

* Added SuperPoint to README's de, pt-br, ru, te and vi

* Checked out README_fr.md

* Fixed README_fr.md

* Test fix README_fr.md

* Test fix README_fr.md

* Last make fix-copies !

* Updated checkpoint path

* Removed unused SuperPoint doc

* Added missing image

* Update src/transformers/models/superpoint/modeling_superpoint.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Removed unnecessary import

* Update src/transformers/models/superpoint/modeling_superpoint.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Added SuperPoint to _toctree.yml

---------

Co-authored-by: steven <steven.bucaillle@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com>

56baa033

[`GemmaConverter`] use user_defined_symbols (#29473) · 2f9a3edb

Arthur authored 1 year ago

* use user_defined_symbols

* fixup

* nit

* add a very robust test

* make sure all models are tested with the `pretrained_tokenizer_to_test`

* should we make sure we test all of them?

* merge

* remove the id

* fix test

* update

* ousies

* oups

* fixup

* fix copies check

* remove `pretrained_tokenizer_to_test`

2f9a3edb

[`Gemma`] final fixes to the modeling (#29729) · 8e2fc52e

Arthur authored 1 year ago


* gelu_pytorch_tanh

* Force config.hidden_act to be approx gelu

* Gemma bug fixes

* force_use_exact_gelu

* Update configuration_gemma.py

* Update modeling_gemma.py

* update

* update for simpler handling

* nit

* nit

* fixpup

* update

* also update the jax modeling!

* add `"gelu_pytorch_tanh": partial(nn.gelu, approximate=True),`

* fixup

* fix order

* act vs act_fn

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

8e2fc52e

[tests] add more tests to `NOT_DEVICE_TESTS` (#29670) · 229ac72b
Fanli Lin authored 1 year ago
```
* add more tests

* remove 2 tests

* add more tests
```
229ac72b

FEAT / Optim: Add GaLore optimizer (#29588) · f6261d7d

Younes Belkada authored 1 year ago


* add galore v1

* add import

* add tests and doc

* fix doctest

* forward contrib credits from discussions

* forward contrib credits from discussions

* Apply suggestions from code review

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* fix failing tests'

* switch to `optim_target_modules` and clarify docs

* more clarification

* enhance lookup logic

* update a test to add peak memory

* add regex, all-linear and single string support

* add layer-wise optimization through DummyOptimizers and LRSchedulers

* forward contrib credits from discussions and original idea

* add a section about DDP not supported in layerwise

* Update src/transformers/trainer.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* fix self

* check only if layer_wise

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* oops

* make use of intervals

* clarify comment

* add matching tests

* GaLoRe -> GaLore

* move to `get_scheduler`

* add note on docs

* add a warning

* adapt a bit the docs

* update docstring

* support original API

* Update docs/source/en/trainer.md

* slightly refactor

* Update docs/source/en/trainer.md

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix args parsing and add tests

* remove warning for regex

* fix type hint

* add note about extra args

* make `is_regex` return optional

---------

Co-authored-by: Maxime <maximegmd @users.noreply.github.com>
Co-authored-by: Wing Lian <winglian @users.noreply.github.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: hiyouga <hiyouga@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

f6261d7d

Use logging.warning instead of warnings.warn in pipeline.__call__ (#29717) · 484e10f7
Motoki Wu authored 1 year ago
```
* Use logging.warning instead of warnings.warn in pipeline.__call__

* Update src/transformers/pipelines/base.py
```
484e10f7

18 Mar, 2024 5 commits

Update the pipeline tutorial to include `gradio.Interface.from_pipeline` (#29684) · 838b87ab

Abubakar Abid authored 1 year ago


* Update pipeline_tutorial.md to include gradio

* Update pipeline_tutorial.md

* Update docs/source/en/pipeline_tutorial.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/pipeline_tutorial.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/pipeline_tutorial.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/pipeline_tutorial.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update pipeline_tutorial.md

* Update docs/source/en/pipeline_tutorial.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

838b87ab

FIX [`bnb`] Make `unexpected_keys` optional (#29420) · c852d4fb

Younes Belkada authored 1 year ago


* make `unexpected_keys` optional

* push

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

c852d4fb

Fix `filter_models` (#29710) · 87e2ea33

Yih-Dar authored 1 year ago


* update

* update

* update

* check

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

87e2ea33

Add MusicGen Melody (#28819) · c43b380e

Yoach Lacombe authored 1 year ago


* first modeling code

* make repository

* still WIP

* update model

* add tests

* add latest change

* clean docstrings and copied from

* update docstrings md and readme

* correct chroma function

* correct copied from and remove unreleated test

* add doc to toctree

* correct imports

* add convert script to notdoctested

* Add suggestion from Sanchit

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct get_uncoditional_inputs docstrings

* modify README according to SANCHIT feedback

* add chroma to audio utils

* clean librosa and torchaudio hard dependencies

* fix FE

* refactor audio decoder -> audio encoder for consistency with previous musicgen

* refactor conditional -> encoder

* modify sampling rate logics

* modify license at the beginning

* refactor all_self_attns->all_attentions

* remove ignore copy from causallm generate

* add copied from for from_sub_models

* fix make copies

* add warning if audio is truncated

* add copied from where relevant

* remove artefact

* fix convert script

* fix torchaudio and FE

* modify chroma method according to feedback-> better naming

* refactor input_values->input_features

* refactor input_values->input_features and fix import fe

* add input_features to docstrigs

* correct inputs_embeds logics

* remove dtype conversion

* refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation

* change warning for chroma length

* Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* change way to save wav, using soundfile

* correct docs and change to soundfile

* fix import

* fix init proj layers

* remove line breaks from md

* fix issue with docstrings

* add FE suggestions

* improve is in logics and remove useless imports

* remove custom from_pretrained

* simplify docstring code

* add suggestions for modeling tests

* make style

* update converting script with sanity check

* remove encoder attention mask from conditional generation

* replace musicgen melody checkpoints with official orga

* rename ylacombe->facebook in checkpoints

* fix copies

* remove unecessary warning

* add shape in code docstrings

* add files to slow doc tests

* fix md bug and add md to not_tested

* make fix-copies

* fix hidden states test and batching

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

c43b380e

CI / generate: batch size computation compatible with all models (#29671) · bf3dfd11
Joao Gante authored 1 year ago

bf3dfd11

15 Mar, 2024 15 commits

[docs] Spanish translation of attention.md (#29681) · 00c1d87a

Aaron Jimenez authored 1 year ago

* add attention to es/ and edit es/_toctree.yml

* translate attention.md

* fix transformers

* fix transformers

00c1d87a

Revert "Fix wrong condition used in `filter_models`" (#29682) · 5011908e
Yih-Dar authored 1 year ago
```
Revert "Fix wrong condition used in `filter_models` (#29673)"

This reverts commit 174aecd0.
```
5011908e

[FIX] Fix speech2test modeling tests (#29672) · 4e98d594

Yoach Lacombe authored 1 year ago


* fix speech_to_test generation tests

* Add details to comment

* Update tests/models/speech_to_text/test_modeling_speech_to_text.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4e98d594

Generate: replace breaks by a loop condition (#29662) · 9e4df7c4

Joao Gante authored 1 year ago


* replace breaks by a loop condition

* Update src/transformers/generation/utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

9e4df7c4

[Quantization] Quanto quantizer (#29023) · 28de2f4d

Marc Sun authored 1 year ago


* start integration

* fix

* add and debug tests

* update tests

* make pytorch serialization works

* compatible with device_map and offload

* fix tests

* make style

* add ref

* guard against safetensors

* add float8 and style

* fix is_serializable

* Fix shard_checkpoint compatibility with quanto

* more tests

* docs

* adjust memory

* better

* style

* pass tests

* Update src/transformers/modeling_utils.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add is_safe_serialization instead

* Update src/transformers/quantizers/quantizer_quanto.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add QbitsTensor tests

* fix tests

* simplify activation list

* Update docs/source/en/quantization.md

Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>

* better comment

* Update tests/quantization/quanto_integration/test_quanto.py

Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>

* Update tests/quantization/quanto_integration/test_quanto.py

Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>

* find and fix edge case

* Update docs/source/en/quantization.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* pass weights_only_kwarg instead

* fix shard_checkpoint loading

* simplify update_missing_keys

* Update tests/quantization/quanto_integration/test_quanto.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* recursion to get all tensors

* block serialization

* skip serialization tests

* fix

* change by cuda:0 for now

* fix regression

* update device_map

* fix doc

* add noteboon

* update torch_dtype

* update doc

* typo

* typo

* remove comm

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>

28de2f4d

Rename `glue` to `nyu-mll/glue` (#29679) · f02aea27
Quentin Lhoest authored 1 year ago
```
* Update run_glue.py

* Update run_glue.py

* Update run_glue_no_trainer.py
```
f02aea27
fix: typos (#29653) · 03847ef4
guangwu authored 1 year ago
```
Signed-off-by: guoguangwu <guoguangwug@gmail.com>
```
03847ef4
Fix wrong condition used in `filter_models` (#29673) · 174aecd0
Yih-Dar authored 1 year ago
```
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
174aecd0

[tests] ensure device-required software is available in the testing... · 272f48e7

Fanli Lin authored 1 year ago

[tests] ensure device-required software is available in the testing environment before testing     (#29477)

* gix

* fix style

* add warning

* revert

* no newline

* revert

* revert

* add CUDA as well

272f48e7

Fix AutoformerForPrediction example code (#29639) · 8a3cfaac

Maciej Torhan authored 1 year ago


Removed static_real_features from AutoformerForPrediction example code

Signed-off-by: Maciej Torhan <maciek97x@gmail.com>

8a3cfaac

[tests] remove deprecated tests for model loading (#29450) · c1993e68
Fanli Lin authored 1 year ago
```
* gix

* fix style

* remove equivalent tests

* add back for image_processor

* remove again
```
c1993e68

Cohere Model Release (#29622) · 0e4a1c34

Saurabh Dash authored 1 year ago


* Cohere Model Release (#1)

Cohere Model Release

* Remove unnecessary files and code (#2)

Some cleanup

* Delete cohere-model directory (#3)

* Make Fix (#5)

* Pr fixes (#6)

* fixes for pr

* pr fixes for the format

* pr fixes for the format

* src/transformers/models/auto/tokenization_auto.py

* Tokenizer test (#8)

* tokenizer test

* format fix

* Adding Docs and other minor changes (#7)

* Add modeling tests (#9)

* Smol Fix (#11)

* tokenization tests are fixed

* format fixes

* fix pr doc tests

* fix pr doc tests

* fix pr doc tests

* fix pr style check

* small changes in cohere.md

* FIX: Address final comments for transformers integration (#13)

* fix modeling final nits and add proper test file

* for now leave empty tests

* add integration test

* push new test

* fix modeling cohere (#14)

* Update chat templates to use the new API (#15)

---------

Co-authored-by: ahmetustun <ahmetustun89@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

0e4a1c34

Pipeline: use tokenizer pad token at generation time if the model pad token is unset. (#29614) · 53d89124
Joao Gante authored 1 year ago

53d89124
Trainer: fail early in the presence of an unsavable `generation_config` (#29675) · c47fcd08
Joao Gante authored 1 year ago

c47fcd08

Extend import utils to cover "editable" torch versions (#29000) · f62407f7

bhack authored 1 year ago


* Extend import utils to cover "editable" torch versions

* Re-add type hint

* Remove whitespaces

* Double quote strings

* Update comment

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Restore package_exists

* Revert "Restore package_exists"

This reverts commit 66fd2cd5c33d1b9a26a8f3e8adef2e6ec1214868.

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

f62407f7

14 Mar, 2024 8 commits

Inaccurate code example within inline code-documentation (#29661) · 56b64bf1
Madhur Prajapati authored 1 year ago
```
* docs:inaccurate_code_example

* Inaccurate code example within inline code-documentation
```
56b64bf1

Allow apply_chat_template to pass kwargs to the template and support a dict of templates (#29658) · 48fbab73

Matt authored 1 year ago

* Allow apply_chat_template to pass kwargs to the template

* Fix priority for template_kwargs

* Fix docstring

* style fix

* Add the option for the model to have a dict of templates

* Error message cleanup

* Add test for chat template dicts

* Simplify the chat template dict test and apply it to all tokenizers in self.get_tokenizers()

* Save chat template dicts as lists with fixed key names

* Add test for serialization/reloading

* Add require_jinja just to be safe, even though I don't think we use it

48fbab73

Generate: handle `cache_position` update in `generate` (#29467) · 23db187d
Joao Gante authored 1 year ago

23db187d

Fix PVT v2 tests (#29660) · 7b87ecb0

Yih-Dar authored 1 year ago


* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

7b87ecb0

Add `dataset_revision` argument to `RagConfig` (#29610) · 2cc3cc83
Yih-Dar authored 1 year ago
```
* add arg

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
2cc3cc83
Fix TPU checkpointing inside Trainer (#29657) · 956f44f1
Shubham Krishna authored 1 year ago
```
Manually call sync step
```
956f44f1
[`PEFT`] Fix `save_pretrained` to make sure adapters weights are also saved on TPU (#29388) · c9e3c0b4
Shubham Krishna authored 1 year ago
```
* Fix for saving ad
apter weights when using PEFT

* Change supported-classes to PushToHubMixin
```
c9e3c0b4
Add newly added PVTv2 model to all README files. (#29647) · b4b96251
robinverduijn authored 1 year ago
```
Add newly added models to all README files.

Also fix one relative path in README_ru.md.
```
b4b96251