Commits · aa17cf986f0761382f3d6e591e985a42671c3fb7 · zhusg / transformers-new

22 Mar, 2024 3 commits

Enable AMD docker build CI (#29803) · aa17cf98
Ilyas Moutawwakil authored 1 year ago
```
* enable amd ci

* remove unnecessary clean up
```
aa17cf98

Fix type hint for train_dataset param of Trainer.__init__() to allow... · 34791613

Steven Madere authored 1 year ago

Fix type hint for train_dataset param of Trainer.__init__() to allow IterableDataset.  Issue 29678 (#29738)

* Fixed typehint for train_dataset param in Trainer.__init__().  Added IterableDataset option.

* make fixup

34791613

[`quality`] update quality check to make sure we check imports

😈

(#29771) · e68ff304

Arthur authored 1 year ago

* update quality check

* make it nice

* update

* let's make sure it runs and we have the logs actually

* update workflow

* nits

e68ff304

21 Mar, 2024 15 commits

Change in-place operations to out-of-place in LogitsProcessors (#29680) · fadb0533

Raushan Turganbay authored 1 year ago


* change in-place -> out-of-place

* add tests

* add more tests

* naming consistency

* fix doctest

* forgot min-length processors

* empty

* Revert "fix doctest"

This reverts commit 4772768457f9bc057f1d4d9d67ea94eb7224eb8d.

* revert change in docstring

* Update tests/generation/test_logits_process.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/generation/test_logits_process.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

fadb0533

Prepend `bos token` to Blip generations (#29642) · b469ebc5

Raushan Turganbay authored 1 year ago


* prepend "bos" to blip generation

* minor changes

* Update src/transformers/models/blip_2/modeling_blip_2.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/instructblip/modeling_instructblip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add generation tester mixin

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b469ebc5

Llama: always convert the causal mask in the SDPA code path (#29663) · ee38fc31
Joao Gante authored 1 year ago
```
* always convert the mask

* rebase and fix copies
```
ee38fc31
Generate: remove legacy generation mixin imports (#29782) · 5ffef2a9
Joao Gante authored 1 year ago

5ffef2a9
Add support for `torch_dtype` in the run_mlm example (#29776) · ef6e371d
Jacky Lee authored 1 year ago
```
feat: add support for torch_dtype

Co-authored-by: Jacky Lee <jackylee328@gmail.com>
```
ef6e371d
Add deterministic config to `set_seed` (#29778) · 10d232e8
Zach Mueller authored 1 year ago
```
* Add deterministic config

* Add note on slowdown

* English fails me again
```
10d232e8
Silence deprecations and use the DataLoaderConfig (#29779) · f0bfb150
Zach Mueller authored 1 year ago
```
* Remove deprecations

* Clean
```
f0bfb150
Cast bfloat16 to float32 for Numpy conversions (#29755) · de627f5a
Matt authored 1 year ago
```
* Cast bfloat16 to float32 for Numpy conversions

* Add test
```
de627f5a
[`LlavaNext`] Fix llava next unsafe imports (#29773) · 73a73b41
Arthur authored 1 year ago
```
* path llava-next

* styling

* styling
```
73a73b41
Fix docker image build for `Latest PyTorch + TensorFlow [dev]` (#29764) · 2ddceef9
Yih-Dar authored 1 year ago
```
* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
2ddceef9
fix issue with logit processor during beam search in Flax (#29636) · fd734be1
théo gigant authored 1 year ago
```
fix issue with logit processor in beam search in Flax
```
fd734be1

Allow `-OO` mode for `docstring_decorator` (#29689) · 691c3d73

Matthias Dittrich authored 1 year ago

Fixes
```
  File "/nix/store/rv8xdwghdad9jv2w86b8g08kan9l6ksm-python3.11-transformers-4.38.2/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 987, in <module>
    class AutoConfig:
  File "/nix/store/rv8xdwghdad9jv2w86b8g08kan9l6ksm-python3.11-transformers-4.38.2/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1011, in AutoConfig
    @replace_list_option_in_docstrings()
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/rv8xdwghdad9jv2w86b8g08kan9l6ksm-python3.11-transformers-4.38.2/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 966, in docstring_decorator
    lines = docstrings.split("\n")
            ^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'
```

691c3d73

OWL-ViT box_predictor inefficiency issue (#29712) · 9556054f

Rahul Vinod Vishwakarma authored 1 year ago


* Calculating box_bias at the start once, then reusing it at inference

* Updating the compute_box_bias function for backwards compatibility

* Caching compute_box_bias function

* Bux fix

* Update owlv2 accordingly to ensure repo consistency

* Co-authored by: nvbinh15 <binh.pdc01@gmail.com>

* Fixup changes

* Made copied code consistent

* Co-authored by: nvbinh15 <binh.pdc01@gmail.com>

---------

Co-authored-by: Nguyen Van Binh <>
Co-authored-by: Nguyen Van Binh <binh.pdc01@gmail.com>

9556054f

Fixed typo in quantization_config.py (#29766) · 0639034a

Ash Kuroki authored 1 year ago

Update quantization_config.py

Fixed typo for clarity and correctness.

previous: input time
current: input type
// changed time to type to fix the typo

0639034a

[docs] Remove redundant `-` and `the` from custom_tools.md (#29767) · 5d1a58a6
Michael authored 1 year ago
```
[docs] Remove redundant  and  from custom_tools.md
```
5d1a58a6

20 Mar, 2024 16 commits

[`BC 4.37 -> 4.38`] for Llama family, memory and speed (#29753) · ff841900

Arthur authored 1 year ago

* attempt to fix

* the actual fix that works with compilation!

* this?

* temporary update

* nit?

* dispatcg to memory efficient?

* update both models that have static cache support

* fix copies fix compile

* make sure fix

* fix cohere and gemma

* fix beams?

* nit

* slipped through the cracks

* nit

* nits

* update

* fix-copies

* skip failing tests

* nits

ff841900

[`BitsAndBytesConfig`] Warning for unused `kwargs` & safety checkers for... · 8dd4ce6f

Benjamin Ye authored 1 year ago

[`BitsAndBytesConfig`] Warning for unused `kwargs` & safety checkers for `load_in_4bit` and `load_in_8bit` (#29761)

* added safety checkers for load_in_4bit and load_in_8bit on init, as well as their setters

* Update src/transformers/utils/quantization_config.py

typo correction for load_in_8bit setter checks

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

8dd4ce6f

Fix docker image build (#29762) · 17e4467f
Yih-Dar authored 1 year ago
```
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
17e4467f
Update test reqs to include sentencepiece (#29756) · c78f5772
Zach Mueller authored 1 year ago
```
* Update test reqs

* Clean
```
c78f5772

Add LLaVa-1.6, bis (#29586) · d91fd7f9

NielsRogge authored 1 year ago


* First draft

* Fix tests, add docs

* Improve docstrings

* Fix test

* Address comments

* Address comments

* Remove vocab_size attribute

* Remove batch_size

* Address comment

* Add image processor tests

* Support fx

* Update docstring

* Add support for 34b

* Convert 34b model

* Add integration tests

* Update checkpoints

* Convert vicuna-13b, remove doc tests

* Remove script

* Remove file

* Address comments

* Improve docstrings

* Deprecate vocab_size

* Remove aspect_ratio_setting

* Address comments

* Update READMEs

* Add tips about chat templates

* Fix tests

* Deprecate vocab_size safely

* Update tests

---------

Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>

d91fd7f9

Add correct batched handling for apply_chat_template (#29222) · 9d999481

Matt authored 1 year ago


* Add correct batched handling for apply_chat_template

* Fix warning method

* Add error for incompatible options

* expand tests

* Add a skip for markuplm

* Add skips for other layout models

* Skip for LayoutLMv2

* Slightly update the warning message

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* typo fix

* Update docstring for conversation kwarg

* Update return docstring

* Remove the warning, improve error message

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/test_tokenization_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/test_tokenization_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Remove return_dict=None

* Fix up some merge cruft

* More merge cruft

* Add another skip

* Add another skip

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

9d999481

SuperPointModel -> SuperPointForKeypointDetection (#29757) · 3c17c529
amyeroberts authored 1 year ago

3c17c529
v4.40.0.dev.0 · 1248f092
Arthur Zucker authored 1 year ago

1248f092

Support sharded safetensors in TF (#29350) · 11ef35e8

Matt authored 1 year ago


* Initial commit (still lots of unfinished bits)

* (Still untested) add safetensors sharding to save_pretrained

* Fix savetensors saving, update default shard size to match PT

* Add proper loading of TF-format safetensors

* Revert default size in case that changes things

* Fix incorrect index name

* Update loading priority

* Update tests

* Make the tests a little more stringent

* Expand tests

* Add sharded cross-test

* Fix argument name

* One more test fix

* Adding mlx to the list of allowed formats

* Remove irrelevant block for safetensors

* Refactor warning logging into a separate function

* Remove unused skip_logger_warnings arg

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Move function def

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

11ef35e8

fix jinja2 package version check (#29754) · 870bbb4c
Ricardo authored 1 year ago

870bbb4c

Update Mamba types and pass through use_cache attr to MambaModel (#29605) · 76b3b20f

Kola authored 1 year ago


* Update docstring for RMSNorm

* Update cache_params object to correct MambaCache type

* Update docstrings and type info

* Pass through use_cache

* ruff

* Reformat with 119 char limit per line (thanks Arthur)

* Pass through use_cache specifically to the backbone rather than all keyword arguments

* Update src/transformers/models/mamba/modeling_mamba.py

* Update src/transformers/models/mamba/modeling_mamba.py

* Update src/transformers/models/mamba/modeling_mamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/mamba/modeling_mamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tab

* Update src/transformers/models/mamba/modeling_mamba.py

* Update src/transformers/models/mamba/modeling_mamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

76b3b20f

[Tests] Remove unused code (#29737) · 776c9d3a
NielsRogge authored 1 year ago
```
Remove unused code
```
776c9d3a
fix galore layerwise with frozen params (#29743) · a1a74541
peterjc123 authored 1 year ago

a1a74541

fixed the issue of DPO trainer that using one node and mutiple GPUs and set... · 8692aa88

Peng Wei authored 1 year ago

fixed the issue of DPO trainer that using one node and mutiple GPUs and set the device_map='auto' (#29695)

* fixed the issue of DPO trainer that using one node and mutiple GPUs

* before update, add the assert

* run the ruff formatter

* Update src/transformers/trainer.py

Thank you.

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* remember to do make style and make quality before commit

* Update src/transformers/trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

8692aa88

Larger runner on CircleCI (#29750) · 243d0de9
Yih-Dar authored 1 year ago
```
larger runner

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
243d0de9
Tests: Musicgen tests + `make fix-copies` (#29734) · 1a5c500f
Joao Gante authored 1 year ago
```
* make fix-copies

* some tests fixed

* tests fixed
```
1a5c500f

19 Mar, 2024 6 commits

Fix `check_copies` not capturing the diff in model/paper title and link (#29724) · 66ce9593
Yih-Dar authored 1 year ago
```
* fix

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
66ce9593

Llama: partial 4d masks (#29731) · 4294f0c3

Joao Gante authored 1 year ago


* partial 4d masks

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4294f0c3

Clean-up generation tests after moving methods to private (#29582) · 425ba56c

Raushan Turganbay authored 1 year ago

* clean-up tests

* refine comments

* fix musicgen tests

* make style

* remove slow decorator from a test

* more clean-up

* fix other failing tests

425ba56c

Implementation of SuperPoint and AutoModelForKeypointDetection (#28966) · 56baa033

StevenBucaille authored 1 year ago


* Added SuperPoint docs

* Added tests

* Removed commented part

* Commit to create and fix add_superpoint branch with a new branch

* Fixed dummy_pt_objects

* Committed missing files

* Fixed README.md

* Apply suggestions from code review

Fixed small changes

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Moved ImagePointDescriptionOutput from modeling_outputs.py to modeling_superpoint.py

* Removed AutoModelForKeypointDetection and related stuff

* Fixed inconsistencies in image_processing_superpoint.py

* Moved infer_on_model logic simply in test_inference

* Fixed bugs, added labels to forward method with checks whether it is properly a None value, also added tests about this logic in test_modeling_superpoint.py

* Added tests to SuperPointImageProcessor to ensure that images are properly converted to grayscale

* Removed remaining mentions of MODEL_FOR_KEYPOINT_DETECTION_MAPPING

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fixed from (w, h) to (h, w) as input for tests

* Removed unnecessary condition

* Moved last_hidden_state to be the first returned

* Moved last_hidden_state to be the first returned (bis)

* Moved last_hidden_state to be the first returned (ter)

* Switched image_width and image_height in tests to match recent changes

* Added config as first SuperPointConvBlock init argument

* Reordered README's after merge

* Added missing first config argument to SuperPointConvBlock instantiations

* Removed formatting error

* Added SuperPoint to README's de, pt-br, ru, te and vi

* Checked out README_fr.md

* Fixed README_fr.md

* Test fix README_fr.md

* Test fix README_fr.md

* Last make fix-copies !

* Updated checkpoint path

* Removed unused SuperPoint doc

* Added missing image

* Update src/transformers/models/superpoint/modeling_superpoint.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Removed unnecessary import

* Update src/transformers/models/superpoint/modeling_superpoint.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Added SuperPoint to _toctree.yml

---------

Co-authored-by: steven <steven.bucaillle@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com>

56baa033

[`GemmaConverter`] use user_defined_symbols (#29473) · 2f9a3edb

Arthur authored 1 year ago

* use user_defined_symbols

* fixup

* nit

* add a very robust test

* make sure all models are tested with the `pretrained_tokenizer_to_test`

* should we make sure we test all of them?

* merge

* remove the id

* fix test

* update

* ousies

* oups

* fixup

* fix copies check

* remove `pretrained_tokenizer_to_test`

2f9a3edb

[`Gemma`] final fixes to the modeling (#29729) · 8e2fc52e

Arthur authored 1 year ago


* gelu_pytorch_tanh

* Force config.hidden_act to be approx gelu

* Gemma bug fixes

* force_use_exact_gelu

* Update configuration_gemma.py

* Update modeling_gemma.py

* update

* update for simpler handling

* nit

* nit

* fixpup

* update

* also update the jax modeling!

* add `"gelu_pytorch_tanh": partial(nn.gelu, approximate=True),`

* fixup

* fix order

* act vs act_fn

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

8e2fc52e