Commits · tied_weights_warning_check · zhusg / transformers-new

03 Nov, 2022 7 commits
- First fix pass. · 4f326510
  Nicolas Patry authored 2 years ago
  
  4f326510
- Style. · 01b555d6
  Nicolas Patry authored 2 years ago
  
  01b555d6
- Attempting to test automatically the `_keys_to_ignore`. · f13cc5a7
  Nicolas Patry authored 2 years ago
  
  f13cc5a7
- [Doctest] Add configuration_camembert.py (#20039) · 790ff254
  Saad Mahmud authored 2 years ago
```
* Add example docstring for CamembertConfig

* Add configuration_camembert to documentation_tests
```
  790ff254
- Fix some doctests after PR 15775 (#20036) · 9ccea7ac
  Yih-Dar authored 2 years ago
```
* Add skip_special_tokens=True in some doctest

* For T5

* Fix for speech_to_text.mdx

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  9ccea7ac
- Add **kwargs (#20037) · a639ea9e
  amyeroberts authored 2 years ago
  
  a639ea9e
- Now supporting pathlike in pipelines too. (#20030) · ec6878f6
  Nicolas Patry authored 2 years ago
  
  ec6878f6
02 Nov, 2022 13 commits

reorganize glossary (#20010) · aa39967b
Steven Liu authored 2 years ago

aa39967b

Show installed libraries and their versions in CI jobs (#20026) · 305e8718

Yih-Dar authored 2 years ago


* Show versions

* check

* store outputs

* revert

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

305e8718

Fix Issue 15003: SentencePiece Tokenizers Not Adding Special Tokens in... · 9f9ddcc2

Ben Eyal authored 2 years ago

   Fix Issue 15003: SentencePiece Tokenizers Not Adding Special Tokens in `convert_tokens_to_string` (#15775)

* Add test for SentencePiece not adding special tokens to strings

* Add SentencePieceStringConversionMixin to fix issue 15003

* Fix conversion from tokens to string for most SentencePiece tokenizers

Tokenizers fixed:
- AlbertTokenizer
- BarthezTokenizer
- CamembertTokenizer
- FNetTokenizer
- M2M100Tokenizer
- MBart50Tokenizer
- PegasusTokenizer
- Speech2TextTokenizer

* Fix MarianTokenizer, adjust SentencePiece test to accomodate vocab

* Fix DebertaV2Tokenizer

* Ignore LayoutXLMTokenizer in SentencePiece string conversion test

* Run 'make style' and 'make quality'

* Clean convert_tokens_to_string test

Instead of explicitly ignoring LayoutXLMTokenizer in the test,
override the test in LayoutLMTokenizationTest and do nothing in it.

* Remove commented out code

* Improve robustness of convert_tokens_to_string test

Instead of comparing lengths of re-tokenized text and input_ids,
check that converting all special tokens to string yields a string
with all special tokens.

* Inline and remove SentencePieceStringConversionMixin

The convert_tokens_to_string method is now implemented
in each relevant SentencePiece tokenizer.

* Run 'make style' and 'make quality'

* Revert removal of space in convert_tokens_to_string

* Remove redundant import

* Revert test text to original

* Uncomment the lowercasing of the reverse_text variable

* Mimic Rust tokenizer behavior for tokenizers

- Albert
- Barthez
- Camembert
- MBart50
- T5

* Fix accidentally skipping test in wrong tokenizer

* Add test for equivalent Rust and slow tokenizer behavior

* Override _decode in BigBirdTokenizer to mimic Rust behavior

* Override _decode in FNetTokenizer to mimic Rust behavior

* Override _decode in XLNetTokenizer to mimic Rust behavior

* Remove unused 're' import

* Update DebertaV2Tokenizer to mimic Rust tokenizer

* Deberta tokenizer now behaves like Albert and its `convert_tokens_to_string` is not tested.

* Ignore problematic tests in Deberta V2

* Add comment on why the Deberta V2 tests are skipped

9f9ddcc2

Fix doctest (#20023) · fb7cbe23

Yih-Dar authored 2 years ago


* Fix doctest

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fb7cbe23

Improve model tester (#19984) · f69eb24b

Yih-Dar authored 2 years ago


* part 1

* part 2

* part 3

* fix

* For CANINE

* For ESMFold

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f69eb24b

[Doctest] Add configuration_deberta_v2.py (#19995) · 74877437

Saad Mahmud authored 2 years ago

* Add example docstring for DebertaV2Config

* Add DebertaV2Config to documentation_tests

* Fix mistake with directory name

74877437

Update auto processor to check image processor created (#20021) · 9aedce99
amyeroberts authored 2 years ago

9aedce99
Quality (#20002) · 49b77b89
Sylvain Gugger authored 2 years ago

49b77b89
Fix gradient checkpoint test in encoder-decoder (#20017) · c6c9db3d
Yih-Dar authored 2 years ago
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
c6c9db3d

Add Image Processors (#19796) · a6b77598

amyeroberts authored 2 years ago


* Add CLIP image processor

* Crop size as dict too

* Update warning

* Actually use logger this time

* Normalize doesn't change dtype of input

* Add perceiver image processor

* Tidy up

* Add DPT image processor

* Add Vilt image processor

* Tidy up

* Add poolformer image processor

* Tidy up

* Add LayoutLM v2 and v3 imsge processors

* Tidy up

* Add Flava image processor

* Tidy up

* Add deit image processor

* Tidy up

* Add ConvNext image processor

* Tidy up

* Add levit image processor

* Add segformer image processor

* Add in post processing

* Fix up

* Add ImageGPT image processor

* Fixup

* Add mobilevit image processor

* Tidy up

* Add postprocessing

* Fixup

* Add VideoMAE image processor

* Tidy up

* Add ImageGPT image processor

* Fixup

* Add ViT image processor

* Tidy up

* Add beit image processor

* Add mobilevit image processor

* Tidy up

* Add postprocessing

* Fixup

* Fix up

* Fix flava and remove tree module

* Fix image classification pipeline failing tests

* Update feature extractor in trainer scripts

* Update pad_if_smaller to accept tuple and int size

* Update for image segmentation pipeline

* Update src/transformers/models/perceiver/image_processing_perceiver.py

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Update src/transformers/image_processing_utils.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/beit/image_processing_beit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* PR comments - docstrings; remove accidentally added resize; var names

* Update docstrings

* Add exception if size is not in the right format

* Fix exception check

* Fix up

* Use shortest_edge in tuple in script

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

a6b77598

make sentencepiece import conditional in bertjapanesetokenizer (#20012) · 2e3452af
Ripose authored 2 years ago

2e3452af

clean up vision/text config dict arguments (#19954) · 8827e1b2

Yih-Dar authored 2 years ago


* clean up

* For backward compatibility

* clean up

* Same changes for more models

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

8827e1b2

Update object detection pipeline to use post_process_object_detection methods(#20004) · cb630ffa
Alara Dirik authored 2 years ago

cb630ffa

01 Nov, 2022 12 commits

fix typo (#20006) · 79c720c0
Steven Liu authored 2 years ago

79c720c0

Generate: contrastive search with full optional outputs (#19963) · 831590f6

Joao Gante authored 2 years ago

* Use beam search functionality; Add extra outputs and test

* Add full tests for contrastive search

* Add error message on unconventional cache format

831590f6

Add LayoutLMv3 resource (#19932) · ab74ac11
Steven Liu authored 2 years ago
```
* add layoutlmv3 resource

* add layoutlmv2 resources

* fix button
```
ab74ac11

Add BERT resources (#19852) · dec8578e

Steven Liu authored 2 years ago

* add resources for bert

* add course chapters

* apply reviews

* add pipeline icons and community resource

* fix buttons

dec8578e

add dataset (#20005) · 1f6885ba
Steven Liu authored 2 years ago

1f6885ba

Add ESMFold code sample (#20000) · 4f1e5e4e

Matt authored 2 years ago

* Add ESMFold code sample

* sorry sylvain

* make fixup

* sorry sylvain again

4f1e5e4e

Add Japanese translated README (#19945) · 38e5b71a

Ikko Ashimine authored 2 years ago


* Add japanese translated README.md

* Add README_ja.md link

* Add japanese transkate to check_copies.py

* Add guide to Japanese README.md

* Update README_ja.md

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update utils/check_copies.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

38e5b71a

typo (#20001) · 4f90fc1d
Wang Ran (汪然) authored 2 years ago

4f90fc1d
Update image_classification.mdx (#19996) · c87ae86a
Sayak Paul authored 2 years ago

c87ae86a

Added onnx config whisper (#19525) · c796b6de

Mohit Sharma authored 2 years ago

* Added onnx config whisper

* added whisper support onnx

* add audio input data

* added whisper support onnx

* fixed the seqlength value

* Updated the whisper onnx ocnfig

* restore files to old version

* removed attention mask from inputs

* Updated get_dummy_input_onnxruntime docstring

* Updated relative imports and token generation

* update docstring

c796b6de

v4.25.0.dev0 · c3a93d8d
Sylvain Gugger authored 2 years ago

c3a93d8d

Add ESMFold (#19977) · 7f9b7b3f

Matt authored 2 years ago


* initial commit

* First draft that gets outputs without crashing!

* Add all the ported openfold dependencies

* testing

* Restructure config files for ESMFold

* Debugging to find output discrepancies

* Mainly style

* Make model runnable without extra deps

* Remove utils and merge them to the modeling file

* Use correct gelu and remove some debug prints

* More cleanup

* Update esm docs

* Update conversion script to support ESMFold properly

* Port some top-level changes from ESMFold repo

* Expand EsmFold docstrings

* Make attention_mask optional (default to all 1s)

* Add inference test for ESMFold

* Use config and not n kwargs

* Add modeling output class

* Remove einops

* Remove chunking in ESM FFN

* Update tests for ESMFold

* Quality

* REpo consistency

* Remove tree dependency from ESMFold

* make fixup

* Add an error in case my structure map function breaks later

* Remove needless code

* Stop auto-casting the LM to float16 so CPU tests pass

* Stop auto-casting the LM to float16 so CPU tests pass

* Final test updates

* Split test file

* Copyright and quality

* Unpin PyTorch to see built doc

* Fix config file to_dict() method

* Add some docstrings to the output

* Skip TF checkpoint tests for ESM until we reupload those

* make fixup

* More docstrings

* Unpin to get even with main

* Flag example to write

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

7f9b7b3f

31 Oct, 2022 8 commits
- Add support for gradient checkpointing (#19990) · 4c9e0f02
  NielsRogge authored 2 years ago
```
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
```
  4c9e0f02
- Pin torch to < 1.13 temporarily (#19989) · 8214a9f6
  Yih-Dar authored 2 years ago
```
* pin torch to < 1.13

* pin torch to < 1.13

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  8214a9f6
- Tranformers documentation translation to Italian #17459 (#19988) · 6aede2d6
  Jean Charles Kouame authored 2 years ago
  
  6aede2d6
- [ASR] Update 'tasks' for model card (#19986) · f38a1454
  Sanchit Gandhi authored 2 years ago
  
  f38a1454
- [modelcard] Update for ASR (#19985) · 9406c7bc
  Sanchit Gandhi authored 2 years ago
```
* [modelcard] Update for ASR

* style
```
  9406c7bc
- gradient checkpointing for GPT-NeoX (#19946) · 225c36fb
  Chiao authored 2 years ago
```
* gradient checkpointing for GPT-NeoX

* initialize gradient checkpointing flag

* must set flag before init
```
  225c36fb
- [Doctest] Add configuration_deberta.py (#19968) · 6176e136
  Saad Mahmud authored 2 years ago
```
* Add Example docstring to DebertaConfig

* Add configuration_deberta to documentation_tests

* Add microsoft/deberta-base to example docstring

* Fix example docstring mistake
```
  6176e136
- donut -> donut-swin (#19920) · b0474726
  Yih-Dar authored 2 years ago
```
* donut -> donut-swin

* remove ("donut-swin", "DonutProcessor")

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  b0474726