Commits · api_big2 · zhusg / transformers-new

30 May, 2022 1 commit

fex fixes · 20b95cb2

younesbelkada authored 3 years ago


fix tokenizer autodoc

fix minor CI issues

fix minor CI issues

fix minor CI issues

fix style issue

fix minor import issues

fix few issues

remove def main on the test

add require torch

replace decorator with 'with'

fix style

change to bloom

add quick fix tokenizer

fix tokenizer file

fix tokenizer

- merge tests
- small fixes

fix import issue

add bloom to readme

fix consistency

Update docs/source/en/model_doc/bloom.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Apply suggestions from code review

fix comment issues on file headers

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

fix doc issue

small fix - modeling test

some changes

- refactor some code
- taking into account reviews
- more tests should pass
- removed pruning tests

remove useless division

more tests should pass

more tests should pass

more tests should pass

let's try this one

-add alibi offset
- remove all permutes to make the grad operations work
- finger crossed

Update data2vec.mdx to include a Colab Notebook link (that shows fine-tuning) (#17194)

* Update data2vec.mdx

* Update data2vec.mdx

* Update docs/source/en/model_doc/data2vec.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Dev version

Add test to ensure models can take int64 inputs (#17210)

* Add test to ensure models can take int64 inputs

* is_integer is an attribute, not a method

* Fix test when some inputs aren't tensors

* Add casts to blenderbot and blenderbot-small

* Add casts to the other failing models

Fix dependency table

update BART docs (#17212)

Black preview (#17217)

* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black

Fix typo in bug report template (#17178)

* Fix typo

* Force rerun workflows

Co-authored-by: Felix Marty <felix@huggingface.co>

Added translation of installation.mdx to Portuguese Issue #16824 (#16979)

* Added translation of installation.mdx to Portuguese, as well
as default templates of _toctree.yml and _config.py

* [ build_documentation.yml ] - Updated doc_builder to build
documentation in Portuguese.
[ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx.

* [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder.

[ pipeline_tutorial.mdx ] - Grammar changes.

* [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial.

* [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial.

[ training.mdx ] - Added portuguese translation for training tutorial.

* [ preprocessing.mdx ] - WIP

* Update _toctree.yml

* Adding Pré-processamento to _toctree.yml

* Update accelerate.mdx

* Nits and eliminate preprocessing file while it is ready

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

OPT-fix (#17229)

* try fixes

* Revert "try fixes"

This reverts commit a8ad75ef69d4fc03a402ef61bd034b018aa8555e.

* add correct shape

* add correct path

OPT - fix docstring and improve tests slighly (#17228)

* correct some stuff

* fix doc tests

* make style

Update self-push workflow (#17177)

* update push ci

* install git-python

* update comment

* update deepspeed jobs

* fix report

* skip 2 more tests that require fairscale

* Fix changes in test_fetcher.py (to deal with `setup.py` is changed)

* set RUN_PT_TF_CROSS_TESTS=1 and final clean-up

* remove SIGOPT_API_TOKEN

* remove echo "$matrix_folders"

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fix --gpus option for docker (#17235)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Handle copyright in add-new-model-like (#17218)

Fix Trainer for Datasets that don't have dict items (#17239)

install dev. version of accelerate (#17243)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix push CI channel (#17242)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Add PR title to push CI report (#17246)

* add PR title to push CI report

* add link

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

[ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial (#17076)

* [ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial

* Delete docs/source/pt-br directory

* [ fast_tokenizers.mdx ] - Continuing work on file

* [ fast_tokenizers.mdx ] - Continuing work on file

* Add fast tokenizers to _toctree.yml

* Eliminated config and toctree.yml

* Nits in fast_tokenizers.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

Translated version of model_sharing.mdx doc to spanish (#16184)

* Translated version of model_sharing to spanish

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Update docs/source_es/model_sharing.mdx

* Addind model sharing to _toctree.yml

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

Guide to create custom models in Spanish (#17158)

* file copied and toctree updated

* Intro and configuration translated

* model section translated

* enter hotfix

* Translation over, correction pending

* Typos and corrections

* Update docs/source/es/create_a_model.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/create_a_model.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/create_a_model.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/create_a_model.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

Fix obvious typos in flax decoder impl (#17279)

Change config.encoder_ffn_dim -> config.decoder_ffn_dim for decoder.

TF - Fix convnext classification example (#17261)

[WIP] [doc] performance/scalability revamp (#15723)

* [doc] performance/scalability revamp

* link the new docs

* no :

* mixed precision

* work on the first doc

* expand the main doc

* Trigger CI

* style

* revamp single GPU training section

* work on training performance

* remove files not used anymore or will be added later

* final touches

* fix rebase

* Add hardware section to toctree

* fix toctree again

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove `fast_tokenizers` entry that was copied in rebase

* add warning about DP vs DDP

* remove todo

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix missing closure of codeblock

* Update docs/source/en/perf_train_gpu_many.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* sync with #16860

* update toc

Co-authored-by: leandro <leandro.vonwerra@spoud.io>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

fixed bug in run_mlm_flax_stream.py (#17203)

* fixed bug run_mlm_flax_stream.py

Fixed bug caused by an update to tokenizer keys introduced in recent transformers versions (between `4.6.2` and `4.18.0`) where additional keys were introduced to the tokenizer output.

* Update run_mlm_flax_stream.py

* adding missing paranthesis

* formatted to black

* remove cols from dataset instead

* reformat to black

* moved rem. columns to map

* formatted to black

Co-authored-by: KennethEnevoldsen <kennethcenevolsen@gmail.com>

 Updated checkpoint support for Sagemaker Model Parallel (#17219)

* adding partial checkpoint support for optimizer state

* formatted trainer.py

* Refactoring based on comments

* reformatting

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Cavdar <dcavdar@a07817b12d7e.ant.amazon.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update codeparrot data preprocessing (#16944)

* add new preprocessing arguments

* add new filters

* add new filters to readme

* fix config and test count, update function names and docstrings

* reformat code

* update readme

* Update readme

* rename config_test filter

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* rename few_assignments filter

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* rename tokenizer in arguments

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* rename functions and add limit_line argument for config_test filter

* update threshold for config_test filter

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>

CodeParrot data pretokenization (#16932)

* add pretokenization arguments

* add pretokenization script

* add support for pretokenized data

* reformat code

* fix run command for training

* fix model call from config

* remove a package

* add comments on pretokenization in the readme

* remove explicit parallelization

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* update readme

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* update readme -remove username

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* update readme -remove username

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* keep data parallelization

* reformat code

* reformat code

* update readme

* reformat code

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>

Remove next sentence prediction from supported ONNX tasks (#17276)

Align logits and labels in OPT (#17237)

Mlflowcallback fix nonetype error (#17171)

* Fix edge cases TypeError: 'NoneType' object is not callable

* fix style

Automatically sort auto mappings (#17250)

* Automatically sort auto mappings

* Better class extraction

* Some auto class magic

* Adapt test and underlying behavior

* Remove re-used config

* Quality

Make TrainerHyperParameterSigOptIntegrationTest slow test (#17288)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Better error in the Auto API when a dep is missing (#17289)

Fix FlavaForPreTrainingIntegrationTest CI test (#17232)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Use the PR URL in CI report (#17269)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

logging documentation update (#17174)

* logging documentation

* style

Co-authored-by: Sander Land <sander@chatdesk.com>

docs(transformers): fix typo (#17263)

Add Tensorflow Swin model (#16988)

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

[Tests] Fix slow opt tests (#17282)

* fix opt tests

* remove unused tok

* make style

* make flake8 happy

* Update tests/models/opt/test_modeling_opt.py

Fix test_model_parallelization (#17249)

* Fix test_model_parallelization

* Modify

Add Wav2Vec2Conformer (#16812)

* save intermediate

* add wav2vec2 conformer

* add more code

* more

* first test passes

* make all checkpoints work

* update

* up

* more clean ups

* save clean-up

* save clean-up

* save more

* remove bogus

* finalize design conformer

* remove vision

* finish all tests

* more changes

* finish code

* add doc tests

* add slow tests

* fix autoconfig test

* up

* correct docstring

* up

* update

* fix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Update docs/source/en/model_doc/wav2vec2-conformer.mdx

* upload

* save copied from

* correct configs

* fix model outputs

* add to docs

* fix imports

* finish

* finish code

* correct copied from

* correct again

* correct make fix

* improve make fix copies

* save

* correct fix copy from

* correct init structure

* correct

* fix import

* apply suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

Fix missing job action button in CI report  (#17270)

* use matrix.machine_type

* fix job names used in job_link

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix wrong PT/TF categories in CI report (#17272)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

[ConvNeXT] Fix drop_path_rate (#17280)

* Fix drop_path_rate

* Fix TF's drop path rate

fix retribert's `test_torch_encode_plus_sent_to_model` (#17231)

Fix tests of mixed precision now that experimental is deprecated (#17300)

* Fix tests of mixed precision now that experimental is deprecated

* Fix mixed precision in training_args_tf.py too

Rewrite TensorFlow train_step and test_step (#17057)

* Initial commit

* Better label renaming

* Remove breakpoint before pushing (this is your job)

* Test a lot more in the Keras fit() test

* make fixup

* Clarify the case where we flatten y dicts into tensors

* Clarify the case where we flatten y dicts into tensors

* Extract label name remapping to a method

correct opt (#17301)

refactor

- refactor code
- style changes
- add new threshold for test

major changes

- change BLOOM to Bloom
- add quick doc on bloom.mdx
- move embeddings test on modeling test

modify readme

small fixes

small fix

- better threshold for a test

remove old test file from fetcher

fix small typo

major change

- change BloomLMHead to BloomForCausalLM

remove onnx config

major changes

- refactor the code
- remove asserts
- change tol for test

make style

small change

adding a slow test + commenting old ones for now

make style

Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

make style

fix duplicates

cleaning comments on config

clean a bit conversion file

refacor a bit modeling file

refactor tokenizer file

fix tokenization test issue

fix tokenization issue second try

fix tokenization issue #2

fix test issue

make style + add suggestions

change test fetcher

try this one

- slow tests should pass
- finger crossed

possible final changes

make style

try fix padding side issue

fix side

fix padding issue

fix ko-readme

fix config auto

cleaning modeling file

keep bloom in caps in ko

update config docs

remove pretraining_pp

remove model parallel

update config

- add correct config files

fix duplicates

fix fetcher

fix refactor issue

- remove divide function

try to remove alibi

small fixes

- fix alibi
- remove seq length
- refactor a bit the code

put correct values

- fix bos and eos token ids

fix attention mask loop

Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

small fixes:

- remove skip bias add

small fixes

- fix typo in readme
- fix typos in config

small changes

- remove a test
- add reconstruction test
- change config

small changes

- change Scaled Softmax to BloomScaledSoftmax

small fixes

- fix alibi dtype

major changes

- removing explicit dtype when loading modules
- fixing test args (torch_dtype=auto)
- add dosctring

fix readmes

major changes

- now bloom supports alibi shifting
- refactor a bit the code
- better test tolerance now

refactor a bit

refactor a bit

put correct name on test

change docstring

small changes

- fix docstring modeling
- fix test tolerance

fix small nit

- take dtype from tensors in the conversion script

minor fix

- fix mdx issue

minor fix

- change config docstring

forward contrib credits from PR14084

Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

apply modifications

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

resolve softmax upcast

Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Update src/transformers/models/bloom/modeling_bloom.py

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

final changes modeling

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Merge commit 'd156898f

'

merge commit

Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

apply suggestions

Apply suggestions from Stas comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

20b95cb2

12 May, 2022 19 commits

put , · 2837a1c4
younesbelkada authored 3 years ago

2837a1c4

Single commit · 8892e6fa

Thomwolf authored 3 years ago


adding template

update model

model update

update conf for debug model

update conversion

update conversion script

update conversion script

fix missing keys check

add tests to test the tokenizer in the local machine

Change variable name

add tests on xnli dataset

add more description

add descriptions + clearer code

clearer code

adding new tests + skipping few tests because of env problems

change comment

add dtype on the configuration

add test embeddings

add hardcoded test

fix dtype issue

adding torch.float16 to config

adding more metrics (min, max, mean)

add sum

now the test passes with almost equal

add files for conversion - test passes on cpu  gpu

add final changes

cleaning code

add new args in the docstring

fix one liner function

remove macros

remove forward attention

clean up init funtion

add comments on the issue

rm scale mask softmax

do make style

fix dtype in init

fixing for loop on att probs

fix style with black

fix style + doc error

fix and debug CI errors (docs + style)

some updates

- change new operations
- finally add scaled softmax
- added new args in the config

make use cache working

add changes

- save sharded models
- final changes on the modeling script

add changes

- comment on alibi
- add TODO on seq length

test commit

- added a text to test the commit

Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

final changes

- attention mask change
- generation works on BS176b

Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

changes - model + conversion

move to correct dir

8892e6fa

Bad rebase. · 44eb5a48
Nicolas Patry authored 3 years ago

44eb5a48
Recovering on corrupted files on disk. · a6d6c362
Nicolas Patry authored 3 years ago

a6d6c362
Workaround inference mode. · ac2ed40f
Nicolas Patry authored 3 years ago

ac2ed40f
Bad rebase. · 3d4c1ef1
Nicolas Patry authored 3 years ago

3d4c1ef1
Better fix for GPT2ForTokenClassification · 4d5912e4
Nicolas Patry authored 3 years ago

4d5912e4
Remove recursion depth error. · 819f55a1
Nicolas Patry authored 3 years ago

819f55a1
Infinite sequence for gpt2. last real override of API. · 3953d3d7
Nicolas Patry authored 3 years ago

3953d3d7

migrate azure blob for beit checkpoints (#16902) · a42242da

Li Dong authored 3 years ago

## Motivation

We are going to use a new blob account to store the checkpoints.

## Modification

Modify the azure blob storage URLs for BEiT checkpoints.

a42242da

Add OPT (#17088) · b971c769

Younes Belkada authored 3 years ago


* First version - OPT model

* Final changes

- putting use cache to False

* few changes

- remove commented block

* few changes

- remove unecessary files

* fix style issues

* few changes

- remove a test file
- added the logits test

* Update src/transformers/models/auto/tokenization_auto.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add gen tests

* few changes

- rm mask filling example on docstring

* few changes

- remove useless args

* some changes

- more tests should pass now
- needs to clean more
- documentation still needs to be done

* fix code quality

* major changes

- change attention architecture to BART-like
- modify some tests
- style fix

* rm useless classes

- remove opt for:
- QA
- cond generation
- seq classif

* Removed autodoc calls to non-existant classes

TOkenizers are not implemented

* Update src/transformers/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/auto/modeling_tf_auto.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Replaced OPTTokeniser with GPT2 tokenizer

* added GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer")

* Removed OPTTokenizer

* make style

* Make style replaces

``` ...).unsqueeze(```
by
``` >>>).unsqueeze(```

* make repo consistency

* Removed PretrainedOPTModel

* fix opt.mdx removed other heads

* fix init, removed 3 heads

* removed heads

* finished cleaning head

* removed seauence classif and question answering

* removed unused imports

* removed useless dummy object for QA, SC and CG

* removed tests for removed useless dummy object for QA, SC and CG

* Removed head_mask using encoder layers which don't exist

* fixed test

* fix line

* added OPT to toctree

* Updated model path with pushed weigths

* fix model path

* fixed code quality

* fixed embeddings and generation tests

* update paths

* clean comments

* removed OPTClassificationHead for sentence classification

* renamed hidden layer

* renamed num layers to standard num_hidden_layers

* num_attention_heads fix

* changes for 125m

* add first version for 125m

* add first version - flax

* add new version

* causal LM output

* replace output type with BaseModelOutputWithPastAndCrossAttentions

* revert working config from 150m to 350m

* clean

* removed decoder input ids

* fixed embed dim

* more embed_dim issues

* make style + removed enc_dec test

* update falx model

* removed troublesome copy

* added is_encoder_decoder=False to config

* added set_input emb fuinction to model class

* requires torch on embed test

* use head mask instead of decoder head mask input param solves a test

* 8 test remaining, update

* Updated create_and_check_decoder_model_past_large_inputs

* Make style

* update op tokenizer with condition

* make style

* See if I can push

* some clean up

* remove linear head hack

* save intermediate

* save correct attention

* add copied from from bart

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix part of the reviewss
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* same changes in naming / conversion

* correct mask

* more fixes

* delete FlaxOPT and TfOPT

* clean traces of Flax and Tf

* fix mask

* fixed positionnal embedding length when past key value is provoded

* get 125m, 6.7b to work

* Added do_layer_norm

* solved mismatch in load dictionnary

* clean up preapre opt input dict

* fixed past key value as bool

* fix previus

* fixed return dict False tuple issue

* All tests are passing

* Make style

* Ignore OPTDecoder non tested

* make fix-copies

* make repo consistency

* small fix

* removed uselss @torch.no_grad decorator

* make styl;e

* fix previous opt test

* style

* make style

* added opt documentation

* update OPT_PRETRAINED_MODEL_ARCHIVE_LIST

* up

* more fixes

* model & config work

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* added comment on padding hack (+2)

* cleaup

* review update

* docstring for missing arg

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/opt/__init__.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update pretrained map

* update path and tests

* make style

* styling

* make consistency

* add gpt2 tok new

* more tok fixes

* Update src/transformers/models/auto/tokenization_auto.py

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/opt/test_modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update based on reviews

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* make style

* make tokenizer auto tests pass

* apply Lysandre suggestion

* finish tests

* add some good tokenizer tests

* improve docs slighly

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

b971c769

ViT and Swin symbolic tracing with torch.fx (#17182) · 8c7481f3

Michael Benayoun authored 3 years ago

* Support tracing for ViT

* Swin support

* Fix copies

* Fix type annotation issue

* Removed unused import

8c7481f3

Fix contents in index.mdx to match docs' sidebar (#17198) · 1a688709
Omar U. Espejel authored 3 years ago
```
* Fix contents in index.mdx to match docs' sidebar

* Eliminates api section from contents
```
1a688709
Fix style error in Spanish docs (#17197) · b17b7889
Omar Sanseviero authored 3 years ago

b17b7889

Translate index.mdx (to ES) and add Spanish models to quicktour.mdx examples (#16685) · 1a66a6c6

Omar U. Espejel authored 3 years ago

* Change nits in Spanish for quicktour.mdx

- Add tasks names in English too.
- Fix small nits in Spanish

* Translate index.mdx to Spanish

* Translate body of index.
* Translated the compatible models list (not the papers´ names). Since this should not be updated manually, I can come back to the original text.

* Add models and a  dataset for Spanish in the code exmaples

* Replaced the English models to Spanish versions.

* Add index to _toctree.yml and fix Spanish

* Fix double ““ error

* Change negative example in ASR example

* make style

* Debug style in quicktour.mdx

1a66a6c6

Documentation: Spanish translation of fast_tokenizers.mdx (#16882) · e2d678b7

Jorge Loayza R authored 3 years ago


* Spanish translation of fast_tokenizers.mdx

* add fast_tokenizers to the spanish _toctree.yml

* Update docs/source/es/fast_tokenizers.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/fast_tokenizers.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/fast_tokenizers.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/fast_tokenizers.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/fast_tokenizers.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/fast_tokenizers.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

e2d678b7

Added es version of language_modeling.mdx doc (#17021) · ae82da21

Joaq authored 3 years ago


* Spanish version of language_modeling.mdx doc file

* modification to toctree.yml file

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/language_modeling.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Correct position of Guías conceptuales

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

ae82da21

Spanish translation of philosophy.mdx #15947 (#16922) · 36ddcc0d

jkmg authored 3 years ago


* adding philosophy.mdx translation to Spanish

* adding philosophy.mdx translation to Spanish

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source/es/philosophy.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* philosophy translation to Spanish

* Update _toctree.yml

* Update _toctree.yml

* nits

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

36ddcc0d

Remove duplicated os.path.join (#17192) · d1d5ebb1
Shijie Wu authored 3 years ago

d1d5ebb1

11 May, 2022 13 commits

[feat] Add FLAVA model (#16654) · a10f6183

Amanpreet Singh authored 3 years ago

* [WIP] Add FLAVA model

This PR aims to add [FLAVA](ihttps://arxiv.org/abs/2112.04482) model to the transformers repo.

Following checklist delineates the list of things to be done for this PR
to be complete:

[x] Flava init
[x] Flava base models
[x] Flava layers
[x] Flava Configs
[x] Flava encoders
[x] Flava pretraining models
[ ] Flava classification/retrieval models (To be added in a separate PR)
[x] Documentation updates 
[x] Imports updates 
[x] Argstring updates
[x] Flava pretrained checkpoints 
[x] Flava tests
[x] Flava processors 
[x] Sanity check
[x] Lint

a10f6183

Remove columns before passing to data collator (#17187) · 7b95825d
Antoni Baum authored 3 years ago

7b95825d
add shift_tokens_right in FlaxMT5 (#17188) · 934e21cd
Suraj Patil authored 3 years ago

934e21cd

Ensure tensors are at least 1d for pad and concat (#17179) · 47412c7d

Antoni Baum authored 3 years ago

* Ensure tensors are at least 1d for pad and concat

* Compatibility

* Fix

* Fix

* Add test

* Retrigger CI

* Consistency with master

* Retrigger CI

47412c7d

Fix LED documentation (#17181) · c76afa51

Manuel R. Ciosici authored 3 years ago

* Fix markdown code block

* Use consistent spelling for self-attention

* Fix typos and phrasing

* Fix code style

c76afa51

Remove unnecessary columns for all dataset types in `Trainer` (#17166) · edcc66d2

Antoni Baum authored 3 years ago

* Remove unneeded columns for IterableDataset

* Add test

* Update trainer tests

* Edit docstring

* Lint

* Apply feedback

* Apply feedback

edcc66d2

[WIP] Enable reproducibility for distributed trainings (#16907) · c33f6046

hasan salim kanmaz authored 3 years ago


* add seed worker and set_deterministic_seed_for_cuda function to enforce reproducability

* change function name to enable determinism, add docstrings, reproducability support for tf

* change function name to enable_determinism_for_distributed_training

* revert changes in set_seed and call set_seed within enable_full_determinism

* add one position argument for seed_worker function

* add full_determinism flag in training args and call enable_full_determinism when it is true

* add enable_full_determinism to documentation

* apply make fixup after the last commit

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c33f6046

Add missing RetriBERT tokenizer tests (#17017) · 5229744b

Martin Pömsl authored 3 years ago


* Create RetriBERT tests folder

* Add missing RetriBERT tokenizer test file

* Apply style corrections

* Add non-english filter

* Update tests/retribert/test_tokenization_retribert.py

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* Update tests/retribert/test_tokenization_retribert.py

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* Move test files to new directory

* Update import path for testing utils to new test file structure

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

5229744b

Convert image to rgb for clip model (#17101) · 6bc6797e
Heng Kuan Wee authored 3 years ago
```
Co-authored-by: kuanwee.heng <kuanwee.heng@aaqua.live>
```
6bc6797e
Fix repo consistency · 0a2bea47
Sylvain Gugger authored 3 years ago

0a2bea47
propagate "attention_mask" dtype for "use_past" in OnnxConfig.generate_dummy_inputs (#17105) · 0645b07d
arampacha authored 3 years ago
```
* propagate attention_mask dtype

* fixup&style
```
0645b07d

Extend Transformers Trainer Class to Enable PyTorch SGD/Adagrad Optimizers for Training (#17154) · 0e6ec2a4

jianan-gu authored 3 years ago


* add torch SGD and Adagrad optimizer bits

* refine naming

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0e6ec2a4

[M2M100 doc] remove duplicate example (#17175) · 63517fdf
Suraj Patil authored 3 years ago
```
* remove duplicate example

* remove code block
```
63517fdf

10 May, 2022 7 commits

MobileBERT tokenizer tests (#16896) · 4a419d49

Leon Derczynski authored 3 years ago


* unhardcode pretrained model path, make it a class var

* add tests for mobilebert tokenizer

* allow tempfiles for vocab & merge similarity test to autodelete

* add explanatory comments

* remove unused imports, let make style do its.. thing

* remove inheritance and use BERT tok tests for MobileBERT

* Update tests/mobilebert/test_tokenization_mobilebert.py

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* amend class names, remove unused import, add fix for mobilebert's hub pathname

* unhardcode pretrained model path, make it a class var

* add tests for mobilebert tokenizer

* allow tempfiles for vocab & merge similarity test to autodelete

* add explanatory comments

* remove unused imports, let make style do its.. thing

* remove inheritance and use BERT tok tests for MobileBERT

* Update tests/mobilebert/test_tokenization_mobilebert.py

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* amend class names, remove unused import, add fix for mobilebert's hub pathname

* amend paths for model tests being in models/ subdir of /tests

* explicitly rm test from prev path

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

4a419d49

Add DebertaV2ForMultipleChoice (#17135) · 48a8f3da
Jason Phang authored 3 years ago

48a8f3da
Fix template init (#17163) · 4ad2f68e
Sylvain Gugger authored 3 years ago

4ad2f68e

Add MLFLOW_FLATTEN_PARAMS support in MLflowCallback (#17148) · e99f0efe

Nicolas Brousse authored 3 years ago

* add support for MLFLOW_FLATTEN_PARAMS

* ensure key is str

* fix style and update warning msg

* Empty commit to trigger CI

* fix bug in check_inits.py

* add unittest for flatten_dict utils

* fix 'NoneType' object is not callable on __del__

* add generic flatten_dict unittest to SPECIAL_MODULE_TO_TEST_MAP

* fix style

e99f0efe

missing file (#17164) · 976835d5
Stas Bekman authored 3 years ago

976835d5
Fixing the output of code examples in the preprocessing chapter (#17162) · 259eeb6d
Patrick Haller authored 3 years ago

259eeb6d

[Deepspeed] add many more models to the model zoo test (#12695) · f8615044

Stas Bekman authored 3 years ago

* model zoo take 2

* add deberta

* new param for zero2

* doc update

* doc update

* add layoutlm

* bump deepspeed

* add deberta-v2, funnel, longformer

* new models

* style

* add t5_v1

* update TAPAS status

* reorg problematic models

* move doc to another PR

* style

* fix checkpoint check test

* making progress on more models running

* cleanup

* new version

* cleanup

f8615044