Commits · 62024453c39144b68b27ed8dd111f05479c06274 · zhusg / transformers-new

02 Feb, 2021 5 commits
- Bump numpy (#9934) · 62024453
  Sylvain Gugger authored 4 years ago
  
  62024453
- Fix 9918 (#9932) · de38a6e4
  Sylvain Gugger authored 4 years ago
```
* Initial work

* Fix doc styler and other models
```
  de38a6e4
- ALBERT Tokenizer integration test (#9943) · 1809de51
  Lysandre Debut authored 4 years ago
```
* ALBERT Tokenizer integration test

* Batching

* Style
```
  1809de51
- fix typo in naming (#9944) · 0f4dc5d8
  Patrick von Platen authored 4 years ago
  
  0f4dc5d8
- [Tokenizer Utils Base] Make pad function more flexible (#9928) · 538b3b46
  Patrick von Platen authored 4 years ago
```
* change tokenizer requirement

* split line

* Correct typo from list to str

* improve style

* make other function pretty as well

* add comment

* correct typo

* add new test

* pass tests for tok without padding token

* Apply suggestions from code review
```
  538b3b46
01 Feb, 2021 11 commits

Tensorflow doc changes on loss output size (#9922) · d1b14c9b

Jan Jitse Venselaar authored 4 years ago

* Change documentation to correctly specify loss tensor size

* Change documentation to correct input format for labels

* Corrected output size of loss tensor for sequence classifier, multiple choice model and question answering

d1b14c9b

Fix bart conversion script (#9923) · 343057e1
Suraj Patil authored 4 years ago
```
* fix conversion script

* typo

* import nn
```
343057e1

Add new model docs (#9667) · 0e3be1ac

Patrick von Platen authored 4 years ago


* add new model logic

* fix docs

* change structure

* improve add_new_model

* push new changes

* up

* up

* correct spelling

* improve docstring

* correct line length

* update readme

* correct links

* correct typos

* only add rst file for now

* Apply suggestions from code review 1

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>

* Apply suggestions from code review

Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

* finish adding all suggestions

* make style

* apply Niels feedback

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply sylvains suggestions

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0e3be1ac

fix typos (#9924) · 0842c33e
Suraj Patil authored 4 years ago

0842c33e

Adafactor: avoid updating group["lr"] attributes (#9751) · 8672bcda

CeShine Lee authored 4 years ago

This affects Adafactor with relative_step=False and scale_parameter=True.
Updating group["lr"] makes the result of ._get_lr() depends on the previous call,
i.e., on the scale of other parameters. This isn't supposed to happen.

8672bcda

Remove subclass for sortish sampler (#9907) · 115d97dd
Sylvain Gugger authored 4 years ago
```
* Remove subclass for sortish sampler

* Use old Seq2SeqTrainer in script

* Styling
```
115d97dd

Fit chinese wwm to new datasets (#9887) · 1682804e

wlhgtc authored 4 years ago


* MOD: fit chinese wwm to new datasets

* MOD: move wwm to new folder

* MOD: formate code

* Styling

* MOD add param and recover trainer

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

1682804e

[wandb] restore WANDB_DISABLED=true to disable wandb (#9896) · 24881008

Stas Bekman authored 4 years ago

* [t5 doc] typos

a few run away backticks

@sgugger

* style

* [trainer] put fp16 args together

this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read

@sgugger

* style

* [wandb] make WANDB_DISABLED disable wandb with any value

This PR solves part of https://github.com/huggingface/transformers/issues/9623

It tries to actually do what https://github.com/huggingface/transformers/issues/9699 requested/discussed and that is any value of `WANDB_DISABLED` should disable wandb.

The current behavior is that it has to be one of `ENV_VARS_TRUE_VALUES = {"1", "ON", "YES"}`

I have been using `WANDB_DISABLED=true` everywhere in scripts as it was originally advertised. I have no idea why this was changed to a sub-set of possible values. And it's not documented anywhere.

@sgugger

* WANDB_DISABLED=true to disable; make tf trainer consistent

* style

24881008

fix logger format for non-main process (#9911) · 6bab8368
Stas Bekman authored 4 years ago

6bab8368
Doc title in the template (#9910) · d85691ac
Sylvain Gugger authored 4 years ago

d85691ac

Add head_mask and decoder_head_mask to FSMT (#9819) · 0c6c0afc

Daniel Stancl authored 4 years ago

* Add {decoder_,}head_mask to fsmt_modeling.py

* Enable test_headmasking and some changes to docs

* Remove test_head_masking flag from fsmt test file

Remove test_head_masking flag from test_modeling_fsmt.py
since test_head_masking is set to be True by default (thus it is redundant to store).

* Merge master and remove test_head_masking = True

* Rebase necessary due to an update of jaxlib

* Remove test_head_masking=True in tests/test_modeling_fsmt.py
as it is redundant.

0c6c0afc

31 Jan, 2021 2 commits

TFBart lables consider both pad token and -100 (#9847) · 74f16b82

Kiyoung Kim authored 4 years ago


* TFBart lables consider both pad token and -100

* make style

* fix for all other models

Co-authored-by: kykim <kykim>
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

74f16b82

Clarify definition of seed argument in TrainingArguments (#9903) · 22121e81

lewtun authored 4 years ago


* Clarify definition of seed argument in Trainer

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args_tf.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

22121e81

30 Jan, 2021 1 commit

[doc] nested markup is invalid in rst (#9898) · 40cfc355

Stas Bekman authored 4 years ago

Apparently nested markup in RST is invalid: https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible

So currently this line doesn't get rendered properly, leaving inner markdown unrendered, resulting in:
```
https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible
```

This PR removes the bold which fixes the link.

40cfc355

29 Jan, 2021 6 commits
- refactor deepspeed setup devices (#9880) · 1420b5ff
  Stas Bekman authored 4 years ago
  
  1420b5ff
- correctly handle mt5 (#9879) · 6bf94bc0
  Stas Bekman authored 4 years ago
  
  6bf94bc0
- When on sagemaker use their env variables for saves (#9876) · 7eadfe16
  Sylvain Gugger authored 4 years ago
```
* When on sagemaker use their env variables for saves

* Address review comments

* Quality
```
  7eadfe16
- Add XLA test (#9848) · fdcde144
  Julien Plu authored 4 years ago
  
  fdcde144
- Clarify use of unk_token in tokenizer docstrings (#9875) · 99b9affa
  Ethan Chau authored 4 years ago
  
  99b9affa
- Adding a new `return_full_text` parameter to TextGenerationPipeline. (#9852) · c2d0ffec
  Nicolas Patry authored 4 years ago
```
* Adding a new `return_full_text` parameter to TextGenerationPipeline.

For text-generation, it's sometimes used as prompting text.
In that context, prefixing `generated_text` with the actual input
forces the caller to take an extra step to remove it.

The proposed change adds a new parameter (for backward compatibility).
`return_full_text` that enables the caller to prevent adding the prefix.

* Doc quality.
```
  c2d0ffec
28 Jan, 2021 11 commits

pin_memory -> dataloader_pin_memory (#9874) · bc109ae5
abhishek thakur authored 4 years ago

bc109ae5
on_log event should occur *after* the current log is written (#9872) · 80e4184f
abhishek thakur authored 4 years ago

80e4184f

[docs] expand install instructions (#9817) · 15e4ce35

Stas Bekman authored 4 years ago


* expand install instructions

* fix

* white space

* rewrite as discussed in the PR

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change the wording to encourage issue report

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

15e4ce35

Remove redundant `test_head_masking = True` flags in test files (#9858) · 4c3ae89a

Daniel Stancl authored 4 years ago

* Remove redundant test_head_masking = True flags

* Remove all redundant test_head_masking flags in PyTorch test_modeling_* files

* Make test_head_masking = True as a default choice in test_modeling_tf_commong.py

* Remove all redundant test_head_masking flags in TensorFlow
test_modeling_tf_* files

* Put back test_head_masking=False fot TFT5 models

4c3ae89a

tutorial typo · caddf912
Joe Davison authored 4 years ago

caddf912
Deprecate model_path in Trainer.train (#9854) · b4e559cf
Sylvain Gugger authored 4 years ago

b4e559cf

Fix computation of attention_probs when head_mask is provided. (#9853) · 2ee9f9b6

Funtowicz Morgan authored 4 years ago


* Fix computation of attention_probs when head_mask is provided.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Apply changes to the template

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

2ee9f9b6

Fixing flaky conversational test + flag it as a pipeline test. (#9837) · b936582f
Nicolas Patry authored 4 years ago

b936582f
Remove submodule (#9868) · 58fbef9e
Lysandre Debut authored 4 years ago

58fbef9e

Partial local tokenizer load (#9807) · 6cb0a6f0

Lysandre Debut authored 4 years ago


* Allow partial loading of a cached tokenizer

* Warning > Info

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Raise error if not local_files_only

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6cb0a6f0

Pin memory in Trainer by default (#9857) · 25fcb5c1

abhishek thakur authored 4 years ago


Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

25fcb5c1

27 Jan, 2021 4 commits

ADD BORT (#9813) · 5ed5a546

Stefan Schweter authored 4 years ago

* tests: add integration tests for new Bort model

* bort: add conversion script from Gluonnlp to Transformers 🚀



* bort: minor cleanup (BORT -> Bort)

* add docs

* make fix-copies

* clean doc a bit

* correct docs

* Update docs/source/model_doc/bort.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/bort.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct dialogpt doc

* correct link

* Update docs/source/model_doc/bort.rst

* Update docs/source/model_doc/dialogpt.rst

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5ed5a546

[traner] fix --lr_scheduler_type choices (#9800) · 7c6d6329

Stas Bekman authored 4 years ago


* fix --lr_scheduler_type choices

* rewrite to fix for all enum-based cl args

* cleanup

* adjust test

* style

* Proposal that should work

* Remove needless code

* Fix test

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

7c6d6329

Allow --arg Value for booleans in HfArgumentParser (#9823) · 893120fa
Sylvain Gugger authored 4 years ago
```
* Allow --arg Value for booleans in HfArgumentParser

* Update last test

* Better error message
```
893120fa

When resuming training from checkpoint, Trainer loads model (#9818) · 35d55b7b

Sylvain Gugger authored 4 years ago

* Whenresuming training from checkpoint, Trainer loads model

* Finish cleaning tests

* Address review comment

* Use global_step from state

35d55b7b