Commits · noua/bloom_cugraph · zhusg / transformers-new

02 Sep, 2022 4 commits
- remove prints · 7f78d4c8
  NouamaneTazi authored 2 years ago
  
  7f78d4c8
- WIP cuda graph with NCCL · 529fd3c0
  NouamaneTazi authored 2 years ago
  
  529fd3c0
- make input_ids static in greedy search · 611eaf1e
  NouamaneTazi authored 2 years ago
  
  611eaf1e
- we can now capture bloom · c110dfc0
  NouamaneTazi authored 2 years ago
  
  c110dfc0
17 Aug, 2022 3 commits

WIP · f4d0dc3c
thomasw21 authored 2 years ago

f4d0dc3c

Fix Yolos ONNX export test (#18606) · c99e9846

Yih-Dar authored 2 years ago


Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

c99e9846

Examples: add Bloom support for token classification (#18632) · 358478e7

Stefan Schweter authored 2 years ago

* examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow)

* examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it)

358478e7

16 Aug, 2022 7 commits

[bnb] Minor modifications (#18631) · 6d175c11

Younes Belkada authored 2 years ago


* bnb minor modifications

- refactor documentation
- add troubleshooting README
- add PyPi library on DockerFile

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* put in one block

- put bash instructions in one block

* update readme

- refactor a bit hardware requirements

* change text a bit

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* apply suggestions

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* add link to paper

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update tests/mixed_int8/README.md

* Apply suggestions from code review

* refactor a bit

* add instructions Turing & Amperer

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add A6000

* clarify a bit

* remove small part

* Update tests/mixed_int8/README.md

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

6d175c11

Update run_translation_no_trainer.py (#18637) · 25e651a2

zhoutang776 authored 2 years ago

* Update run_translation_no_trainer.py

found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint

* fixs `no_decay` and `resume_step` issue

1. change `no_decay` list
2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1`

25e651a2

Update longt5.mdx (#18634) · a27195b1
flozi00 authored 2 years ago

a27195b1
TF: Fix generation repetition penalty with XLA (#18648) · fd9aa82b
Joao Gante authored 2 years ago

fd9aa82b
Add checks for some workflow jobs (#18583) · 81ab1112
Yih-Dar authored 2 years ago
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
81ab1112
Change scheduled CIs to use torch 1.12.1 (#18644) · 510c2a0b
Yih-Dar authored 2 years ago
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
510c2a0b

mac m1 `mps` integration (#18598) · 9cf27468

Sourab Mangrulkar authored 2 years ago


* mac m1 `mps` integration

* Update docs/source/en/main_classes/trainer.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* addressing comments

* Apply suggestions from code review

Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

* resolve comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

9cf27468

14 Aug, 2022 1 commit

Flax Remat for LongT5 (#17994) · d6eeb871

Karim Foda authored 2 years ago


* [Flax] Add remat (gradient checkpointing)

* fix variable naming in test

* flip: checkpoint using a method

* fix naming

* fix class naming

* apply PVP's suggestions from code review

* add gradient_checkpointing to examples

* Add gradient_checkpointing to run_mlm_flax

* Add remat to longt5

* Add gradient checkpointing test longt5

* Fix args errors

* Fix remaining tests

* Make fixup & quality fixes

* replace kwargs

* remove unecessary kwargs

* Make fixup changes

* revert long_t5_flax changes

* Remove return_dict and copy to LongT5

* Remove test_gradient_checkpointing

Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>

d6eeb871

12 Aug, 2022 14 commits

small change (#18584) · 1ccd2515
Younes Belkada authored 2 years ago

1ccd2515

[fsmt] deal with -100 indices in decoder ids (#18592) · b3ff7c68

Stas Bekman authored 2 years ago

* [fsmt] deal with -100 indices in decoder ids

Fixes: https://github.com/huggingface/transformers/issues/17945

decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. 

For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems.

Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive.

* style

b3ff7c68

[doc] fix anchors (#18591) · 37c59918

Stas Bekman authored 2 years ago

the manual anchors end up being duplicated with automatically added anchors and no longer work.

37c59918

Update BLOOM parameter counts (#18531) · 56ef0ba4
Niklas Muennighoff authored 2 years ago
```
* Update BLOOM parameter counts

* Update BLOOM parameter counts
```
56ef0ba4

Fix URLs (#18604) · 153d1361

NielsRogge authored 2 years ago


Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

153d1361

Add Donut (#18488) · 2ab790e8

NielsRogge authored 2 years ago


* First draft

* Improve script

* Update script

* Make conversion work

* Add final_layer_norm attribute to Swin's config

* Add DonutProcessor

* Convert more models

* Improve feature extractor and convert base models

* Fix bug

* Improve integration tests

* Improve integration tests and add model to README

* Add doc test

* Add feature extractor to docs

* Fix integration tests

* Remove register_buffer

* Fix toctree and add missing attribute

* Add DonutSwin

* Make conversion script work

* Improve conversion script

* Address comment

* Fix bug

* Fix another bug

* Remove deprecated method from docs

* Make Swin and Swinv2 untouched

* Fix code examples

* Fix processor

* Update model_type to donut-swin

* Add feature extractor tests, add token2json method, improve feature extractor

* Fix failing tests, remove integration test

* Add do_thumbnail for consistency

* Improve code examples

* Add code example for document parsing

* Add DonutSwin to MODEL_NAMES_MAPPING

* Add model to appropriate place in toctree

* Update namespace to appropriate organization

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

2ab790e8

Supporting seq2seq models for `bitsandbytes` integration (#18579) · a5ca56ff

Younes Belkada authored 2 years ago

* Supporting seq2seq models for `bitsandbytes` integration

- `bitsandbytes` integration supports now seq2seq models
- check if a model has tied weights as an additional check

* small modification

- tie the weights before looking at tied weights!

a5ca56ff

Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261) · ed1924e8
Joao Gante authored 2 years ago
```
* validate generate model_kwargs

* generate tests -- not all models have an attn mask
```
ed1924e8
Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600) · 2156619f
Yih-Dar authored 2 years ago
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
2156619f
FSDP bug fix for `load_state_dict` (#18596) · 4eed2bec
Sourab Mangrulkar authored 2 years ago

4eed2bec
typos (#18594) · d344534b
Stas Bekman authored 2 years ago

d344534b

update doc for perf_train_cpu_many, add intel mpi introduction (#18576) · 3cdaea47

Wang, Yi authored 2 years ago


* update doc for perf_train_cpu_many, add mpi introduction

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3cdaea47

Add type hints for ViLT models (#18577) · 46d09410

Ian Castillo authored 2 years ago

* Add type hints for Vilt models

* Add missing return type for TokenClassification class

46d09410

Load sharded pt to flax (#18419) · bce36ee0

Arthur authored 2 years ago


* initial commit

* add small test

* add cross pt tf flag to test

* fix quality

* style

* update test with new repo

* fix failing test

* update

* fix wrong param ordering

* style

* update based on review

* update related to recent new caching mechanism

* quality

* Update based on review

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* quality and style

* Update src/transformers/modeling_flax_utils.py
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bce36ee0

11 Aug, 2022 11 commits

Return the permuted hidden states if return_dict=True (#18578) · c8b6ae85
amyeroberts authored 2 years ago

c8b6ae85
fix owlvit tests, update docstring examples (#18586) · f28f2408
Alara Dirik authored 2 years ago

f28f2408

Bump nbconvert in /examples/research_projects/visual_bert (#18566) · 05d3a43c

dependabot[bot] authored 2 years ago

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0

)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

05d3a43c

Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565) · 713ab6fd

dependabot[bot] authored 2 years ago

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0

)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

713ab6fd

Fix docstrings with last version of hf-doc-builder styler (#18581) · c23cbdff
Sylvain Gugger authored 2 years ago
```
* Fix docstrings with last version of hf-doc-builder styler

* Remove empty Parameter block
```
c23cbdff

[FX] _generate_dummy_input supports audio-classification models for labels (#18580) · 42b8940b

Michael Benayoun authored 2 years ago

* Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not

* Use ENV_VARS_TRUE_VALUES

42b8940b

Deberta V2: Fix critical trace warnings to allow ONNX export (#18272) · d53dffec

iiLaurens authored 2 years ago


* Fix critical trace warnings to allow ONNX export

* Force input to `sqrt` to be float type

* Cleanup code

* Remove unused import statement

* Update model sew

* Small refactor

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Use broadcasting instead of repeat

* Implement suggestion

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Match deberta v2 changes in sew_d

* Improve code quality

* Update code quality

* Consistency of small refactor

* Match changes in sew_d

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

d53dffec

german docs translation (#18544) · 5d3f0374

flozi00 authored 2 years ago

* Create _config.py

* Create _toctree.yml

* Create index.mdx

not sure about "du / ihr" oder "sie"

* Create quicktour.mdx

* Update _toctree.yml

* Update build_documentation.yml

* Update build_pr_documentation.yml

* fix build

* Update index.mdx

* Update quicktour.mdx

* Create installation.mdx

* Update _toctree.yml

5d3f0374

Change BartLearnedPositionalEmbedding's forward method signature to support... · 80468251

Dan Jones authored 2 years ago

Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486)

* changing BartLearnedPositionalEmbedding forward signature and references to it

* removing debugging dead code (thanks style checker)

* blackened modeling_bart file

* removing copy inconsistencies via make fix-copies

* changing references to copied signatures in Bart variants

* make fix-copies once more

* using expand over repeat (thanks @michaelbenayoun)

* expand instead of repeat for all model copies

Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com>

80468251

Skip broken tests · 3f0707b2
Sylvain Gugger authored 2 years ago

3f0707b2

Fix LayoutLMv3 documentation (#17932) · 4c8ec66a

Wonseok Lee (Jack) authored 2 years ago

* fix typos

* fix sequence_length docs of LayoutLMv3Model

* delete trailing white spaces

* fix layoutlmv3 docs more

* apply make fixup & quality

* change to two versions of input docstring

* apply make fixup & quality

4c8ec66a