Commits · debug_mem_95b37495 · 某某某 / transformers-new

09 Oct, 2023 2 commits
- run 95b37495 · 8f31da3f
  ydshieh authored 1 year ago
  
  8f31da3f
- run 95b37495 · 7f266e80
  ydshieh authored 1 year ago
  
  7f266e80
09 Sep, 2023 1 commit
- [`CITests`] skip failing tests until #26054 is merged (#26063) · 95b37495
  Arthur authored 1 year ago
```
* skip failing tests until #26054 is merged

* fixup
```
  95b37495
08 Sep, 2023 5 commits

[`CodeLlamaTokenizerFast`] Fix fix `set_infilling_processor` to properly reset (#26041) · 09b2de6e

Arthur authored 1 year ago

* fix `set_infilling_processor` to properly reset

* Add docstring!

* fixups

* more details in the docuemtation about the tokenization

* styl;e

09b2de6e

🌐 [i18n-KO] Translated `llama.md` to Korean (#26044) · d5360603
Harheem Kim authored 1 year ago
```
* docs: ko-llama.md

* fix: chatgpt draft

* feat: manual edits

* fix: resolve suggestions
```
d5360603

Skip warning if tracing with dynamo (#25581) · 6c26faa1

Angela Yi authored 1 year ago

* Ignore warning if tracing with dynamo

* fix import error

* separate to function

* add test

6c26faa1

Update missing docs on `activation_dropout` and fix DropOut docs for SEW-D (#26031) · 18ee1fe7
Thien Tran authored 1 year ago
```
* add missing doc for activation dropout

* fix doc for SEW-D dropout

* deprecate hidden_dropout for SEW-D
```
18ee1fe7

Fix Dropout Implementation in Graphormer (#24817) · 0c67a72c

Alexander Krauck authored 1 year ago

This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically:

1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`.
2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers.

These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.

0c67a72c

07 Sep, 2023 9 commits

Try to fix training Loss inconsistent after resume from old checkpoint (#25872) · fb7d2469

dumpmemory authored 1 year ago


* fix loss inconsistent after resume  #25340

* fix typo

* clean code

* reformatted code

* adjust code according to comments

* adjust check_dataloader_randomsampler location

* return sampler only

* handle sampler is None

* Update src/transformers/trainer_pt_utils.py

thanks @amyeroberts

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

fb7d2469

Punctuation fix (#26025) · c5e66a40
MyungHa Kwon authored 1 year ago
```
fix typo
```
c5e66a40
Fix vilt config docstring parameter to match value in init (#26017) · 00efd64e
raghavanone authored 1 year ago
```
* Fix vilt config init parameter to match the ones in documentation

* Fix the documentation
```
00efd64e

Added HerBERT to README.md (#26020) · 02c4a77f

Muskan Kumar authored 1 year ago

* Added HerBERT to README.md

* Update README.md to contain HerBERT (#26016)

* Resolved #26016: Updated READMEs and index.md to contain Herbert

Updated READMEs and ran make fix-copies

02c4a77f

[VITS] Fix nightly tests (#25986) · 2af87d01

Sanchit Gandhi authored 1 year ago

* fix tokenizer

* make bs even

* fix multi gpu test

* style

* model forward

* fix torch import

* revert tok pin

2af87d01

Add `tgs` speed metrics (#25858) · 3744126c

CokeDong authored 1 year ago


* Add tgs metrics

* bugfix and black formatting

* workaround for tokens counting

* formating and bugfix

* Fix

* Add opt-in for tgs metrics

* make style and fix error

* Fix doc

* fix docbuild

* hf-doc-build

* fix

* test

* Update src/transformers/training_args.py

renaming

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* Update src/transformers/training_args.py

renaming

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* Fix some symbol

* test

* Update src/transformers/trainer_utils.py

match nameing patterns

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/trainer.py

nice

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix reviews

* Fix

* Fix black

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

3744126c

Fix CircleCI config (#26023) · 0188739a
Yih-Dar authored 1 year ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
0188739a
fix _resize_token_embeddings will set lm head size to 0 when enabled deepspeed zero3 (#26024) · df04959e
Kai authored 1 year ago

df04959e
Fix err with FSDP (#25991) · e3a97163
Zach Mueller authored 1 year ago
```
* Fix err

* Use version check
```
e3a97163

06 Sep, 2023 7 commits

modify context length for GPTQ + version bump (#25899) · fa6107c9

Marc Sun authored 1 year ago


* add new arg for gptq

* add tests

* add min version autogptq

* fix order

* skip test

* fix

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix style

* change model path

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

v4.33.1

fa6107c9

Remove Falcon from undocumented list (#26008) · 300d6a4a
Matt authored 1 year ago
```
Remove falcon from undocumented list
```
300d6a4a

🌐

[i18n-KO] Translated `llm_tutorial.md` to Korean (#25791) · fa522d8d

Harheem Kim authored 1 year ago

* docs: ko: llm_tutoroal.md

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

* fix: resolve suggestions

fa522d8d

Fix small typo README.md (#25934) · 3e203f92

zspo authored 1 year ago


* fix some samll bugs in readme

* Update docs/README.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

3e203f92

TF-OPT attention mask fixes (#25238) · 842e99f1

Matt authored 1 year ago


* stash commit

* More OPT updates

* Update src/transformers/models/opt/modeling_tf_opt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

842e99f1

Falcon: fix revision propagation (#26006) · f6301b9a
Lysandre Debut authored 1 year ago
```
* Fix revision propagation

* Cleaner
```
f6301b9a
Update README.md (#26003) · f6295c6c
Nino Risteski authored 1 year ago
```
fixed a typo
```
f6295c6c

05 Sep, 2023 16 commits

save space when converting hf model to megatron model. (#25950) · 172f42c5
tju_skywalker authored 1 year ago
```
* fix convert megatron model too large

* fix convert megatron model too large
```
172f42c5

Fix Mega chunking error when using decoder-only model (#25765) · b8def689

Tanay Mehta authored 1 year ago

* add: potential fix to mega chunking in decoder only model bug

* add: decoder with chunking test

* add: input_mask passed with input_ids

b8def689

[`VITS`] tokenizer integration test: fix revision did not exist (#25996) · 4fa0aff2
Arthur authored 1 year ago
```
* revision did not exist

* correct revision
```
4fa0aff2

[`CI`] Fix red CI and ERROR failed should show (#25995) · d0354e5e

Arthur authored 1 year ago

* start with error too

* fix ?

* start with nit

* one more path

* use `job_name`

* mark pipeline test as slow

d0354e5e

Add LLaMA resources (#25859) · 6206f599

Injin Paek authored 1 year ago


* docs: feat: model resources for llama

* fix: resolve suggestion

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

6206f599

[Wav2Vec2 Conformer] Fix inference float16 (#25985) · 8d518013
Sanchit Gandhi authored 1 year ago
```
* [Wav2Vec2 Conformer] Fix inference float16

* fix test

* fix test more

* clean pipe test
```
8d518013

deepspeed resume from ckpt fixes and adding support for deepspeed optimizer... · 6bc517cc

Sourab Mangrulkar authored 1 year ago

deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler (#25863)

* Add support for deepspeed optimizer and HF scheduler

* fix bug

* fix the import

* fix issue with deepspeed scheduler saving for hf optim + hf scheduler scenario

* fix loading of hf scheduler when loading deepspeed checkpoint

* fix import of `DeepSpeedSchedulerWrapper`

* add tests

* add the comment and skip the failing tests

* address comment

6bc517cc

Add TFDebertaV2ForMultipleChoice (#25932) · 1110b565

raghavanone authored 1 year ago

* Add TFDebertaV2ForMultipleChoice

* Import newer model in main init

* Fix import issues

* Fix copies

* Add doc

* Fix tests

* Fix copies

* Fix docstring

1110b565

PegasusX add _no_split_modules (#25933) · da1af21d

andreeahedes authored 1 year ago

* no_split_modules

* no_split_modules

* inputs_embeds+pos same device

* update _no_split_modules

* update _no_split_modules

da1af21d

Patch with accelerate xpu (#25714) · 70a98024

Abhilash Majumder authored 1 year ago

* patch with accelerate xpu

* patch with accelerate xpu

* formatting

* fix tests

* revert ruff unrelated fixes

* revert ruff unrelated fixes

* revert ruff unrelated fixes

* fix test

* review fixes

* review fixes

* black fixed

* review commits

* review commits

* style fix

* use pytorch_utils

* revert markuplm test

70a98024

Show failed tests on CircleCI layout in a better way (#25895) · aa5c94d3
Yih-Dar authored 1 year ago
```
* update

* update

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
aa5c94d3
Trainer: delegate default generation values to `generation_config` (#25987) · 9a70d6e5
Joao Gante authored 1 year ago

9a70d6e5

Update training_args.py to remove the runtime error (#25920) · aea76149

Sahel Sharify authored 1 year ago

This cl iterates through a list of keys rather than dict items while updating the dict elements. Fixes the following error:
File "..../transformers/training_args.py", line 1544, in post_init
for k, v in self.fsdp_config.items():
RuntimeError: dictionary keys changed during iteration

aea76149

Update RAG README.md with correct path to examples/seq2seq (#25953) · 7011cd86
Traun Leyden authored 1 year ago
```
Update README.md with correct path to examples/seq2seq
```
7011cd86
[doc] Always call it Agents for consistency (#25958) · 6316ce8d
Julien Chaumond authored 1 year ago

6316ce8d

Use main in conversion script (#25973) · 391f2645

Yih-Dar authored 1 year ago


* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

391f2645