- 09 Oct, 2023 2 commits
- 09 Sep, 2023 1 commit
-
-
Arthur authored
* skip failing tests until #26054 is merged * fixup
-
- 08 Sep, 2023 5 commits
-
-
Arthur authored
* fix `set_infilling_processor` to properly reset * Add docstring! * fixups * more details in the docuemtation about the tokenization * styl;e
-
Harheem Kim authored
* docs: ko-llama.md * fix: chatgpt draft * feat: manual edits * fix: resolve suggestions
-
Angela Yi authored
* Ignore warning if tracing with dynamo * fix import error * separate to function * add test
-
Thien Tran authored
* add missing doc for activation dropout * fix doc for SEW-D dropout * deprecate hidden_dropout for SEW-D
-
Alexander Krauck authored
This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically: 1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`. 2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers. These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.
-
- 07 Sep, 2023 9 commits
-
-
dumpmemory authored
* fix loss inconsistent after resume #25340 * fix typo * clean code * reformatted code * adjust code according to comments * adjust check_dataloader_randomsampler location * return sampler only * handle sampler is None * Update src/transformers/trainer_pt_utils.py thanks @amyeroberts Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
MyungHa Kwon authored
fix typo
-
raghavanone authored
* Fix vilt config init parameter to match the ones in documentation * Fix the documentation
-
Muskan Kumar authored
* Added HerBERT to README.md * Update README.md to contain HerBERT (#26016) * Resolved #26016: Updated READMEs and index.md to contain Herbert Updated READMEs and ran make fix-copies
-
Sanchit Gandhi authored
* fix tokenizer * make bs even * fix multi gpu test * style * model forward * fix torch import * revert tok pin
-
CokeDong authored
* Add tgs metrics * bugfix and black formatting * workaround for tokens counting * formating and bugfix * Fix * Add opt-in for tgs metrics * make style and fix error * Fix doc * fix docbuild * hf-doc-build * fix * test * Update src/transformers/training_args.py renaming Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * Update src/transformers/training_args.py renaming Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * Fix some symbol * test * Update src/transformers/trainer_utils.py match nameing patterns Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/trainer.py nice Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix reviews * Fix * Fix black --------- Co-authored-by:
Zach Mueller <muellerzr@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Yih-Dar authored
fix Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Kai authored
-
Zach Mueller authored
* Fix err * Use version check
-
- 06 Sep, 2023 7 commits
-
-
Marc Sun authored
* add new arg for gptq * add tests * add min version autogptq * fix order * skip test * fix * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style * change model path --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Matt authored
Remove falcon from undocumented list
-
Harheem Kim authored
* docs: ko: llm_tutoroal.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * fix: resolve suggestions
-
zspo authored
* fix some samll bugs in readme * Update docs/README.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Matt authored
* stash commit * More OPT updates * Update src/transformers/models/opt/modeling_tf_opt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Lysandre Debut authored
* Fix revision propagation * Cleaner
-
Nino Risteski authored
fixed a typo
-
- 05 Sep, 2023 16 commits
-
-
tju_skywalker authored
* fix convert megatron model too large * fix convert megatron model too large
-
Tanay Mehta authored
* add: potential fix to mega chunking in decoder only model bug * add: decoder with chunking test * add: input_mask passed with input_ids
-
Arthur authored
* revision did not exist * correct revision
-
Arthur authored
* start with error too * fix ? * start with nit * one more path * use `job_name` * mark pipeline test as slow
-
Injin Paek authored
* docs: feat: model resources for llama * fix: resolve suggestion Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com>
-
Sanchit Gandhi authored
* [Wav2Vec2 Conformer] Fix inference float16 * fix test * fix test more * clean pipe test
-
Sourab Mangrulkar authored
deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler (#25863) * Add support for deepspeed optimizer and HF scheduler * fix bug * fix the import * fix issue with deepspeed scheduler saving for hf optim + hf scheduler scenario * fix loading of hf scheduler when loading deepspeed checkpoint * fix import of `DeepSpeedSchedulerWrapper` * add tests * add the comment and skip the failing tests * address comment
-
raghavanone authored
* Add TFDebertaV2ForMultipleChoice * Import newer model in main init * Fix import issues * Fix copies * Add doc * Fix tests * Fix copies * Fix docstring
-
andreeahedes authored
* no_split_modules * no_split_modules * inputs_embeds+pos same device * update _no_split_modules * update _no_split_modules
-
Abhilash Majumder authored
* patch with accelerate xpu * patch with accelerate xpu * formatting * fix tests * revert ruff unrelated fixes * revert ruff unrelated fixes * revert ruff unrelated fixes * fix test * review fixes * review fixes * black fixed * review commits * review commits * style fix * use pytorch_utils * revert markuplm test
-
Yih-Dar authored
* update * update * fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Joao Gante authored
-
Sahel Sharify authored
This cl iterates through a list of keys rather than dict items while updating the dict elements. Fixes the following error: File "..../transformers/training_args.py", line 1544, in post_init for k, v in self.fsdp_config.items(): RuntimeError: dictionary keys changed during iteration
-
Traun Leyden authored
Update README.md with correct path to examples/seq2seq
-
Julien Chaumond authored
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-