- 29 Mar, 2023 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
Revert "Error (also in original) model, scaling only q matrix not qk.T dot product (qk.T/sqrt(dim_per_head))" (#22444) Revert "Error (also in original) model, scaling only q matrix not qk.T dot product (qk.T/sqrt(dim_per_head)) (#21627)" This reverts commit bad83008.
-
- 23 Mar, 2023 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
Enforce for device_map strategies
-
- 20 Mar, 2023 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
- 15 Mar, 2023 3 commits
-
-
Sylvain Gugger authored
* Fix regression in pipeline when device=-1 is passed * Add regression test
-
amyeroberts authored
Revert changes
- 14 Mar, 2023 11 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
Revert "Enforce same behavior as PyTorch 2.0 for older versions (#22136)" This reverts commit 1c801d65.
-
Stas Bekman authored
* [trainer] add --optim adamw_torch_fused * change optim default * deal with non-torch * revert default change; prep; add fp16/amp assert * typo * typo
-
amyeroberts authored
* Don't rescale if in and in range 0-255 * Raise value error if int values too large * Update tests/test_image_transforms.py * Update tests/test_image_transforms.py
-
Alara Dirik authored
* create MaskedImageCompletionOutput * fix bugs * fix bugs
-
Sylvain Gugger authored
* Fix big model inference for T5 models in float16 * Apply suggestions from code review Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Style * Trigger CI with latest release --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Nicola Procopio authored
* added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree
-
Yih-Dar authored
update values Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Alara Dirik authored
* Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues
-
Yih-Dar authored
* Move `is_pipeline_test_to_skip` to specific model test classes --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Arthur authored
* temp fix * temporary fix * update * fix tests * fixup * update based on reveiew Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * update to fix tests * update docstring --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 13 Mar, 2023 20 commits
-
-
MichaelRipa authored
* Updated glossary with new terms, added abbreviations for certain terms and merged autoencoding models, autoregressive models and causal language modeling into encoder and decoder models * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added link to 'Pipeline for inference' tutorial * Trigger CI * Update docs/source/en/glossary.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added entry for self supervised learning, added deleted entries + fixed broken links * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Patrick von Platen authored
* [Safetensors] Add explicit flag to from pretrained * add test * remove @ * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Remove backend enforcment for torch.compile * Update error * Update src/transformers/training_args.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Style --------- Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Stas Bekman authored
* [deepspeed docs] Activation Checkpointing * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update deepspeed.mdx --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* [trainer] fix bug in grad accum * comment out debug * fix one-off * rename counter
-
Sylvain Gugger authored
-
Joao Gante authored
* Let generate pick its inputs * fix squad seq2seq example
-
Younes Belkada authored
* add `get_input_embeddings` to `WhisperForAudioClassification` * add common tests * fix another common test * Update tests/models/whisper/test_modeling_whisper.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
bishmdl76 authored
Update configuration_align.py updated projected_dim=640 from 512 in arguments of AlignConfig
-
Yih-Dar authored
* Add script --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
mollerup23 authored
* Adding Type Hints to TF_Pegasus model * Updated some parameters per maintainer comments
-
Sylvain Gugger authored
-
Maria Khalusova authored
* WIP * WIP * manual inference example * make style * Apply suggestions from code review Co-authored-by:
Alara Dirik <8944735+alaradirik@users.noreply.github.com> --------- Co-authored-by:
Alara Dirik <8944735+alaradirik@users.noreply.github.com>
-
Karim Foda authored
* Fix gradient checkpointing bug in trocr * Fix format * Update src/transformers/models/trocr/modeling_trocr.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Karim Foda authored
-
Karim Foda authored
-
Younes Belkada authored
skip accelerate test
-
Nicola Procopio authored
* updated toctree * italian translation big_model.mdx * italian translation big_models
-
Karim Foda authored
-