- 13 Mar, 2023 21 commits
-
-
Stas Bekman authored
* [deepspeed docs] Activation Checkpointing * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update deepspeed.mdx --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* [trainer] fix bug in grad accum * comment out debug * fix one-off * rename counter
-
Sylvain Gugger authored
-
Joao Gante authored
* Let generate pick its inputs * fix squad seq2seq example
-
Younes Belkada authored
* add `get_input_embeddings` to `WhisperForAudioClassification` * add common tests * fix another common test * Update tests/models/whisper/test_modeling_whisper.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
bishmdl76 authored
Update configuration_align.py updated projected_dim=640 from 512 in arguments of AlignConfig
-
Yih-Dar authored
* Add script --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
mollerup23 authored
* Adding Type Hints to TF_Pegasus model * Updated some parameters per maintainer comments
-
Sylvain Gugger authored
-
Maria Khalusova authored
* WIP * WIP * manual inference example * make style * Apply suggestions from code review Co-authored-by:
Alara Dirik <8944735+alaradirik@users.noreply.github.com> --------- Co-authored-by:
Alara Dirik <8944735+alaradirik@users.noreply.github.com>
-
Karim Foda authored
* Fix gradient checkpointing bug in trocr * Fix format * Update src/transformers/models/trocr/modeling_trocr.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Karim Foda authored
-
Karim Foda authored
-
Younes Belkada authored
skip accelerate test
-
Nicola Procopio authored
* updated toctree * italian translation big_model.mdx * italian translation big_models
-
Karim Foda authored
-
Karim Foda authored
-
Karim Foda authored
-
Alex Calabrese authored
* Add pr_checks.mdx Italian translation (#17459) * Updated pr_checks.mdx Italian translation (#17459)
-
wangpeng authored
* add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * remove representation_size from MGPSTRConfig * reformat configuration_mgp_str.py * format test_processor_mgp_str.py * add test for tokenizer and complete model/processer test and model file * rm Unnecessary tupple in modeling_mgp_str * reduce hidden_size/layers/label_size in test_model * add integration tests and change MGPSTR to Mgpstr * add test for logit values * reformat test model file --------- Co-authored-by:
yue kun <yuekun.wp@alibaba-inc.com>
-
Alara Dirik authored
Adds AutoModelForZeroShotImageClassification to transformers
-
- 11 Mar, 2023 1 commit
-
-
Sanchit Gandhi authored
* [Whisper] Remove embed_tokens from encoder docstring * new line to retrigger CI * remove new line
-
- 10 Mar, 2023 11 commits
-
-
Sylvain Gugger authored
* Fix imports of TF MobileViT * Fix copies
-
Maria Khalusova authored
* re: #21989 * update re: #21989 * removed cpu option * make style
-
Dean Wyatte authored
-
J-shang authored
fix hint
-
Karim Foda authored
* Fix gradient checkpointing bug in Speecht5 * Update modeling_speech_to_text.py * Update src/transformers/models/speech_to_text/modeling_speech_to_text.py * Fix change errors --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
Joao Gante authored
fix broken links
-
Kevin Jiang authored
* Update flan-ul2.mdx * Update flan-ul2.mdx
-
Arthur authored
* Make sure position ids are masked * test that padded input produce the same results * fix failing tests * fixup * fix batch test
-
Karim Foda authored
-
Karim Foda authored
* Fix gradient checkpointing bug in Speech2Text * Update modeling_speech_to_text.py * Update modeling_speech_to_text_2.py
- 09 Mar, 2023 7 commits
-
-
Sylvain Gugger authored
* Add a progress bar for the total download of shards * Check for no cache at all * Fix check
-
aws-sangeetha authored
Co-authored-by:
EC2 Default User <ec2-user@ip-172-31-42-72.us-west-2.compute.internal>
-
Yih-Dar authored
Update the script Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Sylvain Gugger authored
* Add setters by type of args to TrainingArguments * Define more setters
-
Yih-Dar authored
* skip 3 tests --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Jiali Mei authored
* Edit the docstring of `image_processing_donut` to match code * improve style * more style improvement after installing quality
-
Stas Bekman authored
* [deepspeed] offload + non-cpuadam optimizer exception * flip * revert min version
-