- 02 Sep, 2022 4 commits
-
-
NouamaneTazi authored
-
NouamaneTazi authored
-
NouamaneTazi authored
-
NouamaneTazi authored
-
- 17 Aug, 2022 3 commits
-
-
thomasw21 authored
-
Yih-Dar authored
Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Stefan Schweter authored
* examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow) * examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it)
-
- 16 Aug, 2022 7 commits
-
-
Younes Belkada authored
* bnb minor modifications - refactor documentation - add troubleshooting README - add PyPi library on DockerFile * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * put in one block - put bash instructions in one block * update readme - refactor a bit hardware requirements * change text a bit * Apply suggestions from code review Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * apply suggestions Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * add link to paper * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Update tests/mixed_int8/README.md * Apply suggestions from code review * refactor a bit * add instructions Turing & Amperer Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * add A6000 * clarify a bit * remove small part * Update tests/mixed_int8/README.md Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com>
-
zhoutang776 authored
* Update run_translation_no_trainer.py found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint * fixs `no_decay` and `resume_step` issue 1. change `no_decay` list 2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1`
-
flozi00 authored
-
Joao Gante authored
-
Yih-Dar authored
Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Sourab Mangrulkar authored
* mac m1 `mps` integration * Update docs/source/en/main_classes/trainer.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * Apply suggestions from code review Co-authored-by:
Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * resolve comment Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>
-
- 14 Aug, 2022 1 commit
-
-
Karim Foda authored
* [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * add gradient_checkpointing to examples * Add gradient_checkpointing to run_mlm_flax * Add remat to longt5 * Add gradient checkpointing test longt5 * Fix args errors * Fix remaining tests * Make fixup & quality fixes * replace kwargs * remove unecessary kwargs * Make fixup changes * revert long_t5_flax changes * Remove return_dict and copy to LongT5 * Remove test_gradient_checkpointing Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co>
-
- 12 Aug, 2022 14 commits
-
-
Younes Belkada authored
-
Stas Bekman authored
* [fsmt] deal with -100 indices in decoder ids Fixes: https://github.com/huggingface/transformers/issues/17945 decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems. Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive. * style
-
Stas Bekman authored
the manual anchors end up being duplicated with automatically added anchors and no longer work.
-
Niklas Muennighoff authored
* Update BLOOM parameter counts * Update BLOOM parameter counts
-
NielsRogge authored
Co-authored-by:
Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
NielsRogge authored
* First draft * Improve script * Update script * Make conversion work * Add final_layer_norm attribute to Swin's config * Add DonutProcessor * Convert more models * Improve feature extractor and convert base models * Fix bug * Improve integration tests * Improve integration tests and add model to README * Add doc test * Add feature extractor to docs * Fix integration tests * Remove register_buffer * Fix toctree and add missing attribute * Add DonutSwin * Make conversion script work * Improve conversion script * Address comment * Fix bug * Fix another bug * Remove deprecated method from docs * Make Swin and Swinv2 untouched * Fix code examples * Fix processor * Update model_type to donut-swin * Add feature extractor tests, add token2json method, improve feature extractor * Fix failing tests, remove integration test * Add do_thumbnail for consistency * Improve code examples * Add code example for document parsing * Add DonutSwin to MODEL_NAMES_MAPPING * Add model to appropriate place in toctree * Update namespace to appropriate organization Co-authored-by:
Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
Younes Belkada authored
* Supporting seq2seq models for `bitsandbytes` integration - `bitsandbytes` integration supports now seq2seq models - check if a model has tied weights as an additional check * small modification - tie the weights before looking at tied weights!
-
Joao Gante authored
* validate generate model_kwargs * generate tests -- not all models have an attn mask
-
Yih-Dar authored
Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Sourab Mangrulkar authored
-
Stas Bekman authored
-
Wang, Yi authored
* update doc for perf_train_cpu_many, add mpi introduction Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Ian Castillo authored
* Add type hints for Vilt models * Add missing return type for TokenClassification class
-
Arthur authored
* initial commit * add small test * add cross pt tf flag to test * fix quality * style * update test with new repo * fix failing test * update * fix wrong param ordering * style * update based on review * update related to recent new caching mechanism * quality * Update based on review Co-authored-by:
sgugger <sylvain.gugger@gmail.com> * quality and style * Update src/transformers/modeling_flax_utils.py Co-authored-by:
sgugger <sylvain.gugger@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 11 Aug, 2022 11 commits
-
-
amyeroberts authored
-
Alara Dirik authored
-
dependabot[bot] authored
Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0 ) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
dependabot[bot] authored
Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0 ) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Sylvain Gugger authored
* Fix docstrings with last version of hf-doc-builder styler * Remove empty Parameter block
-
Michael Benayoun authored
* Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not * Use ENV_VARS_TRUE_VALUES
-
iiLaurens authored
* Fix critical trace warnings to allow ONNX export * Force input to `sqrt` to be float type * Cleanup code * Remove unused import statement * Update model sew * Small refactor Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com> * Use broadcasting instead of repeat * Implement suggestion Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com> * Match deberta v2 changes in sew_d * Improve code quality * Update code quality * Consistency of small refactor * Match changes in sew_d Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com>
-
flozi00 authored
* Create _config.py * Create _toctree.yml * Create index.mdx not sure about "du / ihr" oder "sie" * Create quicktour.mdx * Update _toctree.yml * Update build_documentation.yml * Update build_pr_documentation.yml * fix build * Update index.mdx * Update quicktour.mdx * Create installation.mdx * Update _toctree.yml
-
Dan Jones authored
Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486) * changing BartLearnedPositionalEmbedding forward signature and references to it * removing debugging dead code (thanks style checker) * blackened modeling_bart file * removing copy inconsistencies via make fix-copies * changing references to copied signatures in Bart variants * make fix-copies once more * using expand over repeat (thanks @michaelbenayoun) * expand instead of repeat for all model copies Co-authored-by:
Daniel Jones <jonesdaniel@microsoft.com>
-
Sylvain Gugger authored
-
Wonseok Lee (Jack) authored
* fix typos * fix sequence_length docs of LayoutLMv3Model * delete trailing white spaces * fix layoutlmv3 docs more * apply make fixup & quality * change to two versions of input docstring * apply make fixup & quality
-