- 03 Nov, 2022 7 commits
-
-
Nicolas Patry authored
-
Nicolas Patry authored
-
Nicolas Patry authored
-
Saad Mahmud authored
* Add example docstring for CamembertConfig * Add configuration_camembert to documentation_tests
-
Yih-Dar authored
* Add skip_special_tokens=True in some doctest * For T5 * Fix for speech_to_text.mdx Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
amyeroberts authored
-
Nicolas Patry authored
-
- 02 Nov, 2022 13 commits
-
-
Steven Liu authored
-
Yih-Dar authored
* Show versions * check * store outputs * revert Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Ben Eyal authored
Fix Issue 15003: SentencePiece Tokenizers Not Adding Special Tokens in `convert_tokens_to_string` (#15775) * Add test for SentencePiece not adding special tokens to strings * Add SentencePieceStringConversionMixin to fix issue 15003 * Fix conversion from tokens to string for most SentencePiece tokenizers Tokenizers fixed: - AlbertTokenizer - BarthezTokenizer - CamembertTokenizer - FNetTokenizer - M2M100Tokenizer - MBart50Tokenizer - PegasusTokenizer - Speech2TextTokenizer * Fix MarianTokenizer, adjust SentencePiece test to accomodate vocab * Fix DebertaV2Tokenizer * Ignore LayoutXLMTokenizer in SentencePiece string conversion test * Run 'make style' and 'make quality' * Clean convert_tokens_to_string test Instead of explicitly ignoring LayoutXLMTokenizer in the test, override the test in LayoutLMTokenizationTest and do nothing in it. * Remove commented out code * Improve robustness of convert_tokens_to_string test Instead of comparing lengths of re-tokenized text and input_ids, check that converting all special tokens to string yields a string with all special tokens. * Inline and remove SentencePieceStringConversionMixin The convert_tokens_to_string method is now implemented in each relevant SentencePiece tokenizer. * Run 'make style' and 'make quality' * Revert removal of space in convert_tokens_to_string * Remove redundant import * Revert test text to original * Uncomment the lowercasing of the reverse_text variable * Mimic Rust tokenizer behavior for tokenizers - Albert - Barthez - Camembert - MBart50 - T5 * Fix accidentally skipping test in wrong tokenizer * Add test for equivalent Rust and slow tokenizer behavior * Override _decode in BigBirdTokenizer to mimic Rust behavior * Override _decode in FNetTokenizer to mimic Rust behavior * Override _decode in XLNetTokenizer to mimic Rust behavior * Remove unused 're' import * Update DebertaV2Tokenizer to mimic Rust tokenizer * Deberta tokenizer now behaves like Albert and its `convert_tokens_to_string` is not tested. * Ignore problematic tests in Deberta V2 * Add comment on why the Deberta V2 tests are skipped -
Yih-Dar authored
* Fix doctest Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* part 1 * part 2 * part 3 * fix * For CANINE * For ESMFold Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Saad Mahmud authored
* Add example docstring for DebertaV2Config * Add DebertaV2Config to documentation_tests * Fix mistake with directory name
-
amyeroberts authored
-
Sylvain Gugger authored
-
Yih-Dar authored
Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
amyeroberts authored
* Add CLIP image processor * Crop size as dict too * Update warning * Actually use logger this time * Normalize doesn't change dtype of input * Add perceiver image processor * Tidy up * Add DPT image processor * Add Vilt image processor * Tidy up * Add poolformer image processor * Tidy up * Add LayoutLM v2 and v3 imsge processors * Tidy up * Add Flava image processor * Tidy up * Add deit image processor * Tidy up * Add ConvNext image processor * Tidy up * Add levit image processor * Add segformer image processor * Add in post processing * Fix up * Add ImageGPT image processor * Fixup * Add mobilevit image processor * Tidy up * Add postprocessing * Fixup * Add VideoMAE image processor * Tidy up * Add ImageGPT image processor * Fixup * Add ViT image processor * Tidy up * Add beit image processor * Add mobilevit image processor * Tidy up * Add postprocessing * Fixup * Fix up * Fix flava and remove tree module * Fix image classification pipeline failing tests * Update feature extractor in trainer scripts * Update pad_if_smaller to accept tuple and int size * Update for image segmentation pipeline * Update src/transformers/models/perceiver/image_processing_perceiver.py Co-authored-by:
Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Update src/transformers/image_processing_utils.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/beit/image_processing_beit.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * PR comments - docstrings; remove accidentally added resize; var names * Update docstrings * Add exception if size is not in the right format * Fix exception check * Fix up * Use shortest_edge in tuple in script Co-authored-by:
Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
Ripose authored
-
Yih-Dar authored
* clean up * For backward compatibility * clean up * Same changes for more models Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Alara Dirik authored
-
- 01 Nov, 2022 12 commits
-
-
Steven Liu authored
-
Joao Gante authored
* Use beam search functionality; Add extra outputs and test * Add full tests for contrastive search * Add error message on unconventional cache format
-
Steven Liu authored
* add layoutlmv3 resource * add layoutlmv2 resources * fix button
-
Steven Liu authored
* add resources for bert * add course chapters * apply reviews * add pipeline icons and community resource * fix buttons
-
Steven Liu authored
-
Matt authored
* Add ESMFold code sample * sorry sylvain * make fixup * sorry sylvain again
-
Ikko Ashimine authored
* Add japanese translated README.md * Add README_ja.md link * Add japanese transkate to check_copies.py * Add guide to Japanese README.md * Update README_ja.md Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update utils/check_copies.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Wang Ran (汪然) authored
-
Sayak Paul authored
-
Mohit Sharma authored
* Added onnx config whisper * added whisper support onnx * add audio input data * added whisper support onnx * fixed the seqlength value * Updated the whisper onnx ocnfig * restore files to old version * removed attention mask from inputs * Updated get_dummy_input_onnxruntime docstring * Updated relative imports and token generation * update docstring
-
Sylvain Gugger authored
-
Matt authored
* initial commit * First draft that gets outputs without crashing! * Add all the ported openfold dependencies * testing * Restructure config files for ESMFold * Debugging to find output discrepancies * Mainly style * Make model runnable without extra deps * Remove utils and merge them to the modeling file * Use correct gelu and remove some debug prints * More cleanup * Update esm docs * Update conversion script to support ESMFold properly * Port some top-level changes from ESMFold repo * Expand EsmFold docstrings * Make attention_mask optional (default to all 1s) * Add inference test for ESMFold * Use config and not n kwargs * Add modeling output class * Remove einops * Remove chunking in ESM FFN * Update tests for ESMFold * Quality * REpo consistency * Remove tree dependency from ESMFold * make fixup * Add an error in case my structure map function breaks later * Remove needless code * Stop auto-casting the LM to float16 so CPU tests pass * Stop auto-casting the LM to float16 so CPU tests pass * Final test updates * Split test file * Copyright and quality * Unpin PyTorch to see built doc * Fix config file to_dict() method * Add some docstrings to the output * Skip TF checkpoint tests for ESM until we reupload those * make fixup * More docstrings * Unpin to get even with main * Flag example to write Co-authored-by:
Sylvain Gugger <Sylvain.gugger@gmail.com>
-
- 31 Oct, 2022 8 commits
-
-
NielsRogge authored
Co-authored-by:
Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
Yih-Dar authored
* pin torch to < 1.13 * pin torch to < 1.13 Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Jean Charles Kouame authored
-
Sanchit Gandhi authored
-
Sanchit Gandhi authored
* [modelcard] Update for ASR * style
-
Chiao authored
* gradient checkpointing for GPT-NeoX * initialize gradient checkpointing flag * must set flag before init
-
Saad Mahmud authored
* Add Example docstring to DebertaConfig * Add configuration_deberta to documentation_tests * Add microsoft/deberta-base to example docstring * Fix example docstring mistake
-
Yih-Dar authored
* donut -> donut-swin * remove ("donut-swin", "DonutProcessor") Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-