- 19 Dec, 2023 5 commits
-
-
Mike Zellinger authored
In docstring for PreTrainedModel.resize_token_embeddings, correct definition of new_num_tokens parameter to read "the new number of tokens" (meaning the new size of the vocab) rather than "the number of new tokens" (number of newly added tokens only).
-
Arthur authored
* default config should not use sliding window * update the doc * nits * add a proper test * update * update * update expected value * Update src/transformers/tokenization_utils_fast.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * convert to float * average then N**2 * comment * revert nit * good to fo * fixup * Update tests/models/mixtral/test_modeling_mixtral.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * revert unrelated change --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
Joao Gante authored
* speculative decoding * fix test * space * better comments * remove redundant test * test nit * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * PR comments --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
amyeroberts authored
-
qihqi authored
* When save a model, make a copy to be moved to CPU, dont move the original model * make deepcopy inside of _save_tpu * Move to tpu without copy
-
- 18 Dec, 2023 11 commits
-
-
Aaron Jimenez authored
Fix token link
-
Mike Salvatore authored
-
Steven Liu authored
* doc fix friday * deprecated objects * update not_doctested * update toctree
-
Rockerz authored
Update semantic_segmentation.md
-
Matt authored
* More build_in_name_scope() * Make sure we set the save spec now we don't do it with dummies anymore * make fixup
-
Lucain authored
remove warning if DISABLE_TELEMETRY is used
-
Daize Dong authored
* Disable jitter noise during evaluation * Update outdated configuration information * Formatting * Add new line
-
lain authored
-
Wang, Yi authored
to reduce the storage size and also save the time of checkpoint saving while using deepspeed for training Signed-off-by:
Wang, Yi <yi.a.wang@intel.com>
-
Aeneas Stankowski authored
Update mixtral.md correct minor typo in overview
-
Younes Belkada authored
add SDPA into llava
-
- 17 Dec, 2023 2 commits
-
-
cyyever authored
-
Poedator authored
* edits to _prepare_4d_causal_attention_mask() * initial tests for 4d mask * attention_mask_for_sdpa support * added test for inner model hidden * added autotest decorators * test mask dtype to torch.int64 * torch.testing.assert_close Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * torch_device and @torch_gpu in tests * upd tests * +torch decorators * torch decorators fixed * more decorators! * even more decorators * fewer decorators --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 16 Dec, 2023 1 commit
-
-
Sourab Mangrulkar authored
* fix resuming from ckpt when suing FSDP with FULL_STATE_DICT * update tests * fix tests
-
- 15 Dec, 2023 18 commits
-
-
Steven Liu authored
* mps docs * toctree
-
Steven Liu authored
* first draft * add to toctree * edits * feedback
-
Younes Belkada authored
* Update vipllava.md * Update modeling_vipllava.py
-
Ligeng Zhu authored
* Fix wrong examples in llava usage. * Update modeling_llava.py
-
Kotaro Tanahashi authored
Fix `low_cpu_mem_usage` Flag Conflict with DeepSpeed Zero 3 in `from_pretrained` for Models with `keep_in_fp32_modules`" (#27762) Fix `from_pretrained` Logic for `low_cpu_mem_usage` with DeepSpeed Zero3
-
Quentin Lhoest authored
* fix hf-internal-testing/fixtures_image_utils * fix test * comments
-
dumpmemory authored
* add multi-node traning setting * fix style
-
Julien Chaumond authored
* make torch.load a bit safer * Fixes --------- Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Ke Wen authored
* Put device in tensor constructor instead of to() * Fix copy
-
Adilzhan Ismailov authored
Add past_key_values to _skip_keys_device_placement for LLaVa
-
Yoach Lacombe authored
* skip test from SpeechInput * refine description of skip
-
Younes Belkada authored
* Update convert_mixtral_weights_to_hf.py * forward contrib credits from original fix --------- Co-authored-by:
thomasw21 <thomasw21@users.noreply.github.com>
-
Cylis authored
-
Yoach Lacombe authored
-
Sanchit Gandhi authored
-
Sanchit Gandhi authored
* [Flax BERT] Update deprecated 'split' method * fix copies
-
Younes Belkada authored
fix for mistral
-
Younes Belkada authored
* fix fa-2 issue * fix test * Update src/transformers/modeling_utils.py Co-authored-by:
fxmarty <9808326+fxmarty@users.noreply.github.com> * clenaer fix * up * add more robust tests * Update src/transformers/modeling_utils.py Co-authored-by:
fxmarty <9808326+fxmarty@users.noreply.github.com> * fixup * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pop * add test --------- Co-authored-by:
fxmarty <9808326+fxmarty@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 14 Dec, 2023 3 commits
-
-
amyeroberts authored
Remove warning when enum is created
-
Matt authored
Replace build() with build_in_name_scope() for some tests
-
Matt authored
* Add a convenience method for building in your own name scope * Second attempt at auto layer building * Revert "Second attempt at auto layer building" This reverts commit e03a3aaecf9ec41a805582b83cbdfe3290a631be. * Attempt #3 * Revert "Attempt #3" This reverts commit b9df7a0857560d29b5abbed6127d9e9eca77cf47. * Add missing attributes that we're going to need later * Add some attributes we're going to need later * A fourth attempt! Feel the power flow through you! * Revert "A fourth attempt! Feel the power flow through you!" This reverts commit 6bf4aaf3875d6f28485f50187617a4c616c8aff7. * Add more values we'll need later * TF refactor that we'll need later * Revert "TF refactor that we'll need later" This reverts commit ca07202fb5b7b7436b893baa8d688b4f348ea7b9. * Revert "Revert "TF refactor that we'll need later"" This reverts commit 1beb0f39f293ed9c27594575e1c849aadeb15c13. * make fixup * Attempt five! * Revert "Attempt five!" This reverts commit 3302207958dfd0374b0447a51c06eea51a506044. * Attempt six - this time don't add empty methods * Revert "Attempt six - this time don't add empty methods" This reverts commit 67d60129be75416b6beb8f47c7d38d77b18d79bb. * Attempt seven - better base model class detection! * Revert "Attempt seven - better base model class detection!" This reverts commit 5f14845e92ea0e87c598da933bfbfee10f553bc9. * Another attribute we'll need later * Try again with the missing attribute! * Revert "Try again with the missing attribute!" This reverts commit 760c6f30c5dffb3e04b0e73c34a77d1882a0fef7. * This is the attempt that will pierce the heavens! * Revert "This is the attempt that will pierce the heavens!" This reverts commit c868bb657de057aca7a5260350a3f831fc4dfee6. * Attempt seven - snag list is steadily decreasing * Revert "Attempt seven - snag list is steadily decreasing" This reverts commit 46fbd975deda64429bfb3e5fac4fc0370c00d316. * Attempt eight - will an empty snag list do it? * Revert "Attempt eight - will an empty snag list do it?" This reverts commit 7c8a3c2b083253649569e9877e02054ae5cec67b. * Fixes to Hubert issues that cause problems later * Trying again with Conv1D/SeparableConv fixes * Revert "Trying again with Conv1D/SeparableConv fixes" This reverts commit 55092bca952bc0f750aa1ffe246a640bf1e2036e. * Apply the build shape fixes to Wav2Vec2 as well * One more attempt! * Revert "One more attempt!" This reverts commit 5ac3e4cb01b9458cc93312873725f9444ae7261c. * Another attempt! * Revert "Another attempt!" This reverts commit ea16d890e019d7de8792a3b8e72f3b1c02adae50. * Let's see how many failures we get without the internal build method * Fix OpenAI * Fix MobileBERT * (Mostly) fix GroupVIT * Fix BLIP * One more BLIP fix * One more BLIP fix! * Fix Regnet * Finally fully fix GroupViT * Fix Data2Vec and add the new AdaptivePool * Fix Segformer * Fix Albert * Fix Deberta/DebertaV2 * Fix XLM * Actually fix XLM * Fix Flaubert * Fix lxmert * Fix Resnet * Fix ConvBERT * Fix ESM * Fix Convnext / ConvnextV2 * Fix SAM * Fix Efficientformer * Fix LayoutLMv3 * Fix speech_to_text * Fix mpnet and mobilevit * Fix Swin * Fix CTRL * Fix CVT * Fix DPR * Fix Wav2Vec2 * Fix T5 * Fix Hubert * Fix GPT2 * Fix Whisper * Fix DeiT * Fix the encoder-decoder / dual-encoder classes * make fix-copies * build in name scope * Fix summarization test * Fix tied weight names for BART + Blenderbot * Fix tied weight name building * Fix to TFESM weight building * Update TF SAM * Expand all the shapes out into Big Boy Shapes
-