- 21 Dec, 2023 6 commits
-
-
Arthur authored
* some nits * update test * add support d\sd[a * remove some dummy inputs * all good * style * nits * fixes * fix more copies * nits * styling * fix * Update src/transformers/models/mistral/modeling_mistral.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add a slow test just to be sure * fixup --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Sanchit Gandhi authored
* [Whisper] Use torch for stft if available * update docstring * mock patch decorator * fit on one line
-
Joao Gante authored
-
Poedator authored
* updated bitsandbytes.py * rm test_raise_* from test_4bit.py * add test_4bit_serialization.py * modeling_utils bulk edits * bnb_ver 0.41.3 in integrations/bitsandbytes.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * @slow reinstated Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * bnb ver 0.41.3 in src/transformers/modeling_utils.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * rm bnb version todo in integrations/bitsandbytes.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * moved 4b serialization tests to test_4bit * tests upd for opt * to torch_device Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * ruff fixes to tests * rm redundant bnb version check in mod_utils Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * restore _hf_peft_config_loaded modeling_utils.py::2188 Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * restore _hf_peft_config_loaded test in modeling_utils.py::2199 Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fixed NOT getattr(self, "is_8bit_serializable") Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * setting model.is_4bit_serializable * rm separate fp16_statistics arg from set_module... * rm else branch in integrations::bnb::set_module * bnb 4bit dtype check * upd comment on 4bit weights * upd tests for FP4 safe --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Dean Wyatte authored
disable retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest
- 20 Dec, 2023 11 commits
-
-
amyeroberts authored
* Fix yolos resizing * Update tests * Add a test
-
Joao Gante authored
Co-authored-by:
Merve Noyan <merveenoyan@gmail.com>
-
Steven Liu authored
* fsdp, debugging, gpu selection * fix hfoption * fix
-
amyeroberts authored
* Iteratre over out_features instead of stage_names * Update for all backbones * Add tests * Fix * Align timm backbone behaviour with other backbones * Fix tests * Stricter checks on set out_features and out_indices * Revert back stage selection logic * Remove out-of-order logic * Document restriction in docstrings
-
amyeroberts authored
* Update FA2 exception msg to point to hub discussions * Use path for hub url
-
Yih-Dar authored
* fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
peter-sk authored
* move code to Trainer.evaluate to enable use of that function with multiple datasets * test * update doc string * and a tip * forgot the type --------- Co-authored-by:
Prof. Peter Schneider-Kamp <jps@ordbogen.com>
-
Jong-hun Shin authored
* add attention_bias hparam for a model trained without attention biases * fix argument documentation error
-
Sourab Mangrulkar authored
* fix fa2 * fix FA2 for popular models * improve warning and add Younes as co-author Co-Authored-By:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix the warning * Add Tip * typo fix * nit --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Abolfazl Shahbazi authored
Signed-off-by:
Abolfazl Shahbazi <abolfazl.shahbazi@intel.com>
-
- 19 Dec, 2023 6 commits
-
-
Aaron Jimenez authored
Fix mistral link in mixtral.md
-
Mike Zellinger authored
In docstring for PreTrainedModel.resize_token_embeddings, correct definition of new_num_tokens parameter to read "the new number of tokens" (meaning the new size of the vocab) rather than "the number of new tokens" (number of newly added tokens only).
-
Arthur authored
* default config should not use sliding window * update the doc * nits * add a proper test * update * update * update expected value * Update src/transformers/tokenization_utils_fast.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * convert to float * average then N**2 * comment * revert nit * good to fo * fixup * Update tests/models/mixtral/test_modeling_mixtral.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * revert unrelated change --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
Joao Gante authored
* speculative decoding * fix test * space * better comments * remove redundant test * test nit * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * PR comments --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
amyeroberts authored
-
qihqi authored
* When save a model, make a copy to be moved to CPU, dont move the original model * make deepcopy inside of _save_tpu * Move to tpu without copy
-
- 18 Dec, 2023 11 commits
-
-
Aaron Jimenez authored
Fix token link
-
Mike Salvatore authored
-
Steven Liu authored
* doc fix friday * deprecated objects * update not_doctested * update toctree
-
Rockerz authored
Update semantic_segmentation.md
-
Matt authored
* More build_in_name_scope() * Make sure we set the save spec now we don't do it with dummies anymore * make fixup
-
Lucain authored
remove warning if DISABLE_TELEMETRY is used
-
Daize Dong authored
* Disable jitter noise during evaluation * Update outdated configuration information * Formatting * Add new line
-
lain authored
-
Wang, Yi authored
to reduce the storage size and also save the time of checkpoint saving while using deepspeed for training Signed-off-by:
Wang, Yi <yi.a.wang@intel.com>
-
Aeneas Stankowski authored
Update mixtral.md correct minor typo in overview
-
Younes Belkada authored
add SDPA into llava
-
- 17 Dec, 2023 2 commits
-
-
cyyever authored
-
Poedator authored
* edits to _prepare_4d_causal_attention_mask() * initial tests for 4d mask * attention_mask_for_sdpa support * added test for inner model hidden * added autotest decorators * test mask dtype to torch.int64 * torch.testing.assert_close Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * torch_device and @torch_gpu in tests * upd tests * +torch decorators * torch decorators fixed * more decorators! * even more decorators * fewer decorators --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 16 Dec, 2023 1 commit
-
-
Sourab Mangrulkar authored
* fix resuming from ckpt when suing FSDP with FULL_STATE_DICT * update tests * fix tests
-
- 15 Dec, 2023 3 commits
-
-
Steven Liu authored
* mps docs * toctree
-
Steven Liu authored
* first draft * add to toctree * edits * feedback
-
Younes Belkada authored
* Update vipllava.md * Update modeling_vipllava.py
-