- 25 Jul, 2024 2 commits
-
-
Huazhong Ji authored
remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1.7.0
-
Sanchit Gandhi authored
* [whisper] fix short-form output type * add test * make style * update long-form tests * fixes * last fix * finalise test
-
- 24 Jul, 2024 11 commits
-
-
Sai-Suraj-27 authored
Replaced deprecated unittest method with the correct one.
-
Matt authored
* No more default chat templates * Add the template to the GPT-SW3 tests since it's not available by default now * Fix GPT2 test * Fix Bloom test * Fix Bloom test * Remove default templates again
-
Penut Chen authored
* support gguf fp16 * support gguf bf16 with pytorch * add gguf f16 test * remove bf16
-
Marc Sun authored
* Fix float8_e4m3fn in modeling_utils * style * fix * comment
-
Raushan Turganbay authored
fix resize when deepspeed
-
Arthur authored
* let's not warn when someone is running a foward without cache + self.training * more models * fixup
-
Joao Gante authored
* relaxed rope check * lets also accept rope_type=None, defaulting to the original implementation * type and rope_type can coexist
-
amyeroberts authored
Remove conversation pipeline tests
-
Dr. Artificial曾小健 authored
* Update qwen2.md outdated description * Update qwen2.md amended * Update qwen2.md Update * Update qwen2.md fix wrong version code, now good to go
-
조준래 authored
fix: default value reflects the runtime environment variables rather than the ones present at import time. (#32153) * fix: default value reflects the runtime environment variables rather than the ones present at import time. * Fix: Change `deterministic` to None by default; use env var if None
-
Rohit Dwivedula authored
* adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer * style fix with ruff:
-
- 23 Jul, 2024 26 commits
-
-
Fanli Lin authored
fix
-
Sai-Suraj-27 authored
Fixed an if condition always evaluating to true.
-
Joao Gante authored
-
Lysandre authored
Co-authored-by:
Arthur Zucker <arthur.zucker@gmail.com>
-
Lysandre authored
-
Sai-Suraj-27 authored
* Updated ruff version and fixed the required code accorindg to the latest version. * Updated ruff version and fixed the required code accorindg to the latest version. * Added noqa directive to ignore 1 error shown by ruff
-
RhuiDih authored
* add DataCollatorBatchFlattening * Update data_collator.py * change name * new FA2 flow if position_ids is provided * add comments * minor fix * minor fix data collator * add test cases for models * add test case for data collator * remove extra code * formating for ruff check and check_repo.py * ruff format ruff format tests src utils * custom_init_isort.py
-
Deep Gandhi authored
Update integration_utils.py Added additional kwarg
-
Alvaro Moran authored
* feat(cache): StaticCache uses index_copy_ to avoid useless copy Using index_copy_ allows for explicit in-place change of the tensor. Some backends (XLA) will otherwise copy the tensor, making the code slower and using more memory. Proposed implementation will end up using less memory and on XLA will result in less compilation, but the change is also quite generic, making no change whatsoever on CUDA or CPU backend. * feat(cache): SlidingWindowCache uses index_copy_ to avoid useless copy Applying the same change done in StaticCache. * fix(cache): fallback of index_copy_ when not implemented * fix(cache): in index_copy_ ensure tensors are on same device * [run slow] llama * fix(cache): add move of cache_position to same device in SlidingWindowCache * Revert "[run slow] llama" This reverts commit 02608dd14253ccd464e31c108e0cd94364f0e8b9.
-
amyeroberts authored
-
Sanchit Gandhi authored
Revert "Incorrect Whisper long-form decoding timestamps (#32003)" This reverts commit cd48553f.
-
Amit Garg authored
* renamed phi3 rope_scaling type * fixed trailing whitespaces * fixed test * added warning * fixed format
-
Alexandre TL authored
* Update README.md * tests: forward ok * backward test done * done testing * removed check. scripts * Update README.md * added use_mambapy arg * fixed typo in warning * protected imports w/ mambapy package * delete pscan.py + raise rather than assert * Update import_utils.py * fix whitespaces and unused import * trailing whitespace + import block unformatted * Update modeling_mamba.py * transpose before pscan * shape comment * ran make style * use_mambapy=False by default Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * ran make fix-copies --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Merve Noyan authored
--------- Co-authored-by:
Merve Noyan <mervenoyan@Merve-MacBook-Pro.local>
-
Cyril Vallez authored
Add the lru_cache for speed
-
Ita Zaporozhets authored
* gguf conversion forces add_prefix_space=False for llama3, this is not required and forces from_slow, which fails. changing to None + test * typo * clean test
-
Joao Gante authored
Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
bayllama authored
* Change resize_token_embeddings to make it return same Class that is passed to it * Add explanatory comment as requested in review * Add explanatory comments for add resizing function in lxmert * Add comment for padding_idx and moving _resize_bias in lxmert to LxmertForPreTraining --------- Co-authored-by:
Prashanth Sateesh <prasatee@Prashanths-MBP.attlocal.net> Co-authored-by:
Prashanth Sateesh <prasatee@Prashanths-MacBook-Pro.local>
-
Daniel Lok authored
add attribute to model Signed-off-by:
Daniel Lok <daniel.lok@databricks.com>
-
mig-mfreitas authored
* Add YaRN and Dynamic-YaRN RoPE Scaling Methods YaRN (Yet another RoPE extension method) combines the NTK-By-Parts Interpolation and Attention Scaling methods, improving upon existing RoPE interpolation methods for longer context window sizes. Fine-tuned models maintain their original performance across benchmarks while enabling efficient extrapolation and transfer learning for quicker convergence, especially in compute-limited environments. We implement YaRN and Dynamic-YaRN for the following list of models: - LLaMA - Falcon - GPT-NeoX - Olmo - Persimmon - Phi - StableLM - OpenLLaMA New unit tests are added to assert YaRN's correct behavior on both short and long sequence inputs. For more details, please refer to https://arxiv.org/abs/2309.00071 . Co-authored-by:
Miguel Almeida <miguel.pessanha.almeida@tecnico.ulisboa.pt> * Refactor YaRN implementation for LLaMA Iterate on YaRN implementation for LLaMA and remove diff from remaining models for increased PR modularity. This commit includes the following changes: - Merge 'yarn_rope_scaling' and 'rope_scaling' dictionaries - Remove unnecessary attributes ('extrapolation_factor' and 'finetuned') from YaRN classes - Inherit 'forward' method in YaRN classes from superclass - Rename 'yarn' method to 'compute_yarn_scaling' - Extend YaRN tests with further assertions - Fix style inconsistencies Co-authored-by:
Miguel Monte e Freitas <miguelmontefreitas@tecnico.ulisboa.pt> * Refactor Tensor Building Logic for YaRN - Comply with the the tensor building logic introduced in #30743 - Add referencing to the optimized Attention Factor equation - Remove Dynamic YaRN for a more agile deployment Co-authored-by:
mig-mfreitas <mig-mfreitas@users.noreply.github.com> * remove unwanted file --------- Co-authored-by:
Miguel Almeida <miguel.pessanha.almeida@tecnico.ulisboa.pt> Co-authored-by:
mig-mfreitas <mig-mfreitas@users.noreply.github.com> Co-authored-by:
Joao Gante <joao@huggingface.co>
-
KonradSzafer authored
encapsulate chat template logic
-
Anton Vlasjuk authored
* fix mask creation of gpt2 and gpt_neox caused by me * forgot the reshape of masks when shape > 2 * add tests for gpt neox and gpt2 * nit on a comment
-
Sanchit Gandhi authored
* [whisper] remove un-necessary transpose for fa2 attention * propagate
-
Sanchit Gandhi authored
* [whisper integration] use parquet dataset for testing * propagate to others * more propagation * last one
-
Raushan Turganbay authored
* pad on right if training * docs * add tests
-
James Thewlis authored
* Add llama3-llava-next-8b to llava_next conversion script Adds support for the lmms-lab/llama3-llava-next-8b model to the convert_llava_next_weights_to_hf.py script, along with an example prompt generated from the llava_llama_3 conv_template in the LLaVA-NeXT repo. * Exclude <|begin_of_text|> from prompt example This token gets added automatically, so it should not be included in the prompt example. * Add llava-next-72b and llava-next-110b Adds the Qwen-based LLaVA-Next models to the conversion script, along with changes to load the models on multiple GPUs for inference. * Add llama3 and qwen prompt formats to docs * Chat prompt and padding side left for llama3 batched * update * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove code * better naming --------- Co-authored-by:
raushan <raushan@huggingface.co> Co-authored-by:
Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 22 Jul, 2024 1 commit
-
-
Marc Sun authored
* Add new quant method * update * fix multi-device * add test * add offload * style * style * add simple example * initial doc * docstring * style again * works ? * better docs * switch to non persistant * remove print * fix init * code review
-