Commits · f53a5dec7b03eb195dc89c82ae761b033db1ceb6 · 某某某 / transformers-new

25 Jul, 2024 2 commits
- remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1.7.0 (#32210) · f53a5dec
  Huazhong Ji authored 11 months ago
```
remove unnecessary guard code related with pytorch versions 1.4.2 ~
1.7.0
```
  f53a5dec
- [whisper] fix short-form output type (#32178) · 5658e749
  Sanchit Gandhi authored 11 months ago
```
* [whisper] fix short-form output type

* add test

* make style

* update long-form tests

* fixes

* last fix

* finalise test
```
  5658e749
24 Jul, 2024 11 commits

fix: Replaced deprecated `unittest method` with the correct one (#32198) · 85a1269e
Sai-Suraj-27 authored 11 months ago
```
Replaced deprecated unittest method with the correct one.
```
85a1269e

No more default chat templates (#31733) · edd68f4e

Matt authored 11 months ago

* No more default chat templates

* Add the template to the GPT-SW3 tests since it's not available by default now

* Fix GPT2 test

* Fix Bloom test

* Fix Bloom test

* Remove default templates again

edd68f4e

Support dequantizing GGUF FP16 format (#31783) · 1c122a46
Penut Chen authored 11 months ago
```
* support gguf fp16

* support gguf bf16 with pytorch

* add gguf f16 test

* remove bf16
```
1c122a46
Fix float8_e4m3fn in modeling_utils (#32193) · af0e4b7b
Marc Sun authored 11 months ago
```
* Fix float8_e4m3fn in modeling_utils

* style

* fix

* comment
```
af0e4b7b
Fix resize embedding with Deepspeed (#32192) · 1392a686
Raushan Turganbay authored 11 months ago
```
fix resize when deepspeed
```
1392a686
let's not warn when someone is running a forward (#32176) · 8d2534c4
Arthur authored 11 months ago
```
* let's not warn when someone is running a foward without cache + self.training

* more models

* fixup
```
8d2534c4

RoPE: relaxed rope validation (#32182) · e0182f3b

Joao Gante authored 11 months ago

* relaxed rope check

* lets also accept rope_type=None, defaulting to the original implementation

* type and rope_type can coexist

e0182f3b

Remove conversational pipeline tests (#32099) · 165116bc
amyeroberts authored 11 months ago
```
Remove conversation pipeline tests
```
165116bc

Update qwen2.md (#32108) · 5f4ee98a

Dr. Artificial曾小健 authored 11 months ago

* Update qwen2.md

outdated description

* Update qwen2.md

amended

* Update qwen2.md

Update

* Update qwen2.md

fix wrong version code, now good to go

5f4ee98a

fix: default value reflects the runtime environment variables rather than the... · 8678879f

조준래 authored 11 months ago

fix: default value reflects the runtime environment variables rather than the ones present at import time. (#32153)

* fix: default value reflects the runtime environment variables rather than the ones present at import time.

* Fix: Change `deterministic` to None by default; use env var if None

8678879f

adds: extra_repr() to MambaRMSNorm to include hidden size / size of weights in the layer (#32171) · 01be5b48
Rohit Dwivedula authored 11 months ago
```
* adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer

* style fix with ruff:
```
01be5b48

23 Jul, 2024 26 commits

[docs] change temperature to a positive value (#32077) · c85510f9
Fanli Lin authored 11 months ago
```
fix
```
c85510f9
fix: Fixed an if condition that is always evaluating to true (#32160) · bc2adb01
Sai-Suraj-27 authored 11 months ago
```
Fixed an if condition always evaluating to true.
```
bc2adb01
fix (#32162) · 23f6a43f
Joao Gante authored 11 months ago

23f6a43f
Llama 3.1 conversion · d5a99dfc
Lysandre authored 11 months ago
```
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
```
d5a99dfc
Dev version: v4.44.0.dev0 · ff0d708f
Lysandre authored 11 months ago

ff0d708f

Updated `ruff` to the latest version (#31926) · d2c687b3

Sai-Suraj-27 authored 11 months ago

* Updated ruff version and fixed the required code accorindg to the latest version.

* Updated ruff version and fixed the required code accorindg to the latest version.

* Added noqa directive to ignore 1 error shown by ruff

d2c687b3

Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629) · 9cf4f2aa

RhuiDih authored 11 months ago

* add DataCollatorBatchFlattening

* Update data_collator.py

* change name

* new FA2 flow if position_ids is provided

* add comments

* minor fix

* minor fix data collator

* add test cases for models

* add test case for data collator

* remove extra code

* formating for ruff check and check_repo.py

* ruff format

ruff format tests src utils

* custom_init_isort.py

9cf4f2aa

Added additional kwarg for successful running of optuna hyperparameter search (#31924) · 7d92009a
Deep Gandhi authored 11 months ago
```
Update integration_utils.py

Added additional kwarg
```
7d92009a

feat(cache): StaticCache uses index_copy_ to avoid useless copy (#31857) · 63700628

Alvaro Moran authored 11 months ago

* feat(cache): StaticCache uses index_copy_ to avoid useless copy

Using index_copy_ allows for explicit in-place change of the tensor.
Some backends (XLA) will otherwise copy the tensor, making the code
slower and using more memory.

Proposed implementation will end up using less memory and on XLA will
result in less compilation, but the change is also quite generic, making
no change whatsoever on CUDA or CPU backend.

* feat(cache): SlidingWindowCache uses index_copy_ to avoid useless copy

Applying the same change done in StaticCache.

* fix(cache): fallback of index_copy_ when not implemented

* fix(cache): in index_copy_ ensure tensors are on same device

* [run slow] llama

* fix(cache): add move of cache_position to same device in SlidingWindowCache

* Revert "[run slow] llama"

This reverts commit 02608dd14253ccd464e31c108e0cd94364f0e8b9.

63700628

Fix typing to be compatible with later py versions (#32155) · a009fbda
amyeroberts authored 11 months ago

a009fbda
Revert "Incorrect Whisper long-form decoding timestamps " (#32148) · 3263b343
Sanchit Gandhi authored 11 months ago
```
Revert "Incorrect Whisper long-form decoding timestamps  (#32003)"

This reverts commit cd48553f.
```
3263b343

Rename Phi-3 rope scaling type (#31436) · 034b4778

Amit Garg authored 11 months ago

* renamed phi3 rope_scaling type

* fixed trailing whitespaces

* fixed test

* added warning

* fixed format

034b4778

Added mamba.py backend (#30139) · bab32d6f

Alexandre TL authored 11 months ago


* Update README.md

* tests: forward ok

* backward test done

* done testing

* removed check. scripts

* Update README.md

* added use_mambapy arg

* fixed typo in warning

* protected imports w/ mambapy package

* delete pscan.py + raise rather than assert

* Update import_utils.py

* fix whitespaces and unused import

* trailing whitespace + import block unformatted

* Update modeling_mamba.py

* transpose before pscan

* shape comment

* ran make style

* use_mambapy=False by default

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* ran make fix-copies

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

bab32d6f

Fix video batching to videollava (#32139) · 9ced33ca
Merve Noyan authored 11 months ago
```
---------

Co-authored-by: Merve Noyan <mervenoyan@Merve-MacBook-Pro.local>
```
9ced33ca
Fix flash attention speed issue (#32028) · a5b226ce
Cyril Vallez authored 11 months ago
```
Add the lru_cache for speed
```
a5b226ce

gguf conversion add_prefix_space=None for llama3 (#31937) · a1844a32

Ita Zaporozhets authored 11 months ago

* gguf conversion forces add_prefix_space=False for llama3, this is not required and forces from_slow, which fails. changing to None + test

* typo

* clean test

a1844a32

Llama: RoPE refactor (#32135) · 2e113422

Joao Gante authored 11 months ago


Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

2e113422

Modify resize_token_embeddings to ensure output type is same as input (#31979) · 5a4a76ed

bayllama authored 11 months ago


* Change resize_token_embeddings to make it return same Class that is passed to it

* Add explanatory comment as requested in review

* Add explanatory comments for add resizing function in lxmert

* Add comment for padding_idx and moving _resize_bias in lxmert to LxmertForPreTraining

---------

Co-authored-by: Prashanth Sateesh <prasatee@Prashanths-MBP.attlocal.net>
Co-authored-by: Prashanth Sateesh <prasatee@Prashanths-MacBook-Pro.local>

5a4a76ed

Disable quick init for TapasPreTrainedModel (#32149) · 1535a2c9
Daniel Lok authored 11 months ago
```
add attribute to model

Signed-off-by: Daniel Lok <daniel.lok@databricks.com>
```
1535a2c9

Add YaRN and Dynamic-YaRN RoPE Scaling Methods (#30910) · 34b43211

mig-mfreitas authored 11 months ago

* Add YaRN and Dynamic-YaRN RoPE Scaling Methods

YaRN (Yet another RoPE extension method) combines the NTK-By-Parts
Interpolation and Attention Scaling methods, improving upon existing
RoPE interpolation methods for longer context window sizes.

Fine-tuned models maintain their original performance across benchmarks
while enabling efficient extrapolation and transfer learning for
quicker convergence, especially in compute-limited environments.

We implement YaRN and Dynamic-YaRN for the following list of models:

 - LLaMA
 - Falcon
 - GPT-NeoX
 - Olmo
 - Persimmon
 - Phi
 - StableLM
 - OpenLLaMA

New unit tests are added to assert YaRN's correct behavior on both
short and long sequence inputs.

For more details, please refer to https://arxiv.org/abs/2309.00071

.

Co-authored-by: Miguel Almeida <miguel.pessanha.almeida@tecnico.ulisboa.pt>

* Refactor YaRN implementation for LLaMA

Iterate on YaRN implementation for LLaMA and remove diff from remaining
models for increased PR modularity.

This commit includes the following changes:
- Merge 'yarn_rope_scaling' and 'rope_scaling' dictionaries
- Remove unnecessary attributes ('extrapolation_factor' and 'finetuned')
  from YaRN classes
- Inherit 'forward' method in YaRN classes from superclass
- Rename 'yarn' method to 'compute_yarn_scaling'
- Extend YaRN tests with further assertions
- Fix style inconsistencies

Co-authored-by: Miguel Monte e Freitas <miguelmontefreitas@tecnico.ulisboa.pt>

* Refactor Tensor Building Logic for YaRN

- Comply with the the tensor building logic introduced in #30743
- Add referencing to the optimized Attention Factor equation
- Remove Dynamic YaRN for a more agile deployment

Co-authored-by: mig-mfreitas <mig-mfreitas@users.noreply.github.com>

* remove unwanted file

---------

Co-authored-by: Miguel Almeida <miguel.pessanha.almeida@tecnico.ulisboa.pt>
Co-authored-by: mig-mfreitas <mig-mfreitas@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>

34b43211

Add method to retrieve used chat template (#32032) · 7405c1c7
KonradSzafer authored 11 months ago
```
encapsulate chat template logic
```
7405c1c7

Fix mask creations of `GPTNeoX` and `GPT2` (#31944) · 605f3245

Anton Vlasjuk authored 11 months ago

* fix mask creation of gpt2 and gpt_neox caused by me

* forgot the reshape of masks when shape > 2

* add tests for gpt neox and gpt2

* nit on a comment

605f3245

[modelling] remove un-necessary transpose for fa2 attention (#31749) · 2782aada
Sanchit Gandhi authored 11 months ago
```
* [whisper] remove un-necessary transpose for fa2 attention

* propagate
```
2782aada
Remove `trust_remote_code` when loading Libri Dummy (#31748) · f83c6f1d
Sanchit Gandhi authored 11 months ago
```
* [whisper integration] use parquet dataset for testing

* propagate to others

* more propagation

* last one
```
f83c6f1d
LLaVaNeXT: pad on right if training (#32134) · 3aefb4ec
Raushan Turganbay authored 11 months ago
```
* pad on right if training

* docs

* add tests
```
3aefb4ec

Add llama3-llava-next-8b to llava_next conversion script (#31395) · 251a2409

James Thewlis authored 11 months ago


* Add llama3-llava-next-8b to llava_next conversion script

Adds support for the lmms-lab/llama3-llava-next-8b model to the
convert_llava_next_weights_to_hf.py script, along with an example
prompt generated from the llava_llama_3 conv_template in the LLaVA-NeXT
repo.

* Exclude <|begin_of_text|> from prompt example

This token gets added automatically, so it should not be included in the
prompt example.

* Add llava-next-72b and llava-next-110b

Adds the Qwen-based LLaVA-Next models to the conversion script, along
with changes to load the models on multiple GPUs for inference.

* Add llama3 and qwen prompt formats to docs

* Chat prompt and padding side left for llama3 batched

* update

* Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove code

* better naming

---------

Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

251a2409

22 Jul, 2024 1 commit

Add new quant method (#32047) · 96a074fa

Marc Sun authored 11 months ago

* Add new quant method

* update

* fix multi-device

* add test

* add offload

* style

* style

* add simple example

* initial doc

* docstring

* style again

* works ?

* better docs

* switch to non persistant

* remove print

* fix init

* code review

96a074fa