Commits · best_benchmark_on_static_cache_new · zhusg / transformers-new

21 Feb, 2024 1 commit
- fix · d538e447
  ydshieh authored 1 year ago
  
  d538e447
20 Feb, 2024 7 commits
- update · 4ad7b656
  ydshieh authored 1 year ago
  
  4ad7b656
- update · 2caf4e7d
  ydshieh authored 1 year ago
  
  2caf4e7d
- upload · 60bae8e8
  ydshieh authored 1 year ago
  
  60bae8e8
- FEAT [`Trainer` / `bnb`]: Add RMSProp from `bitsandbytes` to HF `Trainer` (#29082) · f7ef7cec
  Younes Belkada authored 1 year ago
```
* add RMSProp to Trainer

* revert some change

* Update src/transformers/trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
```
  f7ef7cec
- Move misplaced line (#29117) · a7ff2f23
  Erich Schubert authored 1 year ago
```
Move misplaced line, improve code comment
```
  a7ff2f23
- [`gradient_checkpointing`] default to use it for torch 2.3 (#28538) · 9094abe8
  Arthur authored 1 year ago
```
* default to use it

* style
```
  9094abe8
- Fixed nll with label_smoothing to just nll (#28708) · 49c0b293
  Nilesh authored 1 year ago
```
* Fixed nll with label_smoothing to nll

* Resolved conflict by rebase

* Fixed nll with label_smoothing to nll

* Resolved conflict by rebase

* Added label_smoothing to config file

* Fixed nits
```
  49c0b293
19 Feb, 2024 11 commits

storing & logging gradient norm in trainer (#27326) · 4f09d0fd
Shijie Wu authored 1 year ago
```
* report grad_norm during training

* support getting grad_norm from deepspeed
```
4f09d0fd
Fix two tiny typos in `pipelines/base.py::Pipeline::_sanitize_parameters()`'s docstring (#29102) · a4851d94
Sadra Barikbin authored 1 year ago
```
* Update base.py

* Fix a typo
```
a4851d94

Bnb test fix for different hardwares (#29066) · 5ce90f32

Titus authored 1 year ago


* generated text on A10G

* generated text in CI

* Apply suggestions from code review

add explanatory comments

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

5ce90f32

ENH: added new output_logits option to generate function (#28667) · 08cd694e

Max Baak authored 1 year ago

output_logits option behaves like output_scores, but returns the raw, unprocessed prediction logit scores,
ie. the values before they undergo logit processing and/or warping. The latter happens by default for the
regular output scores.

It's useful to have the unprocessed logit scores in certain circumstances. For example, unprocessed logit scores
are very useful with causallm models when one wants to determine the probability of a certain answer, e.g.
when asking a question with a yes/no answer. In that case getting the next-token probabilities of both "yes" and
"no" (and/or their relative ratio) is of interest for classification. The reason for getting these _before_ logit
processing and/or warping is b/c a) that can change the probabilities or b) reject the tokens of interest / reduce
the number of tokens to just 1.

For an example use-case see paper TabLLM: Few-shot Classification of Tabular Data with Large Language Models
by Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag.
https://arxiv.org/abs/2210.10723



In addition:
- added dedicated unit test: tests/generation/test_utils/test_return_unprocessed_logit_scores
  which tests return of logics with output_logits=True in generation.
- set output_logits=True in all other generation unit tests, that also have output_scores=True.

Implemented @gante's and @amyeroberts review feedback

Co-authored-by: kx79wq <max.baak@ing.com>

08cd694e

[Docs] Add resources (#28705) · 07e3454f

NielsRogge authored 1 year ago


* Add resource

* Add more resources

* Add resources

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Remove mention

* Remove pipeline tags

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

07e3454f

change version (#29097) · b2724d7b

Arthur authored 1 year ago


* change version

* nuke

* this doesn't make sense

* update some requirements.py

* revert + no main

* nits

* change cache number

* more pin

* revert

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

b2724d7b

Fix a typo in `examples/pytorch/text-classification/run_classification.py` (#29072) · 79132d4c
Jay Zhou authored 1 year ago

79132d4c
Fix the `bert-base-cased` tokenizer configuration test (#29105) · 98308586
Lysandre Debut authored 1 year ago
```
Fix test
```
98308586

fix the post-processing link (#29091) · 593230f0

Winton Davies authored 1 year ago

The link in evaluation was missing a hyphen between post and processing. I fixed this, for English only. Someone with the ability to do a global search/replace should fix the other languages (if indeed they have this issue)/

593230f0

FIX [`bnb` / `tests`]: Fix currently failing bnb tests (#29092) · a75a6c93
Younes Belkada authored 1 year ago
```
Update test_mixed_int8.py
```
a75a6c93

[`Awq`] Add peft support for AWQ (#28987) · 864c8e6e

Younes Belkada authored 1 year ago


* add peft support for AWQ

* Update src/transformers/quantizers/quantizer_awq.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

864c8e6e

16 Feb, 2024 13 commits

[Docs] Spanish translation of task_summary.md (#28844) · ce4fff0b

Aaron Jimenez authored 1 year ago

* Add task_summary to es/_toctree.yml

* Add task_summary.md to docs/es

* Change title of task_summary.md

* Translate firsts paragraphs

* Translate middle paragraphs

* Translte the rest of the doc

* Edit firts paragraph

ce4fff0b

Add chat support to text generation pipeline (#28945) · 2f1003be

Matt authored 1 year ago

* Add chat support to text generation pipeline

* Better handling of single elements

* Deprecate ConversationalPipeline

* stash commit

* Add missing add_special_tokens kwarg

* Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline

* Add TF

 tests

* @require_tf

* Add type hint

* Add specific deprecation version

* Remove unnecessary do_sample

* Remove todo - the discrepancy has been resolved

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/pipelines/text_generation.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

2f1003be

Fix trainer test wrt DeepSpeed + auto_find_bs (#29061) · 636b0324

Zach Mueller authored 1 year ago


* FIx trainer test

* Update tests/trainer/test_trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

636b0324

Feature: Option to set the tracking URI for MLflowCallback. (#29032) · 161fe425

Sean (Seok-Won) Yi authored 1 year ago

* Added option to set tracking URI for MLflowCallback.

* Added option to set tracking URI for MLflowCallback.

* Changed  to  in docstring.

161fe425

Honor trust_remote_code for custom tokenizers (#28854) · be42c24d

Richard Lee authored 1 year ago


* pass through trust_remote_code for dynamically loading unregistered tokenizers specified by config
add test

* change directories back to previous directory after test

* fix ruff check

* Add a note to that block for future in case we want to remove it later

---------

Co-authored-by: Matt <rocketknight1@gmail.com>

be42c24d

`auto_find_batch_size` isn't yet supported with DeepSpeed/FSDP. Raise error accrodingly. (#29058) · 4c18ddb5
Sourab Mangrulkar authored 1 year ago
```
Update trainer.py
```
4c18ddb5
fix failing trainer ds tests (#29057) · b2628086
Sourab Mangrulkar authored 1 year ago

b2628086

fix num_assistant_tokens with heuristic schedule (#28759) · 258da40e

Jonathan Mamou authored 1 year ago


* fix heuristic num_assistant_tokens_schedule

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update utils.py

check that candidate_generator.assistant_model exists since some some speculations (like ngram and PLD) don't have assistant_model attribute

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/generation/test_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make fixup

* merge conflict

* fix docstring

* make fixup

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

258da40e

Support : Leverage Accelerate for object detection/segmentation models (#28312) · 0eb40855

Tanmay patil authored 1 year ago

* made changes for object detection models

* added support for segmentation models.

* Made changes for segmentaion models

* Changed import statements

* solving conflicts

* removed conflicts

* Resolving commits

* Removed conflicts

* Fix : Pixel_mask_value set to False

0eb40855

Fix max_length criteria when using inputs_embeds (#28994) · aee11fe4

Raushan Turganbay authored 1 year ago


* fix max_length for inputs_embeds

* make style

* Update src/transformers/generation/utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Static Cache: load models with MQA or GQA (#28975)

* fix

* fix tests

* fix tests

* Update src/transformers/generation/utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* more fixes

* make style

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

aee11fe4

Update important model list (#29019) · 8876ce8a
Lysandre Debut authored 1 year ago

8876ce8a
Update all references to canonical models (#29001) · f497f564
Lysandre Debut authored 1 year ago
```
* Script & Manual edition

* Update
```
f497f564
add test marker to run all tests with @require_bitsandbytes (#28278) · 1e402b95
Titus authored 1 year ago

1e402b95

15 Feb, 2024 8 commits

Fix a tiny typo in `generation/utils.py::GenerateEncoderDecoderOutput`'s docstring (#29044) · f3aa7db4
Sadra Barikbin authored 1 year ago
```
Update utils.py
```
f3aa7db4
Removed obsolete attribute setting for AQLM quantization. (#29034) · b0a7f44f
Andrei Panferov authored 1 year ago
```
removed redundant field
```
b0a7f44f
Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043) · 4156f517
amyeroberts authored 1 year ago
```
* Patch to skip currently failing tests

* Whoops - wrong place
```
4156f517
FIX: Fix error with `logger.warning` + inline with recent refactor (#29039) · 6d1f5456
Younes Belkada authored 1 year ago
```
Update modeling_utils.py
```
6d1f5456
Fix copies between DETR and DETA (#29037) · 8a0ed0a9
amyeroberts authored 1 year ago

8a0ed0a9

DeformableDetrModel support fp16 (#29013) · 5b6fa230

Donggeun Yu authored 1 year ago


* Update ms_deform_attn_cuda.cu

* Update ms_deform_attn_cuda.cuh

* Update modeling_deformable_detr.py

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update modeling_deformable_detr.py

* python utils/check_copies.py --fix_and_overwrite

* Fix dtype missmatch error

* Update test_modeling_deformable_detr.py

* Update test_modeling_deformable_detr.py

* Update modeling_deformable_detr.py

* Update modeling_deformable_detr.py

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5b6fa230

Add cuda_custom_kernel in DETA (#28989) · 83e96dc0

Sangbum Daniel Choi authored 1 year ago

* enable graident checkpointing in DetaObjectDetection

* fix missing part in original DETA

* make style

* make fix-copies

* Revert "make fix-copies"

This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358.

* remove fix-copies of DetaDecoder

* enable swin gradient checkpointing

* fix gradient checkpointing in donut_swin

* add tests for deta/swin/donut

* Revert "fix gradient checkpointing in donut_swin"

This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d.

* change supports_gradient_checkpointing pipeline to PreTrainedModel

* Revert "add tests for deta/swin/donut"

This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b.

* Revert "Revert "fix gradient checkpointing in donut_swin""

This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f.

* Simple revert

* enable deformable detr gradient checkpointing

* add gradient in encoder

* add cuda_custom_kernel function in MSDA

* make style and fix input of DetaMSDA

* make fix-copies

* remove n_levels in input of DetaMSDA

* minor changes

* refactor custom_cuda_kernel like yoso format
https://github.com/huggingface/transformers/blob/0507e69d34f8902422eb4977ec066dd6bef179a0/src/transformers/models/yoso/modeling_yoso.py#L53

83e96dc0

Fix static generation when compiling! (#28937) · f3788b09

Arthur authored 1 year ago


* wow I was scared!

* fix everything

* nits

* make it BC?

* add todo

* nits

* is_tracing should still be used to pass tracing tests

* nits

* some nits to make sure genration works with static cache uncompiled

* fix sdpa

* fix FA2 for both static and dynamic in a better way?

* style

* fix-copies

* fix fix copies

* fix sequential beam searcg

* style

* use `keys_to_ignore`

* nit

* correct dtype inference when init

* :( the fix for FA2 is still not optimal to investigate!

* styling

* nits

* nit

* this might work better

* add comment

* Update src/transformers/models/llama/modeling_llama.py

* "position_ids" -> "cache_position"

* style

* nit

* Remove changes that should no be propagatted just yet

* Apply suggestions from code review

* Styling

* make sure we raise an errir for static cache with FA2 enabled

* move  to the bottom of the signature

* style

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py

* nit in the name

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f3788b09