Commits · 89439fea6458d1a430c6dbcadb983937416090fd · zhusg / transformers-new

06 Feb, 2024 9 commits

Yih-Dar authored 1 year ago


* unpin torch

* check

* check

* check

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

89439fea

Revert "[WIP] Hard error when ignoring tensors." (#28898) · 76b4f666
Yih-Dar authored 1 year ago
```
Revert "[WIP] Hard error when ignoring tensors. (#27484)"

This reverts commit 2da28c4b.
```
76b4f666
Fix `FastSpeech2ConformerModelTest` and skip it on CPU (#28888) · 6529a5b5
Yih-Dar authored 1 year ago
```
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
6529a5b5
Raise error when using `save_only_model` with `load_best_model_at_end` for DeepSpeed/FSDP (#28866) · 5346db16
Sourab Mangrulkar authored 1 year ago
```
* Raise error when using `save_only_model` with `load_best_model_at_end` for DeepSpeed/FSDP

* Update trainer.py
```
5346db16
Fix LongT5ForConditionalGeneration initialization of lm_head (#28873) · ee2a3400
Eran Hirsch authored 1 year ago

ee2a3400
[Docs] Update project names and links in awesome-transformers (#28878) · 1ea0bbd7
Klaus Hipp authored 1 year ago
```
Update project names and repository links in awesome-transformers
```
1ea0bbd7

Bump cryptography from 41.0.2 to 42.0.0 in... · e83227d7

dependabot[bot] authored 1 year ago

Bump cryptography from 41.0.2 to 42.0.0 in /examples/research_projects/decision_transformer (#28879)

Bump cryptography in /examples/research_projects/decision_transformer

Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.2 to 42.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/41.0.2...42.0.0

)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

e83227d7

Adds LlamaForQuestionAnswering class in modeling_llama.py along with AutoModel Support (#28777) · 2e7c942c

nakranivaibhav authored 1 year ago

* This is a test commit

* testing commit

* final commit with some changes

* Removed copy statement

* Fixed formatting issues

* Fixed error added past_key_values in the forward method

* Fixed a trailing whitespace. Damn the formatting rules are strict

* Added the copy statement

2e7c942c

Do not use mtime for checkpoint rotation. (#28862) · ac51e59e
xkszltl authored 1 year ago
```
Resolve https://github.com/huggingface/transformers/issues/26961
```
ac51e59e

05 Feb, 2024 9 commits

ClearMLCallback enhancements: support multiple runs and handle logging better (#28559) · 06901162

eajechiloae authored 1 year ago


* add clearml tracker

* support multiple train runs

* remove bad code

* add UI entries for config/hparams overrides

* handle models in different tasks

* run ruff format

* tidy code based on code review

---------

Co-authored-by: Eugen Ajechiloae <eugenajechiloae@gmail.com>

06901162

Image Feature Extraction pipeline (#28216) · ba3264b4

amyeroberts authored 1 year ago


* Draft pipeline

* Fixup

* Fix docstrings

* Update doctest

* Update pipeline_model_mapping

* Update docstring

* Update tests

* Update src/transformers/pipelines/image_feature_extraction.py

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Fix docstrings - review comments

* Remove pipeline mapping for composite vision models

* Add to pipeline tests

* Remove for flava (multimodal)

* safe pil import

* Add requirements for pipeline run

* Account for super slow efficientnet

* Review comments

* Fix tests

* Swap order of kwargs

* Use build_pipeline_init_args

* Add back FE pipeline for Vilt

* Include image_processor_kwargs in docstring

* Mark test as flaky

* Update TODO

* Update tests/pipelines/test_pipelines_image_feature_extraction.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add license header

---------

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ba3264b4

Correct wav2vec2-bert inputs_to_logits_ratio (#28821) · 7addc934

Yoach Lacombe authored 1 year ago

* Correct wav2vec2-bert inputs_to_logits_ratio

* correct ratio

* correct ratio, clean asr pipeline

* refactor on one line

7addc934

[`Doc`] update contribution guidelines (#28858) · 3f9f7493
Arthur authored 1 year ago
```
update guidelines
```
3f9f7493

[WIP] Hard error when ignoring tensors. (#27484) · 2da28c4b

Nicolas Patry authored 1 year ago


* [WIP] Hard error when ignoring tensors.

* Better selection/error when saving a checkpoint.

- Find all names we should normally drop (those are in the transformers
  config)
- Find all disjoint tensors (for those we can safely trigger a copy to
  get rid of the sharing before saving)
- Clone those disjoint tensors getting rid of the issue
- Find all identical names (those should be declared in the config
  but we try to find them all anyway.)
- For all identical names:
  - If they are in the config, just ignore them everything is fine
  - If they are not, warn about them.
- For all remainder tensors which are shared yet neither identical NOR
  disjoint. raise a hard error.

* Adding a failing test on `main` that passes here.

* We don't need to keep the subfolder logic in this test.

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

2da28c4b

Ability to override clean_code_for_run (#28783) · 0466fd5c
w4ffl35 authored 1 year ago
```
* Add clean_code_for_run function

* Call clean_code_for_run from agent method
```
0466fd5c
[Docs] Fix bad doc: replace save with logging (#28855) · c430d6ea
Zizhao Chen authored 1 year ago
```
Fix bad doc: replace save with logging
```
c430d6ea
Support custom scheduler in deepspeed training (#26831) · 7b702836
Ziyang authored 1 year ago
```
Reuse trainer.create_scheduler to create scheduler for deepspeed
```
7b702836

Bump dash from 2.3.0 to 2.15.0 in /examples/research_projects/decision_transformer (#28845) · ca8944c4

dependabot[bot] authored 1 year ago

Bump dash in /examples/research_projects/decision_transformer

Bumps [dash](https://github.com/plotly/dash) from 2.3.0 to 2.15.0.
- [Release notes](https://github.com/plotly/dash/releases)
- [Changelog](https://github.com/plotly/dash/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/plotly/dash/compare/v2.3.0...v2.15.0

)

---
updated-dependencies:
- dependency-name: dash
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

ca8944c4

02 Feb, 2024 9 commits
- Mark `test_encoder_decoder_model_generate` for `vision_encoder_deocder` as flaky (#28842) · 3d2900e8
  amyeroberts authored 1 year ago
```
Mark test as flaky
```
  3d2900e8
- Reduce GPU memory usage when using FSDP+PEFT (#28830) · 80d50076
  Sourab Mangrulkar authored 1 year ago
```
support FSDP+PEFT
```
  80d50076
- Use `-v` for `pytest` on CircleCI (#28840) · f4977959
  Yih-Dar authored 1 year ago
```
use -v in pytest

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  f4977959
- fix / skip (for now) some tests before switch to torch 2.2 (#28838) · a7cb92aa
  Yih-Dar authored 1 year ago
```
* fix / skip some tests before we can switch to torch 2.2

* style

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  a7cb92aa
- Fix issues caused by natten (#28834) · 0e75aeef
  Yih-Dar authored 1 year ago
```
try

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  0e75aeef
- Add missing None check for hf_quantizer (#28804) · ec29d25d
  Juri Ganitkevitch authored 1 year ago
```
* Add missing None check for hf_quantizer

* Add test, fix logic.

* make style

* Switch test model to Mistral

* Comment

* Update tests/test_modeling_utils.py

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
```
  ec29d25d
- Explicitly check if token ID's are None in TFBertTokenizer constructor (#28824) · 1efb21c7
  skumar951 authored 1 year ago
```
Add an explicit none-check, since token ids can be 0
```
  1efb21c7
- [Docs] Fix spelling and grammar mistakes (#28825) · 721ee783
  Klaus Hipp authored 1 year ago
```
* Fix typos and grammar mistakes in docs and examples

* Fix typos in docstrings and comments

* Fix spelling of `tokenizer` in model tests

* Remove erroneous spaces in decorators

* Remove extra spaces in Markdown link texts
```
  721ee783
- [docs] HfQuantizer (#28820) · 2418c64a
  Steven Liu authored 1 year ago
```
* tidy

* fix path
```
  2418c64a
01 Feb, 2024 8 commits

[docs] Backbone (#28739) · abbffc45
Steven Liu authored 1 year ago
```
* backbones

* fix path

* fix paths

* fix code snippet

* fix links
```
abbffc45

Add models from deit (#28302) · 23ea6743

Rockerz authored 1 year ago


* Add modelss

* Add 2 more models

* add models to tocrree

* Add modles

* Update docs/source/ja/model_doc/detr.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/model_doc/deit.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/model_doc/deplot.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix bugs

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

23ea6743

[docs] fix some bugs about parameter description (#28806) · d98591a1
zspo authored 1 year ago
```
Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>
```
d98591a1

enable graident checkpointing in DetaObjectDetection and add tests in Swin/Donut_Swin (#28615) · e19c12e0

Sangbum Daniel Choi authored 1 year ago

* enable graident checkpointing in DetaObjectDetection

* fix missing part in original DETA

* make style

* make fix-copies

* Revert "make fix-copies"

This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358.

* remove fix-copies of DetaDecoder

* enable swin gradient checkpointing

* fix gradient checkpointing in donut_swin

* add tests for deta/swin/donut

* Revert "fix gradient checkpointing in donut_swin"

This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d.

* change supports_gradient_checkpointing pipeline to PreTrainedModel

* Revert "add tests for deta/swin/donut"

This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b.

* Revert "Revert "fix gradient checkpointing in donut_swin""

This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f.

* Simple revert

* enable deformable detr gradient checkpointing

* add gradient in encoder

e19c12e0

Add tip on setting tokenizer attributes (#28764) · 7bc6d763

Matt authored 1 year ago

* Add tip on setting tokenizer attributes

* Grammar

* Remove the bit that was causing doc builds to fail

7bc6d763

Fix symbolic_trace with kv cache (#28724) · 709dc432
fxmarty authored 1 year ago
```
* fix symbolic_trace with kv cache

* comment & better test
```
709dc432
Make `is_torch_bf16_available_on_device` more strict (#28796) · eb8e7a00
Yih-Dar authored 1 year ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
eb8e7a00

Adding [T5/MT5/UMT5]ForTokenClassification (#28443) · 0d26abdd

JB (Don) authored 1 year ago

* Adding [T5/MT5/UMT5]ForTokenClassification

* Add auto mappings for T5ForTokenClassification and variants

* Adding ForTokenClassification to the list of models

* Adding attention_mask param to the T5ForTokenClassification test

* Remove outdated comment in test

* Adding EncoderOnly and Token Classification tests for MT5 and UMT5

* Fix typo in umt5 string

* Add tests for all the existing MT5 models

* Fix wrong comment in dependency_versions_table

* Reverting change to common test for _keys_to_ignore_on_load_missing

The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing.

* Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model

* Add fix-copies to MT5ModelTest

0d26abdd

31 Jan, 2024 5 commits

[docs] Correct the statement in the docstirng of compute_transition_scores in... · 7b2bd1fb
Shichao Song authored 1 year ago
```
[docs] Correct the statement in the docstirng of compute_transition_scores in generation/utils.py (#28786)
```
7b2bd1fb

Split daily CI using 2 level matrix (#28773) · 47358661

Yih-Dar authored 1 year ago


* update / add new workflow files

* Add comment

* Use env.NUM_SLICES

* use scripts

* use scripts

* use scripts

* Fix

* using one script

* Fix

* remove unused file

* update

* fail-fast: false

* remove unused file

* fix

* fix

* use matrix

* inputs

* style

* update

* fix

* fix

* no model name

* add doc

* allow args

* style

* pass argument

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

47358661

Add artifact name in job step to maintain job / artifact correspondence (#28682) · 95346e9d
Yih-Dar authored 1 year ago
```
* avoid using job name

* apply to other files

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
95346e9d
DeepSpeed: hardcode `torch.arange` dtype on `float` usage to avoid incorrect... · beb2a096
Joao Gante authored 1 year ago
```
DeepSpeed: hardcode `torch.arange` dtype on `float` usage to avoid incorrect initialization (#28760)
```
beb2a096

Flax mistral (#26943) · f7076cd3

Kian Sierra McGettigan authored 1 year ago

* direct copy from llama work

* mistral modules forward pass working

* flax mistral forward pass with sliding window

* added tests

* added layer collection approach

* Revert "added layer collection approach"

This reverts commit 0e2905bf2236ec323163fc1a9f0c016b21aa8b8f.

* Revert "Revert "added layer collection approach""

This reverts commit fb17b6187ac5d16da7c461e1130514dc3d137a43.

* fixed attention outputs

* added mistral to init and auto

* fixed import name

* fixed layernorm weight dtype

* freeze initialized weights

* make sure conversion consideres bfloat16

* added backend

* added docstrings

* added cache

* fixed sliding window causal mask

* passes cache tests

* passed all tests

* applied make style

* removed commented out code

* applied fix-copies ignored other model changes

* applied make fix-copies

* removed unused functions

* passed generation integration test

* slow tests pass

* fixed slow tests

* changed default dtype from jax.numpy.float32 to float32 for docstring check

* skip cache test  for FlaxMistralForSequenceClassification since if pad_token_id in input_ids it doesn't score previous input_ids

* updated checkpoint since from_pt not included

* applied black style

* removed unused args

* Applied styling and fixup

* changed checkpoint for doc back

* fixed rf after adding it to hf hub

* Add dummy ckpt

* applied styling

* added tokenizer to new ckpt

* fixed slice format

* fix init and slice

* changed ref for placeholder TODO

* added copies from Llama

* applied styling

* applied fix-copies

* fixed docs

* update weight dtype reconversion for sharded weights

* removed Nullable input ids

* Removed unnecessary output attentions in Module

* added embedding weight initialziation

* removed unused past_key_values

* fixed deterministic

* Fixed RMS Norm and added copied from

* removed input_embeds

* applied make style

* removed nullable input ids from sequence classification model

* added copied from GPTJ

* added copied from Llama on FlaxMistralDecoderLayer

* added copied from to FlaxMistralPreTrainedModel methods

* fix test deprecation warning

* freeze gpt neox random_params and fix copies

* applied make style

* fixed doc issue

* skipped docstring test to allign # copied from

* applied make style

* removed FlaxMistralForSequenceClassification

* removed unused padding_idx

* removed more sequence classification

* removed sequence classification

* applied styling and consistency

* added copied from in tests

* removed sequence classification test logic

* applied styling

* applied make style

* removed freeze and fixed copies

* undo test change

* changed repeat_kv to tile

* fixed to key value groups

* updated copyright year

* split casual_mask

* empty to rerun failed pt_flax_equivalence test FlaxWav2Vec2ModelTest

* went back to 2023 for tests_pr_documentation_tests

* went back to 2024

* changed tile to repeat

* applied make style

* empty for retry on Wav2Vec2

f7076cd3