Commits · 784d22078a64922176c9438d109cf94e74940ac6 · zhusg / transformers-new

26 Nov, 2024 11 commits

[doc] use full path for run_qa.py (#34914) · 784d2207
Fanli Lin authored 7 months ago
```
use full path for run_qa.py
```
784d2207
[docs] use device-agnostic API instead of cuda (#34913) · 6bc0c219
Fanli Lin authored 7 months ago
```
add device-agnostic API

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
```
6bc0c219

[i18n-ar] Translated file : `docs/source/ar/benchmarks.md` into Arabic (#33023) · 64b73e61

Ahmed Almaghz authored 7 months ago


* Add docs/source/ar/benchmarks.md to Add_docs_source_ar_benchmarks.md

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update benchmarks.md

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

64b73e61

Update the Python version in the Chinese README to match the English README. (#34870) · a0ba6315
vansin authored 7 months ago
```
Update Python Version
```
a0ba6315

Fix torch.onnx.export of Qwen2-VL vision encoder (#34852) · 1f6b423f

Joshua Lochner authored 7 months ago

* Fix torch.onnx.export of Qwen2-VL vision encoder

This PR fixes onnx export support for the vision encoder of Qwen2-VL, which converts the `cu_seqlens` to `torch.int32`, leading to errors later on when using the values for slicing.

https://github.com/huggingface/transformers/blob/c57eafdaa119eecae8557be4c626629bc1adc0fd/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py#L1044-L1046

## Error:
```
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Slice, node name: /blocks.0/attn/Slice_4): axes has inconsistent type tensor(int64)
```

## Code to reproduce issue:
```py

import requests
from PIL import Image
import torch
from transformers import (
    AutoProcessor,
    Qwen2VLForConditionalGeneration,
)

# Constants
VISION_MODEL_NAME = "vision_encoder.onnx"

# Load model and processor
model_id = "hf-internal-testing/tiny-random-Qwen2VLForConditionalGeneration"
model = Qwen2VLForConditionalGeneration.from_pretrained(model_id).eval()
processor = AutoProcessor.from_pretrained(model_id)

# Prepare inputs
url = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
image = Image.open(requests.get(url, stream=True).raw)
conversation = [
    {
        "role": "user",
        "content": [
            { "type": "image" },
            { "type": "text", "text": "Describe this image."},
        ],
    },
]
images = [image]
text_prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)
inputs = processor(text=[text_prompt], images=images, padding=True, return_tensors="pt")

## Vision model
vision_inputs = dict(
    pixel_values=inputs["pixel_values"],
    grid_thw=inputs["image_grid_thw"],
)
vision_inputs_positional = tuple(vision_inputs.values())
vision_outputs = model.visual.forward(*vision_inputs_positional)  # Test forward pass
torch.onnx.export(
    model.visual,
    args=vision_inputs_positional,
    f=VISION_MODEL_NAME,
    export_params=True,
    opset_version=14,
    do_constant_folding=True,
    input_names=list(vision_inputs.keys()),
    output_names=["image_features"],
    dynamic_axes={
        "pixel_values": {
            0: "batch_size * grid_t * grid_h * grid_w",
            1: "channel * temporal_patch_size * patch_size * patch_size",
        },
        "grid_thw": {0: "batch_size"},
        "image_features": {0: "batch_size * grid_t * grid_h * grid_w"},
    },
)

# Load and check the exported model model
import onnx
model = onnx.load(VISION_MODEL_NAME)
onnx.checker.check_model(model, full_check=True)
inferred = onnx.shape_inference.infer_shapes(model, check_type=True)
```

* Formatting

* [run-slow] qwen2_vl

1f6b423f

Separate chat templates into a single file (#33957) · d5cf91b3

Matt authored 7 months ago

* Initial draft

* Add .jinja file loading for processors

* Add processor saving of naked chat template files

* make fixup

* Add save-load test for tokenizers

* Add save-load test for tokenizers

* stash commit

* Try popping the file

* make fixup

* Pop the arg correctly

* Pop the arg correctly

* Add processor test

* Fix processor code

* stash commit

* Processor clobbers child tokenizer's chat template

* Processor clobbers child tokenizer's chat template

* make fixup

* Split processor/tokenizer files to avoid interactions

* fix test

* Expand processor tests

* Rename arg to "save_raw_chat_template" across all classes

* Update processor warning

* Move templates to single file

* Move templates to single file

* Improve testing for processor/tokenizer clashes

* Improve testing for processor/tokenizer clashes

* Extend saving test

* Test file priority correctly

* make fixup

* Don't pop the chat template file before the slow tokenizer gets a look

* Remove breakpoint

* make fixup

* Fix error

d5cf91b3

change apply_rotary_pos_emb of Glmmodel for GLM-Edge Series model (#34629) · 5a456178

Yuxuan.Zhang authored 7 months ago

* change apply_rotary_pos_emb

* upload for glm-edge

* remove useless part

* follow the suggestion

* fix

* format

* format

* test

* format again

* format again

* remove modular change

* remove modular change

* this apply_rotary_pos_emb need modify?

* fix with this

* format

* format

* ruff check

* modify modular_glm failed

* remove partial_rotary_factor of function  partial_rotary_factor

* fix wrong change of examples/research_projects

* revert

* remove line 118

* use q_rot

5a456178

Add Pytorch Tensor Parallel support for Mistral (#34927) · 1141eff1
Vladislav Bronzov authored 7 months ago
```
add base tp support
```
1141eff1

[Whisper] Fix whisper integration tests (#34111) · 4d1d0f29

eustlb authored 7 months ago


* fix test_tiny_timestamp_generation

* fix test_large_timestamp_generation

* fix test_whisper_shortform_single_batch_prev_cond

* fix test_whisper_shortform_multi_batch_hard_prev_cond

* return_timestamps necessary with long form

* fix test_default_multilingual_transcription_long_form

* fix test_tiny_token_timestamp_generation_longform

* fix test_whisper_longform_multi_batch_hard

* Update tests/models/whisper/test_modeling_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* fix typo

* do not expect special tokens

* fix test_whisper_longform_single_batch_beam

* fix test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* these tests does not make sense anymore

* this test does not make sense anymore

* make fixup

* suggested nits

* add test with forced_decoder_ids

* this test does not make sense anymore

* change assert for unittest test cases

* make fixup

* test with prompt_ids and task and language

* fix unittest test case call

* fix test_tiny_generation

* fix test_tiny_en_generation

* fix test_tiny_en_batched_generation

* fix test_tiny_longform_timestamps_generation

* fix test_tiny_timestamp_generation

* fix test_large_generation

* fix test_large_batched_generation

* fix test_large_generation_multilingual

* fix test_large_timestamp_generation

* fix test_large_timestamp_generation

* fix test_tiny_token_timestamp_generation_longform

* fix test_tiny_en_batched_generation

* make fixup

* [run-slow] whisper

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

4d1d0f29

Skipping aqlm non working inference tests till fix merged (#34865) · 0e805e6d
Mohamed Mekkouri authored 7 months ago

0e805e6d
VideoLLaVA: add default values (#34916) · 73b4ab10
Raushan Turganbay authored 7 months ago
```
add default values
```
73b4ab10

25 Nov, 2024 25 commits

Fix import structure for Fast Image processors (#34859) · bdb29ff9
Yoni Gozlan authored 7 months ago
```
* Fix import structure image_processor_fast

* update to new inits
```
bdb29ff9

making gpt2 fx traceable (#34633) · bfc3556b

xuzifei-dmatrix authored 7 months ago

* making gpt2 fx tracable

* running make fix-copies

* Revert "running make fix-copies"

This reverts commit 5a3437cb5b63799243bceae7d21a2aed8d0418c7.

bfc3556b

Updated documentation and added conversion utility (#34319) · 95c10fed

Viktor Scherbakov authored 7 months ago


* Updated documentation and added conversion utility

* Update docs/source/en/tiktoken.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tiktoken.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Moved util function to integration folder + allow for str

* Update formatting

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Updated formatting

* style changes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

95c10fed

Fix failling GGML test (#34871) · 890ea7de
Mohamed Mekkouri authored 7 months ago
```
fix_test
```
890ea7de
Upgrade torch version to 2.5 in dockerfile for quantization CI (#34924) · b76a292b
Mohamed Mekkouri authored 7 months ago
```
* Upgrade Torch 2.5

* uncomment
```
b76a292b
Fix `test_auto_backbone_timm_model_from_pretrained` (#34877) · a830df29
Yih-Dar authored 7 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a830df29

fix static cache data type miss-match (#34799) · a464afbe

jiqing-feng authored 7 months ago


* fix gptj data type missmatch

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add low precision static cache tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix low-precision static cache tests

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* avoid config change

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* change data type convert in cache copy

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix comment

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* cast key value after k v out

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

a464afbe

[AWQ, CI] Bump AWQ version used in docker image (#34922) · b13916c0

Benjamin Bossan authored 7 months ago

The old AWQ version is failing with the latest (unreleased)
transformers, giving the error:

> ImportError: cannot import name 'shard_checkpoint' from
'transformers.modeling_utils'

This has been resolved in awq v0.2.7:

https://github.com/casper-hansen/AutoAWQ/pull/644

b13916c0

Fix : BitNet tests (#34895) · 4e6b19cd
Mohamed Mekkouri authored 7 months ago
```
* fix_tests_bitnet

* fix format
```
4e6b19cd
Rename OLMo November to OLMo2 (#34864) · 9121ab8f
Shane A authored 7 months ago
```
* Rename/move OLMo Nov files to OLMo2

* Rename Olmo1124 and its variants to Olmo2
```
9121ab8f

Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/lxmert (#34917) · 1de3598d

dependabot[bot] authored 7 months ago

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2

)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

1de3598d

Fix Qwen2 failing tests (#34819) · f4c04ba3
Jacky Lee authored 7 months ago
```
* fix: qwen2 model ids

* fix: line

* fix: more format

* update: reformat
```
f4c04ba3

[`peft`] Given that `self.active_adapter` is deprecated, avoid using it (#34804) · 11cc2295

Tom Aarsen authored 7 months ago

* Given that self.active_adapter is deprecated, avoid using it

* Remove misleading comment - `self.active_adapter` is not used (and deprecated)

11cc2295

Fix convert_tokens_to_string when decoder is None (#34569) · 74db22f9

Donald Szeto authored 7 months ago


* Fix convert_tokens_to_string when decoder is None

* revert unrelated changs

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>

74db22f9

chore: fix some typos (#34891) · 97514a8b
wanxiangchwng authored 7 months ago
```
Signed-off-by: wanxiangchwng <cui.shuang@foxmail.com>
```
97514a8b

Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/visual_bert (#34887) · 62ab94de

dependabot[bot] authored 7 months ago

Bump tornado in /examples/research_projects/visual_bert

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2

)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

62ab94de

prepare_fa2_from_position_ids function bugfix (#33269) · c50b5675
Meliksah Turker authored 7 months ago
```
contiguous() is called before view() for key and value within prepare_fa2_from_position_ids function
```
c50b5675

allow unused input parameters passthrough when chunking in asr pipelines (#33889) · a0f4f317

VictorAtIfInsurance authored 7 months ago

* allow unused parameter passthrough when chunking in asr pipelines

* format code

* format

* run fixup

* update tests

* update parameters to pipline in test

* updates parametrs in tests

* change spelling in gitignore

* revert .gitignore to main

* add git ignore of devcontainer folder

* assert asr output follows expected inference output type

* run fixup

* Remove .devcontainer from .gitignore

* remove compliance check

a0f4f317

Sum gathered input tokens (#34554) · 4dc1a693

kang sheng authored 7 months ago


* sum gathered input tokens

* ruff line-length is 119, format the code

---------

Co-authored-by: kangsheng <kangsheng@meituan.com>

4dc1a693

🔴 Mllama: fix base prefix (#34874) · 1e492afd
Raushan Turganbay authored 7 months ago
```
fix base prefix
```
1e492afd

[`Deberta/Deberta-v2`] Refactor code base to support compile, export, and fix LLM (#22105) · 857d46ca

Arthur authored 7 months ago

* some modification for roadmap

* revert some changes

* yups

* weird

* make it work

* sttling

* fix-copies

* fixup

* renaming

* more fix-copies

* move stuff around

* remove torch script warnings

* ignore copies

* revert bad changes

* woops

* just styling

* nit

* revert

* style fixup

* nits configuration style

* fixup

* nits

* will this fix the tf pt issue?

* style

* ???????

* update

* eval?

* update error message

* updates

* style

* grumble grumble

* update

* style

* nit

* skip torch fx tests that were failing

* style

* skip the failing tests

* skip another test and make style

857d46ca

BLIP: fix generation after hub update (#34876) · 098962da

Raushan Turganbay authored 7 months ago


* fix blip generation

* dont remove it yet

* Update src/transformers/models/blip_2/modeling_blip_2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments

* modular

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

098962da

Cache: init empty cache when `use_cache` (#34274) · c1a85204

Raushan Turganbay authored 7 months ago

* fix

* fix tests

* fix copies

* add docs

* Revert "add docs"

This reverts commit 32d35634f12ba02781d2ebdee0c8dcfbe992a7b9.

* qwen move deltas

* mllama can potentiall fullgraph compile

* enable mllama compile and fix tests

* remove mllama fixes

c1a85204

Add safe_globals to resume training on PyTorch 2.6 (#34632) · 1339a14d

Dmitry Rogozhkin authored 7 months ago

Starting from version 2.4 PyTorch introduces a stricter check for the objects which
can be loaded with torch.load(). Starting from version 2.6 loading with weights_only=True
requires allowlisting of such objects.

This commit adds allowlist of some numpy objects used to load model checkpoints.
Usage is restricted by context manager. User can still additionally call
torch.serialization.add_safe_globals() to add other objects into the safe globals list.

Accelerate library also stepped into same problem and addressed it with PR-3036.

Fixes: #34631
See: https://github.com/pytorch/pytorch/pull/137602
See: https://pytorch.org/docs/stable/notes/serialization.html#torch.serialization.add_safe_globals
See: https://github.com/huggingface/accelerate/pull/3036

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

1339a14d

Fix: Enable prefill phase key value caching of nemotron/minitron models (#34742) · 318fe25f

jeongin601 authored 7 months ago


* modeling nemotron kv caching bugfix

Signed-off-by: jeongin601 <0200angela@gmail.com>

* test file deleted

Signed-off-by: jeongin601 <0200angela@gmail.com>

* code refinement

Signed-off-by: jeongin601 <0200angela@gmail.com>

* remove unused variables

Signed-off-by: jeongin601 <0200angela@gmail.com>

* import block sorted

* removed deprecation warning

Signed-off-by: jeongin601 <0200angela@gmail.com>

* removed support for tuple shape past_key_values

Signed-off-by: jeongin601 <0200angela@gmail.com>

* Update conditional statement for cache initialization

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Signed-off-by: jeongin601 <0200angela@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

318fe25f

22 Nov, 2024 4 commits

Fix support for image processors modifications in modular (#34866) · 3a8eb746
Yoni Gozlan authored 7 months ago
```
* add fix and examples

* fix camel case naming
```
3a8eb746
Bitnet test fix to avoid using gated model (#34863) · 54be2d7a
Mohamed Mekkouri authored 7 months ago
```
small test fix
```
54be2d7a

[CI] Skip EETQ tests while package is broken with latest transformers (#34854) · 286ffaaf

Benjamin Bossan authored 7 months ago

* CI Skip EETQ tests while package is broken

EETQ tries to import the shard_checkpoint function from transformers but
the function has been removed. Therefore, trying to use EETQ currently
results in an import error. This fix results in EETQ tests being skipped
if there is an import error.

The issue has been reported to EETQ:

https://github.com/NetEase-FuXi/EETQ/issues/34

* Raise helpful error when trying to use eetq

* Forget to raise the error in else clause

286ffaaf

smol improvements to support more flexible usage (#34857) · 861758e2
Andrés Marafioti authored 7 months ago
```
* smol improvements to support more flexible usage

* ruff
```
861758e2