Commits · fix_docker_autogptq_from_source · zhusg / transformers-new

28 Nov, 2024 5 commits
- uncomment · de08be25
  MekkCyber authored 6 months ago
  
  de08be25
- Test autogptq ci · 68e03b77
  MekkCyber authored 6 months ago
  
  68e03b77
- add arch list · e08a1fda
  MekkCyber authored 6 months ago
  
  e08a1fda
- fix python3 · 14e6b4d4
  MekkCyber authored 6 months ago
  
  14e6b4d4
- fix · 6fc30814
  MekkCyber authored 6 months ago
  
  6fc30814
27 Nov, 2024 3 commits
- fix · be222730
  MekkCyber authored 6 months ago
  
  be222730
- fix · 5bdb3828
  MekkCyber authored 6 months ago
  
  5bdb3828
- Build Docker with AutoGPTQ from source · b26edf13
  MekkCyber authored 6 months ago
  
  b26edf13
26 Nov, 2024 13 commits

[i18n-zh]Translated tiktoken.md into chinese (#34936) · 6c3f168b
blueingman authored 6 months ago
```
* Add translation for tiktoken documentation

* Update tiktoken.md

* Update tiktoken.md
```
6c3f168b
docs: HUGGINGFACE_HUB_CACHE -> HF_HUB_CACHE (#34904) · 5bfb40bc
谭九鼎 authored 6 months ago

5bfb40bc
[doc] use full path for run_qa.py (#34914) · 784d2207
Fanli Lin authored 6 months ago
```
use full path for run_qa.py
```
784d2207
[docs] use device-agnostic API instead of cuda (#34913) · 6bc0c219
Fanli Lin authored 6 months ago
```
add device-agnostic API

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
```
6bc0c219

[i18n-ar] Translated file : `docs/source/ar/benchmarks.md` into Arabic (#33023) · 64b73e61

Ahmed Almaghz authored 6 months ago


* Add docs/source/ar/benchmarks.md to Add_docs_source_ar_benchmarks.md

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update benchmarks.md

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

64b73e61

Update the Python version in the Chinese README to match the English README. (#34870) · a0ba6315
vansin authored 6 months ago
```
Update Python Version
```
a0ba6315

Fix torch.onnx.export of Qwen2-VL vision encoder (#34852) · 1f6b423f

Joshua Lochner authored 6 months ago

* Fix torch.onnx.export of Qwen2-VL vision encoder

This PR fixes onnx export support for the vision encoder of Qwen2-VL, which converts the `cu_seqlens` to `torch.int32`, leading to errors later on when using the values for slicing.

https://github.com/huggingface/transformers/blob/c57eafdaa119eecae8557be4c626629bc1adc0fd/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py#L1044-L1046

## Error:
```
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Slice, node name: /blocks.0/attn/Slice_4): axes has inconsistent type tensor(int64)
```

## Code to reproduce issue:
```py

import requests
from PIL import Image
import torch
from transformers import (
    AutoProcessor,
    Qwen2VLForConditionalGeneration,
)

# Constants
VISION_MODEL_NAME = "vision_encoder.onnx"

# Load model and processor
model_id = "hf-internal-testing/tiny-random-Qwen2VLForConditionalGeneration"
model = Qwen2VLForConditionalGeneration.from_pretrained(model_id).eval()
processor = AutoProcessor.from_pretrained(model_id)

# Prepare inputs
url = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
image = Image.open(requests.get(url, stream=True).raw)
conversation = [
    {
        "role": "user",
        "content": [
            { "type": "image" },
            { "type": "text", "text": "Describe this image."},
        ],
    },
]
images = [image]
text_prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)
inputs = processor(text=[text_prompt], images=images, padding=True, return_tensors="pt")

## Vision model
vision_inputs = dict(
    pixel_values=inputs["pixel_values"],
    grid_thw=inputs["image_grid_thw"],
)
vision_inputs_positional = tuple(vision_inputs.values())
vision_outputs = model.visual.forward(*vision_inputs_positional)  # Test forward pass
torch.onnx.export(
    model.visual,
    args=vision_inputs_positional,
    f=VISION_MODEL_NAME,
    export_params=True,
    opset_version=14,
    do_constant_folding=True,
    input_names=list(vision_inputs.keys()),
    output_names=["image_features"],
    dynamic_axes={
        "pixel_values": {
            0: "batch_size * grid_t * grid_h * grid_w",
            1: "channel * temporal_patch_size * patch_size * patch_size",
        },
        "grid_thw": {0: "batch_size"},
        "image_features": {0: "batch_size * grid_t * grid_h * grid_w"},
    },
)

# Load and check the exported model model
import onnx
model = onnx.load(VISION_MODEL_NAME)
onnx.checker.check_model(model, full_check=True)
inferred = onnx.shape_inference.infer_shapes(model, check_type=True)
```

* Formatting

* [run-slow] qwen2_vl

1f6b423f

Separate chat templates into a single file (#33957) · d5cf91b3

Matt authored 6 months ago

* Initial draft

* Add .jinja file loading for processors

* Add processor saving of naked chat template files

* make fixup

* Add save-load test for tokenizers

* Add save-load test for tokenizers

* stash commit

* Try popping the file

* make fixup

* Pop the arg correctly

* Pop the arg correctly

* Add processor test

* Fix processor code

* stash commit

* Processor clobbers child tokenizer's chat template

* Processor clobbers child tokenizer's chat template

* make fixup

* Split processor/tokenizer files to avoid interactions

* fix test

* Expand processor tests

* Rename arg to "save_raw_chat_template" across all classes

* Update processor warning

* Move templates to single file

* Move templates to single file

* Improve testing for processor/tokenizer clashes

* Improve testing for processor/tokenizer clashes

* Extend saving test

* Test file priority correctly

* make fixup

* Don't pop the chat template file before the slow tokenizer gets a look

* Remove breakpoint

* make fixup

* Fix error

d5cf91b3

change apply_rotary_pos_emb of Glmmodel for GLM-Edge Series model (#34629) · 5a456178

Yuxuan.Zhang authored 6 months ago

* change apply_rotary_pos_emb

* upload for glm-edge

* remove useless part

* follow the suggestion

* fix

* format

* format

* test

* format again

* format again

* remove modular change

* remove modular change

* this apply_rotary_pos_emb need modify?

* fix with this

* format

* format

* ruff check

* modify modular_glm failed

* remove partial_rotary_factor of function  partial_rotary_factor

* fix wrong change of examples/research_projects

* revert

* remove line 118

* use q_rot

5a456178

Add Pytorch Tensor Parallel support for Mistral (#34927) · 1141eff1
Vladislav Bronzov authored 6 months ago
```
add base tp support
```
1141eff1

[Whisper] Fix whisper integration tests (#34111) · 4d1d0f29

eustlb authored 6 months ago


* fix test_tiny_timestamp_generation

* fix test_large_timestamp_generation

* fix test_whisper_shortform_single_batch_prev_cond

* fix test_whisper_shortform_multi_batch_hard_prev_cond

* return_timestamps necessary with long form

* fix test_default_multilingual_transcription_long_form

* fix test_tiny_token_timestamp_generation_longform

* fix test_whisper_longform_multi_batch_hard

* Update tests/models/whisper/test_modeling_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* fix typo

* do not expect special tokens

* fix test_whisper_longform_single_batch_beam

* fix test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* these tests does not make sense anymore

* this test does not make sense anymore

* make fixup

* suggested nits

* add test with forced_decoder_ids

* this test does not make sense anymore

* change assert for unittest test cases

* make fixup

* test with prompt_ids and task and language

* fix unittest test case call

* fix test_tiny_generation

* fix test_tiny_en_generation

* fix test_tiny_en_batched_generation

* fix test_tiny_longform_timestamps_generation

* fix test_tiny_timestamp_generation

* fix test_large_generation

* fix test_large_batched_generation

* fix test_large_generation_multilingual

* fix test_large_timestamp_generation

* fix test_large_timestamp_generation

* fix test_tiny_token_timestamp_generation_longform

* fix test_tiny_en_batched_generation

* make fixup

* [run-slow] whisper

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

4d1d0f29

Skipping aqlm non working inference tests till fix merged (#34865) · 0e805e6d
Mohamed Mekkouri authored 6 months ago

0e805e6d
VideoLLaVA: add default values (#34916) · 73b4ab10
Raushan Turganbay authored 6 months ago
```
add default values
```
73b4ab10

25 Nov, 2024 19 commits

Fix import structure for Fast Image processors (#34859) · bdb29ff9
Yoni Gozlan authored 6 months ago
```
* Fix import structure image_processor_fast

* update to new inits
```
bdb29ff9

making gpt2 fx traceable (#34633) · bfc3556b

xuzifei-dmatrix authored 6 months ago

* making gpt2 fx tracable

* running make fix-copies

* Revert "running make fix-copies"

This reverts commit 5a3437cb5b63799243bceae7d21a2aed8d0418c7.

bfc3556b

Updated documentation and added conversion utility (#34319) · 95c10fed

Viktor Scherbakov authored 6 months ago


* Updated documentation and added conversion utility

* Update docs/source/en/tiktoken.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tiktoken.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Moved util function to integration folder + allow for str

* Update formatting

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Updated formatting

* style changes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

95c10fed

Fix failling GGML test (#34871) · 890ea7de
Mohamed Mekkouri authored 6 months ago
```
fix_test
```
890ea7de
Upgrade torch version to 2.5 in dockerfile for quantization CI (#34924) · b76a292b
Mohamed Mekkouri authored 6 months ago
```
* Upgrade Torch 2.5

* uncomment
```
b76a292b
Fix `test_auto_backbone_timm_model_from_pretrained` (#34877) · a830df29
Yih-Dar authored 6 months ago
```
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a830df29

fix static cache data type miss-match (#34799) · a464afbe

jiqing-feng authored 6 months ago


* fix gptj data type missmatch

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add low precision static cache tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix low-precision static cache tests

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* avoid config change

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* change data type convert in cache copy

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix comment

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* cast key value after k v out

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

a464afbe

[AWQ, CI] Bump AWQ version used in docker image (#34922) · b13916c0

Benjamin Bossan authored 6 months ago

The old AWQ version is failing with the latest (unreleased)
transformers, giving the error:

> ImportError: cannot import name 'shard_checkpoint' from
'transformers.modeling_utils'

This has been resolved in awq v0.2.7:

https://github.com/casper-hansen/AutoAWQ/pull/644

b13916c0

Fix : BitNet tests (#34895) · 4e6b19cd
Mohamed Mekkouri authored 6 months ago
```
* fix_tests_bitnet

* fix format
```
4e6b19cd
Rename OLMo November to OLMo2 (#34864) · 9121ab8f
Shane A authored 6 months ago
```
* Rename/move OLMo Nov files to OLMo2

* Rename Olmo1124 and its variants to Olmo2
```
9121ab8f

Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/lxmert (#34917) · 1de3598d

dependabot[bot] authored 6 months ago

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2

)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

1de3598d

Fix Qwen2 failing tests (#34819) · f4c04ba3
Jacky Lee authored 6 months ago
```
* fix: qwen2 model ids

* fix: line

* fix: more format

* update: reformat
```
f4c04ba3

[`peft`] Given that `self.active_adapter` is deprecated, avoid using it (#34804) · 11cc2295

Tom Aarsen authored 6 months ago

* Given that self.active_adapter is deprecated, avoid using it

* Remove misleading comment - `self.active_adapter` is not used (and deprecated)

11cc2295

Fix convert_tokens_to_string when decoder is None (#34569) · 74db22f9

Donald Szeto authored 6 months ago


* Fix convert_tokens_to_string when decoder is None

* revert unrelated changs

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>

74db22f9

chore: fix some typos (#34891) · 97514a8b
wanxiangchwng authored 6 months ago
```
Signed-off-by: wanxiangchwng <cui.shuang@foxmail.com>
```
97514a8b

Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/visual_bert (#34887) · 62ab94de

dependabot[bot] authored 6 months ago

Bump tornado in /examples/research_projects/visual_bert

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2

)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

62ab94de

prepare_fa2_from_position_ids function bugfix (#33269) · c50b5675
Meliksah Turker authored 6 months ago
```
contiguous() is called before view() for key and value within prepare_fa2_from_position_ids function
```
c50b5675

allow unused input parameters passthrough when chunking in asr pipelines (#33889) · a0f4f317

VictorAtIfInsurance authored 6 months ago

* allow unused parameter passthrough when chunking in asr pipelines

* format code

* format

* run fixup

* update tests

* update parameters to pipline in test

* updates parametrs in tests

* change spelling in gitignore

* revert .gitignore to main

* add git ignore of devcontainer folder

* assert asr output follows expected inference output type

* run fixup

* Remove .devcontainer from .gitignore

* remove compliance check

a0f4f317

Sum gathered input tokens (#34554) · 4dc1a693

kang sheng authored 6 months ago


* sum gathered input tokens

* ruff line-length is 119, format the code

---------

Co-authored-by: kangsheng <kangsheng@meituan.com>

4dc1a693