1. 26 Feb, 2025 1 commit
  2. 25 Feb, 2025 17 commits
  3. 24 Feb, 2025 8 commits
  4. 21 Feb, 2025 6 commits
    • Matt's avatar
      Fix exploitable regexes in Nougat and GPTSan/GPTJNeoXJapanese (#36121) · 92c5ca9d
      Matt authored
      
      * Fix potential regex catastrophic backtracking in NougatTokenizerFast
      
      The original regex pattern in tokenization_nougat_fast.py was vulnerable to
      catastrophic backtracking due to greedy quantifiers and nested alternations.
      This commit replaces it with a more efficient pattern that:
      
      1. Uses explicit character classes instead of dot (.)
      2. Handles whitespace more precisely
      3. Avoids unnecessary backtracking
      4. Supports both lowercase and uppercase roman numerals
      5. Maintains the same functionality while being more robust
      
      * Try another regex
      
      * Trying deepseek's answer
      
      * Start with a simplification
      
      * Another simplification
      
      * Just rewrite the whole function myself
      
      * Fix gptneox and gptsan
      
      * Simplify the regex even further
      
      * Tighten up the price regex a little
      
      * Add possessive version of the regex
      
      * Fix regex
      
      * Much cleaner regexes
      
      ---------
      
      Co-authored-by: default avataropenhands <openhands@all-hands.dev>
      92c5ca9d
    • CalOmnie's avatar
      Uses Collection in transformers.image_transforms.normalize (#36301) · 547911e7
      CalOmnie authored
      * Uses Collection instead of Sequence in transformers.image_transforms.normalize
      
      * Uses collections.abc.Collection in lieu of deprecated typing one
      547911e7
    • Fanli Lin's avatar
      [tests] make quanto tests device-agnostic (#36328) · 7c5bd24f
      Fanli Lin authored
      * make device-agnostic
      
      * name change
      7c5bd24f
    • Joao Gante's avatar
    • Pavel Iakubovskii's avatar
      Add SigLIP 2 (#36323) · a957b791
      Pavel Iakubovskii authored
      * Docs
      
      * Inits
      
      * Auto classes
      
      * Add siglip base
      
      * Add base tests
      
      * Fix Siglip V1 for fix res version
      
      * Add image processor
      
      * Update conversion
      
      * Experimenting with vectorized embeddings
      
      * Fixup
      
      * Add modular Siglip2Processor
      
      * Add modular configuration
      
      * Rename num patches
      
      * Correct image and text features merging
      
      * Working conversion script
      
      * Refactoring conversion script
      
      * Remove unused code in conversion script
      
      * Shorten dict a bit
      
      * Refactoring conversion
      
      * Done conversion refactoring
      
      * Fixup
      
      * Modular siglip2
      
      * Make model exportable and compilable without graph breaks
      
      * Remove position_ids from image_processor
      
      * REmove position ids from modeling file
      
      * Update modular
      
      * Type hint
      
      * Fixup
      
      * Set defaults to processor
      
      * Add integration test
      
      * Revert spatial shapes back to tensor
      
      * Change order
      
      * Fix most of the tests
      
      * Fix docstring
      
      * Remove interpolate_pos_encoding arg (not needed)
      
      * Update docs
      
      * Standardize processing
      
      * Fix attention_mask in vision head
      
      * Siglip v1: remove double transpose in FA2
      
      * Update modular file
      
      * Update FA2 test
      
      * Update expected logits
      
      * Fix interpolation for siglip2 image processor
      
      * Skip init test
      
      * Skip dispatch on flash test
      
      * Fix modeling tests
      
      * Fixup
      
      * Add dummy objects
      
      * Fix some docstrings
      
      * Add siglip2 in index.md
      
      * Fix consistency
      
      * Add docs
      
      * Remove size and data format
      
      * Add image processor tests
      
      * Fix
      
      * Add fast image processor
      
      * Fix style
      
      * Fix
      
      * Docs
      
      * Set lowercase for tokenizer
      
      * Adjust head size for Siglip v1
      
      * Update siglip2 for consistency with siglip1
      
      * Update siglip2 conversion
      
      * Update pipeline
      
      * Update checkpoints in tests
      
      * Update checkpoint name
      
      * Fix pooling for image classification model
      
      * Fix FA2 test
      
      * Update processor
      
      * Fix check repo
      
      * Update docs
      
      * Fix typos
      
      * Fix docstring for fast image processor
      
      * Add siglip2 to FA2 docs
      
      * Fix fast ip tests
      
      * Fix constitency
      
      * Fix tokenizer class for siglip v1
      
      * Fix missing header
      
      * Refactor scaling for clip, siglip, siglip2
      
      * Remove unused imports
      
      * Make fast IP default for siglip2
      
      * Update docs
      
      * Update checkpoints
      
      * Update modular
      
      * Update paper link
      
      * Fixup
      
      * Fix name in toctree
      
      * Fix test
      v4.49.0-SigLIP-2
      a957b791
    • Raushan Turganbay's avatar
      VLMs: even more clean-up (#36249) · 14552cbd
      Raushan Turganbay authored
      * squash
      
      * style
      14552cbd
  5. 20 Feb, 2025 8 commits
    • Cyan's avatar
    • Joao Gante's avatar
      [smolvlm] make CI green (#36306) · 27d17075
      Joao Gante authored
      * add smolvlm to toctree
      
      * add requirements
      
      * dev-ci
      
      * no docker changes
      
      * dev-ci
      
      * update torch-light.dockerfile
      
      * derp
      
      * dev-ci
      27d17075
    • Nosimus's avatar
      fix: prevent second save in the end of training if last step was saved already (#36219) · effaef33
      Nosimus authored
      
      * fix: prevent second save in the end of training
      
      * fix: prevent second save in the end of training
      
      * test: added test for no duplicate save on epoch save strategy
      
      * fix: removed TrainerControl
      
      * chore: style formatting
      
      ---------
      
      Co-authored-by: default avatarJaktensTid <jaktenstid1@gmail.com>
      effaef33
    • 12v's avatar
      Fix typo in Pixtral example (#36302) · 5412ff1a
      12v authored
      Fix typo
      5412ff1a
    • Orr Zohar's avatar
      SmolVLM2 (#36126) · 4397dfcb
      Orr Zohar authored
      
      * smolvlm init
      
      * updates
      
      * fixing bugs
      
      * minimal run, no checks
      
      * minimal run, no checks
      
      * passing first check + adding url support
      
      * updating video dataloading logic
      
      * fixing image logic
      
      * trying modular, but fails
      
      * modular is working, changing processor to match PR comments and general transformers logic
      
      * fixing kwargs
      
      * offloading video loading logic to  image_util
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * fixing circleci code formatting errors
      
      * update
      
      * add idefics3-based tests
      
      * add keyword to all
      
      * add PreTrainedModel
      
      * updateing video loading logic
      
      * working inference
      
      * updates for PR comments
      
      * updates for PR comments
      
      * moving SmolVLMPretrainedModel higher to fix import error
      
      * CI test pass
      
      * CI test pass
      
      * removing lambda
      
      * CI test pass
      
      * CI test pass
      
      * CI test pass
      
      * CI test pass
      
      * CI test pass
      
      * CI test pass
      
      * processor tests
      
      * add example in docs
      
      * typo
      
      * fix copies
      
      * skip compile tests - sdpa for VisionTransformer
      
      * fix init
      
      * raise import error for num2words
      
      * update doc for FA2
      
      * more doc fix
      
      * CI
      
      * updates for PR comments
      
      * Update docs/source/en/model_doc/smolvlm.md
      
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/model_doc/smolvlm.md
      
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/model_doc/smolvlm.md
      
      Co-authored-by: default avatarJoshua Lochner <admin@xenova.com>
      
      * Update docs/source/en/model_doc/smolvlm.md
      
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/model_doc/smolvlm.md
      
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * fixing processor -- tokenizer not defined properly, (gpt2 tokenizer), and does not have the attributes of fake image token, etc
      
      * adding smolvlm to VQA models
      
      * removing vqa auto class
      
      * Update src/transformers/models/smolvlm/processing_smolvlm.py
      
      Co-authored-by: default avatarJoshua Lochner <admin@xenova.com>
      
      * removing smolvlmvisiontransformer from index.md
      
      * my bad, video processing had typos
      
      * fixing docs
      
      * renaming params in SmolVLMModel.inputs_merger
      
      * removing un-needed dtype/device in model forward
      
      * ruff for CI
      
      * update docs
      
      * Update docs/source/en/model_doc/smolvlm.md
      
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * return cache position
      
      * return cache position
      
      * return cache also in modular
      
      * needed to run modular again
      
      * fix training tests
      
      * push vectorized inputs merger
      
      * format
      
      * format
      
      * reduce number of mappings
      
      * addressing PR comments
      
      * happy CI, happy me :)
      
      * skip non-nested images
      
      * adjust integration test for smaller GPUs
      
      * format
      
      * fix kwargs in chat template apply
      
      * skip this for now
      
      ---------
      
      Co-authored-by: default avatarraushan <raushan@huggingface.co>
      Co-authored-by: default avatarPablo <pablo.montalvo.leroux@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      Co-authored-by: default avatarJoshua Lochner <admin@xenova.com>
    • Yih-Dar's avatar
      f2ab182d
    • Yih-Dar's avatar
      Fix broken CI on release branch due to missing conversion files (#36275) · e8531a0e
      Yih-Dar authored
      
      * fix
      
      * fix
      
      ---------
      
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      e8531a0e
    • Ilyas Moutawwakil's avatar
      Make cache traceable (#35873) · 5e2183f3
      Ilyas Moutawwakil authored
      simply make cache traceable
      5e2183f3