1. 12 Feb, 2025 17 commits
  2. 11 Feb, 2025 10 commits
    • Fanli Lin's avatar
      [docs] update awq doc (#36079) · 11afab19
      Fanli Lin authored
      
      * update awq doc
      
      * Update docs/source/en/quantization/awq.md
      
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/quantization/awq.md
      
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/quantization/awq.md
      
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/quantization/awq.md
      
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * add note for inference
      
      ---------
      
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      11afab19
    • Fanli Lin's avatar
      [docs] minor doc fix (#36127) · 9b69986e
      Fanli Lin authored
      fix
      9b69986e
    • Sambhav Dixit's avatar
      Make `output_dir` Optional in `TrainingArguments` #27866 (#35735) · 1b57de8d
      Sambhav Dixit authored
      
      * make output_dir optional
      
      * inintaied a basic testing module to validate and verify the changes
      
      * Test output_dir default to 'tmp_trainer' when  unspecified.
      
      * test existing functionality of output_dir.
      
      * test that output dir only created when needed
      
      * final check
      
      * added doc string and changed the tmp_trainer to trainer_output
      
      * amke style fixes to test file.
      
      * another round of fixup
      
      ---------
      
      Co-authored-by: default avatarsambhavnoobcoder <indosambahv@gmail.com>
      1b57de8d
    • Arthur's avatar
      update tiktoken integ to use converted (#36135) · 03534a92
      Arthur authored
      03534a92
    • Pablo Montalvo's avatar
      Fix CI issues (#35662) · 3a5c328f
      Pablo Montalvo authored
      * make explicit gpu dep
      
      * [run-slow] bamba
      3a5c328f
    • Hicham Tala's avatar
      Fix max size deprecated warning (#34998) · 775252ab
      Hicham Tala authored
      * Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning
      
      * Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning
      
      * Remove deprecated warnings and eliminate `max_size` usage
      
      * Test use `int` as argument for `size`
      Add a test to ensure test can pass successfully and backward compatibility
      
      * The test pipelines still use `max_size`
      Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys
      
      * Reformatting
      
      * Reformatting
      
      * Revert "Reformatting"
      
      This reverts commit c3040acee75440357cffd1f60c9d29ff5b2744b8.
      
      * Revert "Reformatting"
      
      This reverts commit ac4522e5c9a02d2d0c298295026db68ea26453df.
      
      * Revert "The test pipelines still use `max_size`"
      
      This reverts commit eaed96f041ffc32459536e1524d87f7a12ddee29.
      
      * Revert "Test use `int` as argument for `size`"
      
      This reverts commit 1925ee38c7c5eabb11832316712df1d4ba8043d0.
      
      * Revert "Remove deprecated warnings and eliminate `max_size` usage"
      
      This reverts commit d8e7e6ff9025931468fc1f3827cda1fa391003d5.
      
      * Change version `4.26` to "a future version"
      
      * Reformatting
      
      * Revert "Change version `4.26` to "a future version""
      
      This reverts commit 2b53f9e4
      775252ab
    • 湛露先生's avatar
      5489fea5
    • Maxim Evtush's avatar
      fix: typos in documentation files (#36122) · 76048be4
      Maxim Evtush authored
      * Update tools.py
      
      * Update text_generation.py
      
      * Update question_answering.py
      76048be4
    • Pavel Iakubovskii's avatar
      Add common test for `torch.export` and fix some vision models (#35124) · f42d46cc
      Pavel Iakubovskii authored
      * Add is_torch_greater_or_equal test decorator
      
      * Add common test for torch.export
      
      * Fix bit
      
      * Fix focalnet
      
      * Fix imagegpt
      
      * Fix seggpt
      
      * Fix swin2sr
      
      * Enable torch.export test for vision models
      
      * Enable test for video models
      
      * Remove json
      
      * Enable for hiera
      
      * Enable for ijepa
      
      * Fix detr
      
      * Fic conditional_detr
      
      * Fix maskformer
      
      * Enable test maskformer
      
      * Fix test for deformable detr
      
      * Fix custom kernels for export in rt-detr and deformable-detr
      
      * Enable test for all DPT
      
      * Remove custom test for deformable detr
      
      * Simplify test to use only kwargs for export
      
      * Add comment
      
      * Move compile_compatible_method_lru_cache to utils
      
      * Fix beit export
      
      * Fix deformable detr
      
      * Fix copies data2vec<->beit
      
      * Fix typos, update test to work with dict
      
      * Add seed to the test
      
      * Enable test for vit_mae
      
      * Fix beit tests
      
      * [run-slow] beit, bit, conditional_detr, data2vec, def...
      f42d46cc
    • Arthur's avatar
      Fix nighlty CIs: missing atols (#35903) · 1779f518
      Arthur authored
      fix osme missing atols
      1779f518
  3. 10 Feb, 2025 13 commits
    • ivarflakstad's avatar
      1feebb5b
    • Joao Gante's avatar
      [generate] shape checks in tests compatible with fixed-length caches (+ some minor fixes) (#35993) · be2ac091
      Joao Gante authored
      * shape checks compatible with static cache
      
      * add test
      
      * tmp
      
      * manually turn on eager attn when we want to output attn
      
      * typo
      
      * generalize to encoder-decoder models
      
      * force compilation on cpu
      
      * tmp commit
      
      * fix static cache shape checks
      
      * models with odd caches
      
      * fix copies
      
      * shorter cache search loop
      
      * use decoder_past_key_values everywhere
      
      * better test variable names and comments
      
      * signature
      
      * rename _check_outputs into _check_generate_outputs
      
      * add comments
      
      * HybridCache future test note
      be2ac091
    • Marc Sun's avatar
      fix bnb warning (#36116) · 9510ae39
      Marc Sun authored
      fix
      9510ae39
    • kkscilife's avatar
      09261ccf
    • Marc Sun's avatar
      Revert checkpoint tmp dir (#36112) · d4a6b409
      Marc Sun authored
      * Revert "Fix OS err (#36094)"
      
      This reverts commit ba29a439.
      
      * Revert "Save checkpoint to temporary directory to handle partial saves during failures (#35580)"
      
      This reverts commit 20d17358.
      d4a6b409
    • jiqing-feng's avatar
      Refactor OPT model (#36101) · 0baf0039
      jiqing-feng authored
      
      * remove cross attention
      
      Signed-off-by: default avatarjiqing-feng <jiqing.feng@intel.com>
      
      * remove is_decoder
      
      Signed-off-by: default avatarjiqing-feng <jiqing.feng@intel.com>
      
      * fix pkv
      
      Signed-off-by: default avatarjiqing-feng <jiqing.feng@intel.com>
      
      ---------
      
      Signed-off-by: default avatarjiqing-feng <jiqing.feng@intel.com>
      0baf0039
    • Yoni Gozlan's avatar
      Remove Multi-threaded image conversion for fast image processors (#36105) · 924f1c71
      Yoni Gozlan authored
      
      remove multithreaded image conversion
      
      Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
      924f1c71
    • Yih-Dar's avatar
      Enable pytest live log and show warning logs on GitHub Actions CI runs (#35912) · 3897f2ca
      Yih-Dar authored
      
      * fix
      
      * remove
      
      * fix
      
      ---------
      
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      3897f2ca
    • Jingze Shi's avatar
      Support constant lr with cooldown (#35453) · 48a309d0
      Jingze Shi authored
      
      * Add support for constant learning rate with cooldown
      
      * Add support for constant learning rate with cooldown
      
      * Add support for constant learning rate with cooldown
      
      * Add support for constant learning rate with cooldown
      
      * Add support for constant learning rate with cooldown
      
      * Add support for constant learning rate with cooldown
      
      * Add support for constant learning rate with cooldown
      
      * Add more warmup and cooldown methods to 'get_wsc_schedule'
      
      * Add more warmup and cooldown methods to 'get_wsc_schedule'
      
      * Add more warmup and cooldown methods to 'get_wsc_schedule'
      
      * Add more warmup and cooldown methods to 'get_wsc_schedule'
      
      * Add more warmup and decay methods to 'get_wsd_schedule'
      
      * support num_training_steps and num_stable_steps for get_wsd_schedule
      
      * support num_training_steps and num_stable_steps for get_wsd_schedule
      
      * get wsd scheduler before the `num_training_steps` decision
      
      * fix code_quality
      
      * Update stable branch logic
      
      * fix code_quality
      
      * Move stable stage decide to `get_wsd_schedule`
      
      * Update docstring of `get_wsd_schedule`
      
      * Update `num_train_steps` to optional
      
      * Update `num_train_steps` to optional
      
      * Update docstring of `get_wsd_schedule`
      
      * Update src/transformers/optimization.py
      
      Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
      
      ---------
      
      Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
      48a309d0
    • Armaghan Shakir's avatar
      Add Apple's Depth-Pro for depth estimation (#34583) · 9a6be63f
      Armaghan Shakir authored
      
      * implement config and model building blocks
      
      * refactor model architechture
      
      * update model outputs
      
      * update init param to include use_fov_model
      
      * update param name in config
      
      * fix hidden_states and attentions outputs for fov
      
      * sort config
      
      * complete minor todos
      
      * update patching
      
      * update config for encoder
      
      * fix config
      
      * use correct defaults in config
      
      * update merge for compatibility with different image size
      
      * restructure encoder for custom configuration
      
      * make fov model compatible with custom config
      
      * replace word "decoder" with "fusion"
      
      * weight conversion script
      
      * fix fov squeeze
      
      * update conversion script (without test)
      
      * upload ruff image processing
      
      * create fast image processing
      
      * use torch interpolation for image processing
      
      * complete post_process_depth_estimation
      
      * config: fix imports and sort args
      
      * apply inference in weight conversion
      
      * use mllama script instead for weight conversion
      
      * clean weight conversion script
      
      * add depth-pro status in other files
      
      * fill docstring in config
      
      * formatting
      
      * more formatting
      
      * formatting with ruff
      
      * formatting with style
      
      * fix copied classes
      
      * add examples; update weight convert script
      
      * fix using check_table.py and isort
      
      * fix config docstring
      
      * add depth pro to sdpa docs
      
      * undo unintentional changes in configuration_gemma.py
      
      * minor fixes
      
      * test image processing
      
      * fixes and tests
      
      * more fixes
      
      * use output states from image_encoder instead
      
      * Revert "use output states from image_encoder instead"
      
      This reverts commit 2408ec54e4f27d2abbecdb8374e58f34d91d8e96.
      
      * make embeddings dynamic
      
      * reshape output hidden states and attentions as part of computation graph
      
      * fix ruff formating
      
      * fix docstring failure
      
      * use num_fov_head_layers in tests
      
      * update doc
      
      * check consistency with config
      
      * ruff formatting
      
      * update test case
      
      * fix ruff formatting
      
      * add tests for fov
      
      * use interpolation in postprocess
      
      * run and fix slow tests locally
      
      * use scaled_images_features for image and fov encoder
      
      * return fused_hidden_states in fusion stage
      
      * fix example
      
      * fix ruff
      
      * fix copyright license for all files
      
      * add __all__ for each file
      
      * minor fixes
      - fix download spell
      - add push_to_hub option
      - fix Optional type hinting
      - apply single loop for DepthProImageProcessor.preprocess
      
      * return list in post_process_depth_estimation
      
      * minor fixes
      - capitalize start of docstring
      - use ignore copy
      - fix examples
      - move docstring templates and custom output classes to top
      - remove "-> None" typehinting from __init__
      - type hinting for forward passes
      - fix docstrings for custom output classes
      
      * fix "ruff check"
      
      * update upsample and projection
      
      * major changes: (image size and merge optimization)
      - add support for images of any size
      - optimize merge operation
      - remove image_size from config
      - use full names instead of B, C, H, W
      - remove interpolation from fusion stage
      - add interpolation after merge
      - move validations to config
      - update integration test
      - add type hints for functions
      
      * fix push_to_hub option in weights conversion
      
      * remove image_size in weights conversion
      
      * major changes in the architecture
      - remove all DepthProViT modules and support different backbones using the AutoModel API
      - set default use_fov_model to False
      - validate parameters in configuration
      - update interpolate function: use "nearest" for faster computation
      - update reshape_feature function: remove all special tokens, possible from different backbones
      - update merge function: use padding from config instead of merge_out_size
      - remove patch_to_batch and batch_to_patch conversions for now
      - calculate out_size dynamically in the encoder
      - leave head_mask calculation to the backbone
      - fix bugs with merge
      - add more comments
      - update tests
      
      * placeholder for unused config attributes
      
      * improve docs amid review
      
      * minor change in docs
      
      * further optimize merge
      
      * fix formatting
      
      * remove unused patch/batch convertion functions
      
      * use original F.interpolate
      
      * improve function naming
      
      * minor chages
      - use torch_int instead of int
      - use proper for newly initialized tensors
      - use user provided return_dict for patch_encoder
      - use if-else block instead in self.use_fov_model
      
      * rearchitect upsample block for improved modularity
      
      * update upsample keys in weight conversion
      
      * improve padding in merge_patches
      
      * use double-loop for merge
      
      * update comments
      
      * create feature_extractor, reduce some forward code
      
      * introduce config.use_mask_token in dinov2
      
      * minor fixes
      
      * minor fixes for onnx
      
      * update __init__ to latest format
      
      * remove DepthProConfig.to_dict()
      
      * major changes in backbone
      
      * update config in weight conversion
      
      * formatting
      
      * converted model is fp32
      
      * improve naming and docs for feature_extractor->reconstruct_feature_maps
      
      * minor fixes; amid review
      
      * create intermediate vars in func call
      
      * use torch.testing.assert_close
      
      * use ModuleList instead of Sequential and ModuleDict
      
      * update docs
      
      * include fov in integraiton tests
      
      * update docs
      
      * improve initialization of convolution layers
      
      * fix unused fov keys
      
      * update tests
      
      * ruff format
      
      * fix test, amid kaimming initialization
      
      * add depthpro to toctree
      
      * add residual layer to _no_split_modules
      
      * architecture rework
      
      * Update src/transformers/models/depth_pro/image_processing_depth_pro.py
      
      Co-authored-by: default avatarPavel Iakubovskii <qubvel@gmail.com>
      
      * Update src/transformers/models/depth_pro/image_processing_depth_pro_fast.py
      
      Co-authored-by: default avatarPavel Iakubovskii <qubvel@gmail.com>
      
      * update docs
      
      * improve merge_patches
      
      * use flatten with fov_output
      
      * ruff formatting
      
      * update resources section in docs
      
      Co-authored-by: default avatarPavel Iakubovskii <qubvel@gmail.com>
      
      * fix typo "final_kernal_size"
      
      Co-authored-by: default avatarPavel Iakubovskii <qubvel@gmail.com>
      
      * fix output typehint for DepthProDepthEstimator
      
      Co-authored-by: default avatarPavel Iakubovskii <qubvel@gmail.com>
      
      * residual operation in 2 steps
      
      Co-authored-by: default avatarPavel Iakubovskii <qubvel@gmail.com>
      
      * use image_size instead of global patch_size in interpolation
      
      * replace all Sequential with ModuleList
      
      * update fov
      
      * update heads
      
      * fix and update conversion script for heads
      
      * ruff formatting
      
      * remove float32 conversion
      
      * use "Fov" instead of "FOV" in class names
      
      * use "Fov" instead of "FOV" in config docs
      
      * remove prune_heads
      
      * update fusion stage
      
      * use device in examples
      
      * update processor
      
      * ruff fixes
      
      * add do_rescale in image_processor_dict
      
      * skip test: test_fast_is_faster_than_slow
      
      * ruff formatting
      
      * DepthProImageProcessorFast in other files
      
      * revert antialias removal
      
      * add antialias in BaseImageProcessorFast
      
      * Revert "revert antialias removal"
      
      This reverts commit 5caa0bd8f9f7463b98410c04e6cfe8fef3adee18.
      
      * Revert "add antialias in BaseImageProcessorFast"
      
      This reverts commit 3ae1134780ae236872985523d9c0a444eabcc179.
      
      * update processor for grouping and antialias
      
      * try test_fast_is_faster_than_slow without "skip" or "flanky"
      
      * update checkpoint
      
      * update checkpoint
      
      * use @is_flanky for processor test
      
      * update checkpoint to "apple/DepthPro-hf"
      
      ---------
      
      Co-authored-by: default avatarPavel Iakubovskii <qubvel@gmail.com>
      9a6be63f
    • Raushan Turganbay's avatar
      Paligemma: revert #36084 (#36113) · c3999219
      Raushan Turganbay authored
      * revert
      
      * type check
      c3999219
    • Raushan Turganbay's avatar
      Chat template: update for processor (#35953) · eebd2c97
      Raushan Turganbay authored
      * update
      
      * we need batched nested input to always process correctly
      
      * update a bit
      
      * fix copies
      eebd2c97
    • Raushan Turganbay's avatar
      Processors: allow tuples of images when checking (#36084) · 5bd76947
      Raushan Turganbay authored
      allow tuples of images
      5bd76947