1. 22 Oct, 2024 9 commits
  2. 21 Oct, 2024 5 commits
  3. 18 Oct, 2024 6 commits
    • Matthew Hoffman's avatar
      Only cast logits to float when computing loss (#34147) · 816f4424
      Matthew Hoffman authored
      * Only cast logits to float when computing loss
      
      Some misses from #31292 and #33902
      
      * Move logits.float() into existing if labels is not None branch
      816f4424
    • Matt's avatar
      Fix UDOP dtype issue (#34180) · e46e3bc1
      Matt authored
      * Trigger UDOP tests
      
      * Try forcing dtype in LayoutLMV3
      
      * Do checks to see where uint8 is getting in
      
      * Do checks to see where uint8 is getting in
      
      * Found it!
      
      * Add .astype(np.float32)
      
      * Remove forced check, make fixup
      
      * Checking where exactly the uint8 creeps in
      
      * More checking on the uint8 issues
      
      * Manually upcast in rescale()
      
      * Remove UDOP trigger
      e46e3bc1
    • Cyril Vallez's avatar
      add Glm (#33823) · 66047640
      Cyril Vallez authored
      * Create modular_glm.py
      
      * Update modular_glm.py
      
      * Finalize architecture without all attentions
      
      * Add all attentions modules
      
      * Finalize modular
      
      * Update given last version
      
      * Last update
      
      * Finalize model
      
      * Finalize converter
      
      * Update convert_glm_weights_to_hf.py
      
      * style
      
      * style
      
      * Create __init__.py
      
      * Aff all inits
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Correct the rotary embeddings
      
      * Remove apply_residual_connection_post_layernorm (always false)
      
      * remove use_rms_norm (always true)
      
      * remove past_layer_norm (always true)
      
      * Update __init__.py
      
      * Update config and license
      
      * start adding tests and doc
      
      * Add doc + style
      
      * Update test_modeling_glm.py
      
      * Add dummies
      
      * Apply correct modeling
      
      * Refactor attention to follow llama
      
      * Update __init__.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Correct bias
      
      * remove linear_bias and pdrop (never used)
      
      * apply modular
      
      * Simplify converter
      
      * remove dummies + style
      
      * add model_input_names
      
      * Add pretraining_tp to config for when eager attention is used
      
      * Update modular to remove all pretraining_tp
      
      * Update test_modeling_glm.py
      
      * Update the __all__
      
      * Update __all__
      
      * Update __init__.py
      
      * Update test_modeling_glm.py
      
      * add revisions
      
      * Add the correct repos and revisions
      
      * style
      
      * Update __init__.py
      
      * update exports
      
      * remove import of modular files
      
      * style
      
      * Apply Llama changes + refine converter
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * Update convert_glm_weights_to_hf.py
      
      * style
      
      * Use new modular converter
      
      * add pretrainedmodel to init
      
      * style
      
      * Update test_modeling_glm.py
      
      * Move config outside modular to please CI about docstrings
      
      * Add dummies to please CI
      
      * Update glm.md
      
      * Update glm.md
      66047640
    • Lysandre Debut's avatar
      Informative 2 (#34154) · e95ea479
      Lysandre Debut authored
      
      * Informative
      
      * style
      
      * Informative 2
      
      * Apply suggestions from code review
      
      Co-authored-by: default avatarlewtun <lewis.c.tunstall@gmail.com>
      
      ---------
      
      Co-authored-by: default avatarlewtun <lewis.c.tunstall@gmail.com>
      e95ea479
    • byi8220's avatar
      Fix broken test decorator `require_torch_up_to_2_accelerators` (#34201) · 0437d6cd
      byi8220 authored
      * fix broken require_torch_up_to_2_accelerators
      
      * make style
      0437d6cd
    • Raushan Turganbay's avatar
      BLIP: fix input expansion logic (#34225) · 5a5b590d
      Raushan Turganbay authored
      fix
      5a5b590d
  4. 17 Oct, 2024 13 commits
  5. 16 Oct, 2024 7 commits
    • Marc Sun's avatar
      Revert "Fix FSDP resume Initialization issue" (#34193) · 3f06f95e
      Marc Sun authored
      Revert "Fix FSDP resume Initialization issue (#34032)"
      
      This reverts commit 4de1bdbf.
      3f06f95e
    • Reza Rahemtola's avatar
      Avoid using torch's Tensor or PIL's Image in chat template utils if not available (#34165) · 3a10c619
      Reza Rahemtola authored
      
      * fix(utils): Avoid using torch Tensor or PIL Image if not available
      
      * Trigger CI
      
      ---------
      
      Co-authored-by: default avatarMatt <rocketknight1@gmail.com>
      3a10c619
    • Yoni Gozlan's avatar
      Fix wrong name for llava onevision and qwen2_vl in tokenization auto (#34177) · bd5dc10f
      Yoni Gozlan authored
      * nit fix wrong llava onevision name in tokenization auto
      
      * add qwen2_vl and fix style
      bd5dc10f
    • steveepreston's avatar
      Revert `accelerate` error caused by `46d09af4` (#34197) · cc7d8b87
      steveepreston authored
      Revert `accelerate` bug
      cc7d8b87
    • alpertunga-bile's avatar
      [fix] fix token healing tests and usage errors (#33931) · 98bad9c6
      alpertunga-bile authored
      * auto-gptq requirement is removed & model is changed & tokenizer pad token is assigned
      
      * values func is changed with extensions & sequence key value bug is fixed
      
      * map key value check is added in ExtensionsTree
      
      * empty trimmed_ids bug is fixed
      
      * tail_id IndexError is fixed
      
      * empty trimmed_ids bug fix is updated for failed test
      
      * too much specific case for specific tokenizer is removed
      
      * input_ids check is updated
      
      * require auto-gptq import is removed
      
      * key error check is changed with empty list check
      
      * empty input_ids check is added
      
      * empty trimmed_ids fix is checked with numel function
      
      * usage change comments are added
      
      * test changes are commented
      
      * comment style and quality bugs are fixed
      
      * test comment style and quality bug is fixed
      98bad9c6
    • Yoach Lacombe's avatar
      Moshi integration (#33624) · 9ba021ea
      Yoach Lacombe authored
      
      * clean mimi commit
      
      * some nits suggestions from Arthur
      
      * make fixup
      
      * first moshi WIP
      
      * converting weights working + configuration + generation configuration
      
      * finalize converting script - still missing tokenizer and FE and processor
      
      * fix saving model w/o default config
      
      * working generation
      
      * use GenerationMixin instead of inheriting
      
      * add delay pattern mask
      
      * fix right order: moshi codes then user codes
      
      * unconditional inputs + generation config
      
      * get rid of MoshiGenerationConfig
      
      * blank user inputs
      
      * update convert script:fix conversion, add  tokenizer, feature extractor and bf16
      
      * add and correct Auto classes
      
      * update modeling code, configuration and tests
      
      * make fixup
      
      * fix some copies
      
      * WIP: add integration tests
      
      * add dummy objects
      
      * propose better readiblity and code organisation
      
      * update tokenization tests
      
      * update docstrigns, eval and modeling
      
      * add .md
      
      * make fixup
      
      * add MoshiForConditionalGeneration to ignore Auto
      
      * revert mimi changes
      
      * re
      
      * further fix
      
      * Update moshi.md
      
      * correct md formating
      
      * move prepare causal mask to class
      
      * fix copies
      
      * fix depth decoder causal
      
      * fix and correct some tests
      
      * make style and update .md
      
      * correct config checkpoitn
      
      * Update tests/models/moshi/test_tokenization_moshi.py
      
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update tests/models/moshi/test_tokenization_moshi.py
      
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * make style
      
      * Update src/transformers/models/moshi/__init__.py
      
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fixup
      
      * change firm in copyrights
      
      * udpate config with nested dict
      
      * replace einsum
      
      * make style
      
      * change split to True
      
      * add back splt=False
      
      * remove tests in convert
      
      * Update tests/models/moshi/test_modeling_moshi.py
      
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * add default config repo + add model to FA2 docstrings
      
      * remove logits float
      
      * fix some tokenization tests and ignore some others
      
      * make style tokenization tests
      
      * update modeling with sliding window + update modeling tests
      
      * [run-slow] moshi
      
      * remove prepare for generation frol CausalLM
      
      * isort
      
      * remove copied from
      
      * ignore offload tests
      
      * update causal mask and prepare 4D mask aligned with recent changes
      
      * further test refine + add back prepare_inputs_for_generation for depth decoder
      
      * correct conditional use of prepare mask
      
      * update slow integration tests
      
      * fix multi-device forward
      
      * remove previous solution to device_map
      
      * save_load is flaky
      
      * fix generate multi-devices
      
      * fix device
      
      * move tensor to int
      
      ---------
      
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarMarc Sun <marc@huggingface.co>
      9ba021ea
    • Raushan Turganbay's avatar
      IDEFICS: support inputs embeds (#34043) · d087165d
      Raushan Turganbay authored
      * support embeds
      
      * use cache from config
      
      * style...
      
      * fix tests after rebase
      d087165d