• tomeras91's avatar
    Add jamba (#29943) · 3f20877d
    tomeras91 authored
    * Add jamba arch
    
    * apply "make fix-copies" changes
    
    * fix link to model in JambaConfig docstring
    
    * Add n_ctx in modeling file because repo-consistency wants that
    
    * Add jamba to flash attention and sdpa documentation
    
    * mamba dt_proj quant fix now works for LoRA as well
    
    * override test_left_padding_compatibility and use a more permissive tolerance. left padding numerical difference are accentuated by mamba layers
    
    * add jamba to tokenization auto
    
    * fix comments of shape (PR #24 in the model page: https://huggingface.co/ai21labs/Jamba-v0.1/discussions/24)
    
    * simple PR fixes
    
    * remove unnecessary kwargs from JambaAttentionDecoderLayer and JambaMambaDecoderLayer
    
    * remove the LoRA hack for the mamba dt_proj bias. It was solved in huggingface/peft#1530 (https://github.com/huggingface/peft/pull/1530)
    
    * Add copied comment on JambaMLP (it's the same as MixtralMLP)
    
    * remove padding_mask warnings. It's not supported anymore
    
    * fix docstring. Float instead of int...
    3f20877d