• Arthur's avatar
    [ `gemma`] Adds support for Gemma 💎 (#29167) · 594c1277
    Arthur authored
    * inital commit
    
    * update
    
    * update conversion checkpoint
    
    * update conversion script
    
    * nits
    
    * some fixes
    
    * nits
    
    * merge
    
    * fix permute
    
    * nits
    
    * fix
    
    * nits
    
    * nits
    
    * nits
    
    * fix rope
    
    * fix both rope
    
    * nites
    
    * style
    
    * make sure flax works
    
    * fix flax init code
    
    * fix foward
    
    * nits
    
    * print flax generation out
    
    * current code
    
    * nits
    
    * SIIIIIIIIIIIIIIIIIII
    
    * update
    
    * add new tokenizer
    
    * correct fast tokenizer
    
    * fix conversion
    
    * more comments
    
    * fix modeling and conversion
    
    * nits and nits
    
    * nits testing
    
    * add some tokenization tests
    
    * add some edge cases
    
    * add slow tests and fix them
    
    * fixup
    
    * fix copies for modeling
    
    * fix copies
    
    * add 7B slow tests
    
    * fix
    
    * fix
    
    * fix tests
    
    * make tokenizer cis go green
    
    * styling
    
    * last tokenizer nits
    
    * update jax tests
    
    * fix flax for 7b
    
    * add jit testing 🤗
    
    
    
    * cleanups
    
    * isolated nit, inv_freq for rotary_emb.inv_freq
    
    * propagate to jax
    
    * Apply suggestions from code review
    
    Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    
    * adjust test
    
    * fix conversion script
    
    * change name
    
    * correct file names
    
    * update conversion script
    
    * Fix bos and eos token ids in the model configuration (#3)
    
    * update modelling
    
    * update conversion script
    
    * add static cache for gemma
    
    * fix sdpa generate
    
    * fix batched
    
    * multiple fixes
    
    * fix FA2
    
    * final fix
    
    * Rename a few missing strings and filenames (#4)
    
    * merge with upstream main
    
    * fix copies
    
    * fix copies
    
    * fix fixup
    
    * fix fixup
    
    * fix
    
    * fix
    
    * final tests
    
    * fix fx gemma tests
    
    * fix fx bf16/fp16 tests
    
    * update slow fx tests
    
    * fx slow tests: one logits, one generation
    
    * move jit test standalone
    
    * Apply suggestions from code review
    
    * nits
    
    * tokenizer updates
    
    * more tokenization updates: custom GemmaSentencepieceExtrator
    
    * style
    
    * Update src/transformers/cache_utils.py
    
    * Update src/transformers/models/gemma/__init__.py
    
    * Update tests/models/gemma/test_modeling_flax_gemma.py
    
    * small nits
    
    * style
    
    * update tokenization test
    
    * fix the rotary embedding
    
    * with style
    
    * fix slow tests
    
    * WARNING this commit might be very important for precisions
    
    * Update tests/models/gemma/test_modeling_flax_gemma.py
    
    * Update src/transformers/models/gemma/configuration_gemma.py
    
    Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
    
    * Update src/transformers/models/gemma/modeling_flax_gemma.py
    
    Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
    
    * small nits here and there!
    
    * forgotten nit
    
    * remove on the fly computation of inv_freq
    
    * revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float
    
    * Apply suggestions from code review
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_flax_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_tokenization_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_tokenization_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_tokenization_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_tokenization_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * Update tests/models/gemma/test_modeling_gemma.py
    
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * nit conversion script link
    
    * fix some tests
    
    * add not doctest and pr doctest
    
    * repo consistency
    
    * fix last CIs 🚀
    
    
    
    * update all readmes
    
    ---------
    
    Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
    Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
    Co-authored-by: default avatarsanchit-gandhi <sanchit@huggingface.co>
    Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
    594c1277