• Younes Belkada's avatar
    [`Llava`] Add Llava to transformers (#27662) · 44b5506d
    Younes Belkada authored
    * add model like
    
    * logits match
    
    * minor fixes
    
    * fixes
    
    * up
    
    * up
    
    * add todo
    
    * llava processor
    
    * keep the processor simple
    
    * add conversion script
    
    * fixup
    
    * fix copies
    
    * up
    
    * add to index
    
    * fix config + logits
    
    * fix
    
    * refactor
    
    * more refactor
    
    * more refactor
    
    * fix copies
    
    * add authors
    
    * v1 tests
    
    * add `LlavaProcessor` in init
    
    * remove unneeded import
    
    * up
    
    * up
    
    * docs
    
    * up
    
    * fix CI
    
    * fix CI
    
    * add attention  mask in test
    
    * make fixup
    
    * remove the vision model
    
    * that' s the dirty way to do it
    
    * nits
    
    * nits
    
    * updates
    
    * add more tests
    
    * add input tests
    
    * fixup
    
    * more styling
    
    * nits
    
    * updates amd cleanup
    
    * fixup the generation expected results
    
    * fix the testing script
    
    * some cleanup and simplification which does not work yet but almost there!
    
    * make correct dispatch operations
    
    * vectorize works for batch of images and text
    
    * last todos
    
    * nits
    
    * update test and modeling code
    
    * remove useless function for now
    
    * fix few issues
    
    * fix generation
    
    * some nits
    
    * add bakllava
    
    * nits
    
    * remove duplicated code
    
    * finis merge
    
    * cleanup
    
    * missed this line
    
    * fill the todos
    
    * add left padding offset
    
    * add left and rignt padding logic
    
    * bool to properly index
    
    * make sure
    
    * more cleanups
    
    * batch is fixed 😉
    
    
    
    * add correct device for tensor creation
    
    * fix some dtype missmatch
    
    * ruff
    
    * update conversion script
    
    * Update src/transformers/__init__.py
    
    * fa 2 support + fix conversion script
    
    * more
    
    * correct reshaping
    
    * fix test dict
    
    * fix copies by ignoring
    
    * fix nit
    
    * skip clip vision model
    
    * fixup
    
    * fixup
    
    * LlavaForVisionText2Text -> LlavaForCausalLM
    
    * update
    
    * fix
    
    * raise correct errors
    
    * fix
    
    * docs
    
    * nuke for now
    
    * nits here and there
    
    * fixup
    
    * fix remaining tests
    
    * update LlavaForConditionalGeneration instead of CausalLM
    
    * fixups
    
    * pipeline support
    
    * slow and piepline tests
    
    * supports batch
    
    * nits
    
    * cleanup
    
    * fix first integration tests
    
    * add pad token where needed
    
    * correct etsts
    
    * fixups
    
    * update pipeline testr
    
    * fix quality
    
    * nits
    
    * revert unneeded change
    
    * nit
    
    * use BatchFeature
    
    * from ...feature_extraction_utils import BatchFeature
    
    * nits
    
    * nits
    
    * properly update
    
    * more f*** nits
    
    * fix copies
    
    * comment
    
    * keep slow test slow
    
    * Update src/transformers/models/llava/processing_llava.py
    
    Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * add piepline example
    
    * add pixel values in docstrign
    
    * update pr doctest
    
    * fix
    
    * fix slow tests
    
    * remove hack
    
    * fixup
    
    * small note
    
    * forward contrib credits from PR25789
    
    * forward contrib credits from original implementation and work
    
    * add arthur
    
    * Update src/transformers/models/llava/processing_llava.py
    
    Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
    
    * update docstring
    
    * nit
    
    * move to not doctested because of timeout issues
    
    * fixup
    
    * add description
    
    * more
    
    * fix-copies
    
    * fix docs
    
    * add beam search
    
    * add more comments
    
    * add typehints on processor
    
    * add speedup plot
    
    * update slow tests and docs
    
    * push test
    
    * push batched test
    
    * fix batched generation with different number of images
    
    * remove benchmark due to a bug
    
    * fix test
    
    * fix copies
    
    * add gcolab demo
    
    ---------
    
    Co-authored-by: default avatarArthur Zucker <arthur.zucker@gmail.com>
    Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
    Co-authored-by: default avatarshauray8 <shauray8@users.noreply.github.com>
    Co-authored-by: default avatarhaotian-liu <haotian-liu@users.noreply.github.com>
    Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
    44b5506d