1. 18 Apr, 2024 1 commit
    • tomeras91's avatar
      Add jamba (#29943) · 3f20877d
      tomeras91 authored
      * Add jamba arch
      
      * apply "make fix-copies" changes
      
      * fix link to model in JambaConfig docstring
      
      * Add n_ctx in modeling file because repo-consistency wants that
      
      * Add jamba to flash attention and sdpa documentation
      
      * mamba dt_proj quant fix now works for LoRA as well
      
      * override test_left_padding_compatibility and use a more permissive tolerance. left padding numerical difference are accentuated by mamba layers
      
      * add jamba to tokenization auto
      
      * fix comments of shape (PR #24 in the model page: https://huggingface.co/ai21labs/Jamba-v0.1/discussions/24)
      
      * simple PR fixes
      
      * remove unnecessary kwargs from JambaAttentionDecoderLayer and JambaMambaDecoderLayer
      
      * remove the LoRA hack for the mamba dt_proj bias. It was solved in huggingface/peft#1530 (https://github.com/huggingface/peft/pull/1530)
      
      * Add copied comment on JambaMLP (it's the same as MixtralMLP)
      
      * remove padding_mask warnings. It's not supported anymore
      
      * fix docstring. Float instead of int
      
      * A few more minor PR fixes
      
      * (1) lowercase names for mamba layernorms (2) remove _apply_inner_layernorms and do it directly in the forward pass
      
      * Return None attention weights from mamba layers. Append to all attentions only if not None.
      
      * remove some leftover jamba archive lists
      
      * Better separation between expert vs non-expert layers. non-expert layers return None as router_logits, and it is not concatenated to all_router_logits returned from JambaModel
      
      * no need to take router_logits at config.expert_layer_offset anymore. result.router_logits now holds results only for expert layers
      
      * Add Jamba paper on READMEs
      
      * (1) rename n_ctx -> max_position_embeddings (2) don't use it in the modeling file since it's not needed (set it as an exception to check_config_attributes)
      
      * Add copied from comment
      
      * remove the code path for apply_inner_layernorms=False. Jamba always has the inner mamba layernorms
      
      * clearer docstring for _convert_to_standard_cache
      
      * style fixes
      
      * Change calc_logits_for_entire_prompt (bool) to num_logits_to_keep (int). Adapt assisted decoding code tp use it. Also small change in low memory beam search decoding path to support this new int value in model_inputs
      
      * rename test so it still overrides what its meant to override
      
      * draft
      
      * oups
      
      * nit
      
      * remove more complexe logic
      
      * fix names used in config
      
      * fix fix fix
      
      * style
      
      * fix some more failing tests
      
      * generate did not init the cache 🙃
      
      
      
      * more small nits
      
      * typo
      
      * config.mamba_expand * config.hidden_size for the intermediate size of the mamba shapes
      
      * fix init of pkv with torch.tensor()
      
      * empty tensor
      
      * fix some init issues
      
      * stupid changes required by generate because it does not even support it's own DynamicCache class
      
      * more fixes
      
      * fix general assisted gen cache_position bug
      
      * tests passing
      
      * Add offsets and periods as SPECIAL_CASES_TO_ALLOW in check_config_attributes.py
      
      * fix reorder_cache to reorder mamba states and override some more functions in HybridMambaAttentionDynamicCache
      
      * no need to override test_past_key_values_format() and _check_past_key_values_for_generate() in tests anymore
      
      * fix docstrings and typehints for past_key_values
      
      * style fixes
      
      * fix docs
      
      * change typehint due to copy from Mixtral
      
      * forgot import
      
      * import order
      
      * Add configuration_jamba and modeling_jamba to not_doctested because the model is too big to download (in docstring of JambaForCausalLM.forward)
      
      * Add integration test with tiny tandom Jamba model on hub
      
      * fix flash attention cache shapes
      
      * bring back forgotten hidden states
      
      * rename HybridMambaAttentionDynamicCache.seqlen_offset to has_previous_state (and make bool) and bugfix - it should be set to True after a finished forward pass of the entire model
      
      * align integration test after modeling fixes
      
      * bugfix - mamba can use precomputed states only of forward pass is on a single token
      
      * bugfix - mamba can use precomputed states only if they match the batch size
      
      * typo
      
      * remove making _prepare_4d_causal_attention_mask a leaf function
      
      * stop using past_seq_len.get_seq_length(). Use cache positions instead. Adjust test (test_decoder_model_past_with_large_inputs) accordingly
      
      ---------
      
      Co-authored-by: default avatarArthur Zucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarJoao Gante <joao@huggingface.co>
      3f20877d
  2. 17 Apr, 2024 1 commit
    • Shane A's avatar
      Add OLMo model family (#29890) · e4ea19b9
      Shane A authored
      * Add OLMo using add-new-model-like with Llama
      
      * Fix incorrect tokenizer for OLMo
      
      * Copy-paste relevant OLMo methods and their imports
      
      * Add OLMo config
      
      * Modify OLMo config to follow HF conventions
      
      * Remove unneeded Llama code from OLMo model
      
      * Add ability for OLMo model to output attentions
      
      * Add OLMoPreTrainedModel and OLMoModel
      
      * Add OLMoForCausalLM
      
      * Minor fixes to OLMo model for style and missing functions
      
      * Implement OLMo tokenizer
      
      * Implement OLMo to HF conversion script
      
      * Add tests for OLMo model
      
      * Add tests for OLMo fast tokenizer
      
      * Add auto-generated dummy objects
      
      * Remove unimplemented OLMo classes from auto and init classes and re-format
      
      * Add README and associated auto-generated files
      
      * Use OLMo names for common properties
      
      * Run make fixup
      
      * Remove `|` from OLMo typing
      
      * Remove unneeded tokenization_olmo.py
      
      * Revert model, config and converter to add-new-model-like Llama
      
      * Move logic for adding bos/eos token into GPTNeoxTokenizerFast
      
      * Change OLMoConfig defaults to match OLMo-7B
      
      * Use GPTNeoXToknizerFast in OLMo tokenizer tests
      
      * Modify auto-generated OLMoModelTests to work for OLMo
      
      * Add non-parametric layer norm OLMoLayerNorm
      
      * Update weight conversion script for OLMo
      
      * Fix __init__ and auto structure for OLMo
      
      * Fix errors from make fixup
      
      * Remove OLMoTokenizerFast from documentation
      
      * Add missing 'Copied from' for OLMoModel._update_causal_mask
      
      * Run make fix-copies
      
      * Rearrange string replacements in OLMoForCausalLM Copied from
      
      * Move OLMo and Llama CausalLM.forward example into global constants
      
      * Fix OLMO_GENERATION_EXAMPLE doc string typo
      
      * Add option for qkv clipping to OLMo
      
      * Rearrange OLMoConfig kwargs in convert_olmo_weights_to_hf
      
      * Add clip_qkv to OLMoConfig in convert_olmo_weights_to_hf
      
      * Fix OLMo tokenization bug using conversion script
      
      * Keep model in full precision after conversion
      
      * Do not add eos token automatically
      
      * Update references to OLMo model in HF Hub
      
      * Do not add eos token during encoding by default
      
      * Fix Llama generation example
      
      * Run make fixup
      
      * OLMo 7B integration test fix
      
      * Remove unneeded special case for OLMoConfig
      
      * OLMo 7B Twin 2T integration test fix
      
      * Fix test_model_7b_greedy_generation
      
      * Remove test_compile_static_cache
      
      * Fix OLMo and Llama generation example
      
      * Run make fixup
      
      * Revert "OLMo 7B integration test fix"
      
      This reverts commit 4df56a4b150681bfa559846f40e9b7b7f97d7908.
      
      * Revert "OLMo 7B Twin 2T integration test fix"
      
      This reverts commit 9ff65a4a294ace89ab047b793ca55e623a9ceefc.
      
      * Ungate 7B integration tests and fix greedy generation test
      
      * Add retries for flaky test_eager_matches_sdpa_generate
      
      * Fix output of doc example for OLMoForCausalLM.forward
      
      * Downsize OLMo doc test for OLMoForCausalLM.forward to 1B model
      
      * Try fix incorrect characters in OLMoForCausalLM.forward doct test
      
      * Try fix incorrect characters in OLMoForCausalLM.forward doc test using end quotes
      
      * Remove pretraining_tp from OLMo config and model
      
      * Add missing 'Copied from' instances
      
      * Remove unneeded causal_mask from OLMoModel
      
      * Revert Llama changes
      
      * Ignore copy for OLMoForCausalLM.forward
      
      * Change 'OLMo' to 'Olmo' in classes
      
      * Move minimal OLMo tokenization tests to model tests
      
      * Add missed 'Copied from' for repeat_kv
      e4ea19b9
  3. 15 Apr, 2024 1 commit
    • amyeroberts's avatar
      Add Idefics2 (#30253) · 6b78360e
      amyeroberts authored
      
      * Initial add model additions
      
      * Test
      
      * All weights loading
      
      * Can perform full forward pass
      
      * Local and remote the same
      
      * Matching local and remote
      
      * Fixup
      
      * Idefics2Model importable; fixup docstrings
      
      * Don't skip by default
      
      * Remove deprecated use_resampler arg
      
      * Remove self.config
      
      * DecoupledLinear takes config
      
      * Tidy up
      
      * Enable eager attention and tidy up
      
      * Most tests passing
      
      * Update for batch of processed images
      
      * Add image processor
      
      * Update doc pages
      
      * Update conversion script
      
      * Remove erroneous breakpoint
      
      * Remove accidendtal spelling change
      
      * Update to reflect changes on hub - make generate work
      
      * Fix up
      
      * Image processor tests
      
      * Update tests
      
      * Add a processor
      
      * Add a processor
      
      * Update convert script
      
      * Update modeling file - remove fixmes
      
      * Bug fix
      
      * Add processing test
      
      * Use processor
      
      * Fix up
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Fix test
      
      * Update config - PR comments and defaults align with checkpoint
      
      * Reviewer comments
      
      * Add copied froms for flahs attention
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Apply suggestions from code review
      
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Remove qk_layer_norm and freeze_layers functionality
      
      * Fix
      
      * Remove freeze_layer options from config
      
      * Sync with upstream main
      
      * Fix attention shapes siglip
      
      * Remove Llava-next refs - TO REBASE
      
      * Use AutoModel for text model
      
      * Add comment to explain vision embeddings
      
      * Fix issue with tie_word_embeddings
      
      * Address review comments
      
      * Fix and fix up
      
      * Chat templates for idefics
      
      * Fix copies
      
      * Fix
      
      * Add layer norms to FA2
      
      * Fix tests
      
      * Apply suggestions from code review
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Fix
      
      * Review comments
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update inputs merger
      
      * Merge weights in correct order
      
      * Update convert script
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update template
      
      * Model code examples (fix idefics too)
      
      * More review comments
      
      * Tidy up
      
      * Update processing
      
      * Fix attention mask preparation
      
      * Update inputs_merger inputs
      
      * Vectorize inputs_merger
      
      * Update src/transformers/models/idefics2/__init__.py
      
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      
      * Review comments
      
      * saying bye to the `qk_layer_norms`
      
      * Simplify
      
      * Update latents
      
      * Remove erroneuous readme changes
      
      * Return images when applying chat template
      
      * Fix bug - prompt images are for a single sample
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      
      * image splitting
      
      * fix test
      
      * some more comment
      
      * some comment
      
      * Apply suggestions from code review
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/idefics2/image_processing_idefics2.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update processor
      
      * Update model tests
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Don't add BOS in template
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Remove index in examples
      
      * Update tests to reflect #13
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * PR comment - consistent typing
      
      * Update readme and model doc
      
      * Update docs
      
      * Update checkpoint references
      
      * Update examples
      
      * Fix and update tests
      
      * Small addition
      
      * Update tests - remove copied from as no ignore placement copy could be found
      
      * Update example
      
      * small fixes
      
      * Update docs/source/en/model_doc/idefics2.md
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update docs/source/en/model_doc/idefics2.md
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update README.md
      
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Connector model as bridge
      
      * Fix up
      
      * Fix up
      
      * Don't pass model inputs for generation kwargs update
      
      * IDEFICS-2 -> Idefics2
      
      * Remove config archive name
      
      * IDEFICS-2 -> Idefics2
      
      * Add back llava-next
      
      * Update readmes
      
      * Add requirements for processor tester
      
      * Use custom convert_to_rgb to avoid possible BC
      
      * Fix doc example
      
      * Fix doc example
      
      * Skip model doc tests - as model to large
      
      * More doc example - account for image splitting
      
      * Update src/transformers/image_transforms.py
      
      * Fix config doctest
      
      ---------
      
      Co-authored-by: default avatarPablo Montalvo <39954772+molbap@users.noreply.github.com>
      Co-authored-by: default avatarArthurZucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      6b78360e
  4. 11 Apr, 2024 1 commit
    • Eduardo Pacheco's avatar
      Adding grounding dino (#26087) · b752ad30
      Eduardo Pacheco authored
      
      * Fixed typo when converting weigths to GroundingDINO vision backbone
      
      * Final modifications on modeling
      
      * Removed unnecessary class
      
      * Fixed convert structure
      
      * Added image processing
      
      * make fixup partially completed
      
      * Now text_backbone_config has its own class
      
      * Modified convert script
      
      * Removed unnecessary config attribute
      
      * Added new function to generate sub sentence mask
      
      * Renamed parameters with gamma in the name as it's currently not allowed
      
      * Removed tokenization and image_processing scripts since we'll map from existing models
      
      * Fixed some issues with configuration
      
      * Just some modifications on conversion script
      
      * Other modifications
      
      * Copied deformable detr
      
      * First commit
      
      * Added bert to model
      
      * Bert validated
      
      * Created Text and Fusion layers for Encoder
      
      * Adapted Encoder layer
      
      * Fixed typos
      
      * Adjusted Encoder
      
      * Converted encoder to hf
      
      * Modified Decoder Layer
      
      * Modified main decoder class
      
      * Removed copy comments
      
      * Fixed forward from GroundingDINOModel and GroundingDINODecoder
      
      * Added all necessary layers, configurations and forward logic up to GroundingDINOModel
      
      * Added all layers to convertion
      
      * Fixed outputs for GroundingDINOModel and GroundingDINOForObjectDetection
      
      * Fixed mask input to encoders and fixed nn.MultiheadAttention batch first and attn output
      
      * Fixed forward from GroundingDINOTextEnhancerLayer
      
      * Fixed output bug with GroundingDINODeformableLayer
      
      * Fixed bugs that prevent GroundingDINOForObjectDetection to run forward method
      
      * Fixed attentions to be passed correctly
      
      * Passing temperature arg when creating Sine position embedding
      
      * Removed copy comments
      
      * Added temperature argument for position embedding
      
      * Fixed typo when converting weigths to GroundingDINO vision backbone
      
      * Final modifications on modeling
      
      * Removed unnecessary class
      
      * Fixed convert structure
      
      * Added image processing
      
      * make fixup partially completed
      
      * Now text_backbone_config has its own class
      
      * Modified convert script
      
      * Removed unnecessary config attribute
      
      * Added new function to generate sub sentence mask
      
      * Renamed parameters with gamma in the name as it's currently not allowed
      
      * Removed tokenization and image_processing scripts since we'll map from existing models
      
      * Fixed some issues with configuration
      
      * Just some modifications on conversion script
      
      * Other modifications
      
      * Fix style
      
      * Improve fixup
      
      * Improve conversion script
      
      * Improve conversion script
      
      * Add GroundingDINOProcessor
      
      * More improvements
      
      * Return token type ids
      
      * something
      
      * Fix more tests
      
      * More improvements
      
      * More cleanup
      
      * More improvements
      
      * Fixed tests, improved modeling and config
      
      * More improvements and fixing tests
      
      * Improved tests and modeling
      
      * Improved tests and added image processor
      
      * Improved tests inference
      
      * More improvements
      
      * More test improvements
      
      * Fixed last test
      
      * Improved docstrings and comments
      
      * Fix style
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      
      * Better naming
      
      * Better naming
      
      * Added Copied statement
      
      * Added Copied statement
      
      * Moved param init from GroundingDINOBiMultiHeadAttention
      
      * Better naming
      
      * Fixing clamp style
      
      * Better naming
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/configuration_grounding_dino.py
      
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/convert_grounding_dino_to_hf.py
      
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      
      * Improving conversion script
      
      * Improved config
      
      * Improved naming
      
      * Improved naming again
      
      * Improved grouding-dino.md
      
      * Moved grounding dino to multimodal
      
      * Update src/transformers/models/grounding_dino/convert_grounding_dino_to_hf.py
      
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      
      * Fixed docstrings and style
      
      * Fix docstrings
      
      * Remove timm attributes
      
      * Reorder imports
      
      * More improvements
      
      * Add Grounding DINO to pipeline
      
      * Remove model from check_repo
      
      * Added grounded post_process to GroundingDINOProcessor
      
      * Fixed style
      
      * Fixed GroundingDINOTextPrenetConfig docstrings
      
      * Aligned inputs.keys() when both image and text are passed with model_input_names
      
      * Added tests for GroundingDINOImageProcessor and GroundingDINOProcessor
      
      * Testing post_process_grounded_object_detection from GroundingDINOProcessor at test_inference_object_detection_head
      
      * Fixed order
      
      * Marked test with require_torch
      
      * Temporarily changed repo_id
      
      * More improvements
      
      * Fix style
      
      * Final improvements
      
      * Improve annotators
      
      * Fix style
      
      * Add is_torch_available
      
      * Remove type hints
      
      * vocab_tokens as one liner
      
      * Removed print statements
      
      * Renamed GroundingDINOTextPrenetConfig to GroundingDINOTextConfig
      
      * remove unnecessary comments
      
      * Removed unnecessary tests on conversion script
      
      * Renamed GroundingDINO to camel case GroundingDino
      
      * Fixed GroundingDinoProcessor docstrings
      
      * loading MSDA kernels in the modeling file
      
      * Fix copies
      
      * Replace nn.multiheadattention
      
      * Replace nn.multiheadattention
      
      * Fixed inputs for GroundingDinoMultiheadAttention & order of modules
      
      * Fixed processing to avoid messing with inputs
      
      * Added more tips for GroundingDino
      
      * Make style
      
      * Chaning name to align with SAM
      
      * Replace final nn.multiheadattention
      
      * Fix model tests
      
      * Update year, remove GenerationTesterMixin
      
      * Address comments
      
      * Address more comments
      
      * Rename TextPrenet to TextModel
      
      * Rename hidden_states
      
      * Address more comments
      
      * Address more comments
      
      * Address comment
      
      * Address more comments
      
      * Address merge
      
      * Address comment
      
      * Address comment
      
      * Address comment
      
      * Make style
      
      * Added layer norm eps to layer norms
      
      * Address more comments
      
      * More fixes
      
      * Fixed equivalence
      
      * Make fixup
      
      * Remove print statements
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Add comment
      
      * Address comment
      
      * Remove overwriting of test
      
      * Fix bbox_embed
      
      * Improve decoder_bbox_embed_share
      
      * Simplify outputs
      
      * Updated post_process_grounded_object_detection
      
      * Renamed sources to feature_maps
      
      * Improved tests for Grounding Dino ImageProcessor and Processor
      
      * Fixed test requirements and imports
      
      * Fixed image_processing
      
      * Fixed processor tests
      
      * Fixed imports for image processing tests
      
      * Fix copies
      
      * Updated modeling
      
      * Fix style
      
      * Moved functions to correct position
      
      * Fixed copy issues
      
      * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py
      
      Co-authored-by: default avatarSangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarSangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avatarSangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
      
      * Keeping consistency custom cuda kernels for MSDA
      
      * Make GroundingDinoProcessor logic clearer
      
      * Updated Grounding DINO checkpoints
      
      * Changed tests to correct structure
      
      * Updated gpu-cpu equivalence test
      
      * fix copies
      
      * Update src/transformers/models/grounding_dino/processing_grounding_dino.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/processing_grounding_dino.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/grounding_dino/configuration_grounding_dino.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Fixed erros and style
      
      * Fix copies
      
      * Removed inheritance from PreTrainedModel from GroundingDinoTextModel
      
      * Fixed GroundingDinoTextModel
      
      * Fixed type of default backbone config
      
      * Fixed missing methods for GroundingDinoTextModel and Added timm support for GroundingDinoConvEncoder
      
      * Addressed comments
      
      * Addressed batched image processing tests
      
      * Addressed zero shot test comment
      
      * Addressed tip comment
      
      * Removed GroundingDinoTextModel from check_repo
      
      * Removed inplace masking
      
      * Addressed comments
      
      * Addressed comments
      
      * Addressed comments
      
      * Fix copies
      
      * Fixing timm test
      
      * Fixed batching equivalence test
      
      * Update docs/source/en/model_doc/grounding-dino.md
      
      Co-authored-by: default avatarTianqi Xu <40522713+dandansamax@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/grounding-dino.md
      
      Co-authored-by: default avatarTianqi Xu <40522713+dandansamax@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/grounding-dino.md
      
      Co-authored-by: default avatarTianqi Xu <40522713+dandansamax@users.noreply.github.com>
      
      * Addressed more comments
      
      * Added a new comment
      
      * Reduced image size
      
      * Addressed more comments
      
      * Nits
      
      * Nits
      
      * Changed the way text_config is initialized
      
      * Update src/transformers/models/grounding_dino/processing_grounding_dino.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      ---------
      
      Co-authored-by: default avatarNiels <niels.rogge1@gmail.com>
      Co-authored-by: default avatarRafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarEduardo Pacheco <eduardo.pacheco@limehome.com>
      Co-authored-by: default avatarSangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarTianqi Xu <40522713+dandansamax@users.noreply.github.com>
      b752ad30
  5. 10 Apr, 2024 1 commit
    • Arthur's avatar
      Add recurrent gemma (#30143) · 0fe44059
      Arthur authored
      
      * Fork.
      
      * RecurrentGemma initial commit.
      
      * Updating __init__.py.
      
      * Minor modification to how we initialize the cache.
      Changing how the config specifies the architecture.
      
      * Reformat code to 4 spaces.
      Fixed a few typos.
      
      * Fixed the forward pass.
      Still unclear on the cache?
      
      * Fixed the RecurrentGemmaForCausalLM
      
      * Minor comment that we might not need attention_mask and output_attention arguments.
      
      * Now cache should work as well.
      
      * Adding a temporary example to check whether the model generation works.
      
      * Adding the tests and updating imports.
      
      * Adding the example file missing in the previous commit.
      
      * First working example.
      
      * Removing .gitignore and reverting parts of __init__.
      
      * Re-add .gitignore.
      
      * Addressing comments for configuration.
      
      * Move mask creation to `_prepare_inputs_for_generation`.
      
      * First try at integration tests:
      1. AttributeError: 'GriffinCausalLMOutput' object has no attribute 'attentions'.
      2. `cache_position` not passed
      
      * Transfoering between machines.
      
      * Running normal tests.
      
      * Minor fix.
      
      * More fixes.
      
      * Addressing more comments.
      
      * Minor fixes.
      
      * first stab at cleanup
      
      * more refactoring
      
      * fix copies and else
      
      * renaming and get init to work
      
      * fix causal mask creation
      
      * update
      
      * nit
      
      * fix a hell lot of things
      
      * updates
      
      * update conversion script
      
      * make all keys importable
      
      * nits
      
      * add auto mappings
      
      * properly convert ffw_up and down
      
      * add scaling
      
      * fix generations
      
      * for recurrent dtype
      
      * update
      
      * fix going beyong window
      
      * fixup
      
      * add missing files
      
      * current updates to remove last einops
      
      * finish modeling refactor
      
      * TADA
      
      * fix compile
      
      * fix most failing testt ? ?
      
      * update tests
      
      * refactor and update
      
      * update
      
      * nits, fixup and update tests
      
      * more fixup
      
      * nits
      
      * fix imports
      
      * test format
      
      * fixups
      
      * nits
      
      * tuple typing
      
      * fix code quality
      
      * add model card
      
      * fix doc
      
      * skip most generation tests
      
      * nits
      
      * style
      
      * doc fixes
      
      * fix pr and check_copies?
      
      * last nit
      
      * oupsy
      
      * Apply suggestions from code review
      
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * update
      
      * Update src/transformers/models/recurrent_gemma/convert_recurrent_gemma_to_hf.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * update based on review
      
      * doc nit
      
      * fix quality
      
      * quality
      
      * fix slow test model path
      
      * update default dype
      
      * ignore attributes that can be safely ignored in check config attributes
      
      * 0lallalala come on
      
      * save nit
      
      * style
      
      * remove to dict update
      
      * make sure we can also run in float16
      
      * style
      
      ---------
      
      Co-authored-by: default avatarPablo Montalvo <39954772+molbap@users.noreply.github.com>
      Co-authored-by: default avatarAleksandar Botev <botev@google.com>
      Co-authored-by: default avatarLeonard Berrada <lberrada@users.noreply.github.com>
      Co-authored-by: default avataranushanf <anushanf@google.com>
      Co-authored-by: default avatarbotev <botevmg@gmail.com>
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      0fe44059
  6. 30 Mar, 2024 1 commit
  7. 27 Mar, 2024 1 commit
    • Bo Zheng's avatar
      Add Qwen2MoE (#29377) · 1c39974a
      Bo Zheng authored
      
      * add support for qwen2 MoE models
      
      * update docs
      
      * add support for qwen2 MoE models
      
      * update docs
      
      * update model name & test
      
      * update readme
      
      * update class names & readme & model_doc of Qwen2MoE.
      
      * update architecture name
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * update modeling_qwen2_moe.py
      
      * fix model architecture
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * update modeling_qwen2_moe.py
      
      * fix model architecture
      
      * fix style
      
      * fix test when there are sparse and non sparse layers
      
      * fixup
      
      * Update README.md
      
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fixup
      
      * fixup
      
      * add archive back
      
      * add support for qwen2 MoE models
      
      * update docs
      
      * update model name & test
      
      * update readme
      
      * update class names & readme & model_doc of Qwen2MoE.
      
      * update architecture name
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * update modeling_qwen2_moe.py
      
      * fix model architecture
      
      * fixup
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * fix style
      
      * fix test when there are sparse and non sparse layers
      
      * fixup
      
      * add archive back
      
      * fix integration test
      
      * fixup
      
      ---------
      
      Co-authored-by: default avatarbozheng-hit <dsoul0621@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      1c39974a
  8. 20 Mar, 2024 2 commits
    • NielsRogge's avatar
      Add LLaVa-1.6, bis (#29586) · d91fd7f9
      NielsRogge authored
      
      * First draft
      
      * Fix tests, add docs
      
      * Improve docstrings
      
      * Fix test
      
      * Address comments
      
      * Address comments
      
      * Remove vocab_size attribute
      
      * Remove batch_size
      
      * Address comment
      
      * Add image processor tests
      
      * Support fx
      
      * Update docstring
      
      * Add support for 34b
      
      * Convert 34b model
      
      * Add integration tests
      
      * Update checkpoints
      
      * Convert vicuna-13b, remove doc tests
      
      * Remove script
      
      * Remove file
      
      * Address comments
      
      * Improve docstrings
      
      * Deprecate vocab_size
      
      * Remove aspect_ratio_setting
      
      * Address comments
      
      * Update READMEs
      
      * Add tips about chat templates
      
      * Fix tests
      
      * Deprecate vocab_size safely
      
      * Update tests
      
      ---------
      
      Co-authored-by: default avatarAmy Roberts <22614925+amyeroberts@users.noreply.github.com>
      d91fd7f9
    • Arthur Zucker's avatar
      v4.40.0.dev.0 · 1248f092
      Arthur Zucker authored
      1248f092
  9. 19 Mar, 2024 2 commits
  10. 18 Mar, 2024 1 commit
    • Yoach Lacombe's avatar
      Add MusicGen Melody (#28819) · c43b380e
      Yoach Lacombe authored
      
      * first modeling code
      
      * make repository
      
      * still WIP
      
      * update model
      
      * add tests
      
      * add latest change
      
      * clean docstrings and copied from
      
      * update docstrings md and readme
      
      * correct chroma function
      
      * correct copied from and remove unreleated test
      
      * add doc to toctree
      
      * correct imports
      
      * add convert script to notdoctested
      
      * Add suggestion from Sanchit
      
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * correct get_uncoditional_inputs docstrings
      
      * modify README according to SANCHIT feedback
      
      * add chroma to audio utils
      
      * clean librosa and torchaudio hard dependencies
      
      * fix FE
      
      * refactor audio decoder -> audio encoder for consistency with previous musicgen
      
      * refactor conditional -> encoder
      
      * modify sampling rate logics
      
      * modify license at the beginning
      
      * refactor all_self_attns->all_attentions
      
      * remove ignore copy from causallm generate
      
      * add copied from for from_sub_models
      
      * fix make copies
      
      * add warning if audio is truncated
      
      * add copied from where relevant
      
      * remove artefact
      
      * fix convert script
      
      * fix torchaudio and FE
      
      * modify chroma method according to feedback-> better naming
      
      * refactor input_values->input_features
      
      * refactor input_values->input_features and fix import fe
      
      * add input_features to docstrigs
      
      * correct inputs_embeds logics
      
      * remove dtype conversion
      
      * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation
      
      * change warning for chroma length
      
      * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
      
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * change way to save wav, using soundfile
      
      * correct docs and change to soundfile
      
      * fix import
      
      * fix init proj layers
      
      * remove line breaks from md
      
      * fix issue with docstrings
      
      * add FE suggestions
      
      * improve is in logics and remove useless imports
      
      * remove custom from_pretrained
      
      * simplify docstring code
      
      * add suggestions for modeling tests
      
      * make style
      
      * update converting script with sanity check
      
      * remove encoder attention mask from conditional generation
      
      * replace musicgen melody checkpoints with official orga
      
      * rename ylacombe->facebook in checkpoints
      
      * fix copies
      
      * remove unecessary warning
      
      * add shape in code docstrings
      
      * add files to slow doc tests
      
      * fix md bug and add md to not_tested
      
      * make fix-copies
      
      * fix hidden states test and batching
      
      ---------
      
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      c43b380e
  11. 15 Mar, 2024 1 commit
    • Saurabh Dash's avatar
      Cohere Model Release (#29622) · 0e4a1c34
      Saurabh Dash authored
      
      * Cohere Model Release (#1)
      
      Cohere Model Release
      
      * Remove unnecessary files and code (#2)
      
      Some cleanup
      
      * Delete cohere-model directory (#3)
      
      * Make Fix (#5)
      
      * Pr fixes (#6)
      
      * fixes for pr
      
      * pr fixes for the format
      
      * pr fixes for the format
      
      * src/transformers/models/auto/tokenization_auto.py
      
      * Tokenizer test (#8)
      
      * tokenizer test
      
      * format fix
      
      * Adding Docs and other minor changes (#7)
      
      * Add modeling tests (#9)
      
      * Smol Fix (#11)
      
      * tokenization tests are fixed
      
      * format fixes
      
      * fix pr doc tests
      
      * fix pr doc tests
      
      * fix pr doc tests
      
      * fix pr style check
      
      * small changes in cohere.md
      
      * FIX: Address final comments for transformers integration (#13)
      
      * fix modeling final nits and add proper test file
      
      * for now leave empty tests
      
      * add integration test
      
      * push new test
      
      * fix modeling cohere (#14)
      
      * Update chat templates to use the new API (#15)
      
      ---------
      
      Co-authored-by: default avatarahmetustun <ahmetustun89@gmail.com>
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      0e4a1c34
  12. 14 Mar, 2024 1 commit
  13. 11 Mar, 2024 1 commit
  14. 26 Feb, 2024 1 commit
  15. 23 Feb, 2024 1 commit
  16. 16 Feb, 2024 1 commit
  17. 09 Feb, 2024 1 commit
  18. 06 Feb, 2024 1 commit
    • Klaus Hipp's avatar
      [Docs] Add missing language options and fix broken links (#28852) · 1c31b7aa
      Klaus Hipp authored
      * Add missing entries to the language selector
      
      * Add links to the Colab and AWS Studio notebooks for ONNX
      
      * Use anchor links in CONTRIBUTING.md
      
      * Fix broken hyperlinks due to spaces
      
      * Fix links to OpenAI research articles
      
      * Remove confusing footnote symbols from author names, as they are also considered invalid markup
      1c31b7aa
  19. 29 Jan, 2024 1 commit
  20. 11 Jan, 2024 1 commit
  21. 04 Jan, 2024 1 commit
  22. 28 Nov, 2023 1 commit
  23. 10 Nov, 2023 1 commit
    • Susnato Dhar's avatar
      Add Phi-1 and Phi-1_5 (#26170) · e1c3ac25
      Susnato Dhar authored
      * only dir not even init
      
      * init
      
      * tokenizer removed and reference of codegen added
      
      * modeling file updated a lot remaining app_rotary_emb
      
      * conversion script done
      
      * conversion script fixed, a lot of factoring done and most tests pass
      
      * added token_clf and extractive_QA_head
      
      * integration tests pass
      
      * flash attn tests pass!
      
      * config done
      
      * more docs in modeling file
      
      * some style fix
      
      * style and others
      
      * doc test error fix
      
      * more doc fix
      
      * some attention fixes
      
      * most fixes
      
      * style and other fixes
      
      * docs fix and config
      
      * doc fix
      
      * some comments
      
      * conversion script updated
      
      * conversion script updated
      
      * Revert "conversion script updated"
      
      This reverts commit e92378c54084ec0747041b113083d1746ecb6c7f.
      
      * final comments
      
      * add Phi to language_modeling.md
      
      * edit phi.md file
      
      * rebase and fix
      
      * removed phi-1.5 example
      
      * changed model_type from 'phi'->'mixformer-sequential'
      
      * small change
      
      * small change
      
      * revert \small change
      
      * changed mixformer-sequential->phi
      
      * small change
      
      * added phi-1.5 example instead of phi-1
      
      * doc test might pass now
      
      * rebase and small change
      
      * added the dropout layer
      
      * more fixes
      
      * modified .md file
      
      * very very small doc change
      e1c3ac25
  24. 31 Oct, 2023 1 commit
  25. 27 Oct, 2023 1 commit
  26. 19 Oct, 2023 1 commit
  27. 18 Oct, 2023 1 commit
    • Pablo Montalvo's avatar
      Add fuyu model (#26911) · caa0ff0b
      Pablo Montalvo authored
      
      * initial commit
      
      * add processor, add fuyu naming
      
      * add draft processor
      
      * fix processor
      
      * remove dropout to fix loading of weights
      
      * add image processing fixes from Pedro
      
      * fix
      
      * fix processor
      
      * add basic processing fuyu test
      
      * add documentation and TODO
      
      * address comments, add tests, add doc
      
      * replace assert with torch asserts
      
      * add Mixins and fix tests
      
      * clean imports
      
      * add model tester, clean imports
      
      * fix embedding test
      
      * add updated tests from pre-release model
      
      * Processor: return input_ids used for inference
      
      * separate processing and model tests
      
      * relax test tolerance for embeddings
      
      * add test for logit comparison
      
      * make sure fuyu image processor is imported in the init
      
      * fix formattingh
      
      * more formatting issues
      
      * and more
      
      * fixups
      
      * remove some stuff
      
      * nits
      
      * update init
      
      * remove the fuyu file
      
      * Update integration test with release model
      
      * Update conversion script.
      
      The projection is not used, as confirmed by the authors.
      
      * improve geenration
      
      * Remove duplicate function
      
      * Trickle down patches to model call
      
      * processing fuyu updates
      
      * remove things
      
      * fix prepare_inputs_for_generation to fix generate()
      
      * remove model_input
      
      * update
      
      * add generation tests
      
      * nits
      
      * draft leverage automodel and autoconfig
      
      * nits
      
      * fix dtype patch
      
      * address comments, update READMEs and doc, include tests
      
      * add working processing test, remove refs to subsequences
      
      * add tests, remove Sequence classification
      
      * processing
      
      * update
      
      * update the conversion script
      
      * more processing cleanup
      
      * safe import
      
      * take out ModelTesterMixin for early release
      
      * more cl;eanup
      
      * more cleanup
      
      * more cleanup
      
      * and more
      
      * register a buffer
      
      * nits
      
      * add postprocessing of generate output
      
      * nits
      
      * updates
      
      * add one working test
      
      * fix test
      
      * make fixup works
      
      * fixup
      
      * Arthur's updates
      
      * nits
      
      * update
      
      * update
      
      * fix processor
      
      * update tests
      
      * passe more fixups
      
      * fix
      
      * nits
      
      * don't import torch
      
      * skip fuyu config for now
      
      * fixup done
      
      * fixup
      
      * update
      
      * oups
      
      * nits
      
      * Use input embeddings
      
      * no buffer
      
      * update
      
      * styling processing fuyu
      
      * fix test
      
      * update licence
      
      * protect torch import
      
      * fixup and update not doctested
      
      * kwargs should be passed
      
      * udpates
      
      * update the impofixuprts in the test
      
      * protect import
      
      * protecting imports
      
      * protect imports in type checking
      
      * add testing decorators
      
      * protect top level import structure
      
      * fix typo
      
      * fix check init
      
      * move requires_backend to functions
      
      * Imports
      
      * Protect types
      
      ---------
      
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      Co-authored-by: default avatarArthurZucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarLysandre <lysandre@huggingface.co>
      caa0ff0b
  28. 25 Sep, 2023 1 commit
  29. 22 Sep, 2023 1 commit
  30. 19 Sep, 2023 1 commit
    • NielsRogge's avatar
      Add ViTMatte (#25843) · 7d6354e0
      NielsRogge authored
      * First draft
      
      * Simplify image processor
      
      * Fix rebase
      
      * Address comments
      
      * Address more comments
      
      * Address more comments
      
      * Address more comments
      
      * Address more comments
      
      * Improve pad_image
      
      * Add tests
      
      * Update integration test
      
      * Fix image processor tests
      
      * Fix model tests
      
      * Convert checkpoints
      
      * Fix doc tests
      
      * Remove file
      
      * Apply suggestions
      
      * Address comments
      
      * Fix typing hint
      
      * Add batch_norm_eps
      
      * Address comments
      
      * Fix style
      7d6354e0
  31. 14 Sep, 2023 1 commit
    • Jinho Park's avatar
      Add BROS (#23190) · 17fdd354
      Jinho Park authored
      
      * add Bros boilerplate
      
      * copy and pasted modeling_bros.py from official Bros repo
      
      * update copyright of bros files
      
      * copy tokenization_bros.py from official repo and update import path
      
      * copy tokenization_bros_fast.py from official repo and update import path
      
      * copy configuration_bros.py from official repo and update import path
      
      * remove trailing period in copyright line
      
      * copy and paste bros/__init__.py from official repo
      
      * save formatting
      
      * remove unused unnecessary pe_type argument - using only crel type
      
      * resolve import issue
      
      * remove unused model classes
      
      * remove unnecessary tests
      
      * remove unused classes
      
      * fix original code's bug - layer_module's argument order
      
      * clean up modeling auto
      
      * add bbox to prepare_config_and_inputs
      
      * set temporary value to hidden_size (32 is too low because of the of the
      Bros' positional embedding)
      
      * remove decoder test, update create_and_check* input arguemnts
      
      * add missing variable to model tests
      
      * do make fixup
      
      * update bros.mdx
      
      * add boilerate plate for no_head inference test
      
      * update BROS_PRETRAINED_MODEL_ARCHIVE_LIST (add naver-clova-ocr prefix)
      
      * add prepare_bros_batch_inputs function
      
      * update modeling_common to add bbox inputs in Bros Model Test
      
      * remove unnecessary model inference
      
      * add test case
      
      * add model_doc
      
      * add test case for token_classification
      
      * apply fixup
      
      * update modeling code
      
      * update BrosForTokenClassification loss calculation logic
      
      * revert logits preprocessing logic to make sure logits have original shape
      
      * - update class name
      
      * - add BrosSpadeOutput
      - update BrosConfig arguments
      
      * add boilerate plate for no_head inference test
      
      * add prepare_bros_batch_inputs function
      
      * add test case
      
      * add test case for token_classification
      
      * update modeling code
      
      * update BrosForTokenClassification loss calculation logic
      
      * revert logits preprocessing logic to make sure logits have original shape
      
      * apply masking on the fly
      
      * add BrosSpadeForTokenLinking
      
      * update class name
      put docstring to the beginning of the file
      
      * separate the logits calculation logic and loss calculation logic
      
      * update logic for loss calculation so that logits shape doesn't change
      when return
      
      * update typo
      
      * update prepare_config_and_inputs
      
      * update dummy node initialization
      
      * update last_hidden_states getting logic to consider when return_dict is False
      
      * update box first token mask param
      
      * bugfix: remove random attention mask generation
      
      * update keys to ignore on load missing
      
      * run make style and quality
      
      * apply make style and quality of other codes
      
      * update box_first_token_mask to bool type
      
      * update index.md
      
      * apply make style and quality
      
      * apply make fix-copies
      
      * pass check_repo
      
      * update bros model doc
      
      * docstring bugfix fix
      
      * add checkpoint for doc, tokenizer for doc
      
      * Update README.md
      
      * Update docs/source/en/model_doc/bros.md
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update bros.md
      
      * Update src/transformers/__init__.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/bros.md
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Apply suggestions from code review
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * apply suggestions from code review
      
      * apply suggestions from code review
      
      * revert test_processor_markuplm.py
      
      * Update test_processor_markuplm.py
      
      * apply suggestions from code review
      
      * apply suggestions from code review
      
      * apply suggestions from code review
      
      * update BrosSpadeELForTokenClassification head name to entity linker
      
      * add doc string for config params
      
      * update class, var names to more explicit and apply suggestions from code review
      
      * remove unnecessary keys to ignore
      
      * update relation extractor to be initialized with config
      
      * add bros processor
      
      * apply make style and quality
      
      * update bros.md
      
      * remove bros tokenizer, add bros processor that wraps bert tokenizer
      
      * revert change
      
      * apply make fix-copies
      
      * update processor code, update itc -> initial token, stc -> subsequent token
      
      * add type hint
      
      * remove unnecessary condition branches in embedding forward
      
      * fix auto tokenizer fail
      
      * update docstring for each classes
      
      * update bbox input dimension as standard 2 points and convert them to 4
      points in forward pass
      
      * update bros docs
      
      * apply suggestions from code review : update Bros -> BROS in bros.md
      
      * 1. box prefix var -> bbox
      2. update variable names to be more explicit
      
      * replace einsum with torch matmul
      
      * apply style and quality
      
      * remove unused argument
      
      * remove unused arguments
      
      * update docstrings
      
      * apply suggestions from code review: add BrosBboxEmbeddings, replace
      einsum with classical matrix operations
      
      * revert einsum update
      
      * update bros processor
      
      * apply suggestions from code review
      
      * add conversion script for bros
      
      * Apply suggestions from code review
      
      * fix readme
      
      * apply fix-copies
      
      ---------
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      17fdd354
  32. 12 Sep, 2023 1 commit
  33. 07 Sep, 2023 1 commit
    • Muskan Kumar's avatar
      Added HerBERT to README.md (#26020) · 02c4a77f
      Muskan Kumar authored
      * Added HerBERT to README.md
      
      * Update README.md to contain HerBERT (#26016)
      
      * Resolved #26016: Updated READMEs and index.md to contain Herbert
      
      Updated READMEs and ran make fix-copies
      02c4a77f
  34. 04 Sep, 2023 1 commit
  35. 01 Sep, 2023 1 commit
    • Matthijs Hollemans's avatar
      add VITS model (#24085) · 4ece3b94
      Matthijs Hollemans authored
      
      * add VITS model
      
      * let's vits
      
      * finish TextEncoder (mostly)
      
      * rename VITS to Vits
      
      * add StochasticDurationPredictor
      
      * ads flow model
      
      * add generator
      
      * correctly set vocab size
      
      * add tokenizer
      
      * remove processor & feature extractor
      
      * add PosteriorEncoder
      
      * add missing weights to SDP
      
      * also convert LJSpeech and VCTK checkpoints
      
      * add training stuff in forward
      
      * add placeholder tests for tokenizer
      
      * add placeholder tests for model
      
      * starting cleanup
      
      * let the great renaming begin!
      
      * use config
      
      * global_conditioning
      
      * more cleaning
      
      * renaming variables
      
      * more renaming
      
      * more renaming
      
      * it never ends
      
      * reticulating the splines
      
      * more renaming
      
      * HiFi-GAN
      
      * doc strings for main model
      
      * fixup
      
      * fix-copies
      
      * don't make it a PreTrainedModel
      
      * fixup
      
      * rename config options
      
      * remove training logic from forward pass
      
      * simplify relative position
      
      * use actual checkpoint
      
      * style
      
      * PR review fixes
      
      * more review changes
      
      * fixup
      
      * more unit tests
      
      * fixup
      
      * fix doc test
      
      * add integration test
      
      * improve tokenizer tests
      
      * add tokenizer integration test
      
      * fix tests on GPU (gave OOM)
      
      * conversion script can handle repos from hub
      
      * add conversion script for all MMS-TTS checkpoints
      
      * automatically create a README for the converted checkpoint
      
      * small changes to config
      
      * push README to hub
      
      * only show uroman note for checkpoints that need it
      
      * remove conversion script because code formatting breaks the readme
      
      * make WaveNet layers configurable
      
      * rename variables
      
      * simplifying the math
      
      * output attentions and hidden states
      
      * remove VitsFlip in flow model
      
      * also got rid of the other flip
      
      * fix tests
      
      * rename more variables
      
      * rename tokenizer, add phonemization
      
      * raise error when phonemizer missing
      
      * re-order config docstrings to match method
      
      * change config naming
      
      * remove redundant str -> list
      
      * fix copyright: vits authors -> kakao enterprise
      
      * (mean, log_variances) -> (prior_mean, prior_log_variances)
      
      * if return dict -> if not return dict
      
      * speed -> speaking rate
      
      * Apply suggestions from code review
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * update fused tanh sigmoid
      
      * reduce dims in tester
      
      * audio -> output_values
      
      * audio -> output_values in tuple out
      
      * fix return type
      
      * fix return type
      
      * make _unconstrained_rational_quadratic_spline a function
      
      * all nn's to accept a config
      
      * add spectro to output
      
      * move {speaking rate, noise scale, noise scale duration} to config
      
      * path -> attn_path
      
      * idxs -> valid idxs -> padded idxs
      
      * output values -> waveform
      
      * use config for attention
      
      * make generation work
      
      * harden integration test
      
      * add spectrogram to dict output
      
      * tokenizer refactor
      
      * make style
      
      * remove 'fake' padding token
      
      * harden tokenizer tests
      
      * ron norm test
      
      * fprop / save tests deterministic
      
      * move uroman to tokenizer as much as possible
      
      * better logger message
      
      * fix vivit imports
      
      * add uroman integration test
      
      * make style
      
      * up
      
      * matthijs -> sanchit-gandhi
      
      * fix tokenizer test
      
      * make fix-copies
      
      * fix dict comprehension
      
      * fix config tests
      
      * fix model tests
      
      * make outputs consistent with reverse/not reverse
      
      * fix key concat
      
      * more model details
      
      * add author
      
      * return dict
      
      * speaker error
      
      * labels error
      
      * Apply suggestions from code review
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/vits/convert_original_checkpoint.py
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * remove uromanize
      
      * add docstrings
      
      * add docstrings for tokenizer
      
      * upper-case skip messages
      
      * fix return dict
      
      * style
      
      * finish tests
      
      * update checkpoints
      
      * make style
      
      * remove doctest file
      
      * revert
      
      * fix docstring
      
      * fix tokenizer
      
      * remove uroman integration test
      
      * add sampling rate
      
      * fix docs / docstrings
      
      * style
      
      * add sr to model output
      
      * fix outputs
      
      * style / copies
      
      * fix docstring
      
      * fix copies
      
      * remove sr from model outputs
      
      * Update utils/documentation_tests.txt
      
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * add sr as allowed attr
      
      ---------
      
      Co-authored-by: default avatarsanchit-gandhi <sanchit@huggingface.co>
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      4ece3b94
  36. 29 Aug, 2023 2 commits
  37. 25 Aug, 2023 1 commit
    • Arthur's avatar
      [`CodeLlama`] Add support for `CodeLlama` (#25740) · 015f8e11
      Arthur authored
      
      * add all
      
      * Revert "Delete .github directory"
      
      This reverts commit 9b0ff7b052e2b20b629a26fb13606b78a42944d1.
      
      * make conversion script backward compatible
      
      * fixup
      
      * more styling
      
      * copy to llama changes
      
      * fix repo consistency
      
      * nits
      
      * document correct classes
      
      * updates
      
      * more fixes
      
      * nits
      
      * update auto mappings
      
      * add readmes
      
      * smallupdates
      
      * llama-code replace with llama_code
      
      * make fixup
      
      * updates to the testsing suite
      
      * fix fast nits
      
      * more small fixes
      
      * fix decode
      
      * fix template processing
      
      * properly reset the normalizer
      
      * nits processor
      
      * tokenization tests pass
      
      * styling
      
      * last tests
      
      * additional nits
      
      * one test is left
      
      * nits
      
      Co-authored-by faabian <faabian@users.noreply.github.com>
      
      * update failing test
      
      * fixup
      
      * remove decode infilling users should handle it on their onw after generation, padding can be a problem
      
      * update
      
      * make test slow and more meaningfull
      
      * fixup
      
      * doc update
      
      * fixup
      
      * Apply suggestions from code review
      
      * add kwargs doc
      
      * tokenizer requires `requires_backend`
      
      * type requires_backends
      
      * CodeLlama instead of LlamaCode
      
      * more name cahnges
      
      * nits
      
      * make doctests happy
      
      * small pipeline nits
      
      * last nit
      
      * Apply suggestions from code review
      
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * update
      
      * add codellama to toctree
      
      ---------
      
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      015f8e11