• Yoach Lacombe's avatar
    Add new meta w2v2-conformer BERT-like model (#28165) · d2cdefb9
    Yoach Lacombe authored
    
    
    * first commit
    
    * correct default value non causal
    
    * update config and modeling code
    
    * update converting checkpoint
    
    * clean modeling and fix tests
    
    * make style
    
    * add new config parameters to docstring
    
    * fix copied from statements
    
    * Apply suggestions from code review
    
    Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    
    * make position_embeddings_type docstrings clearer
    
    * clean converting script
    
    * remove function not used
    
    * clean modeling file
    
    * apply suggestion for test file + add convert script to not_doctested
    
    * modify tests according to review - cleaner logic and more tests
    
    * Apply nit suggestions from code review
    
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * add checker of valid position embeddings type
    
    * instantiate new layer norm layer with the right eps
    
    * fix freeze_feature_encoder since it can be None in some cases
    
    * add test same output in convert script
    
    * restore wav2vec2conformer and add new model
    
    * create processor and FE + clean
    
    * add new model code
    
    * fix convert script and set default config parameters
    
    * correct model id paths
    
    * make style
    
    * make fix-copies and cleaning files
    
    * fix copied from statements
    
    * complete .md and fixe copies
    
    * clean convert script argument defaults
    
    * fix config parameters docstrings
    
    * fix config docstring
    
    * add copied from and enrich FE tests
    
    * fix copied from and repo-consistency
    
    * add autotokenizer
    
    * make test input length shorter and change docstring code
    
    * fix docstrings and copied from
    
    * add add_adapter to ASR training example
    
    * make testing of adapters more robust
    
    * adapt to multi adapter layers
    
    * refactor input_values->input_features and remove w2v2-bert feature extractor
    
    * remove pretraining model
    
    * remove depreciated features and useless lines
    
    * add copied from and ignore statements to modeling tests
    
    * remove pretraining model #2
    
    * change import in convert script
    
    * change default in convert script
    
    * update readme and remove useless line
    
    * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py
    
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * refactor BERT to Bert for consistency
    
    * remove useless ignore copy statement
    
    * add persistent to buffer in rotary
    
    * add eps in LayerNorm init and remove copied from
    
    * add adapter activation parameters and add copied from statements
    
    * Fix copied statements and add unitest.skip reasons
    
    * add copied statement in test_processor
    
    * refactor processor
    
    * make style
    
    * replace numpy random by torch rand
    
    * remove expected output CTC
    
    * improve converting script with processor class
    
    * Apply suggestions from code review
    
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * remove gumbel class
    
    * remove tests related to previously deleted class
    
    * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py
    
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * correct typos
    
    * remove uused parameters
    
    * update processor to takes both text and audio
    
    * update checkpoints
    
    * update expected output and add ctc expected output
    
    * add label_attention_mask
    
    * replace pt with np in processor tests
    
    * fix typo
    
    * revert to behaviour with labels_attention_mask
    
    ---------
    
    Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    d2cdefb9