• Bo Zheng's avatar
    Add Qwen2MoE (#29377) · 1c39974a
    Bo Zheng authored
    
    
    * add support for qwen2 MoE models
    
    * update docs
    
    * add support for qwen2 MoE models
    
    * update docs
    
    * update model name & test
    
    * update readme
    
    * update class names & readme & model_doc of Qwen2MoE.
    
    * update architecture name
    
    * fix qwen2_moe tests
    
    * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
    
    * update modeling_qwen2_moe.py
    
    * fix model architecture
    
    * fix qwen2_moe tests
    
    * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
    
    * update modeling_qwen2_moe.py
    
    * fix model architecture
    
    * fix style
    
    * fix test when there are sparse and non sparse layers
    
    * fixup
    
    * Update README.md
    
    Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * fixup
    
    * fixup
    
    * add archive back
    
    * add support for qwen2 MoE models
    
    * update docs
    
    * update model name & test
    
    * update readme
    
    * update class names & readme & model_doc of Qwen2MoE.
    
    * update architecture name
    
    * fix qwen2_moe tests
    
    * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
    
    * update modeling_qwen2_moe.py
    
    * fix model architecture
    
    * fixup
    
    * fix qwen2_moe tests
    
    * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
    
    * fix style
    
    * fix test when there are sparse and non sparse layers
    
    * fixup
    
    * add archive back
    
    * fix integration test
    
    * fixup
    
    ---------
    
    Co-authored-by: default avatarbozheng-hit <dsoul0621@gmail.com>
    Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
    1c39974a