Skip to content
GitLab
Projects
Groups
Topics
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
lxy
PRA24-Convolution
Repository
Branches
Overview
Active
Stale
All
commit/11-04-580.4203
f0dbd026
·
Merge Optmization on tranforms
·
Nov 04, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
implicit_gemm
88c39510
·
Mixed up the code, can pass the final case, but error when c is not the multiple of 16
·
Nov 04, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/matrix-core-fp16fp32-128x64x16-Akxm_Bnxk_Cnxm
daeedbed
·
Revert " merge input and filter transform of 4x3"
·
Nov 04, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/matrix-core-fp16fp32-128x64x16-Akxm_Bnxk_Cnxm-double-buffering
61f6be3d
·
Feat: double buffering
·
Nov 05, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/matrix-core-fp16fp32-256x64x16-Akxm_Bnxk_Cnxm-double-buffering
096bc939
·
not finished yet
·
Nov 05, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/matrix-core-fp16fp32-Template-Akxm_Bnxk_Cnxm-double-buffering
fe6a2356
·
Feat: double buffering template function
·
Nov 05, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/matrix-core-fp16fp32-Template-Amxk_Bnxk_Cnxm
1f9b7b07
·
WIP: finished gemm, not validated
·
Nov 06, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/matrix-core-fp16fp32-128x64x16-Akxm_Bnxk_Cnxm-winograd-2x3
71929978
·
WIP: Use vertor type for read GMEM
·
Nov 07, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/matrix-core-fp16fp32-32x32x16-fuse-winograd2x3
cc78ca3a
·
Feat: eliminated lds operation of image tile
·
Nov 07, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/32x32x16-fused-VUY-as-Union
6c092e3c
·
Feat: shrink lds size from 32KiB to 16KiB
·
Nov 08, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/matrix-core-fp16fp32-Template-Akxm_Bnxk_Cnxm
df7d5c47
·
Feat: multi-warp gemm_batched_kernel.
·
Nov 08, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
dev/matrix-core-fp16fp32-nonfused-input-transform
0cd870c5
·
add the template of output transform
·
Nov 09, 2024
Compare
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
Prev
1
2
Next