Skip to content
GitLab
Projects
Groups
Topics
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
lxy
PRA24-Convolution
Tags
Tags give the ability to mark specific points in history as being important
commit-11-09-462.5048-final
08d315c4
·
Revert (Add merged transform kernels in F(2x2,3x3) back).
·
Nov 09, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
commit-11-09-462.9656
7401e6df
·
Feat: transform per image on F(4x4,3x3).
·
Nov 09, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
commit-11-09-482.9532
0c9c4476
·
Tring to eliminate bank conflict.
·
Nov 09, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
commit-11-08-526.3507
468e197d
·
Merge GMEM read optimization of image and filter in winograd_2x3_fused.cpp from
cc78ca3a
·
Nov 07, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
commit-11-05-566.3518
14c900c3
·
Delete comments & useless function and add performance for last commit.
·
Nov 05, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
commit-11-05-565.2064
b2442f98
·
Revert "Feat: Merge Input & Filter transforms in F(4x4,3x3)"
·
Nov 05, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
commit-11-04-574.4697
6ac6c4f8
·
Add 8 warp 128x64x16 gemm back.
·
Nov 04, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
commit-11-03-584.8657
5ddb5a6b
·
Feat: Merge F(2x2,3x3) non-fuseed kernel
·
Nov 03, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
commit-11-03-655.8940
f7176627
·
Manual merge
03bb27b9
(dev/matrix-core-fp16fp32-64x64x16-Akxm_Bnxk_Cnxm)
·
Nov 03, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
32x64x16-fuse-winograd2x3
62644190
·
Just rename some variables and use macro for s_nop
·
Nov 02, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
64x64x16-fuse-2x3
a6f7e65a
·
Can't pass validation and slow
·
Nov 02, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
other-mxmxk-blocked-gemm
f1fbc179
·
add the batch kernel of 128x64x16
·
Nov 02, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
commit-11-01-745.2700
583d053a
·
Fix: lost error++ in main()
·
Nov 01, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
test-100-iter-stability
27c60d1b
·
Test 100 iter for verifing stability
·
Nov 01, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
max-relative-error
f4385591
·
Add: max relative error, remember not commit this
·
Nov 01, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
16x16x16-fuse-winograd2x3
def6b7a1
·
Delete __syncthreads in if() at filter&img trans
·
Nov 01, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
16x16x16-fuse
5871a2b9
·
Feat: now can handle different m, n, k
·
Oct 31, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
32x32x16-AkxmBnxkCnxm_trans_per_img
16facd88
·
Bad Performance
·
Oct 30, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
32x32x16-AkxmBnxkCnxm_grid_mnk_batch
d1e415b6
·
Slow , and diy atomicAdd is wrong
·
Oct 30, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
32x32x16-double-buffer
9dd3aa08
·
tried double buffer
·
Oct 24, 2024
Select Archive Format
Download source code
zip
tar.gz
tar.bz2
tar
Prev
1
2
Next