Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • P PRA24-Convolution
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Deployments
    • Deployments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • lxy
  • PRA24-Convolution
  • Repository
  • Branches
  • Overview
  • Active
  • Stale
  • All
  • commit/11-04-580.4203
    f0dbd026 · Merge Optmization on tranforms · Nov 04, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • implicit_gemm
    88c39510 · Mixed up the code, can pass the final case, but error when c is not the multiple of 16 · Nov 04, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/matrix-core-fp16fp32-128x64x16-Akxm_Bnxk_Cnxm
    daeedbed · Revert " merge input and filter transform of 4x3" · Nov 04, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/matrix-core-fp16fp32-128x64x16-Akxm_Bnxk_Cnxm-double-buffering
    61f6be3d · Feat: double buffering · Nov 05, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/matrix-core-fp16fp32-256x64x16-Akxm_Bnxk_Cnxm-double-buffering
    096bc939 · not finished yet · Nov 05, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/matrix-core-fp16fp32-Template-Akxm_Bnxk_Cnxm-double-buffering
    fe6a2356 · Feat: double buffering template function · Nov 05, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/matrix-core-fp16fp32-Template-Amxk_Bnxk_Cnxm
    1f9b7b07 · WIP: finished gemm, not validated · Nov 06, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/matrix-core-fp16fp32-128x64x16-Akxm_Bnxk_Cnxm-winograd-2x3
    71929978 · WIP: Use vertor type for read GMEM · Nov 07, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/matrix-core-fp16fp32-32x32x16-fuse-winograd2x3
    cc78ca3a · Feat: eliminated lds operation of image tile · Nov 07, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/32x32x16-fused-VUY-as-Union
    6c092e3c · Feat: shrink lds size from 32KiB to 16KiB · Nov 08, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/matrix-core-fp16fp32-Template-Akxm_Bnxk_Cnxm
    df7d5c47 · Feat: multi-warp gemm_batched_kernel. · Nov 08, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • dev/matrix-core-fp16fp32-nonfused-input-transform
    0cd870c5 · add the template of output transform · Nov 09, 2024
    Compare
    Download source code
    zip tar.gz tar.bz2 tar
  • Prev
  • 1
  • 2
  • Next