A plan for getting involved in an open source project

I have little experience with open source, but currently I’m neither associated with nor active in any open source project. I want to be, and I have a good plan for it. From my experience, I believe it’s a good plan, so other people who want to get involved but struggle might find it helpful. Keep in mind this is my plan and may not suit everyone.


While many guides paint a rosy picture, keep in mind two key points:

Cuda Matmul with wmma


There are several ways to do matmul in CUDA.

  • Use libraries
    • cutlass
    • cuBlas
  • Implementing your own
    • naive way
    • tiled matmul
    • you can also decide to use shared memory or not

It’s not the end, yet another way to implement matmul is to use wmma. This accro...

Click to read more ...

Learning with chatGPT

Recently I’m learning stuffs with chatGPT. I found that it is exteremly useful to use chatGPT to learn something new.

Some Pros:

  • I can dive into codebase first with few prior knowledge. chatgpt isn’t pretty good at generating code yet but good at explaining concepts like “Vector” in mlir.
  • Enable semantic search, not keyword matching query. For example, I can query like “what is that doing …” this kind of query will yield bad answers with google search, mo...
Click to read more ...

Tiled Matmul 101

I’m extremly poor at thinking about matrices. I’ve seen many people graphically think and draw matrix strides, multiplications, etc…. Yet as a person who work in ML, I think I should understand matmul, even in high-level concepts. This post is about my struggle to understand matmul in HPC environment.

for convenience, I set the notation:

  • A is M by K sized matrix
  • B is K by N sized matrix
  • C = A @ B, is M by N sized matrix

Click to read more ...

Flash Attention Idea

This post is basically a commentary, or more introductory version of this gist. I added some which helped me to understand the article, but more verbosity and confusion may be introduced. Thanks to Kunwar Grover for making a great tutorial on FlashAttention and mlir.linalg.

The goal of this post is to provide conceptual understanding of FlashAttention for peop...

Click to read more ...