Bryce Adelstein Lelbach’s Post

We just announced cuTile, a tile programming model for CUDA! It's an array-based paradigm where the compiler automates mem movement, pipelining & tensor core utilization, making GPU programming easier & more portable. It's built on top of a powerful new compiler stack and MLIR dialect called Tile IR. I've been lucky enough to work with the amazing team behind this over the last few months. Tile programming is a big deal and will be transformative for CUDA!

  • No alternative text description for this image
  • No alternative text description for this image
  • No alternative text description for this image
  • No alternative text description for this image

Wow! This is awesome. How can I begin to explore it awesome capabilities?

Like
Reply

Interesting, hopefully it will be faster than cuda malloc on the fly for dynamic arrays

Like
Reply

Looking forward to checking it out and hoping there are general purpose applications for this, non ML and such.

Like
Reply

When can we expect APL symbols to make an appearance 😅 More seriously, this looks quite elegant, awesome job!

Like
Reply

Is it a nVidia version of Triton? Seems both of them are at "CTA Tile" level.

Like
Reply

any chance of these coming to open standards for gpu programming like vulkan?

Like
Reply

hope the compiler stack is open source.. so cutile paradigm can be ported eventually to CUDA like GPU stacks like AMD HIP and MUSA

Like
Reply
See more comments

To view or add a comment, sign in

Explore content categories