logoalt Hacker News

jaberjaber23last Tuesday at 2:43 PM3 repliesview on HN

We’re RightNow AI. We built a tool that automatically profiles, detects bottlenecks, and generates optimized CUDA kernels using AI.

If you’ve written CUDA before, you know how it goes. You spend hours tweaking memory access, digging through profiler dumps, swapping out intrinsics, and praying it’ll run faster. Most of the time, you're guessing.

We got tired of it. So we built something that just works.

What RightNow AI Actually Does Prompt-based CUDA Kernel Generation Describe what you want in plain English. Get fast, optimized CUDA code back. No need to know the difference between global and shared memory layouts.

Serverless GPU Profiling Run your code on real GPUs without having local hardware. Get detailed reports about where it's slow and why.

Performance Optimizations That Deliver Not vague advice like “try more threads.” We return rewritten code. Our users are seeing 2x to 4x improvements out of the box. Some hit 20x.

Why We Built It We needed it for our own work. Our ML stack was bottlenecked by GPU code we didn’t have time to optimize. Existing tools felt ancient. The workflow was slow, clunky, and filled with trial and error.

We thought: what if I could just say "optimize this kernel for A100" and get something useful?

So we built it.

RightNow AI is live. You can try it for freee: https://www.rightnowai.co/

If you use it and hit something rough, tell us. We’ll fix it.


Replies

paulirishyesterday at 8:55 PM

What does one of the GPU profiling reports look like?

Edit: oh is it this? https://youtu.be/b-yh3FFpSX8?t=28

show 1 reply
3abitontoday at 5:23 AM

Howis this different than what unsloth is doing?

show 1 reply