Modal is great, they even released a deep dive into their LP solver for how they're able to get GPUs so quickly (and cheaply).
Coiled is another option worth looking at if you're a Python developer. Not nearly as fast on cold start as Modal, but similarly easy to use and great for spinning up GPU-backed VMs for bursty workloads. Everything runs in your cloud account. The built-in package sync is also pretty nice, it auto-installs CUDA drivers and Python dependencies from your local dev context.
(Disclaimer: I work with Coiled, but genuinely think it's a good option for GPU serverless-ish workflows. )