logoalt Hacker News

reilly3000today at 12:43 AM1 replyview on HN

If you’re already using GCP, Vertex AI is pretty good. You can run lots of models on it:

https://docs.cloud.google.com/vertex-ai/generative-ai/docs/m...

Lambda.ai used to offer per-token pricing but they have moved up market. You can still rent a B200 instance for sub $5/hr which is reasonable for experimenting with models.

https://app.hyperbolic.ai/models Hyperbolic offers both GPU hosting and token pricing for popular OSS models. It’s easy with token based options because usually are a drop-in replacement for OpenAI API endpoints.

You have you rent a GPU instance if you want to run the latest or custom stuff, but if you just want to play around for a few hours it’s not unreasonable.


Replies

verdvermtoday at 12:49 AM

GCloud and Hyperbolic have been my go-to as well