I gave Replicate a shot but needed to run on my own GPUs, so I initially used Cog to port the workload.
I quickly realized Cog was an obstacle rather than an accelerator. I replaced it with a lightweight FastAPI layer, which immediately unblocked me:
1. Native I/O with Google Cloud Storage.
2. Freedom to use the latest Torch and Nvidia Docker images without abstraction overhead.
3. Running Torch and TensorFlow in parallel (legacy model constraints that Cog struggled with).
It forces the question: What is Replicate's value proposition for a startup where the founders are competent engineers? If you aren't afraid of a Dockerfile, the "ease of use" premium evaporates.The answer to that question is likely this acquisition.
The standalone AI middleware market is precarious; the landscape shifts too fast and technical founders will eventually outgrow the training wheels.
Folding into Cloudflare gives the team a sustainable home to leverage the platform's scale, rather than competing solely on a container abstraction layer.
Wish them the best. Cloudflare’s infrastructure is likely the right environment to turn this into a high-leverage product
Me too. After trying it out I found Cog to be super frustrating to use and the only use case for me ended-up being trying out a new model through the web UI occasionally.