Yes, you can. There are multiple inference providers out there. The problem is, it’s hard to beat the Chinese providers in cost. And you also have to compete with frontier model providers’ subsidized offerings.
They charge the exact same prices. So many people in these comments have no idea what they're talking about. Even if they did charge less, nobody is going to deal with the latency of sending requests to China.
edit: Actually American inference providers are cheaper for Chinese models. There's way more competition here because the Chinese aren't idiots and investing every last dollar they have into data centers for llms that don't make money..
They charge the exact same prices. So many people in these comments have no idea what they're talking about. Even if they did charge less, nobody is going to deal with the latency of sending requests to China.
edit: Actually American inference providers are cheaper for Chinese models. There's way more competition here because the Chinese aren't idiots and investing every last dollar they have into data centers for llms that don't make money..