Maybe a really lightweight fast LLM could moderate messages in realtime. Not sure how pricey that would get though.
Extremely if not behind substantially cheaper anti spam measures.
Further it may still make sense to use human reports to gate some automation even if it slows response.
No matter how cheap per request, someone will figure out a way to DoS that endpoint, and it will be extremely pricy unless you have effective rate limiting.