Then they made it wrong. For example, "What the actual fuck?" is not getting flagged, neither is "What the *fuck*".
Classic over-engineering. Their approach is just fine 90% of the time for the use case it’s intended for.
They evidently ran a statistical analysis and determined that virtually no one uses those phrases as a quick retort to a model's unsatisfying answer... so they don't need to optimize for them.
It is exceedingly obvious that the goal here is to catch at least 75-80% of negative sentiment and not to be exhaustive and pedantic and think of every possible way someone could express themselves.