Sure, but if you're making a point about LLMs in general, you need to use examples from best-in-class models. Otherwise your examples of how these models fail are meaningless. It would be like complaining about how smartphone cameras are inherently terrible, but all your examples of bad photos aren't labeled with what phone was used to capture. How can anyone infer anything meaningful from that?