Aren't the conceptual relations you describe still, at their core, just search (even if that's extremely reductive)? We know models can interpolate well, but it's still the same probabilistic pattern matching. They identify conceptual relationships based on associations seen in vast training data. It's my understanding that models are still not at all good at extrapolation, handling data "way outside" of their training set.
Also, I was under the impression LLM's can replicate irony and humor simply because that text has specific stylistic properties, and they've been trained on it.
I don't know honestly, I think really the only big hole the current models have is if you have tokens that never get exposed enough to have a good learned embedding value. Those can blow the system out of the water because they cause activation problems in the low layers.
Other than that the model should be able to learn in context for most things based on the component concepts. Similar to how you learn in context.
There aren't a lot of limits in my experience. Rarely you'll hit patterns that are too powerful where it is hard for context to alter behavior, but those are pretty rare.
The models can mix and match concepts quite deeply. Certainly, if it is a completely novel concept that can't be described by a union or subtraction between similar concepts, than the model probably wouldn't handle it. In practice, a completely isolated concept is pretty rare.