Has anyone compared recently doing something like ModernBERT plus classifier vs. full or lora FT of a small LM like qwen?