For those that have homebrewed a base model, does your output have the same AI-isms like overusing em dashes? If so/not, what dataset did you use?
Does yours also use the oxford comma and generally more commas?
AFAIK, those are mostly a consequence of posttraining.
that is a post-training artifact
Does yours also use the oxford comma and generally more commas?