This is so cool! I just learned about this last week. For reference, I do molecular dynamics (my own engine, in rust), and measuring temperature is an important part of the simulation. (So you can nudge it to a target temperature, for example). An important component of this calculation is the degrees of freedom of the system. Calculating this depends on your model. For example, are you representing atoms that can each move on their own? Rigid molecules of multiple atoms that can rotate? Are you removing center-of-mass velocity from the system.
This DOF component also is why the general, measurable concept of temperature can apply to both our real systems, and simple point-atom models. (Or coarser ones). It is, not surprisingly, at the heart of why negative temperature exists!
negative temperature in this case is a sampling thing. When you sample from a table of tokens, the equation for the probability of token i is p_i = exp(logit_i/T) / sum_j(exp(logit_j/T))
Not really related to molecular dynamics temperature except superficially in terms of phenomenology (higher temperature crosses activation barriers in the joint probability landscape). Negative temperature makes no sense in MD
The simplest physical model that can exhibit negative temperatures is a spin lattice in a state that has more energy than a state at infinite temperature. Adding more energy to such a system reduces the entropy.