An important part of this kind of model is that it is not a "chat model" in the way that we're used to using gpt4/llama.
https://www.latent.space/p/o1-skill-issue
This is a good conceptual model of how to think about this kind of model. Really exploit the large context window.