The rambly speech is how it "reasons". An LLM can only compute tokens based on the tokens before it. So with a more traditional chat model, it has to compute the answer straight from your question. With a model trained like this, it can lay down a lot of "trains of thought" before it needs to come up with an answer, and potentially they can make computing the final answer easier.