Take a look at the harmony repo which specifies the internal OpenAI format - the effort level is specified in the context after the <|start|> tag - https://github.com/openai/harmony
Note that inference libs also have parsers that put hard limits on reasoning tokens with separate counters (similar to how you can put a limit on token generation per completion versus waiting for an <eos>). For that, take a look at vllm reasoning docs.
Examples with inference of different reasoning effort levels is in the OpenAI docs as well - https://developers.openai.com/cookbook/articles/openai-harmo...
https://docs.vllm.ai/en/latest/features/reasoning_outputs/#a...
https://developers.openai.com/api/docs/guides/reasoning