The word "thread" is confusing things. In computer science a thread represents a flow of execution, which in concrete terms where execution is a series of function calls, is typically a program counter and a stack.
There are many ways to implement and manage threads. In Unix-like and Windows systems a "thread" is the above, plus a bunch of kernel context, plus implicit preemptive context switching. Because Unix and Windows added threads to their architectures relatively late in their development, each thread has to behave sort of like its own process, capable of running all the pre-existing software that was thread-agnostic. Which is why they have implicit scheduling, large userspace stacks, etc.
But nothing about "thread" requires it to be implemented or behave exactly like "OS threads" do in popular operating systems. People wax on about Async Rust and state machines. Well, a thread is already state machine, too. Async Rust has to nest a bunch of state machine contexts along with space for data manipulated in each function--that's called a stack. So Async Rust is one layer of threading built atop another layer of threading. And it did this not because it's better, but primarily because of legacy FFI concerns and interoperability with non-Rust software that depended on the pre-existing ABIs for stack and scheduling management.
Go largely went in the opposite direction, embracing threads as a first-class concept in a way that makes it no less scalable or cheap than Rust Futures, notwithstanding that Go, too, had to deal with legacy OS APIs and semantics, which they abstracted and modeled with their G (goroutine), M (machine), P (processor) architecture.