Are there any details around how the round-trip and exchange of data (CPU<->GPU) is implemented in order to not be a big (partially-hidden) performance hit?
e.g. this code seems like it would entirely run on the CPU?
print!("Enter your name: ");
let _ = std::io::stdout().flush();
let mut name = String::new();
std::io::stdin().read_line(&mut name).unwrap();
But what if we concatenated a number to the string that was calculated on the GPU or if we take a number: print!("Enter a number: ");
[...] // string number has to be converted to a float and sent to the GPU
// Some calculations with that number performed on the GPU
print!("The result is: " + &the_result.to_string()); // Number needs to be sent back to the CPU
Or maybe I am misunderstanding how this is supposed to work?I'm confused about this: As the article outlines well, Std Rust (over core) buys you GPOS-provided things. For example:
- file system
- network interfaces
- dates/times
- Threads, e.g. for splitting across CPU cores
The main relevant one I can think which applies is an allocator.I do a lot of GPU work with rust: Graphics in WGPU, and Cuda kernels + cuFFT mediated by Cudarc (A thin FFI lib). I guess, running Std lib on GPU isn't something I understand. What would be cool is the dream that's been building for decades about parallel computing abstractions where you write what looks like normal single-threaded CPU code, but it automagically works on SIMD instructions or GPU. I think this and CubeCL may be working towards that? (I'm using Burn as well on GPU, but that's abstracted over)
Of note: Rayon sort of is that dream for CPU thread pools!
How different is it from rust-gpu effort?
UPDATE: Oh, that's a post from maintainers or rust-gpu.
I feel like the title is a bit misleading. I think it should be something like "Using Rust's Standard Library from the GPU". The stdlib code doesn't execute on the GPU, it is just a remote function call, executed on the CPU, and then the response is returned. Very neat, but not the same as executing on the GPU itself as the title implies.