Sorry, I'm not the CUDA expert you should be looking for. My efforts are in ML and I only dabble in CUDA and am friends with systems people. I'd suggest reaching out to system people.
I'd suggest you use that NVIDIA connection and reach out to the HPC teams there. Anyone working on CUTLASS, TensorRT, cuTensor, or maybe event the CuPy team could give you a lot better advice than me.
Sorry, I'm not the CUDA expert you should be looking for. My efforts are in ML and I only dabble in CUDA and am friends with systems people. I'd suggest reaching out to system people.
I'd suggest you use that NVIDIA connection and reach out to the HPC teams there. Anyone working on CUTLASS, TensorRT, cuTensor, or maybe event the CuPy team could give you a lot better advice than me.