Isn't that some kind of gambling if you offload random experts onto the CPU?
Or is it only layers but that would affect all Experts?
Pretty sure all partial offload systems I’ve seen work by layers, but there might be something else out there.
Pretty sure all partial offload systems I’ve seen work by layers, but there might be something else out there.