1. /r/localllama unanimously doesn't like the Spark for running models
2. and for CUDA dev it's not worth the crazy price when you can dev on a cheap RTX and then rent a GH or GB server for a couple of days if you need to adjust compatibility and scaling.
What’s GH and GB server?