That's kinda what NERFs are (neural radience fields). They actually preceeded this Gaussian story, with Gaussians coming in and outperforming them. Maybe they'll merge later for something even better, I don't know enough about them.
NERFs have significantly higher image quality than 3D Gaussian Splatting or more recent similar techniques, though they are much slower to render.
Sure, but NERFs were trying to match your input photos and poses, not some arbitrary prompt, if I understand correctly.