Star Control II was released in 1992, a year before Doom, and it was able to play beautiful, rich 4-channel MOD music via PC speaker, on a 80386 @ 40Mhz.
Scream Tracker, a music composition software, was able to pull of the same feat, 4 channels of 8-bit voices, in 1990.
However cool and useful the PC speaker output was, it was the a hand-soldered "Covox" lookalike, a passive DAC built out of a resistor ladder and attached to the printer port, which you actually connected to your hi-fi amplifier.
No, RealSound was not a Covox-like hardware dongle. It was PC speaker only. Play the first few minutes of Mean Streets or Martian Memorandum or Countdown in DOSBox and you'll hear it.
Fast Tracker 2, admittedly "bit" later than 1990, could route playback of however many channels you used to the speaker.
Worth noting that the quality in these cases was pretty good. A bit staticky but still well above Wolfenstein 3D sound effects most people associate with PC Speaker (covox-less).