It's a very nice write-up, but this part makes me uneasy:
> So long as all the computation in the loop finishes before the next quantum, the timing requirements [...] are met.
Seems like we are back to cycle counting then? but instead of having just 32 1-IPC instructions, we have up to 4K instructions with various latency, and there is C compiler too, so even if you had enough cycles in budget now, the things might break when compiler is upgraded.
I am wondering if the original PIO approach was still salvageable if the binary compatibility is not a goal. Because while co-processors are useful, people did some amazing things with PIO, like fully software DVI.
It's easier because you don't need a precise cycle count, just a worst-case threshold.
At this level, yes, you are always cycle counting just to make sure you can make your guarantees.
The previous sentence already answers this:
> Here, we leverage the “quantum” feature to get exact pulse timings without resorting to cycle-counting
This is just a hard-real-time constraint that already exists in today’s computers and other devices.
For example: Audio playback and processing are a day-to-day operations where hard-real-time guarantees are necessary for uninterrupted playback, and every digital audio device already conforms to it. If the buffer is too slow you get playback errors.