logoalt Hacker News

dspilletttoday at 11:44 AM0 repliesview on HN

> The original Pentium I believe introduced a second pipeline that required a compiler to optimize for it to achieve maximum performance.

It wasn't a full pipeline, but large parts of the integer ALU and related circuitry were duplicated so that complex (time-consuming) instructions like multiply could directly follow each other without causing a pipeline bubble. Things were still essentially executed entirely in-order but the second MUL (or similar) could start before the first was complete, if it didn't depend upon the result of the first, and the Pentium line had a deeper pipeline than previous Intel chips to take most advantage of this.

The compiler optimisations, and similar manual code changes with the compiler wasn't bright enough, were to reduce the occurrence of instructions depending on the results of the instructions before, which would make the pipeline bubble come back as the subsequent instructions couldn't be started until the current one was complete. This was also a time when branch prediction became a major concern, and further compiler optimisations (and manual coding tricks) were used to help here too, because aborting a deep pipeline because of a branch (or just stalling the pipeline at the conditional branch point until the decision is made) causes quite a performance cost.