logoalt Hacker News

jcranmeryesterday at 9:25 PM4 repliesview on HN

I don't know that 7-bit bytes were ever used. Computer word sizes have historically been multiples of 6 or 8 bits, and while I can't say as to why particular values were chosen, I would hypothesize that multiples of 6 and 8 work well for representation in octal and hexadecimal respectively. For many of these early machines, sub-word addressability wasn't really a thing, so the question of 'byte' is somewhat academic.

For the representation of text of an alphabetic language, you need to hit 6 bits if your script doesn't have case and 7 bits if it does have case. ASCII ended up encoding English into 7 bits and EBCDIC chose 8 bits (as it's based on a binary-coded decimal scheme which packs a decimal digit into 4 bits). Early machines did choose to use the unused high bit of an ASCII character stored in 8 bits as a parity bit, but most machines have instead opted to extend the character repertoire in a variety of incompatible ways, which eventually led to Unicode.


Replies

cardiffspacemanyesterday at 10:30 PM

On the DEC-10 the word size is 36 bits. There was (an option to include) a special set of instructions to enable any given byte size with bytes packed. Five 7-bit bytes per word, for example, with a wasted bit in each word.

I wouldn’t be surprised if other machines had something like this in hardware.

show 1 reply
int_19htoday at 1:43 AM

> For the representation of text of an alphabetic language, you need to hit 6 bits if your script doesn't have case

Only if you assume a 1:1 mapping. But e.g. the original Baudot code was 5-bit, with codes reserved to switch between letters and "everything else". When ASCII was designed, some people wanted to keep the same arrangement.

goku12today at 3:19 AM

I wasn't asking about word sizes in particular, and had ASCII in mind. Nevertheless, your answer is in the right direction.

dborehamtoday at 4:00 AM

Quick note that parity was never used in "characters stored". It was only ever used in transmission, and checked/removed by hardware[1].

[1] Yes, I remember you could bit-bang a UART in software, but still the parity bit didn't escape the serial decoding routine.