logoalt Hacker News

goku12yesterday at 8:20 PM8 repliesview on HN

> It is amazing that big endian is almost dead.

I wish the same applied to written numbers in LTR scripts. Arithmetic operations would be a lot easier to do that way on paper or even mentally. I also wish that the world would settle on a sane date-time format like the ISO 8601 or RFC 3339 (both of which would reverse if my first wish is also granted).

> It will be relegated to the computing dustbin like non-8-bit bytes and EBCDIC.

I never really understood those non-8-bit bytes, especially the 7 bit byte. If you consider the multiplexer and demux/decoder circuits that are used heavily in CPUs, FPGAs and custom digital circuits, the only number that really makes sense is 8. It's what you get for a 3 bit selector code. The other nearby values being 4 and 16. Why did they go for 7 bits instead of 8? I assume that it was a design choice made long before I was even born. Does anybody know the rationale?


Replies

idoubtityesterday at 8:43 PM

> I also wish that the world would settle on a sane date-time format like the ISO 8601

IIRC, in most countries the native format is D-M-Y (with varying separators), but some Asian countries use Y-M-D. Since those formats are easy to distinguish, that's no problem. That's why Y-M-D is spreading in Europe for official or technical documents.

There's mainly one country which messes things up...

show 5 replies
pavontoday at 4:41 AM

There are a lot of computations where 256 is too small of a range but 65536 is overkill. When designers of early computers were working out how many digits of precision their calculations needed to have for their intended purpose 12 bits commonly ended up being a sweet spot.

When your RAM is vacuum tubes or magnetic core memory, you don't want 25% of it to go unused, just to round your word size up a power of two.

show 1 reply
jcranmeryesterday at 9:25 PM

I don't know that 7-bit bytes were ever used. Computer word sizes have historically been multiples of 6 or 8 bits, and while I can't say as to why particular values were chosen, I would hypothesize that multiples of 6 and 8 work well for representation in octal and hexadecimal respectively. For many of these early machines, sub-word addressability wasn't really a thing, so the question of 'byte' is somewhat academic.

For the representation of text of an alphabetic language, you need to hit 6 bits if your script doesn't have case and 7 bits if it does have case. ASCII ended up encoding English into 7 bits and EBCDIC chose 8 bits (as it's based on a binary-coded decimal scheme which packs a decimal digit into 4 bits). Early machines did choose to use the unused high bit of an ASCII character stored in 8 bits as a parity bit, but most machines have instead opted to extend the character repertoire in a variety of incompatible ways, which eventually led to Unicode.

show 4 replies
creshaltoday at 7:27 AM

> both of which would reverse if my first wish is also granted

But why? The brilliance of 8601/3339 is that string sorting is also correct datetime sorting.

show 1 reply
blahedoyesterday at 8:54 PM

I believe that 10- and 12-bit bytes were also attested in the early days. As for "why": the tradeoffs are different when you're at the scale that any computer was at in the 70s (and 60s), and while I can't speak to the specific reasons for such a choice, I do know that nobody was worrying about scaling up to billions of memory locations, and also using particular bit combinations to signal "special" values was a lot more common in older systems, so I imagine both were at play.

formerly_provenyesterday at 8:52 PM

Computers never used 7-bit bytes similarly to how 5-bit bytes were uncommon, but both 6-bit and 8-bit bytes were common in their respective eras.

show 1 reply
globular-toasttoday at 7:03 AM

In Britain the standard way to write a date has always been, e.g "12th March 2023” or 12/3/2023 for short. Don't think there's a standard for where to put the time, though, I can imagine it both before and after.

Doing numbers little-endian does make more sense. It's weird that we switch to RTL when doing arithmetic. Amusingly the Wikipedia page for Hindu-Arabic numeral system claims that their RTL scripts switch to LTR for numbers. Nope... the inventors of our numeral system used little-endian and we forgot to reverse it for our LTR scripts...

Edit: I had to pull out Knuth here (vol. 2). So apparently the original Hindu scripts were LTR, like Latin, and Arabic is RTL. According to Knuth the earliest known Hindu manuscripts have the numbers "backwards", meaning most significant digit at the right, but soon switched to most significant at the left. So I read that as starting in little-endian but switching to big-endian.

These were later translated to Arabic (RTL), but the order of writing numbers remained the same, so became little-endian ("backwards").

Later still the numerals were introduced into Latin but, again, the order remained the same, so becoming big-endian again.

show 1 reply