logoalt Hacker News

What's up with all those equals signs anyway?

388 pointsby todsacerdotitoday at 9:37 AM114 commentsview on HN

Comments

ruhithtoday at 11:56 AM

The real punchline is that this is a perfect example of "just enough knowledge to be dangerous." Whoever processed these emails knew enough to know emails aren't plain text, but not enough to know that quoted-printable decoding isn't something you hand-roll with find-and-replace. It's the same class of bug as manually parsing HTML with regex, it works right up until it doesn't, and then you get congressional evidence full of mystery equals signs.

show 2 replies
kstrausertoday at 4:11 PM

For context, this is the Lars Ingebrigtsen who wrote the manual for Gnus[0], a common Emacs package for reading email and Usenet. It’s clever, funny, and wildly informative. Lars has probably forgotten more about email parsing than 99% of us here will ever have learned.

The manual itself says[1]:

> Often when I read the manual, I think that we should take a collection up to have Lars psycho-analysed.

0: https://www.gnu.org/software/emacs/manual/html_mono/gnus.htm...

1: https://www.gnus.org/manual.html

TazeTSchnitzeltoday at 3:23 PM

The most interesting thing to me wasn't the equals signs, which I knew are from quoted-printable, but the fact that when an equals sign appears, a letter that should have been preceding or following it is missing. It's as if an off-by-one error has occurred, where instead of getting rid of the equals sign, it's gotten rid of part of the actual text. Perhaps the CRLF/LF thing is part of it.

show 1 reply
tiborsaastoday at 11:09 AM

> We see that that’s a quite a long line. Mail servers don’t like that

Why do mail server care about how long a line is? Why don't they just let the client reading the mail worry about wrapping the lines?

show 9 replies
heikkilevantotoday at 11:42 AM

I thought the article would be about the various meanings of operators like = == === .=. <== ==> <<== ==>> (==) => =~=

show 1 reply
xg15today at 12:22 PM

I'm just wondering why this problem shows up now. Why do lots of people suddenly post their old emails with a defective QP decoder?

> For some reason or other, people have been posting a lot of excerpts from old emails on Twitter over the last few days.

On the risk of having missed the latest meme or social media drama, but does anyone know what this "some reason or other" is?

Edit: Question answered.

show 4 replies
thedanbobtoday at 12:19 PM

I wrote my own email archiving software. The hardest part was dealing with all the weird edge cases in my 20+ year collection of .eml files. For being so simple conceptually, email is surprisingly complicated.

show 1 reply
beejiutoday at 10:44 AM

> So what’s happened here? Well, whoever collected these emails first converted from CRLF (i.e., “Windows” line ending coding) to “NL” (i.e., “Unix” line ending coding). This is pretty normal if you want to deal with email. But you then have one byte fewer:

I think there is a second possible conclusion, which is that the transformation happened historically. Everyone assumes these emails are an exact dump from Gmail, but isn't it possible that Epstein was syncing emails from Gmail to a third party mail server?

Since the Stackoverflow post details the exact situation in 2011, I think we should be open to the idea that we're seeing data collected from a secondary mail server, not Gmail directly.

Do we have anything to discount this?

(If I'm not mistaken, I think you can also see the "=" issue simply by applying the Quoted-Printable encoding twice, not just by mishandling the line-endings, which also makes me think two mail servers. It also explains why the "=" symbol is retained.)

show 2 replies
JKCalhountoday at 1:44 PM

(The title of the blog reminded me the late Bob Pease [1] who had the signature, "What's all this XXX stuff, anyhow?" [2] where XXX might be "noise gain", "capacitor leakage"…)

[1] https://en.wikipedia.org/wiki/Bob_Pease

[2] https://www.qsl.net/n9zia/pease/index.html

quibonotoday at 10:19 AM

CLRF vs LF strikes again. Partly at least.

I wonder why even have a max line length limit in the first place? I.e. is this for a technical reason or just display related?

show 3 replies
jojomoddingtoday at 10:00 AM

https://web.archive.org/web/20260203094902/https://lars.inge...

Did the site get the HN kiss of death?

maartin0today at 12:29 PM

Fun how the archive.today article near the top has this exact issue

https://pastes.io/correspond

https://news.ycombinator.com/item?id=46843805

lordnachotoday at 9:52 AM

I love how HN always floats up the answers to questions that were in my mind, without occupying my mind.

I, too, was reading about the new Epstein files, wondering what text artifact was causing things to look like that.

show 3 replies
voxelghosttoday at 12:07 PM

My main takeaway from this article, is that I want to know what happened to the modified pigs with non-cloven hoofs

lucb1etoday at 12:07 PM

    cat title | sed 's/anyway/in email/'
would save a click for those already familiar with =20 etc.
noduermetoday at 11:24 AM

Great. Can't wait for equal signs to be the next (((whatever this is))). Maybe it's a secret code. j/k

On a side note: There are actually products marketed as kosher bacon (it's usually beef or turkey). And secular Jews frequently make jokes like this about our kosher bros who aren't allowed to eat the real stuff for some dumb reason like it has too many toes.

show 1 reply
MarginalGainztoday at 1:24 PM

"It’s a fascinating case of 'Abstraction Leak'.

We’ve become so accustomed to modern libraries handling encoding transparently that when raw data surfaces (like in these dumps), we often lack the 'Digital Archeology' skills to recognize basic Quoted-Printable.

These artifacts (=20, =3D) are effectively fossils of the transport layer. It’s a stark reminder that underneath our modern AI/React/JSON world, the internet is still largely held together by 7-bit ASCII constraints and protocols from the 1980s.

VoodooJuJutoday at 1:27 PM

[dead]

seydortoday at 10:18 AM

TLDR "=\r\n" was converted to "=\n"

show 2 replies
ValveFan6969today at 2:24 PM

[flagged]

bradortoday at 11:14 AM

Could be worsened by inaccurate optical character recognition in some cases.

Back in those days optical scanners were still used.

zabzonktoday at 12:04 PM

People posting Excel formulae?

ccppurcelltoday at 11:19 AM

Rock dots? You mean diacritics? Yeah someone invented them: the ancient Greeks, idiöt.

show 3 replies