logoalt Hacker News

The Perils of ISBN

84 pointsby evakhouryyesterday at 5:34 PM44 commentsview on HN

Comments

amiga386yesterday at 9:56 PM

This reminds me of MusicBrainz, whose database stores "release groups", e.g. the album Nevermind by Nirvana is one, which can have hundreds of "releases", as different media (tape, CD, LP, promo, ...), different countries, later re-issues, etc. [0]

Sometimes these have different catalogue numbers or barcodes to distinguish them, sometimes they don't but they're still different. I've seen releases where the only difference is the label in the centre of the LP, or the back of the CD case has a two-column tracklisting vs a one-column tracklisting. Music publisher uses the same code and says it's identical and yet it's clearly not.

Then there's the "recordings" on an album, which even if they're never re-recorded can still end up chopped up, bleeped or remastered. They're not the same sound. MusicBrainz likes to track when they are exactly the same recording (e.g. the LP recording of a song appearing on a compilation album verbatim) and when they're not (e.g. radio edits of the LP recording). And if we're going beyond recordings by one artist of "their" song, i.e. cover versions, or just plain standards, those are "works", with composers, lyricists, and can be recorded thousands of times by different artists...

I greatly appreciate the pedantry and flexibility for noting down when creative works are the same versus where they differ, in relational database form.

[0] https://musicbrainz.org/release-group/1b022e01-4da6-387b-865...

show 3 replies
nomdeptoday at 1:58 AM

A good time to remember that the Open Library came to be thanks to the initial work of Brewster Kahle (founder of the Internet Archive) and Aaron Swartz (RIP) http://www.aaronsw.com/weblog/openlibrary

idoubtittoday at 12:58 AM

Wikidata is a FRBR-compatible public database of books. I don't know if it's good enough for the kind of books the author wants, but in recent years the quality of wikidata greatly increased for the books that deal with (about 1000 items).

BTW, they misunderstood their own example of "Hotel Iris" by Yoko Ogawa when they wrote "the same work is duplicated four times." In fact, those four entries in the list point to distinct works.

One of these is a French publication by the publisher Actes Sud. Translations are not the same work as the original. They are derived works.

But it's true this list is a mess. Another entriy has 3 editions, one in English and two in Spanish, so it's obviously an error that mixes two distinct works.

show 1 reply
saithiryesterday at 11:22 PM

Sometimes we definitely want 'items' though, so for example I am in a physical bookstore and see a book I might be interested in, so I buy it, to find out later back home that I already have the very same book - and edition - already. It's a scenario that anyone with some amount of books definitely encountered multiple times, I know I did it myself a few times. :)

Ability of an ISBN search of my collection would have helped me in this case - scanning a barcode is easy enough task to accomplish.

And even if I had a different edition, the resulting title from searching for a different edition would be enough to help me figure out that I should not buy a book I already own.

millicentricismyesterday at 9:40 PM

This also fails to take into account that ISBNs also contain the publisher ID in them. So identical copies of a book could have different ISBNs depending on which markets they are sold in.

show 2 replies
rahimnathwaniyesterday at 9:34 PM

I'm not sure we always want 'works'. Sometimes different 'expressions' of the same work are different enough that they don't have the same value.

For example, compare the most recent edition of 'Straight and crooked thinking' with the one published in 1930.

show 2 replies
jdranczewskiyesterday at 11:01 PM

If anyone in the comments is in a similar predicament to the author and would like a book logging app, I will say that I disagree on their judgement of StoryGraph - I've found it a pretty decent interface, the search function is very good, and the (anti)features mentioned in the footnote are incredibly easy to not use, as the creators seem to understand that many of their users have a very strong preference to avoid AI bloat.

show 1 reply
mmoosstoday at 2:59 AM

> there’s a distinction between the work (the book The Last Unicorn), the expression (a given edition of the book), a manifestation (a given physical format for an expression, such as paperback or hardcover), and an item (an individual object in a collection)

The author misunderstands 'work', as far as I know: A work is "intellectual or artistic content of a distinct creation. It refers to a very abstract idea of a creation e.g. Shakespeare's Romeo and Juliet and not a specific expression."[0]

In contrast, an "expression" is an "intellectual or artistic realization of a work. The realization may take the form of text, sound, image, object, movement, etc., or any combination of such forms."[0]

The Last Unicorn story is the work, "the book The Last Unicorn" is an expression as would be the film version or the computer game, etc.

[0] https://www.ifla.org/references/best-practice-for-national-b... (as of a few years ago)

jiggawattsyesterday at 10:23 PM

My state had a reading competition that listed books by ISBN, which was a real challenge for students to track down. Each library had different editions and even different cover art, so if you “found” the book you might not recognise it on the shelf, etc…

I worked on the library systems and one of my innovations was to use the ISBN mapping database of WorldCat to find books with identical content but different ISBNs to help kids find the books on the list.

Over ten years that one SQL join in the code made the kids read an extra million books they wouldn’t have otherwise.

My biggest “bang for buck” in my career!

show 1 reply
gerdesjyesterday at 11:28 PM

When you delve into real domain specific knowledge, surprises often surface and it turns out that what you might think is a simple thing is actually rather complicated.

I'm mildly surprised at exactly how successful ISBNs are. I worked in a book wholesaler's warehouse 35 odd years ago and the ISBN was used as the product code by the "system". I'd get a series of picking lists for pallets on good old green "staved" fan fold. I'd whizz around the warehouse with my trolley and pick from paper packets of books. The product lines had the rack and bay, last four from the SBN, quantity, title and full SBN. The packets of books had the rack/bay/last four from SBN printed on a label in large and small other details. I got very good at optimising my course around the warehouse and could pick at a right old rate, whilst listening to my mini cassette player. Its pretty boring work so you might as well game it!

Sometimes an individual book might fall off my trolley and be dumped in the big cardboard "skip" for rejects. For some reason casualties around me generally involved subjects like maths, material sciences, geology, surveying, hydrology. Oh and fractals!

I graduated in civil engineering.

Anyway. Surely all of us here know that really getting to grips with defining what it is that you are cataloguing/indexing/numbering/whatever and why can be quite tricky.

Both Dewey and SBNs catalogue "books" but for very different reasons. Both systems are extremely successful. You might think that in our world of LLMs n that, that books, Dewey and SBNs will go the way of the dodo.

Perhaps, but I doubt it.

Right, bugger all this old school nonsense. I've got a C64 (it rocks a SD card interface and a HDMI out (via SCART - must sort that out)) blinking away on my telly in the sittingroom and some mutant camels need a bloody good kicking.

toomuchtodoyesterday at 8:45 PM

If the author sees this comment, https://news.ycombinator.com/item?id=43168838 might be relevant as it relates to catalogue completeness. OpenLibrary is very good, but Anna's Archive is potentially more complete.

CodesInChaosyesterday at 9:50 PM

I read that it's much worse than that, and there are ISBNs that were reused for completely different books.

show 1 reply
bell-cotyesterday at 9:33 PM

The first few para's of https://en.wikipedia.org/wiki/ISBN are a better summary of the issue.

tl;dr; - The ISBN is intended to be a physical Part Number, within the book business. Where "hardcover, or paperback, or trade paperback, or large print, or revised edition, or ..." very much matters.

KPGv2yesterday at 11:55 PM

> why isn't there a letterboxd for books

There is. https://hardcover.app

I used Letterboxd a lot before kids. I used Goodreads until the Trump inauguration when I de-Amazon'd myself as much as possible (Amazon owns Goodreads). I switched to Hardcover, which is a much better interface. There are ways to improve, but overall I prefer it over Goodreads.

NoMoreNicksLeftyesterday at 11:54 PM

>Uh-oh. Why do we have so many distinct versions of The Last Unicorn? Well, each distinct format of a work has its own ISBN (so a hardcover, paperback, and eBook all have different ISBNs),

This isn't even the half of it. On some digital books, I'll find a dozen ISBNs in the front matter. Of course there's the hardback, the clothbound (not always the same as the hardback), the alk. paper variant, paperback, trade paperback, epub, pdf, "Adobe digital", and "master digital e-book" (no idea what that even is myself). And that's all just issued together. If they reprint, it won't get a new ISBN, but if the rights convey to another publisher, that one will get a whole 'nother set again. Some popular titles likely have low hundreds of ISBNs, and keep in mind that these have only been a thing since the late 1960s (9 digit ISBNs, technically just SBNs back then). Then with the now dead paperback trade, you could go through a dozen different covers for the most popular books (King, etc) but they'd all use the same ISBN.

Then, and this one bites me the most... if archive.org scans in a hardback with its ISBN, what do I use for the scanned pdf? I've decided that for lack of a better alternative I have to use it, but if the publisher made their own pdf (even just scanning the hardback), then it is supposed to issue a new ISBN to it.

Cataloging my own library, I've had to use a hodgepodge of unique ids. ASINs, ISBNs, Worldcat's OCLC numbers, Open Library's, and a few others besides. And it still comes up short. The number of oddball publishers and pamphlets and so forth that have never been cataloged anywhere is enormous.

davtyan1202yesterday at 9:38 PM

[flagged]