logoalt Hacker News

thegrim33yesterday at 9:29 PM5 repliesview on HN

I decided to test it out myself.

Went to the website, typed in "Jeep Wrangler JK engine bay with components labeled" (Since I'm intimately familiar with JK engine bays). Seems like a pretty analogous test to what you did, if anything an even easier test.

Let's see what we get .. a very nice looking diagram of a wrangler engine bay with components labeled, looks good.

But wait ..

- The brake fluid reservoir is on the wrong side of the engine bay

- Where the brake fluid reservoir is, it's labeled as the coolant overflow tank, and while the actual coolant overflow tank does exist in the diagram, it has no label.

- The battery is on the wrong side of the engine bay.

- The top of the front grill is labeled as the "oil filter cap".

- The oil fill cap is in the wrong place.

- Half of the battery is labeled as the fuse box, when the fuse box is correctly shown, but unlabeled, on the other side of the engine bay.

- It shows two different windshield washer reservoirs next to each other.

I could keep going on ...

Now I tried clicking on the incorrectly labeled coolant overflow reservoir and it switches to a new page which now shows a completely different looking coolant overflow, but now it's at least located in the correct place in the engine bay.

But of course it doesn't look remotely like the actual coolant overflow container. It also shows the radiator cap as on the top of the coolant reservoir, when in reality it is very much on the top of the radiator itself.

Like .. I can find fault with every aspect of it. But of course, if you didn't actually know much about the topic it'd all look fairly believable. The story of LLMs basically.


Replies

dugidugoutyesterday at 10:08 PM

It does poorly on creative concepts as well.

I attempted to explore the works of Kinoko Nasu/TYPE-MOON through its characters and the relationships across works and it was mostly nonsense. Sure it had some broad relations correct, but it presented a tiny set of meaningful characters and only attempted to touch Fate/Stay-Night and Tsukihime.

Even more damning was that it produced garbled text for a few of the textual representations and often even if the lettering was clean, the grammar was off.

show 1 reply
torawayyesterday at 9:47 PM

I had a tab on nuclear reactors open and so typed in "Pressurized Water Reactor" and the result while very visually appealing is completely nonsensical (connected the high/low pressure coolant loops together) and would definitely explode.

https://imgur.com/a/DEb3oD4

jazzypantsyesterday at 11:04 PM

Do we ever simply accept that LLMs weren't made for this kind of detail-oriented work? I can't imagine something like this ever being anything other than a toy which can't be trusted.

Will Silicon Valley executives ever accept this reality? If we acquiesce and admit that LLMs are a good tool for prototyping and boilerplate-reduction, but not finished products-- is that when the bubble finally bursts?

macprothrowawayyesterday at 9:39 PM

I also replied because I asked it about a Mac Pro case I had right in front of me. Mostly right words, totally wrong visuals. And while I see what you mean by 'story of LLMs', I ask LLMs about things I know often, and for the last 12 months theyve been pretty dang accurate. This ai visual example is the strongest 'its just guessing' Ive seen in years. For a demo, pretty cool still though. Not sure why OP exaggerated, or simply doesnt know his car as well as he thinks he does.

ofjcihentoday at 12:41 AM

Does it make sense that maybe it has a model of the vehicle it can pull from its corpus wholesale but then the “guess the next letter” portion takes over for labeling and just guesses poorly?