Yes, you have to be very careful when querying LLM's, you have to assume that they are giving y...

milesvp • yesterday at 1:20 AM • 0 replies • view on HN

Yes, you have to be very careful when querying LLM's, you have to assume that they are giving you sort of the average answer to a question. I find them very good at sort of telling me how people commonly solve a problem. I'm lucky, in that the space I've been working has had a lot of good forums training data, and the average solution tends to be on the more correct side. But you still have to validate nearly everything it tells you. It's also funny to watch the tokenization "fails". When you ask about things like register names, and you can see it choose nonexisting tokens. Atmel libraries have a lot of things like this in them

#define PA17_EIC_LINE PIN_PA17A_EIC_EXTINT_NUM #define PA17_EIC_BIT PORT_PA17A_EIC_EXTINT1 #define PA17_PMUX_INDEX 8 //pa17 17/2 #define PA17_PMUX_TYPE MUX_PA17A_EIC_EXTINT1

And the output will be almost correct code, but instead of an answer being:

PORT_PA17A_EIC_EXTINT1

you'll get:

PORT_PA17A_EIC_EXTINT_NUM

and you can tell that it diverged trying to use similar tokens, and since _ follows EXTINT sometimes, it's a "valid" token to try, and now that it's EXTINT_ now NUM is the most likely thing to follow.

That said, it's massively sped up the project I'm working on, especially since Microchip effectively shut down the forums that chatgpt was trained on.

alt Hacker News