logoalt Hacker News

thenewguy077yesterday at 8:40 PM1 replyview on HN

This looks AI generated code, is it?


Replies

carlovalentiyesterday at 9:15 PM

It's not, except some utilities (json parser and picture handling). Please refer to the readme for full details. The main purpose was to understand the internals of transformer models by coding an engine from the ground up, so using AI to generate the machine learning related code would have made the whole project pointless. The most exciting moments: when I first ran successfully a decode; when I managed to fine tune a Gemma model by having it learn things about me; when Paligemma boxed correctly a bee in a picture I presented to it.