logoalt Hacker News

jdspiral04/23/20251 replyview on HN

Yes, tokenization and embeddings are exactly how LLMs process input—they break text into tokens and map them to vectors. POS tags and SVOs aren't part of the model pipeline but help visualize structures the models learn implicitly.


Replies