logoalt Hacker News

davidwyesterday at 10:19 PM2 repliesview on HN

It started off nicely but before long you get

"The MLP (multilayer perceptron) is a two-layer feed-forward network: project up to 64 dimensions, apply ReLU (zero out negatives), project back to 16"

Which starts to feel pretty owly indeed.

I think the whole thing could be expanded to cover some more of it in greater depth.


Replies

tibbartoday at 1:43 AM

I think the big frustration I've had in learning modern ML is that the entire owl is just so complicated. A poor explainer reads like "black box is black boxing the other black box", completely undecipherable. A mediocre-to-above-average explanation will be like "(loosely introduced concept) is (doing something that sounds meaningful) to black box", which is a little better. However, when explanations start getting more accurate, you run into the sheer volume of concepts/data transforms taking place in a transformer, and there's too much information to be useful as a pedagogical device.

growingswetoday at 2:01 AM

I tried to include tooltips in some places that go into more depth, but I understand there's a jump. I'm not sure what will be the best way to go about it tbh

show 1 reply