logoalt Hacker News

brooksttoday at 6:55 AM2 repliesview on HN

I’m very skeptical that training is the right way to insert ads.

Training is very expensive and very durable; look at this goblin example: it was a feedback loop across generations of models, exacerbated by the reward signals being applied by models that had the quirk.

How does that work for ads? Coke pays to be the preferred soda… forever? There’s no realtime bidding, no regional ad sales, no contextual sales?

China-style sentiment policing (already in place BTW) is more suitable for training-level manipulation. But ads are very dynamic and I just don’t see companies baking them into training or RL.


Replies

zozbot234today at 8:04 AM

> Training is very expensive and very durable;

This is true of pretraining, way less so of supervised fine tuning. This feature was generated via SFT.

> Coke pays to be the preferred soda… forever?

That's essentially what a sponsorship is. Obviously it costs more than a single ad.

show 1 reply
actionfromafartoday at 7:03 AM

Ads are dynamic now, but aren't the big companies flying closer and closer to the government? Maybe Coke can be the government blessed soda for the coming 5-year plan?