I gave a talk at PyData Berlin on how to build your own TikTok recommendation algorithm. The TikTok personalized recommendation engine is the world's most valuable AI. It's TikTok's differentiation. It updates recommendations within 1 second of you clicking - at human perceivable latency. If your AI recommender has poor feature freshness, it will be perceived as slow, not intelligent - no matter how good the recommendations are.
TikTok's recommender is partly built on European Technology (Apache Flink for real-time feature computation), along with Kafka, and distributed model training infrastructure. The Monolith paper is misleading that the 'online training' is key. It is not. It is that your clicks are made available as features for predicitons in less than 1 second. You need a per-event stream processing architecture for this (like Flink - Feldera would be my modern choice as an incremental streaming engine).
* https://www.youtube.com/watch?v=skZ1HcF7AsM
* Monolith paper - https://arxiv.org/pdf/2209.07663
I noticed Youtube shorts also seems to update the feed based on how long the last video you watched. If you're scrolling quickly then stop to watch a dog video long enough the next one is likely to be another dog video.
Flink is too slow for this.
If by features you mean tracking state per user, that stuff can be tracked without Flink insanely fast with Redis as well.
If you re saying they dont have to load data to update the state, I dont see how massive these states are to require inmemory updates, and if so, you could just do inmemory updates without Flink.
Similarly, any consumer will have to deal with batches of users and pipelining.
Flink is just a bottleneck.
If they actually use Flink for this, its not the moat.
Thanks for the Feldera shoutout Jim.
For anyone else, if you want to try out Feldera and IVM for feature-engineering (it gives you perfect offline-online parity), you can start here: https://docs.feldera.com/use_cases/fraud_detection/
TikTok's differention is the userbase of all teenagers in the world.
[flagged]
It is not only recommender though. These guys [1] seem to be able to react pretty quickly and not to create addicts on the way ;(
[1] https://recombee.com