Are there video "thumbprints" like exists for audio (used by soundhound/etc) - IE a compressed set of features that can reliably be linked in unique content? I would expect that is possible and a lot faster lookup for 2 frames a second. If this is the case, the "your device is taking a snapshot every 30 seconds" sounds a lot worse (not defending it - it's still something I hope can be legislated away - something can be bad and still exaggerated by media)
I've been led to believe those video thumbprints exist, but I know the hash of the perceived audio is often all that is needed for a match of what is currently being presented (movie, commercial advert, music-as-music-not-background, ...).
There are perceptual hashing algorithms for images/video/audio (dsp and ML based) that could work for that.