logoalt Hacker News

wavemodelast Wednesday at 8:51 PM0 repliesview on HN

Data leakage is an eval problem, not an accuracy problem.

That is, the problem is not that the AI is wrong X% of the time. The problem is that, in the presence of a data leak, there is no way of knowing what the value of X even is.

This problem is recursive - in the presence of a data leak, you also cannot know for sure the quantity of data that has leaked.