logoalt Hacker News

ranger_dangeryesterday at 5:36 PM3 repliesview on HN

But you also relied on people giving away too much personal information about themselves... which won't always be the case.


Replies

majorchordyesterday at 5:40 PM

Yeah my first thought was "of course an LLM can do that, we didn't need a paper to tell us". I would be more impressed if it could do it without that information, such as by analyzing writing styles and other cues that aren't direct PII.

show 1 reply
DalasNoinyesterday at 5:53 PM

I agree that these accounts probably on average still contain more information than the average pseudonymous account. I think we could try to use the LLM to increasingly ablate more information and see how it performance decays – to be clear we already heavily remove such information, see Table 2 appendix. But I don't expect that to change the basic conclusions.

show 1 reply
famouswafflesyesterday at 5:43 PM

Over a large enough timeframe (often a couple years at most), almost everyone online gives too much information about themselves. A seemingly innocuous statement can pin you to an exact city and so on.

show 1 reply