I looked at the original study [1], and it seems to be a very well-supported piece of research. All the necessary pieces are there, as you would expect from a Nature publication. And overall, I am convinced there's an effect.
However, I'm still skeptical of the effects or size of the change. First, a point that applies to the Massachusetts ballot on psychedelics in particular, putting views into percentages, and getting accurate results from political polls are notoriously difficult tasks [2]. Therefore, the size of any effect is faced with whatever confounding variables make those tasks difficult.
Second, there could be some level Hawthorne effect [3] at play here, such that participants may report being (more) convinced because that's what (they think) is expected of them. I'm not familiar with the recruiting platforms they used, but if they're specialized in paid or otherwise professional surveys, I wonder if participants feel an obligation to perform well.
Third, and somewhat related to the above, participants could state they'd vote Y after initially reporting X preference, because they know it's a low-cost no-commitment claim. In other words, they can claim they'd now vote for Y without fear of judgement because it's a lab environment and an anonymous activity, but they can always go back to their original position once the actual vote happens. To show the size of the effect with respect to other things, researchers will have to make the stakes higher, or follow-up with participants after the vote and find out if/why they changed their mind (again).
Fourth, if one 6-minute-average conversation with a chatbot could convince an average voter, I wonder how much did they know about the issue/candidate being voted on. More cynically for the study, there may be much more at play with actual vote preference than a single dialectic presentation of facts. For example: salient events that happen in the period up to the election; emotional connection with the issue/candidate; personal experiences.
Still, this does not make the study flawed for not covering everything. We can learn a lot from this work, and kudos to the authors for publishing it.
[1] https://www.nature.com/articles/s41586-025-09771-9
[2] For example: https://www.brookings.edu/articles/polling-public-opinion-th...
While I'm as paranoid about LLMs as the next HN'er, there are some silver linings to this research:
1) the LLMs mostly used factual information to influence people (vs. say emotional or social influence) 2) the fact were mostly accurate
I'm not saying we shouldn't worry. But I expected the results to be worse.
Overall, the interesting finding here is that that political opinions can be changed by new information at all. I'm curious how this effect would compare to comparably informed human discussions. I would not be surprised if the LLMs were more effect for at least two reasons:
1) Cost-efficiency, in terms of the knowledge required, and effort/skill to provide personalized arguments. 2) Reduction in the emotional barrier to changing your mind: people don't want to "lose" by being wrong about politics to someone else. But perhaps the machine doesn't trigger this social/tribal response.
Cited papers:
https://www.nature.com/articles/s41586-025-09771-9
https://www.science.org/doi/10.1126/science.aea3884