logoalt Hacker News

brookst07/30/202519 repliesview on HN

I’ve been kicking around the idea of something like Lean (maybe even just Lean?) to rewrite news and other non-fiction articles, treating statements as theorems that need to be proven. Proofs could include citations, and could be compound things like “this is a fact if three of my approved sources asserted it as a fact”

It should then be possible to get a marked-up version of any document with highlighting for “proven” claims.

Sure, it’s not perfect, but I think it would be an interesting way to apply the rigor that used to be the job of publications.


Replies

bubblyworld07/31/2025

Formalising natural language statements is a minefield of difficulties, for (imo) essentially the same reasons that writing code which interacts with the real world is so difficult. Concepts you take for granted like identity, time, causality... all of that stuff needs to be fleshed out carefully and precisely in the formalism for facts to be relatable to each other (or even expressible in the first place).

Not to discourage you - it's a cool problem! OpenCog comes to mind as a project that tried to take this all the way, and there's a field devoted to this stuff in academia called KRR (knowledge representation and reasoning). The IJCAI journal is full of research on similar topics.

(also see all the different logics philosophers use to formalise different kinds of argument involving time/modality/probability - there are a lot, but they aren't very "modular" and can't easily be mixed unless someone has solved that since I last looked into it)

show 3 replies
bjackman07/31/2025

I think it's quite rare that the most important beliefs you should derive from the news can be justified by a collection of absolute statements.

I think you'd be better served by a tool for calculating chains of Bayesian reasoning. I've seen a tool that does this for numerical estimations.

show 1 reply
t_mann07/31/2025

Careful, such an approach could easily end up giving an aura of logical objectivity to arbitrarily radical and nonsensical ideas. The political views of one of the fathers of modern logic may serve as a cautionary tale [0].

[0] https://en.m.wikipedia.org/wiki/Gottlob_Frege#Political_view...

magicalhippo07/31/2025

> I’ve been kicking around the idea of something like Lean (maybe even just Lean?) to rewrite news and other non-fiction articles, treating statements as theorems that need to be proven.

I found I got much better at writing non-fiction after having math at uni. I would help proof-read essays and other hand-ins by my SO and sister, and apply similar rigor as you mention. Stuff like "you show C follows from B here, but you haven't actually given an argument for why B follows A, so you can't then claim C follows from A".

It's tempting to say that with LLMs this seems like a plausible task to turn it into a program, but the hallucination issue puts a damper on that scheme.

show 3 replies
Tainnor07/31/2025

I think for that kind of thing, Prolog would be sufficient. Lean is more powerful, but it requires you to supply proofs - Prolog can just automatically run inferences.

(Although I wouldn't be surprised if somebody had already recreated something like Prolog in Lean with a tactic that does more or less what the Prolog interpreter does)

CobrastanJorji07/31/2025

I think it'd be more interesting to map out entire trees of arguments about a topic. Start with something big, like "is there a God," then come up with all of the arguments for or against, then come up with all of the arguments against those arguments, then the counters to those arguments. Really explore it as a sort of debate space. Then bring in citations not as backing but for historical context: "Plato made this argument in such and such." The idea wouldn't be so much to decide a winner as to prevent us from going around in circles by making a map.

show 2 replies
atoav07/31/2025

As a programmer who has studied philosophy: This is the approach I take more or less in my head when reading articles, however the problem is the ambiguity of natural language.

E.g. lets say the statement is about a presidents promise to end a conflict within 24 hours after coming into office. One could get to a conclusion pretty quickly when the conflict hasn't been ended after 24 hours when they entered the office.

But what does "end the conflict" mean exactly? If the conflict ended how long does it need to remain ended to achieve the label "ended"? What if said president has a history that recontextualizes the meaning of that seemingly simple claim because he is known to define the word "ended" a little different than the rest of us, do you now judge by his or by our definition? What if the conflict is ended but there is a small nest of conflict remaining, after which size do we consider the conflict going on?

I know some of that has official definitions, but not everything has. In the end a lot of that will require interpretation and which definition to chose. But yeah I support your idea, just spelling it out and having a machine-readable chain of thought might help already.

show 1 reply
atomicnature07/31/2025

Proof != evidence. In evidence, we corroborate, collate, add more sources, weigh evidence, judge. Proof is a totally different process. Only in the mathematical do one prove something, everywhere else we build up evidence, corroborate, etc.

viraptor08/01/2025

Check out https://argdown.org/ for the much simpler version of that idea. No proofs, just supporting/conflicting evidence and arguments. But that's hard enough to consistently produce as it is.

poulpy12307/31/2025

News and non-fiction articles are not math and cannot be treated as math. At best you could build a tool that check the most glaring contradictions (like a typo changing a number), and I'm not even sure it can be consistent without building a software that understand language. At worse you would build a tool that spit bullshit that millions of people would treat as gospel

fancy_pantser07/31/2025

Claimify from MS research aims in this direction. There's a paper and video explainer from a few months ago.

https://www.microsoft.com/en-us/research/blog/claimify-extra...

guyomes07/31/2025

For the form, you might be interested in Ethica, by Spinoza [1]. On the other hand, for fact checking, the key concept seems to be trust in sources rather than logical consistency.

[1]: https://en.wikipedia.org/wiki/Spinoza%27s_Ethics

jeffhuys07/31/2025

Look at Arguman, now defunct, but you can probably find some good videos/screenshots.

Basically crowd-sourced statements, rebuttals, agreements, etc, in a nice interface.

We need that nowadays.

show 1 reply
michael199908/01/2025

I think you could just use lean, and progressively elaborate the inference graph.

But you might find a historical/journalistic/investigative angle more profitable.

I heard a software product pitch from a small law practice who were outside investigators for abuse/policy-violation claims in a unionized workforce. They had a solid procedure: start with the main event. Interview all witnesses, and identify other interesting events. Then cross-reference all witnesses against all alleged/interesting events. That gives you a full matrix to identify who is a reliable witness, what sort of motivations are in play, and what stories are being told, patterns of behaviour, etc. Their final report might actually focus on another event entirely that had better evidence around it.

In that sort of historical investigation, a single obviously false claim undermines all claims from that witness. Pure logic doesn't have that bi-directionality of inference. Sometimes your sources just flat out lie! That's half the work.

The key difference is between a full matrix of claims vs sources/evidence, while good math is all about carefully plotting a sparse inference graph that traverses the space. What is called "penetrating minimalism" in math is called "cherry picking" in history or journalism.

We passed on the pitch. Didn't see how it scaled to a product worth building, but their process was appealing. Much better than the an internal popularity-contest or witch-hunt. Probably better than the average police investigation too.

wredcoll07/31/2025

I think a more practical method would be to trace the provenance of a claim, in a way that lets you evaluate how likely it is for the person making it to actually know the facts.

A simple scenario would be reading a news article about american politics. You see an unusual claim made, so you start tracing it and find that the journalist got it from person X who got it from person Y who got it from donald trump. Trump is famous for lying constantly, so unless there's a second source making the claim, you could disregard it with a high probability.

BlarfMcFlarf07/31/2025

What about proven facts that get disproven? Is there room to rethink your priors?

ants_everywhere07/31/2025

I don't know why you're getting downvoted because I think this is an interesting idea.

Completely impossible but still interesting and fun to explore.

But you don't need the power of Lean to do something like this. I would recommend starting with something like Prolog or RDF triples

show 2 replies
bufferoverflow07/31/2025

[dead]