logoalt Hacker News

0110001104/03/202512 repliesview on HN

People are sticking up for LLMs here and that's cool.

I wonder, what if you did the opposite? Take a project of moderate complexity and convert it from code back to natural language using your favorite LLM. Does it provide you with a reasonable description of the behavior and requirements encoded in the source code without losing enough detail to recreate the program? Do you find the resulting natural language description is easier to reason about?

I think there's a reason most of the vibe-coded applications we see people demonstrate are rather simple. There is a level of complexity and precision that is hard to manage. Sure, you can define it in plain english, but is the resulting description extensible, understandable, or more descriptive than a precise language? I think there is a reason why legalese is not plain English, and it goes beyond mere gatekeeping.


Replies

drpixie04/03/2025

> Do you find the resulting natural language description is easier to reason about?

An example from an different field - aviation weather forecasts and notices are published in a strongly abbreviated and codified form. For example, the weather at Sydney Australia now is:

  METAR YSSY 031000Z 08005KT CAVOK 22/13 Q1012 RMK RF00.0/000.0
It's almost universal that new pilots ask "why isn't this in words?". And, indeed, most flight planning apps will convert the code to prose.

But professional pilots (and ATC, etc) universally prefer the coded format. Is is compact (one line instead of a whole paragraph), the format well defined (I know exactly where to look for the one piece I need), and it's unambiguous and well defined.

Same for maths and coding - once you reach a certain level of expertise, the complexity and redundancy of natural language is a greater cost than benefit. This seems to apply to all fields of expertise.

show 11 replies
fluidcruft04/03/2025

I'm not so sure it's about precision rather than working memory. My presumption is people struggle to understand sufficiently large prose versions for the same reason a LLM would struggle working with larger prose versions: people have limited working memory. The time needed to reload info from prose is significant. People reading large text works will start highlighting and taking notes and inventing shorthand forms in their notes. Compact forms and abstractions help reduce demands for working memory and information search. So I'm not sure it's about language precision.

show 2 replies
eightysixfour04/03/2025

Language can carry tremendous amounts of context. For example:

> I want a modern navigation app for driving which lets me select intersections that I never want to be routed through.

That sentence is low complexity but encodes a massive amount of information. You are probably thinking of a million implementation details that you need to get from that sentence to an actual working app but the opportunity is there, the possibility is there, that that is enough information to get to a working application that solves my need.

And just as importantly, if that is enough to get it built, then “can I get that in cornflower blue instead” is easy and the user can iterate from there.

show 2 replies
Affric04/03/2025

Sure but we build (leaky) abstractions, and this is even happens in legal texts.

Asking an llm to build a graphical app in assembly from an ISA and a driver for the display would give you nothing.

But with a mountain of abstractions then it can probably do it.

This is not to defend an LLM more to say I think that by providing the right abstractions (reusable components) then I do think it will get you a lot closer.

show 2 replies
jimmydddd04/03/2025

--I think there is a reason why legalese is not plain English

This is true. Part of the precision of legalese is that the meanings of some terms have already been more precisely defined by the courts.

show 2 replies
jsight04/03/2025

I've thought about this quite a bit. I think a tool like that would be really useful. I can imagine asking questions like "I think this big codebase exposes a rest interface for receiving some sort of credit check object. Can you find it and show me a sequence diagram for how it is implemented?"

The challenge is that the codebase is likely much larger than what would fit into a single codebase. IMO, the LLM really needs to be taught to consume the project incrementally and build up a sort of "mental model" of it to really make this useful. I suspect that a combination of tool usage and RL could produce an incredibly useful tool for this.

soulofmischief04/03/2025

What you're describing is decontextualization. A sufficiently powerful transformer would theoretically be able recontextualize a sufficiently descriptive natural language specification. Likewise, the same or an equivalently powerful transformer should be able to fully capture the logic of a complicated program. We just don't have sufficient transformers yet.

I don't see why a complete description of the program's design philosophy as well as complete descriptions of each system and module and interface wouldn't be enough. We already produce code according to project specification and logically fill in the gaps by using context.

show 2 replies
1vuio0pswjnm704/03/2025

"Sure, you can define it in plain english, but is the resulting description extensible, understandable, or more descriptive than a precise language? I think there is a reason why legalese is not plain English, and it goes beyond mere gatekeeping."

Is this suggesting the reason for legalese is to make documents more "extensible, understable or descriptive" than if written in plain English.

What is this reason that the parent thinks legalese is used that "goes beyond gatekeeping".

Plain English can be every bit as precise as legalese.

It is also unclear that legalese exists for the purpose of gatekeeping. For example, it may be an artifact that survives based on familiarity and laziness.

Law students are taught to write in plain English.

https://www.law.columbia.edu/sites/default/files/2021-07/pla...

In some situations, e.g., drafting SEC filings, use of plain English is required by law.

https://www.law.cornell.edu/cfr/text/17/240.13a-20

show 1 reply
nsonha04/03/2025

isn't that just copilot "explain", one of the earliest copilot capabilities. It's definitely helpful to understand new codebases at a high level

> there is a reason why legalese is not plain English, and it goes beyond mere gatekeeping.

unfortunately they're not in any kind of formal language either

show 2 replies
cyanydeez04/04/2025

the vibe coding seems a lot like the dream of using UML, but in a distinctly different direction, and how in theory (and occasional practice) you can create a two way street, most often these things are one way conversions and while we all desire some level of two way dependency and continual integration to make certain aspects of coding (documentation, testing) to be up to date, the reality is that the generative code aspect always breaks and you're always going to be left with the raw products of these tools and it's rarely going to be a cycle of code -> tool -> code. And thus the ultimate value beyond the bootstrap is lose.

We're still going to have AI tools, but seriously complex applications, the ones we pay money for, arn't going to yield many LLM based curation strategies. There will probably be some great documentation and testing ones, but the architetural-code paradigm isnt going to yield any time soon.

vonneumannstan04/03/2025

I think you can basically make the same argument for programming directly in machine code since programming languages are already abstractions.

VoodooJuJu04/03/2025

[dead]