logoalt Hacker News

jmalickiyesterday at 3:49 PM4 repliesview on HN

As a side gig, I write novel software that solves problems no existing software does, that existing LLMs have difficulty reproducing, purely for the purpose of existing as LLM training data.

There are journalists being hired to write Atlantic-worthy articles that exist only as LLM training data, because they're getting paid more than the Atlantic would pay them for it.

It's insane.

Yes, they are hiring the experts themselves. To create new knowledge above and beyond what's on the internet. To be locked away as LLM training data.

The largest characteristic of all of this new data is it is targeted at LLM's weak points.

It's not just more data, it's custom tutorials built for what LLMs struggle at.


Replies

MattRogishyesterday at 4:36 PM

I'm not saying they are not trying - I'm saying we're inventing new problems faster than any Lab can:

1) Identify the gaps

2) Determine how to fix them

3) Implement a fix (especially if that fix is: identify and find experts)

4) And judge the result

How do they know [person] is an expert in [some field]? How do they find that person? How many experts are necessary to give the right information? How do we evaluate the results, especially if it's novel?

You can find a lot of people who disagree on many topics, and those turtles go all the way down.

I'm not in disagreement that your work will help reduce hallucinations and improve model performance! It is.

I predict (I hope I'm wrong!) that we're going to hit some asymptote that is not at 0% hallucinations (and I would even put a substantial nonzero probability that "overall" hallucination rate bottoms out at some minimum and then slowly grows because we just can't keep up with the new garbage we throw at it).

show 2 replies
ayewoyesterday at 3:58 PM

1. How did you land the side gig? Mercor or a lessor known brand?

2. What criteria do such vendors typically require?

show 1 reply
victorbjorklundyesterday at 3:55 PM

What kind of programs? Can you give an example of the tasks?

show 1 reply
giardiniyesterday at 5:58 PM

jmalicki says many things, among them being

"As a side gig, I write novel software that solves problems no existing software does,"

and

"Yes, they are hiring the experts themselves. To create new knowledge above and beyond what's on the internet. To be locked away as LLM training data."

More likely you're joking and/or paranoid!8-))

show 2 replies