logoalt Hacker News

serial_devlast Wednesday at 7:06 AM4 repliesview on HN

The main barriers for me would be:

1. Why? Who would use that? What’s the problem with the other search engines? How will it be paid for?

2. Potential legal issues.

The technical barriers are at least challenging and interesting.

Providing a service with significant upfront investment needs with no product or service vision that I’ll likely to be sued for a couple of times a year, probably losing with who knows what kind of punishment… I’ll have to pass unfortunately.


Replies

bborlast Wednesday at 1:58 PM

1. It'd be for the scientific community (broadly-construed). Converting media that is currently completely un-indexed into plaintext and offering a suite of search features for finding content within it would be a game-changer, IMO! If you've ever done a lit review for any field other than ML, I'm guessing you know how reliant many fields are on relatively-old books and articles (read: PDFs at best, paper-only at worst) that you can basically only encounter via a) citation chains, b) following an author, or c) encyclopedias/textbooks.

2. I really don't see how this could ever lead to any kind of legal issue. You're not hosting any of the content itself, just offering a search feature for it. GoodReads doesn't need legal permission to index popular books, for example.

In general I get the sense that your comment is written from the perspective of an entrepreneur/startup mindset. I'm sure that's brought you meaning and maybe even some wealth, but it's not a universal one! Some of us are more interested in making something to advance humanity than something likely to make a profit, even if we might look silly in the process.

show 2 replies
1vuio0pswjnm7last Wednesday at 5:24 PM

But he did not mention anything about creating a "service"

It could be his own copy for personal use

What if computers continue to become faster and storage continues to become cheaper; what if "large" amounts data continue to become more manageable

The data might seem large today, but it might not seem large or unmanageable in the future

namlemlast Wednesday at 7:46 AM

It would be incredible for LLMs. Searching it, using it as training data, etc. Would probably have to be done in Russia or some other country that doesn't respect international copyright though.

show 4 replies
carlosjobimlast Wednesday at 1:41 PM

> 1. Why? Who would use that?

Rather who would use a traditional search engine instead of a book search engine, when the quality of the results from the latter will be much superior?

People who need or want the highest quality information available will pay for it. I'd easily pay for it.