logoalt Hacker News

ajblast Friday at 11:04 PM5 repliesview on HN

Here's what may seem like an unrelated question in response: how can we get 10^7+ bits of information out of the human body every day?

There are a lot of companies right now trying to apply AI to health, but what they are ignoring is that there are orders of magnitude less health data per person than there are cat pictures. (My phone probably contains 10^10 bits of cat pictures and my health record probably 10^3 bits, if that). But it's not wrong to try to apply AI, because we know that all processes leak information, including biological ones; and ML is a generic tool for extracting signal from noise, given sufficient data.

But our health information gathering systems are engineered to deal with individual very specific hypotheses generated by experts, which require high quality measurements of specific individual metrics that some expert, such as yourself, have figured may be relevant. So we get high quality data, in very small quantities -a few bits per measurement.

Suppose you invent a new cheap sensor for extracting large (10^7+ bits/day) quantities of information about human biochemistry, perhaps from excretions, or blood. You run a longitudinal study collecting this information from a cohort and start training a model to predict every health outcome.

What are the properties of the bits collected by such a sensor, that would make such a process likely to work out? The bits need to be "sufficiently heterogeneous" (but not necessarily independent) and their indexes need to be sufficiently stable (in some sense). What is not required if for specific individual data items to be measured with high quality. Because some information about the original that we're interested in (even though we don't know exactly what it is) will leak into the other measurements.

I predict that designs for such sensors, which cheaply perform large numbers of low quality measurements are would result in breakthroughs what in detection and treatment, by allowing ML to be applied to the problem effectively.


Replies

standingcalast Friday at 11:37 PM

Or perhaps even routine bloodwork could incorporate some form of sequencing and longitudinal data banking. Deep sequencing, which may still be too expensive, generates tons of data that can be useful for things that we don't even know to look for today, capturing this data could let us retroactively identify meaningful biomarkers or early signals when we have better techniques. That way, each time models/methods improve, prior data becomes newly valuable. Perhaps the same could be said of raw data/readings from instruments running standard tests as well (as opposed to just the final results).

I'd be really curious to see how longitudinal results of sequencing + data banking, plus other routine bloodwork, could lead to early detection and better health outcomes.

rscholast Friday at 11:16 PM

Last time someone tried to inject chips into the bloodstream, public opinion didn't handle it too well. It's the same as we would learn a lot by being more cruel to research animals. But most people have other priorities. Good or bad ? Who knows ? Research meets social constructs.

show 2 replies
gleennlast Friday at 11:10 PM

Someone should add a sensor to all those diabetes sensors people have in their arms all day and collect general info. It would obviously bias towards diabetics but that's like half the US population anyways so maybe it wouldn't matter that much.

melagonsteryesterday at 5:30 AM

If you can let the detector be so cheap, doctor will love you!

im3w1lyesterday at 3:00 AM

I think it's a very interesting approach and I highly support such an initiative. The easiest way to get a lot of data out of the body is probably to tap the body's own monitoring system - the sensory nerves.

A chemosensor also sounds like a useful thing it should give concentration by time. Minimally invasive option would be to monitor breath, better signal in blood.