logoalt Hacker News

mptesttoday at 5:34 AM0 repliesview on HN

No expert, more a hobbyist, but my understanding is that most serious people with longer timelines all believe "embodiment" training data ie data from robots operating in the world is the data they need to make the next step change in the growth of these things.

How to best get masses of robotics operating in the real world data is debated. Can you get there in Sim2Real, where, if you can create a physically sound enough sim you can train your robots in the virtual world much easier than ours. See ... eureka ? dr eureka? i forget the main paper. Hand spinning a pen. The boston dynamics dog on a rolling yoga ball. After a billion robots train for a million "years" in your virtual world, just transfer the "brain" to a physical robot.

Jim Fan of nvidia is one to follow there. Then there's tele-operation believers. Then there's mass deployment and iterate believers (musk's "self driving" rollout), there's iirc research that believes video games and video interpretation will be able to confer some of that data from operating in the real world, similar to how it's said transformers learned utilized the implicit structure of language to learn from unclean data, even properly ordered text has meaning embedded in its relative positional values.

Just my summary of what I've seen of researchers who agree scaling text and train time is old news, I mostly see them trying to figure out how to scale "embodied" ai data collection. or derive a VLA model in fancy ways (bigger training sets of robotic behavior around a standard robot form factor maybe?) all types of avenues but yes most serious people recognize the need for "embodied" data - at least that I've read.