This is cool as hell.
I remember like seventeen years years ago, Microsoft had "PhotoSynth", which would make 3D environments based on a bunch of images, and seventeen-year-old-tombert thought it was one of the most amazing things to ever be done on a computer.
Doing this with just one image makes this at least an order of magnitude cooler. I will be playing with this over the weekend.
I wonder if there is something similar but for creating isometric sprites? I burned through $30 yesterday realizing that I can't just get image gen to give me isometric static/animated sprites with consistency....even the best image gen cannot do this and im just baffled how difficult isometric sprite is compared to 3d mesh gen
I'm at a crossroad , do I opt for 3d mesh isometrics with more hardware requirements for mobile phones or stick to isometric sprite which nobody seems to be generating via AI reliably (happy to be corrected here if anybody does find a way)
I see it used worldlabs, i’ve tested it quite a bit and no results were not really that usable, it hallucinated so many parts outside of the wall that made no sense. He will be fine if hallucinated and it made sense but if it doesn’t make sense, I’m not sure what the point of inputting a single image is. I’ve actually found better luck using gpt image 2 instead.
Is this in the same vein as TRELLIS?
https://github.com/Microsoft/TRELLIS
I've been trying to use this to generate 3d character models from images. I am enjoying 3d printing these models to mess with my kids.
Not much of what I've found runs on local models but I'm always on the lookout. Meshy.ai (mentioned here) offers really nice generation but the cost adds up quickly.
Curious about the actual architecture. From the outside it looks like Gaussian splatting anchored to roughly one viewpoint, since the moment you wander outside the original frame or behind an object, it becomes messy. But Ben Mildenhall is one of the co-founders and a NeRF co-author (https://arxiv.org/abs/2003.08934), so I'm betting that whatever they're doing is more interesting than naive splatting. Curious if OP can share anything about the pipeline.
So Blade Runner's Esper photo analysis went from ruining the suspension of disbelief to reality quicker then most magic.
My team is working in the character animation space which might complement this: https://uthana.com/
Example: https://uthana.com/app/preview/cXi2eAP19XwQ/mH7opbcqZE4P
What about creating 3d meshes from multiple photos of the same object?
Very cool.
May I ask if Claude is the only option to use the tool?
Sol Roth
[dead]
I’m ready to make a game with this, or something similar. Open to suggestions on tooling and asset pipelines that utilize AI, if anyone has any suggestions or guides.
If you haven't tried AI modeling pipelines in the last year you'll be surprised.
The star of the show here is https://platform.worldlabs.ai/ (author works there, I don't) which is really good. There's also Meshy.ai (which this repo doesn't seem to use?) for non-scene stuff that's right up there in quality. There's texturing, auto-rigging, etc.
The latest VLLM models have true pixel image grounding which means you can totally ask your AI about pixel coordinates of things, so you get 3d perception for edits and anything else you need.
I'm actually surprised I don't see this stuff being used more; I think it's because most pipelines are hard-baked with assumption that your 3D assets are files you get from an artist, not something you can imagine up in minutes in a script. The technology is moving faster than the industry can keep up with.