For those curious (and still locked out) here’s direct a comparison of Sora vs. the open-source leaders (HunyuanVideo, Mochi and LTX):
https://app.checkbin.dev/snapshots/1f0f3ce3-6a30-4c1a-870e-2...
Pros:
- Some of the Sora results are absolutely stunning. Check out the detail on the lion, for example! - The landscapes and aerial shots are absolutely incredible. - Quality is much better than Mochi & LTX out of the box. Mochi/LTX seem to require specifically optimized workflows (I've seen great img2vid LTX results on Reddit that start with Flux image generations, for example). Hunyuan seems comparable to Sora!
Cons:
- Still nearly impossible to access Sora despite the “launch”. My generations today were in the 2000s, implying that it’s only open to a very small number of people. There’s no api yet, so it’s not an option for developers. - Sora struggles with physical interactions. Watch the dancers moonwalk, or the ball goes through the dog. HunyuanVideo seems to be a bit better in this regard. - Can't run it locally mode (obviously) - I haven't tested this, but I think it's safe to assume Sora will be censored extensively. HunyuanVideo is surprisingly open (I've seen NSFW generations!) - I’m getting weird camera angles from Sora, but that could likely be solved with better prompting.
Overall, I’d say it’s the best model I've played with, though I haven’t spent much time on other non-open-source ones. Hunyuan gives it a run for its money, though!
Every day that passes I grow fonder of Google's decision to delay or otherwise keep a lot of this under the wraps.
The other day I was scrolling down on YouTube shorts and a couple videos invoked an uncanny valley response from me (I think it was a clip of an unrealistically large snake covering some hut) which was somehow fascinating and strange and captivating, and then scrolling down a few more, again I saw something kind of "unbelievable"... I saw a comment or two saying it's fake, and upon closer inspection: yeah, there were enough AI'esque artifacts that one could confidently conclude it's fake.
We'd known about AI slop permeating Facebook -- usually a Jesus figure made out of unlikely set of things (like shrimp!) and we'd known that it grips eyeballs. And I don't even know in which box to categorize this, in my mind it conjures the image of those people on slot machines, mechanically and soullessly pulling levers because they are addicted. It's just so strange.
I can imagine now some of the conversations that might have happened at Google when they choose to keep a lot of innovations related to genAI under the wraps (I'm being charitable here of their motives), and I can't help but agree.
And I can't help but be saddened about OpenAI's decisions to unload a lot of this before recognizing the results of unleashing this to humanity, because I'm almost certain it'll be used more for bad things than good things, I'm certain its application on bad things will secure more eyeballs than on good things.
What I desperately need is a model that generates perfectly made PowerPoint slides. I have to create many presentations for management, and it’s a very time consuming task. It’s easy to outline my train of thoughts and let an LLM write the full text, but then to create a convincing presentation slide by slide takes days.
I know there is Beautiful.ai or Copilot for PowerPoint, but none of the existing tools really work for me because the results and the user flow aren’t convincing.
Wow this is bad. And by bad i mean worse than leading open source and existing alternatives.
Is it me or does it seem like OpenAI revolutionized with both chatGPT and Sora, but they've completely hit the ceiling?
Honestly a bit surprised it happened so fast!
Their example videos: https://openai.com/sora/, of the doors opening, are hilarious.
1. The first set of doors doesn't have any doorknobs or handles. https://ibb.co/PwqfzBq
2. The second set of doors has handles, and some very large/random hinges on the left door. https://ibb.co/JkDtc6r
3. The third set doesn't have any handles, but I can forgive that, because we're in a spaceship now. The problem is that the inside of the doors seem to have windows, but the outside of the doors, doesn't have any windows. https://ibb.co/nwpXmtq & https://ibb.co/wr6v2g1
4. The best/most hilarious part for me. The doors have handles, but they are on the hinge side of the door. No idea how this would work. https://ibb.co/gWXDcfr
I feel like there is a sweet spot for AI generation of images and videos that I would describe as "charmingly bad", like the stuff we got from the old CLIP+VQGAN models. I feel like Sora has jumped past that into the valley of "unappealingly bad".
Technically it's amazing that this is possible at all. Yet I don't see how the world is better off for it on net. Aside from eliminating jobs in FX/filming/acting/set design/etc, what do we really gain? Amateur filmmakers can be more powerful? How about we put the same money into a fund for filmmakers to access. The negatives are plentiful, from the mundane reduction of our media to monolithic simulacra to putting the nail in the coffin for truth to exist unchallenged, let alone the 'fine tunes' that will continue to come for deepfakes that are literal (sexual) harassment.
Humans are not built for this power to be in the hands of everyone with low friction.
Not available in France yet, I'd we interested to know if it's a matter of progressive rollout, or some form of legislation (EU or otherwise ?) that's making OpenAI cautious ? Something like the EU AI Act [1] ?
In a sane world, any video produced by Sora would be required to have a form of watermarking that's on par with what intellectual property owners require.
We've put people in jail for sharing copyrighted movies, and don't see why we would refrain from mandating that AI generated videos have some caption that says, I don't know, "This video was generated with AI" ?
People would not respect the mandate, and we would consider that illegal, and use the monopoly on force to take money out of their bank account.
I know, it sounds mad and soooo 20th century - maybe that's why OpenAI overlords are not deeming peasants in France worthy of "a cat in a suit drinking coffee in an office" and "you'll never believe what the other candidate is doing to your kids".
[1] https://www.imatag.com/blog/ai-act-legal-requirement-to-labe...
EDIT: apparently some form of watermarking is built in (but it's not obvious in the examples, for some reason.)
> While imperfect, we’ve added safeguards like visible watermarks by default, and built an internal search tool that uses technical attributes of generations to help verify if content came from Sora.
Link should be annoucement post: https://openai.com/index/sora-is-here/
> We’re introducing our video generation technology now to give society time to explore its possibilities and co-develop norms and safeguards that ensure it’s used responsibly as the field advances.
That's an interesting way of saying "we're probably gonna miss some stuff in our safety tools, so hopefully society picks up the slack for us". :)
A bit off-topic, but how much does a 4-letter (or less) .com go for these days? I wonder if they bought this via an intermediary so that the seller wouldnt see "OpenAI" and tack on a few zeros.
edit: previously, this thread pointed to sora.com
For the $20/month subscription: you get 50 generations a month. So it is included in your subscription already! Nice.
For the Pro $200/month subscription: you get unlimited generations a month (on a slower que).
Who is the audience for this product? A lot of people like video because it's a way of experience something they currently cannot for one reason or another. People don't want to see arbitrary fake worlds or places on earth that aren't real. Unless it's video game or something. But I see this product being used primarily to trick Facebook users
I guess the CGI industry implications are interesting, but look at the waves behind the AI generated man. They don't break so much as dissolving into each other. There's always a tell. These aren't GPU generated versions of reality with thought behind the effects.
Raises billion dollar, claims of agi by 2025, cannot handle new user sign up traffic.
“Sora is here”
No it’s not. I’ve been trying to access all day: “Sora account creation is temporarily unavailable We're currently experiencing heavy traffic and have temporarily disabled Sora account creation. If you've never logged into Sora before, please check back again soon.”
A little worried how young children watching these videos may develop inaccurate impressions of physics in nature.
For instance, that ladybug looks pretty natural, but there's a little glitch in there that an unwitting observer, who's never seen a ladybug move before, may mistake as being normal. And maybe it is! And maybe it isn't?
The sailing ship - are those water movements correct?
The sinking of the elephant into snow - how deep is too deep? Should there be snow on the elephant or would it have melted from body heat? Should some of the snow fall off during movement or is it maybe packed down too tightly already?
There's no way to know because they aren't actual recordings, and if you don't know that, and this tech improves leaps and bounds (as we know it will), it will eventually become published and will be taken at face value by many.
Hopefully I'm just overthinking it.
Many people say:
> these things will get bigger and better much faster than we can learn to discern
I would like to ask “Why?”
Clearly, these models are just one case of “NN can learn to map anything from one domain to another” and with enough training/overfitting they can approximate reality to a high degree.
But, why would it get better to any significant extent?
Because we can collect an infinite amount of video? Because we can train models to the point where they become generative video compression algorithms that have seen it all?
As there was no mention of an API for either Sora or o1 Pro, I think this launch further marks OpenAI’s transition from an infrastructure company to a product company.
“Right before the TikTok ban goes into effect” is incredible market timing for the release of a tool that is useless for anything other than terrible TikTok spam videos
Who legitimately asked for, or wants this? It's cool on it's face, sure.
What legitimate problem does it solve? Isn't AI supposed to make our lives easier, or is that just "not what it's supposed to be bro", or whatever. I've lost track at this point with all the hallucinations and poor/bad/really fucking bad responses. It's not 100% of the time, but that's the point of companies like OpenAI releasing stuff like this to the public... to be helpful and believable.
Deep fakes were bad enough. Shit like this is not helpful when given to the largely ignorant public. It's not going to be used for anything helpful, conducive, or otherwise beneficial.
It's impressive. Sure. I just fail to see what it's the solution to.
Not available in
> the United Kingdom, Switzerland and the European Economic Area. We are working to expand access further in the coming months
Excellent to announce this lack of access after the launch of pro. At least I have no business reason for sora so it's not a loss there so much but annoying nonetheless.
Here's something I find interesting: We have multiple paid accounts with OpenAI. In other words, we are paying customers. I have yet to see a single announcement or new development that we learn about through email. In most cases we learn these things when they get covered by some online outfit, posted on HN, etc.
OpenAI isn't the only company that seems to act in this manner. I find this to be interesting. Your paying customers actively want to know about what you are doing and, more than likely, would love to get a heads-up before the word goes out to the world. Hearing about things from third parties can make you feel like a company takes your business for grant it or does not deem it important enough to feed you news when it happens.
Another example of this is Kickstarter, although, their problem is different. I have only ever backed technology projects on KS. That's all I am interested in. And yet, every single email they send is full of projects that don't even begin to approach my profile (built over dozens of backed projects). As a result of this, KS emails have become spam to be deleted without even reading them. This also means I have not backed projects I would have seriously considered and I don't frequent the site as much as I used to.
Getting back on topic: It will be interesting to see how Sora usage evolves.
“I've come up with a set of rules that describe our reactions to technologies:
1. Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.
2. Anything that's invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.
3. Anything invented after you're thirty-five is against the natural order of things.”
― Douglas Adams, The Salmon of Doubt: Hitchhiking the Galaxy One Last Time
It will not be available in the EU for now. I always feel disadvantaged when I read that sentence
From "12 Days of OpenAI: Day 3"
https://www.youtube.com/watch?v=2jKVx2vyZOY (live as of this comment)
In a not so distant future we might need to have some sort of regulation that forces uploaders (or content creators) to declare if videos have been generated with ai tech or not and depending on the content such declaration might carry legal consequences. On the other side hosting platforms should display clearly if such content was declared ai generated or not as well. Right now I can't see a simple and good enough solution as this that could mitigate the spread of malicious content.
This is actually a different version from what they had before. What they released today is Sora Turbo.
There’s an ongoing related livestream[0].
Curious, what kinds of things are you all gonna make with Sora?
Personally, I think I'll just be making weird memes to send to my friends!
If we take HunyuanVideo, which is similar to Sora, as an example, they state that generating a 5-second video requires 5 minutes on 8xH100 GPUs. Therefore, if 10,000 users simultaneously want to generate a 5-second video within the same 5-minute window, you would need 80,000 H100 GPUs, which would cost around 2 billion USD in GPUs alone.
Can't even log in. I get "Unexpected token '<', "<!DOCTYPE "... is not valid JSON"
God, I hate this crap. Every damn thing on the internet now is some variety of AI slop. Every image is some generative garbage, half the text is the kind of stilted half-accurate wordy gpt garbage, and now I get to dodge a gazillion generated videos of things that would be interesting if they were real and followed the laws of physics and continuity, and I get to do all that while living in a post-truth society, because there’s just too much bullshit out there for the average person to bother to sort through. You’d think after 20 years of this crap nerds would have some notion that technology has consequences, but nope, there’s a shiny thing we could build, so we’ve gotta build it.
Not to be picky, but Sora is here: https://geohack.toolforge.org/geohack.php?pagename=Sora,_Laz...
Welp I can't even login with my existing ChatGPT account because their servers are overloaded
Wish they’d followed their previous MO of releasing stuff with no warning or buildup.
Results won’t match the hype.
If you're looking for video for casual personal projects or fill-ins for vlog posts, or something to make your PowerPoint look neat, this seems like a rad tool. It has a looong way to go before it's taking anyone's movie VFX job.
The mammoths are walking over some pre-existing footprints, but they don't leave any prints of their own. I guess I'm getting hung up on little things. For a prompt of a few words, it looks pretty nice!
Genuinely curious who is doing this for adult content?
Complaints about Sora's quality and prompt complexity likely not as important to auteur's in that category, especially with ability to load a custom character etc
Account creation not available. Login to see more videos.
Classic OpenAI. I don't care, there are so many better alternatives to everything they do. Funny how quickly they have become irrelevant and lost their moat.
Not downplaying the amazing progress, but even the video showcases have some weird uncanny valley effects. The winged horse one in particular - the wings and legs morph, the wing on the left disappears and reappears through the tail.
This stuff is a little ways off, but still some amazing effects here. I think it will be a little bit before it is sufficient for production use in any real commercial situation. There's something unsettling about all of the videos generated here.
This seems pretty broken at the moment, I haven't actually managed to create a video, every prompt results in "There was an unexpected error running this prompt".
I wonder what it is about EU and UK law, in particular, that restricts its availability there. Their FAQs don't mention this.
If it's about training models on potentially personal information, the GDPR (EU and UK variants) kicks in, but then that hasn't restricted OpenAI's ability to deploy (Chat)GPT there. The same applies to broader copyright regulations around platforms needing to proactively prevent copyright violation, something GPT could also theoretically accomplish. Any (planned) EU-specific regulations don't apply to the UK, so I doubt it's those either.
The only thing that leaves, perhaps, is laws around the generation of deepfakes which both the UK and EU have laws about? But then why didn't that affect DALL-E? Anyone with a more detailed understanding of this space have any ideas?
How long until they fix the sign up issue? What an embarrassment. Why release something if you know it can't work properly? And why do we need to sign up when we already have an account with ChatGPT?
It was cool when they announced it but the novelty of generating a piece of AI video clipart is quickly fading, especially when it takes months or years to just get a demo in users' hands.
“The version of Sora we are deploying has many limitations. It often generates unrealistic physics and struggles with complex actions over long durations. Although Sora Turbo is much faster than the February preview, we’re still working to make the technology affordable for everyone.”
So they demo the full model and release the quantised and censored model.
Does anyone else find this kind of bait & switch distasteful?
"Sora is not available in The United Kingdom yet". Available elsewhere, from Albania to Zimbabwe. Any particular reason why?
I've found using these and similar tools that the amount of prompts and iteration required to create my vision (image or video in my mind) is very large and often is not able to create what I had originally wanted. A way to test this is to take a piece of footage or an image which is the ground truth, and test how much prompting and editing it takes to get the same or similar ground truth starting from scratch. It is basically not possible with the current tech and finite amounts of time and iterations.