logoalt Hacker News

fdsf2today at 1:30 AM1 replyview on HN

It behooves me that Gemini et al dont have these standard video editing tools. Do the engineers seriously think prompting by text is the way people want videos to be generated? Nope. People want to customise. E.g. Check out capcut in the context of social media.

Ive been trying to create a quick and dirty marketing promo via an LLM to visualise how a product will fit into the world of people - it is incredibly painful to 'hope and pray' that by refining the prompt via text you can make slight adjustments come through.

The models are good enough if you are half-decent at prompting and have some patience. But given the amount invested, I would argue they are pretty disappointing. Ive had to chunk the marketing promo into almost a frame-by-frame play to make it somewhat work.


Replies

suprstarrdtoday at 1:44 AM

Speaking as someone who doesn't like the idea of AI art so take my words with a grain of salt, but my theory is that this input method exclusivity is intentional on their part, for exactly the reason you want the change. If you only let people making AI art communicate what they want through text or reference attachments (the latter of which they usually won't have), then they have to spend time figuring out how to put it into words. It IS painful to ask for those refinements, because any human would clearly understands it. In the end, those people get to say that they spent hours, days, or weeks refining "their prompt" to get a consistent and somewhat-okay looking image; the engineers get to train their AI to better understand the context of what someone is saying; and all the while the company gets to further legitimize a false art form.