> 4. If you read about a new Gemini model, you might want to use it - but are you using @google&#...

logankilpatrick • last Saturday at 11:21 PM • 1 reply • view on HN

> 4. If you read about a new Gemini model, you might want to use it - but are you using @google/genai, @google/generative-ai (wow finally deprecated) or @google-ai/generativelanguage? Silly mistake, but when nano banana dropped it was highly confusing image gen was available only through one of these.?

Yeah, I hear you, open to suggestions to make this more clear, but it is google/genai going forward. Switching packages sucks.

> Gemini supports video! But that video first has to be uploaded to "Google GenAI Drive" which will then splices it into 1 FPS images and feeds it to the LLM. No option to improve the FPS, so if you want anything properly done, you'll have to splice it yourself and upload it to generativelanguage.googleapis.com which is only accessible using their GenAI SDK. Don't ask which one, I'm still not sure.

We have some work ongoing (should launch in the next 3-4 weeks) which will let you reference files (video included) from links directly so you don't need to upload to the File API. We do also support custom FPS: https://ai.google.dev/gemini-api/docs/video-understanding#cu...

> 6. Nice, it works. Let's try using live video. Open the docs, you get it mentioned a bunch of times but 0 documentation on how to actually do it. Only suggestions for using 3rd party services. When you actually find it in the docs, it says "To see an example of how to use the Live API in a streaming audio and video format, run the "Live API - Get Started" file in the cookbooks repository". Oh well, time to read badly written python.

Just pinged the team, we will get a live video example added here: https://ai.google.dev/gemini-api/docs/live?example=mic-strea... should have it live Monday, not sure why that isn't there, sorry for the miss!

> 7. How about we try generating a video - open up AI studio, see only Veo 2 available from the video models. But, open up "Build" section, and I can have Gemini 3 build me a video generation tool that will use Veo 3 via API by clicking on the example. But wait why cant we use Veo 3 in the AI studio with the same API key?

We are working on adding Veo 3.1 into the drop down, I think it is being tested by QA right now, pinged the team to get ETA, should be rolling out ASAP though, sorry for the confusing experience. Hoping this is fixed by Monday EOD!

> 8. Every Veo 3 extended video has absolutely garbled sound and there is nothing you can do about it, or maybe there is, but by this point I'm out of willpower to chase down edgy edge cases in their docs.

Checking on this, haven't used extend a lot but will see if there is something missing we can clarify.

On some of the later points, I don't have enough domain expertise to weight in but will forward to folks n the Android / Play side to see what we can do to streamline things!

Thank you for taking the time to write up this feedback : ) hoping we can make the product better based on this.

Replies

thecupisblue • last Monday at 10:26 AM

Didn't catch in the updates that the custom FPS was released, amazing. Seems like the limit is just 20MB, but can use custom splitting for larger ones.

Trying to split all videos into frames was a PITA mostly due to weird inputs from different Android phones requiring handling all kinds of edge cases, then uploading each to Upload API with retry was also adding a lag + complexity, so doing it all in one go will save me both time and nerves (and tokens).

Thanks for listening and all the great work you do, since you came in the experience improved by an immeasurable amount.

alt Hacker News

Replies