I'm not convinced its end-to-end multimodal - in that case, you'll have a speech synthesis...

vessenes • last Wednesday at 7:54 PM • 0 replies • view on HN

I'm not convinced its end-to-end multimodal - in that case, you'll have a speech synthesis section and this will be some of the result. You could test by having it sing or do some accents, or have it talk back to you in an accent you give it.

alt Hacker News