Google's Gemini Omni turns images, audio, and text into video — and that's just the start | TechCrunch
When Google launched Gemini three years ago , the goal was to build a multimodal large language model — a single neural network that was trained on text, image, audio, and video and could…