AI Multimodal Process audio, images, videos, documents, and generate images/videos using Google Gemini's multimodal API. Setup Quick Start Verify setup : Analyze media : - TIP: When you're asked to analyze an image, check if command is available, then use command. If command is not available, use command. Generate content : Stdin support : You can pipe files directly via stdin (auto-detects PNG/JPG/PDF/WAV/MP3). - - (traditional) Models - Image generation : (standard), (quality), (speed) - Video generation : (8s clips with audio) - Analysis : (recommended), (advanced) Scripts - : CLI orchestr…