Real-time, full-duplex multimodal AI on your device

A 9B omni-modal model that sees, listens, and speaks simultaneously. Features full-duplex streaming (no turn-taking lag) and proactive interaction. Outperforms GPT-4o on vision benchmarks. Runs locally via llama.cpp & Ollama.