New open models unlock deep video comprehension with novel features like video tracking and multi-image reasoning, accelerating the science of AI into a new generation of multimodal intelligence.