Moving beyond the traditional paradigms of "Thinking with Text" (e.g., Chain-of-Thought) and "Thinking with Images", we propose "Thinking with Video"—a new paradigm that unifies visual and textual ...
HoloScene leverages a comprehensive interactive scene-graph representation, encoding object geometry, appearance, and physical properties alongside hierarchical and inter-object relationships.