HeadlinesBriefing favicon HeadlinesBriefing.com

DeepMind’s D4RT Offers Real‑Time 4D Scene Reconstruction

Google DeepMind Blog •
×

DeepMind’s new D4RT model blends depth, motion, and camera tracking into a single Transformer, enabling real‑time 4D reconstruction from ordinary video.

Unlike legacy pipelines that stitch separate depth, motion, and pose modules, D4RT answers a single query: where is a pixel in 3D space at any time? This question‑driven design reduces inference to a few milliseconds on a single TPU.

Benchmarks show D4RT runs 300‑fold faster than prior state‑of‑the‑art, processing a minute of footage in five seconds. Accuracy remains high; the model preserves continuous movement of dynamic objects without duplicating them.

Practical gains ripple across robotics, AR, and world‑model research. Robots can navigate crowds instantly, AR glasses can overlay graphics without lag, and researchers inch closer to true AI perception of physical reality.