HeadlinesBriefing favicon HeadlinesBriefing.com

ShapeR: 3D Shape Generation from Casual Captures

Hacker News: Front Page •
×

Meta Reality Labs Research introduces ShapeR, a system for generating metric 3D shapes from casual image sequences. It uses off-the-shelf inputs like SLAM points and text captions to condition a rectified flow transformer, producing object-centric meshes for full scene reconstruction. The approach emphasizes multimodal conditioning and robust training to handle real-world noise.

Unlike monolithic scene fusion, ShapeR reconstructs individual objects, allowing for interaction and manipulation. It addresses limitations of single-view methods like SAM 3D, which can lack metric accuracy. ShapeR leverages multi-view geometric constraints to achieve consistent layouts and scales without user input, even when trained solely on synthetic data.

The system generalizes to non-Aria data, reconstructing objects in ScanNet++ scenes and from monocular iPhone captures using tools like MapAnything. A new evaluation dataset with 178 objects across 7 scenes is released to test in-the-wild challenges. The authors suggest combining ShapeR's metric accuracy with SAM 3D's texture priors for future work.