|国家预印本平台
首页|Generative Perception of Shape and Material from Differential Motion

Generative Perception of Shape and Material from Differential Motion

Generative Perception of Shape and Material from Differential Motion

来源:Arxiv_logoArxiv
英文摘要

Perceiving the shape and material of an object from a single image is inherently ambiguous, especially when lighting is unknown and unconstrained. Despite this, humans can often disentangle shape and material, and when they are uncertain, they often move their head slightly or rotate the object to help resolve the ambiguities. Inspired by this behavior, we introduce a novel conditional denoising-diffusion model that generates samples of shape-and-material maps from a short video of an object undergoing differential motions. Our parameter-efficient architecture allows training directly in pixel-space, and it generates many disentangled attributes of an object simultaneously. Trained on a modest number of synthetic object-motion videos with supervision on shape and material, the model exhibits compelling emergent behavior: For static observations, it produces diverse, multimodal predictions of plausible shape-and-material maps that capture the inherent ambiguities; and when objects move, the distributions quickly converge to more accurate explanations. The model also produces high-quality shape-and-material estimates for less ambiguous, real-world objects. By moving beyond single-view to continuous motion observations, our work suggests a generative perception approach for improving visual reasoning in physically-embodied systems.

Xinran Nicole Han、Ko Nishino、Todd Zickler

计算技术、计算机技术

Xinran Nicole Han,Ko Nishino,Todd Zickler.Generative Perception of Shape and Material from Differential Motion[EB/OL].(2025-06-03)[2025-06-17].https://arxiv.org/abs/2506.02473.点此复制

评论