|国家预印本平台
首页|MILD: Multi-Layer Diffusion Strategy for Complex and Precise Multi-IP Aware Human Erasing

MILD: Multi-Layer Diffusion Strategy for Complex and Precise Multi-IP Aware Human Erasing

MILD: Multi-Layer Diffusion Strategy for Complex and Precise Multi-IP Aware Human Erasing

来源:Arxiv_logoArxiv
英文摘要

Recent years have witnessed the success of diffusion models in image-customized tasks. Prior works have achieved notable progress on human-oriented erasing using explicit mask guidance and semantic-aware inpainting. However, they struggle under complex multi-IP scenarios involving human-human occlusions, human-object entanglements, and background interferences. These challenges are mainly due to: 1) Dataset limitations, as existing datasets rarely cover dense occlusions, camouflaged backgrounds, and diverse interactions; 2) Lack of spatial decoupling, where foreground instances cannot be effectively disentangled, limiting clean background restoration. In this work, we introduce a high-quality multi-IP human erasing dataset with diverse pose variations and complex backgrounds. We then propose Multi-Layer Diffusion (MILD), a novel strategy that decomposes generation into semantically separated pathways for each instance and the background. To enhance human-centric understanding, we introduce Human Morphology Guidance, integrating pose, parsing, and spatial relations. We further present Spatially-Modulated Attention to better guide attention flow. Extensive experiments show that MILD outperforms state-of-the-art methods on challenging human erasing benchmarks.

Zhiyuan Ma、Yue Ma、Kaiqi Liu、Yuhan Wang、Jianjun Li、Jinghan Yu

计算技术、计算机技术

Zhiyuan Ma,Yue Ma,Kaiqi Liu,Yuhan Wang,Jianjun Li,Jinghan Yu.MILD: Multi-Layer Diffusion Strategy for Complex and Precise Multi-IP Aware Human Erasing[EB/OL].(2025-08-05)[2025-08-24].https://arxiv.org/abs/2508.06543.点此复制

评论