|国家预印本平台
首页|High-Fidelity Image Inpainting with Multimodal Guided GAN Inversion

High-Fidelity Image Inpainting with Multimodal Guided GAN Inversion

High-Fidelity Image Inpainting with Multimodal Guided GAN Inversion

来源:Arxiv_logoArxiv
英文摘要

Generative Adversarial Network (GAN) inversion have demonstrated excellent performance in image inpainting that aims to restore lost or damaged image texture using its unmasked content. Previous GAN inversion-based methods usually utilize well-trained GAN models as effective priors to generate the realistic regions for missing holes. Despite excellence, they ignore a hard constraint that the unmasked regions in the input and the output should be the same, resulting in a gap between GAN inversion and image inpainting and thus degrading the performance. Besides, existing GAN inversion approaches often consider a single modality of the input image, neglecting other auxiliary cues in images for improvements. Addressing these problems, we propose a novel GAN inversion approach, dubbed MMInvertFill, for image inpainting. MMInvertFill contains primarily a multimodal guided encoder with a pre-modulation and a GAN generator with F&W+ latent space. Specifically, the multimodal encoder aims to enhance the multi-scale structures with additional semantic segmentation edge texture modalities through a gated mask-aware attention module. Afterwards, a pre-modulation is presented to encode these structures into style vectors. To mitigate issues of conspicuous color discrepancy and semantic inconsistency, we introduce the F&W+ latent space to bridge the gap between GAN inversion and image inpainting. Furthermore, in order to reconstruct faithful and photorealistic images, we devise a simple yet effective Soft-update Mean Latent module to capture more diversified in-domain patterns for generating high-fidelity textures for massive corruptions. In our extensive experiments on six challenging datasets, we show that our MMInvertFill qualitatively and quantitatively outperforms other state-of-the-arts and it supports the completion of out-of-domain images effectively.

Libo Zhang、Yongsheng Yu、Jiali Yao、Heng Fan

计算技术、计算机技术

Libo Zhang,Yongsheng Yu,Jiali Yao,Heng Fan.High-Fidelity Image Inpainting with Multimodal Guided GAN Inversion[EB/OL].(2025-04-17)[2025-05-12].https://arxiv.org/abs/2504.12844.点此复制

评论