HF-VTON: High-Fidelity Virtual Try-On via Consistent Geometric and Semantic Alignment
HF-VTON: High-Fidelity Virtual Try-On via Consistent Geometric and Semantic Alignment
Virtual try-on technology has become increasingly important in the fashion and retail industries, enabling the generation of high-fidelity garment images that adapt seamlessly to target human models. While existing methods have achieved notable progress, they still face significant challenges in maintaining consistency across different poses. Specifically, geometric distortions lead to a lack of spatial consistency, mismatches in garment structure and texture across poses result in semantic inconsistency, and the loss or distortion of fine-grained details diminishes visual fidelity. To address these challenges, we propose HF-VTON, a novel framework that ensures high-fidelity virtual try-on performance across diverse poses. HF-VTON consists of three key modules: (1) the Appearance-Preserving Warp Alignment Module (APWAM), which aligns garments to human poses, addressing geometric deformations and ensuring spatial consistency; (2) the Semantic Representation and Comprehension Module (SRCM), which captures fine-grained garment attributes and multi-pose data to enhance semantic representation, maintaining structural, textural, and pattern consistency; and (3) the Multimodal Prior-Guided Appearance Generation Module (MPAGM), which integrates multimodal features and prior knowledge from pre-trained models to optimize appearance generation, ensuring both semantic and geometric consistency. Additionally, to overcome data limitations in existing benchmarks, we introduce the SAMP-VTONS dataset, featuring multi-pose pairs and rich textual annotations for a more comprehensive evaluation. Experimental results demonstrate that HF-VTON outperforms state-of-the-art methods on both VITON-HD and SAMP-VTONS, excelling in visual fidelity, semantic consistency, and detail preservation.
Ming Meng、Qi Dong、Jiajie Li、Zhe Zhu、Xingyu Wang、Zhaoxin Fan、Wei Zhao、Wenjun Wu
服装工业、制鞋工业
Ming Meng,Qi Dong,Jiajie Li,Zhe Zhu,Xingyu Wang,Zhaoxin Fan,Wei Zhao,Wenjun Wu.HF-VTON: High-Fidelity Virtual Try-On via Consistent Geometric and Semantic Alignment[EB/OL].(2025-05-26)[2025-06-15].https://arxiv.org/abs/2505.19638.点此复制
评论