人体动作美感评估与增强方法研究
Research on Human Motion Aesthetic Assessment and Enhancement Methods
郁小海1
作者信息
- 1. 北京邮电大学计算机学院(国家示范性软件学院),北京100876
- 折叠
摘要
动作捕捉与动作生成技术能够高效产出大量人体动作数据,广泛服务于数字人驱动、影视动画制作与具身智能交互等场景。现有方法的优化目标主要集中在几何精度与物理合理性层面,对动作协调性、力量感与艺术表现力等美学属性较少涉及,导致所得动作在视觉观感上仍可能显得僵硬或缺乏生动性。针对上述问题,本文围绕数据构建、评估建模与生成优化三个层面,建立一套从美学定义到引导优化的技术框架。在数据层面,提出人机协同迭代的动作美学数据构建方法,利用LoRA微调视觉语言大模型作为预标注者,结合不确定性筛选与人工校验提升数据集规模与一致性。在评估层面,设计基于物理感知特征编码与多尺度时域融合的孪生排序网络,实现对动作美学的细粒度量化评估。在优化层面,构建多目标协同引导的动作美学优化框架,以预训练运动扩散模型为生成先验,将美学评分、几何一致性、物理接触等约束统一注入扩散采样过程。实验结果表明:微调后的VLM预标注准确率达79.56\%,与人类标注的Cohen's Kappa为0.89;轻量评估模型在多维排序任务上取得92.22\%的平均准确率;优化框架在美学评分与综合质量指标上优于传统方法,有效提升了动作流畅性与美感。
Abstract
Motion capture and generation technologies can efficiently produce large volumes of human motion data, widely serving digital human animation, film production, and embodied intelligent interaction. Existing methods mainly focus on geometric accuracy and physical plausibility, with limited attention to aesthetic attributes such as coordination, power, and artistic expressiveness. To address these issues, this paper establishes a technical framework from aesthetic definition to guided optimization, covering data construction, evaluation modeling, and generation optimization. At the data level, a human-in-the-loop iterative construction method is proposed, adapting a vision-language model via LoRA as a pre-annotator. At the evaluation level, a Siamese ranking network based on physics-aware feature encoding and multi-scale temporal fusion is designed for fine-grained aesthetic assessment. At the optimization level, a multi-objective collaboratively guided framework is constructed using a pretrained motion diffusion model as the generative prior. Experiments demonstrate that the LoRA-tuned VLM achieves 79.56\% pre-annotation accuracy with a Cohen's Kappa of 0.89; the lightweight evaluation model achieves 92.22\% average pairwise accuracy on multi-dimensional ranking tasks; and the optimization framework outperforms traditional methods in aesthetic scores and overall quality metrics.关键词
动作美学/人机协同数据构建/视觉大语言模型/动作优化/扩散模型Key words
motion aesthetics/human-in-the-loop data construction/vision-language model/motion optimization/diffusion model引用本文复制引用
郁小海.人体动作美感评估与增强方法研究[EB/OL].(2026-03-30)[2026-03-31].http://www.paper.edu.cn/releasepaper/content/202603-293.学科分类
计算技术、计算机技术
评论