首页|ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction

ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction

来源：

英文摘要

Pre-trained models are valuable intellectual property, capturing both domain-specific and domain-invariant features within their weight spaces. However, model extraction attacks threaten these assets by enabling unauthorized source-domain inference and facilitating cross-domain transfer via the exploitation of domain-invariant features. In this work, we introduce **ProDiF**, a novel framework that leverages targeted weight space manipulation to secure pre-trained models against extraction attacks. **ProDiF** quantifies the transferability of filters and perturbs the weights of critical filters in unsecured memory, while preserving actual critical weights in a Trusted Execution Environment (TEE) for authorized users. A bi-level optimization further ensures resilience against adaptive fine-tuning attacks. Experimental results show that **ProDiF** reduces source-domain accuracy to near-random levels and decreases cross-domain transferability by 74.65\%, providing robust protection for pre-trained models. This work offers comprehensive protection for pre-trained DNN models and highlights the potential of weight space manipulation as a novel approach to model security.

作者：Tong Zhou、Shijin Duan、Gaowen Liu、Charles Fleming、Ramana Rao Kompella、Shaolei Ren、Xiaolin Xu

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Tong Zhou,Shijin Duan,Gaowen Liu,Charles Fleming,Ramana Rao Kompella,Shaolei Ren,Xiaolin Xu.ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction[EB/OL].(2025-03-17)[2025-05-18].https://arxiv.org/abs/2503.13224.点此复制

ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction

ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction

评论