|国家预印本平台
首页|FailLite: Failure-Resilient Model Serving for Resource-Constrained Edge Environments

FailLite: Failure-Resilient Model Serving for Resource-Constrained Edge Environments

FailLite: Failure-Resilient Model Serving for Resource-Constrained Edge Environments

来源:Arxiv_logoArxiv
英文摘要

Model serving systems have become popular for deploying deep learning models for various latency-sensitive inference tasks. While traditional replication-based methods have been used for failure-resilient model serving in the cloud, such methods are often infeasible in edge environments due to significant resource constraints that preclude full replication. To address this problem, this paper presents FailLite, a failure-resilient model serving system that employs (i) a heterogeneous replication where failover models are smaller variants of the original model, (ii) an intelligent approach that uses warm replicas to ensure quick failover for critical applications while using cold replicas, and (iii) progressive failover to provide low mean time to recovery (MTTR) for the remaining applications. We implement a full prototype of our system and demonstrate its efficacy on an experimental edge testbed. Our results using 27 models show that FailLite can recover all failed applications with 175.5ms MTTR and only a 0.6% reduction in accuracy.

Li Wu、Walid A. Hanafy、Tarek Abdelzaher、David Irwin、Jesse Milzman、Prashant Shenoy

计算技术、计算机技术

Li Wu,Walid A. Hanafy,Tarek Abdelzaher,David Irwin,Jesse Milzman,Prashant Shenoy.FailLite: Failure-Resilient Model Serving for Resource-Constrained Edge Environments[EB/OL].(2025-04-22)[2025-05-21].https://arxiv.org/abs/2504.15856.点此复制

评论