|国家预印本平台
首页|Learning to Defer in Congested Systems: The AI-Human Interplay

Learning to Defer in Congested Systems: The AI-Human Interplay

Learning to Defer in Congested Systems: The AI-Human Interplay

来源:Arxiv_logoArxiv
英文摘要

High-stakes applications rely on combining Artificial Intelligence (AI) and humans for responsive and reliable decision making. For example, content moderation in social media platforms often employs an AI-human pipeline to promptly remove policy violations without jeopardizing legitimate content. A typical heuristic estimates the risk of incoming content and uses fixed thresholds to decide whether to auto-delete the content (classification) and whether to send it for human review (admission). This approach can be inefficient as it disregards the uncertainty in AI's estimation, the time-varying element of content arrivals and human review capacity, and the selective sampling in the online dataset (humans only review content filtered by the AI). In this paper, we introduce a model to capture such an AI-human interplay. In this model, the AI observes contextual information for incoming jobs, makes classification and admission decisions, and schedules admitted jobs for human review. During these reviews, humans observe a job's true cost and may overturn an erroneous AI classification decision. These reviews also serve as new data to train the AI but are delayed due to congestion in the human review system. The objective is to minimize the costs of eventually misclassified jobs. We propose a near-optimal learning algorithm that carefully balances the classification loss from a selectively sampled dataset, the idiosyncratic loss of non-reviewed jobs, and the delay loss of having congestion in the human review system. To the best of our knowledge, this is the first result for online learning in contextual queueing systems. Moreover, numerical experiments based on online comment datasets show that our algorithm can substantially reduce the number of misclassifications compared to existing content moderation practice.

Thodoris Lykouris、Wentao Weng

计算技术、计算机技术

Thodoris Lykouris,Wentao Weng.Learning to Defer in Congested Systems: The AI-Human Interplay[EB/OL].(2025-08-12)[2025-08-24].https://arxiv.org/abs/2402.12237.点此复制

评论