首页|Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments

Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments

来源：

英文摘要

Contextual linear multi-armed bandits are a learning framework for making a sequence of decisions, e.g., advertising recommendations for a sequence of arriving users. Recent works have shown that clustering these users based on the similarity of their learned preferences can significantly accelerate the learning. However, prior work has primarily focused on the online setting, which requires continually collecting user data, ignoring the offline data widely available in many applications. To tackle these limitations, we study the offline clustering of bandits (Off-ClusBand) problem, which studies how to use the offline dataset to learn cluster properties and improve decision-making across multiple users. The key challenge in Off-ClusBand arises from data insufficiency for users: unlike the online case, in the offline case, we have a fixed, limited dataset to work from and thus must determine whether we have enough data to confidently cluster users together. To address this challenge, we propose two algorithms: Off-C$^2$LUB, which we analytically show performs well for arbitrary amounts of user data, and Off-CLUB, which is prone to bias when data is limited but, given sufficient data, matches a theoretical lower bound that we derive for the offline clustered MAB problem. We experimentally validate these results on both real and synthetic datasets.

作者：Jingyuan Liu、Zeyu Zhang、Xuchuang Wang、Xutong Liu、John C. S. Lui、Mohammad Hajiesmaili、Carlee Joe-Wong

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Jingyuan Liu,Zeyu Zhang,Xuchuang Wang,Xutong Liu,John C. S. Lui,Mohammad Hajiesmaili,Carlee Joe-Wong.Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments[EB/OL].(2025-05-25)[2025-06-12].https://arxiv.org/abs/2505.19043.点此复制

Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments

Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments

评论