|国家预印本平台
首页|Competing Bandits in Decentralized Contextual Matching Markets

Competing Bandits in Decentralized Contextual Matching Markets

Competing Bandits in Decentralized Contextual Matching Markets

来源:Arxiv_logoArxiv
英文摘要

Sequential learning in a multi-agent resource constrained matching market has received significant interest in the past few years. We study decentralized learning in two-sided matching markets where the demand side (aka players or agents) competes for the supply side (aka arms) with potentially time-varying preferences to obtain a stable match. Motivated by the linear contextual bandit framework, we assume that for each agent, an arm-mean may be represented by a linear function of a known feature vector and an unknown (agent-specific) parameter. Moreover, the preferences over arms depend on a latent environment in each round, where the latent environment varies across rounds in a non-stationary manner. We propose learning algorithms to identify the latent environment and obtain stable matchings simultaneously. Our proposed algorithms achieve instance-dependent logarithmic regret, scaling independently of the number of arms, and hence applicable for a large market.

Soumya Basu、Abishek Sankararaman、Avishek Ghosh、Satush Parikh

计算技术、计算机技术

Soumya Basu,Abishek Sankararaman,Avishek Ghosh,Satush Parikh.Competing Bandits in Decentralized Contextual Matching Markets[EB/OL].(2025-06-19)[2025-06-30].https://arxiv.org/abs/2411.11794.点此复制

评论