Near-Linear MIR Algorithms for Stochastically-Ordered Priors
Near-Linear MIR Algorithms for Stochastically-Ordered Priors
With the rise of online applications, recommender systems (RSs) often encounter constraints in balancing exploration and exploitation. Such constraints arise when exploration is carried out by agents whose utility must be taken into account when optimizing overall welfare. A recent work by Bahar et al. (2020) suggests that recommendations should be \emph{mechanism-informed individually rational} (MIR). Specifically, if agents have a default arm they would use, relying on the RS should yield each agent at least the reward of the default arm, conditioned on the information available to the RS. Under the MIR constraint, striking a balance between exploration and exploitation becomes a complex planning problem. To that end, Bahar et al. propose an approximately optimal yet inefficient planning algorithm that runs in $O(2^K K^2 H^2)$, where $K$ is the number of arms and $H$ is the size of the support of the reward distributions. In this paper, we make a significant improvement for a special yet practical case, removing both the dependence on $H$ and the exponential dependence on $K$. We assume a stochastic order of the rewards (e.g., Gaussian with unit variance, Bernoulli, etc.), and devise an asymptotically optimal algorithm with a runtime of $O(K \log K)$. Our technique is based on formulating a Goal Markov Decision Process (GMDP), establishing an optimal dynamic programming procedure, and then unveiling its crux -- fleshing out a simple index-based structure that facilitates efficient computation. Additionally, we present an incentive-compatible version of our algorithm.
Moshe Tennenholtz、Gal Bahar、Omer Ben-Porat、Kevin Leyton-Brown
计算技术、计算机技术
Moshe Tennenholtz,Gal Bahar,Omer Ben-Porat,Kevin Leyton-Brown.Near-Linear MIR Algorithms for Stochastically-Ordered Priors[EB/OL].(2025-07-06)[2025-07-16].https://arxiv.org/abs/2502.12766.点此复制
评论