|国家预印本平台
首页|Subtrajectory Clustering and Coverage Maximization in Cubic Time, or Better

Subtrajectory Clustering and Coverage Maximization in Cubic Time, or Better

Subtrajectory Clustering and Coverage Maximization in Cubic Time, or Better

来源:Arxiv_logoArxiv
英文摘要

Many application areas collect unstructured trajectory data. In subtrajectory clustering, one is interested to find patterns in this data using a hybrid combination of segmentation and clustering. We analyze two variants of this problem based on the well-known \textsc{SetCover} and \textsc{CoverageMaximization} problems. In both variants the set system is induced by metric balls under the Fr\'echet distance centered at polygonal curves. Our algorithms focus on improving the running time of the update step of the generic greedy algorithm by means of a careful combination of sweeps through a candidate space. In the first variant, we are given a polygonal curve $P$ of complexity $n$, distance threshold $\Delta$ and complexity bound $\ell$ and the goal is to identify a minimum-size set of center curves $\mathcal{C}$, where each center curve is of complexity at most $\ell$ and every point $p$ on $P$ is covered. A point $p$ on $P$ is covered if it is part of a subtrajectory $\pi_p$ of $P$ such that there is a center $c\in\mathcal{C}$ whose Fr\'echet distance to $\pi_p$ is at most $\Delta$. We present an approximation algorithm for this problem with a running time of $O((n^2\ell + \sqrt{k_\Delta}n^{5/2})\log^2n)$, where $k_\Delta$ is the size of an optimal solution. The algorithm gives a bicriterial approximation guarantee that relaxes the Fr\'echet distance threshold by a constant factor and the size of the solution by a factor of $O(\log n)$. The second problem variant asks for the maximum fraction of the input curve $P$ that can be covered using $k$ center curves, where $k\leq n$ is a parameter to the algorithm. Here, we show that our techniques lead to an algorithm with a running time of $O((k+\ell)n^2\log^2 n)$ and similar approximation guarantees. Note that in both algorithms $k,k_\Delta\in O(n)$ and hence the running time is cubic, or better if $k\ll n$.

Jacobus Conradi、Anne Driemel

计算技术、计算机技术

Jacobus Conradi,Anne Driemel.Subtrajectory Clustering and Coverage Maximization in Cubic Time, or Better[EB/OL].(2025-04-24)[2025-07-16].https://arxiv.org/abs/2504.17381.点此复制

评论