12月31日论文推荐(附下载地址)
论文名:
Optimal Distributed Submodular Optimization via Sketching
作者:
MohammadHossein Bateni (Google)
Hossein Esfandiari (Harvard University)
Vahab Mirrokni (Google)
推荐理由:
“Optimal Distributed Submodular Optimization via Sketching”是Google纽约团队的工作,这篇文章提出了一个针对Submodular优化的分布式算法。Submodular是数学、数据挖掘、优化等很多领域中的一个共性问题,早先几年在社交网络、尤其是影响力最大化传播中使用非常多,当然传统的数学问题就是Set Cover。
Submodular比较流行是因为它虽然是一个NP难问题,但能找到一个非常简单的贪婪算法,并且能够保证很好的最优效果的近似(大约54-66%)效果。这篇文章是提出一个分布式算法,算法保证了很好的空间复杂度、优化效果。下图给出了不同submodular问题下文章方法和传统方法在理论上的比较结果,这是一个非常有意思而且很Solid的结果。其中Dominating Set就是影响力最大化的基础问题。
摘要:
We present distributed algorithms for several classes of submodular optimization problems such as k-cover, set cover, facility location,and probabilistic coverage.The new algorithms enjoy almost optimal space complexity, optimal approximation guarantees, optimal communication complexity (and run in only four rounds of computation), addressing major shortcomings of prior work. We first present a distributed algorithm for k-cover using only O˜(n) space per machine, and then extend it to several submodular optimization problems, improving previous results for all the above problems—e.g., our algorithm for facility location problem improves the space of the best known algorithm (Lindgren et al. [26]).Our algorithms are implementable in various distributed frameworks such as MapReduce and RAM models. On the hardness side we demonstrate thee limitations of uniform sampling via an information theoretic argument.
Furthermore, we perform an extensive empirical study of our algorithms (implemented in MapReduce) on a variety of datasets. We observe that using sketches 30–600 times smaller than the input, one can solve the coverage maximization problem with quality very close to that of the state-of-the-art single-machine algorithm. Finally, we demonstrate an application of our algorithm in largescale feature selection.