ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.00236
42
2
v1v2 (latest)

Distributed Data Summarization in Well-Connected Networks

1 August 2019
Hsin-Hao Su
H. Vu
ArXiv (abs)PDFHTML
Abstract

We study distributed algorithms for some fundamental problems in data summarization. Given a communication graph GGG of nnn nodes each of which may hold a value initially, we focus on computing ∑i=1Ng(fi)\sum_{i=1}^N g(f_i)∑i=1N​g(fi​), where fif_ifi​ is the number of occurrences of value iii and ggg is some fixed function. This includes important statistics such as the number of distinct elements, frequency moments, and the empirical entropy of the data. In the CONGEST model, a simple adaptation from streaming lower bounds shows that it requires Ω~(D+n)\tilde{\Omega}(D+ n)Ω~(D+n) rounds, where DDD is the diameter of the graph, to compute some of these statistics exactly. However, these lower bounds do not hold for graphs that are well-connected. We give an algorithm that computes ∑i=1Ng(fi)\sum_{i=1}^{N} g(f_i)∑i=1N​g(fi​) exactly in τG⋅2O(log⁡n)\tau_G \cdot 2^{O(\sqrt{\log n})}τG​⋅2O(logn​) rounds where τG\tau_GτG​ is the mixing time of GGG. This also has applications in computing the top kkk most frequent elements. We demonstrate that there is a high similarity between the GOSSIP model and the CONGEST model in well-connected graphs. In particular, we show that each round of the GOSSIP model can be simulated almost-perfectly in O~(τG\tilde{O}(\tau_G O~(τG​ rounds of the CONGEST model. To this end, we develop a new algorithm for the GOSSIP model that 1±ϵ1\pm \epsilon1±ϵ approximates the ppp-th frequency moment Fp=∑i=1NfipF_p = \sum_{i=1}^N f_i^pFp​=∑i=1N​fip​ in O~(ϵ−2n1−k/p)\tilde{O}(\epsilon^{-2} n^{1-k/p})O~(ϵ−2n1−k/p) rounds, for p>2p >2p>2, when the number of distinct elements F0F_0F0​ is at most O(n1/(k−1))O\left(n^{1/(k-1)}\right)O(n1/(k−1)). This result can be translated back to the CONGEST model with a factor O~(τG)\tilde{O}(\tau_G)O~(τG​) blow-up in the number of rounds.

View on arXiv
Comments on this paper