List sampling for large graphs
- Authors
- Yousuf, Muhammad Irfan; Kim, Suhyun
- Issue Date
- 2018-03
- Publisher
- IOS PRESS
- Citation
- INTELLIGENT DATA ANALYSIS, v.22, no.2, pp.261 - 295
- Abstract
- Real world graphs are massive in size and often prohibitively expensive to analyze. Of the possible solutions, sampling is extracting a representative subgraph from a large graph that faithfully represents the actual graph. The prior research has developed several sampling methods but the samples produced by these methods fail to match important properties of the original graph and work poorly in maintaining its topology. We observed that the existing methods do not explore the neighborhood of sampled nodes fairly and hence yield suboptimal samples. In this paper, we introduce a novel approach in which we keep a list of candidate nodes that is populated with all the neighbors of nodes that have been sampled so far. With this approach, we can balance the depth and breadth of graph exploration to produce better samples. We evaluate the effectiveness of our approach using several real world datasets and show that it surpasses the existing state-of-the-art approaches in maintaining the properties of the original graph and retaining its structure. We also calculate Kolmogorov-Smirnov Distance and Jensen-Shannon Distance for quantitative evaluation of our approach.
- Keywords
- Graph sampling; big graphs; social network analysis
- ISSN
- 1088-467X
- URI
- https://pubs.kist.re.kr/handle/201004/121630
- DOI
- 10.3233/IDA-163319
- Appears in Collections:
- KIST Article > 2018
- Files in This Item:
There are no files associated with this item.
- Export
- RIS (EndNote)
- XLS (Excel)
- XML
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.