List sampling for large graphs

Authors
Yousuf, Muhammad IrfanKim, Suhyun
Issue Date
2018-03
Publisher
IOS PRESS
Citation
INTELLIGENT DATA ANALYSIS, v.22, no.2, pp.261 - 295
Abstract
Real world graphs are massive in size and often prohibitively expensive to analyze. Of the possible solutions, sampling is extracting a representative subgraph from a large graph that faithfully represents the actual graph. The prior research has developed several sampling methods but the samples produced by these methods fail to match important properties of the original graph and work poorly in maintaining its topology. We observed that the existing methods do not explore the neighborhood of sampled nodes fairly and hence yield suboptimal samples. In this paper, we introduce a novel approach in which we keep a list of candidate nodes that is populated with all the neighbors of nodes that have been sampled so far. With this approach, we can balance the depth and breadth of graph exploration to produce better samples. We evaluate the effectiveness of our approach using several real world datasets and show that it surpasses the existing state-of-the-art approaches in maintaining the properties of the original graph and retaining its structure. We also calculate Kolmogorov-Smirnov Distance and Jensen-Shannon Distance for quantitative evaluation of our approach.
Keywords
Graph sampling; big graphs; social network analysis
ISSN
1088-467X
URI
https://pubs.kist.re.kr/handle/201004/121630
DOI
10.3233/IDA-163319
Appears in Collections:
KIST Article > 2018
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE