New methods to generate massive synthetic networks
Lenwood Heath, Naren Ramakrishnan
Abstract
One of the biggest needs in network science research is access to large realistic datasets. As data analytics methods permeate a range of diverse disciplines---e.g., computational epidemiology, sustainability, social media analytics, biology, and transportation--- network datasets that can exhibit characteristics encountered in each of these disciplines becomes paramount. The key technical issue is to be able to generate synthetic topologies with pre-specified, arbitrary, degree distributions. Existing methods are limited in their ability to faithfully reproduce macro-level characteristics of networks while at the same time respecting particular degree distributions. We present a suite of three algorithms that exploit the principle of residual degree attenuation to generate synthetic topologies that adhere to macro-level real-world characteristics. By evaluating these algorithms w.r.t. several real-world datasets we demonstrate their ability to faithfully reproduce network characteristics such as node degree, clustering coefficient, hop length, and k-core structure distributions.
People
Publication Details
- Date of publication:
- May 23, 2017
- Journal:
- Cornell University
- Publication note:
Malay Chakrabarti, Lenwood S. Heath, Naren Ramakrishnan:New methods to generate massive synthetic networks. CoRR abs/1705.08473 (2017)