Lenwood Heath, Naren Ramakrishnan
One of the biggest needs in network science research is access to large realistic datasets. As data analytics methods permeate a range of diverse disciplines---e.g., computational epidemiology, sustainability, social media analytics, biology, and transportation--- network datasets that can exhibit characteristics encountered in each of these disciplines becomes paramount. The key technical issue is to be able to generate synthetic topologies with pre-specified, arbitrary, degree distributions. Existing methods are limited in their ability to faithfully reproduce macro-level characteristics of networks while at the same time respecting particular degree distributions. We present a suite of three algorithms that exploit the principle of residual degree attenuation to generate synthetic topologies that adhere to macro-level real-world characteristics. By evaluating these algorithms w.r.t. several real-world datasets we demonstrate their ability to faithfully reproduce network characteristics such as node degree, clustering coefficient, hop length, and k-core structure distributions.
- Date of publication:
- May 23, 2017
- Cornell University
- Publication note:
Malay Chakrabarti, Lenwood S. Heath, Naren Ramakrishnan:New methods to generate massive synthetic networks. CoRR abs/1705.08473 (2017)