Danai Koutra, Vasileios Koutras, Christos Faloutsos

Abstract

If Alice has double the friends of Bob, will she also have double the phone-calls (or wall-postings, or tweets)? Our first contribution is the discovery that the relative frequencies obey a power-law (sub-linear, or super-linear), for a wide variety of diverse settings: tasks in a phone-call network, like count of friends, count of phone-calls, total count of minutes; tasks in a twitter-like network, like count of tweets, count of followees etc. The second contribution is that we further provide a full, digitized 2-d distribution, which we call the Almond-DG model, thanks to the shape of its iso-surfaces. The Almond-DG model matches all our empirical observations: super-linear relationships among variables, and (provably) log-logistic marginals. We illustrate our observations on two large, real network datasets, spanning ~2.2M and ~3.1M individuals with 5 features each. We show how to use our observations to spot clusters and outliers, like, e.g., telemarketers in our phone-call network.

No items found

Publication Details

Date of publication:
January 1, 2013
Conference:
PAKDD
Publisher:
Springer Science + Business Media
Page number(s):
201--212