DISTL: Distributed In-Memory Spatio-Temporal Event-based Storyline Categorization Platform in Social Media
Chang-Tien Lu
Abstract
Event analysis in social media is challenging due to endless amount of information generated daily. While current research has put a strong focus on detecting events, there is no clear guidance on how those storylines should be processed such that they would make sense to a human analyst. In this paper, we present DISTL, an event processing platform which takes as input a set of storylines (a sequence of entities and their relationships) and processes them as follows: (1) uses different algorithms (LDA, SVM, information gain, rule sets) to identify events with different themes and allocates storylines to them; and (2) combines the events with location and time to narrow down to the ones that are meaningful in a specific scenario. The output comprises sets of events in different categories. DISTL uses in-memory distributed processing that scales to high data volumes and categorizes generated storylines in near real-time. It uses Big Data tools, such as Hadoop and Spark, which have show n to be highly efficient in handling millions of tweets concurrently.
People
-
Bio Item
Publication Details
Date of publication: April 25, 2016
Conference: GISTAM 2016
Page number(s):
Volume:
Issue Number: