Aman Ahuja, Edward Fox, Chandan Reddy
The proliferation of Internet-enabled smartphones has ushered in an era where events are reported on social media websites such as Twitter and Facebook. However, the short text nature of social media posts, combined with a large volume of noise present in such datasets makes event detection challenging. This problem can be alleviated by using other sources of information, such as news articles, that employ a precise and factual vocabulary, and are more descriptive in nature. In this paper, we propose Spatio-Temporal Event Detection (STED), a probabilistic model to discover events, their associated topics, time of occurrence, and the geospatial distribution from multiple data sources, such as news and Twitter. The joint modeling of news and Twitter enables our model to distinguish events from other noisy topics present in Twitter data. Furthermore, the presence of geocoordinates and timestamps in tweets helps find the spatio-temporal distribution of the events. We evaluate our model on a large corpus of Twitter and news data, and our experimental results show that STED can effectively discover events, and outperforms state-of-the-art techniques.
Aman Ahuja, Ashish Baghudana, Wei Lu, Edward A. Fox, Chandan K. Reddy: Spatio-Temporal Event Detection from Multiple Data Sources. PAKDD (1) 2019: 293-305
- Date of publication:
- March 22, 2019
- Pacific-Asia Conference on Knowledge Discovery and Data Mining
- Page number(s):