Computing flood probabilities using Twitter: application to the Houston urban area during Harvey

7 December 2020

Etienne Brangbour

P. Bruneau

Stéphane Marchand-Maillet

Abstract

In this paper, we investigate the conversion of a Twitter corpus into geo-referenced raster cells holding the probability of the associated geographical areas of being flooded. We describe a baseline approach that combines a density ratio function, aggregation using a spatio-temporal Gaussian kernel function, and TFIDF textual features. The features are transformed to probabilities using a logistic regression model. The described method is evaluated on a corpus collected after the floods that followed Hurricane Harvey in the Houston urban area in August-September 2017. The baseline reaches a F1 score of 68%. We highlight research directions likely to improve these initial results.

View on arXiv

Comments on this paper