Scalable Spatiotemporal Statistics for Global Environmental Phenomena

Basic data for this project

Type of project: Individual project
Duration: 01/03/2018 - 28/02/2021

Description

With today's amount of open Earth observation (EO) data, comprehensive geostatistical analyses of environmental phenomena can be brought to global scale. However, computational complexity of random field operations, globally unrealistic model assumptions like stationarity, separability, and isotropy, and data management issues when data size exceeds local storage capacity currently limit the practical use of the data in applications and result in unshareable, irreproducible research. As an example, using global elevation data in the order of a few terabytes as covariate information in modeling the spatial variation of precipitation requires fast methods to inferential statistics and extensive effort in data management. In this project, we aim at developing efficient methods for geostatistical inference on global environmental phenomena that (i) computationally scale well on shared nothing architectures, (ii) consider non-stationary, non-separable, and anisotropic spatiotemporal dependencies, and (iii) are capable of integrating multiple data sources including remote sensing imagery and in-situ observations. Therefore, we will build geostatistical models that combine spatial Markov random fields with temporal advection-diffusion processes and adapt novel algorithms to the highly scalable data management and analytics system SciDB. Developed methods will be demonstrated in two use cases including the creation of a global high resolution precipitation dataset and modeling land use change with external independent variables. In addition to global modeling, the second use case will emphasize how project results facilitate working with latest, high-resolution satellite-derived datasets. For this, data from the Sentinel-2 and TanDEM-X missions will be used in a national scale land use change analysis. Expected key contributions include algorithms for efficient geostatistical inference in distributed computing environments, approaches for modeling non-stationarity and anisotropy on global scale, and open source software tools for reproducible large-scale environmental data management and analyses.

Keywords: Geoinformatik; Umweltphänomene