Running user-defined functions in R on Earth observation data in cloud back-ends

Ghosh, P., Lahn, F., Gebbert, S., Mohr M., Pebesma, E.

Forschungsartikel in Sammelband (Konferenz)

Zusammenfassung

Although Earth Observation (EO) data plays a central role in various applications of geospatial sciences, the conventional workflow requiring the presence of the data on the local machine has become a bottleneck for processing large volumes of EO data. Processing it in cloud back-ends instead by transferring the code to the data overcomes bandwidth limitations significantly. Furthermore, it allows for the development of a client-server architecture where client nodes using different programming languages submit processing requests to cloud back-ends, servers with access to EO data. As part of the openEO application programming interface (API), which aims to mediate such workflows, infrastructure to run users’ scripts containing custom functions on EO data in cloud back-ends is under development. This paper focuses on an API for running such user-defined functions (UDFs) written in R, and describes three strategies that have been developed with the help of the R package “stars”. The first is file-based where the UDF service could be thought of as a part of the backend. The other two are implemented as RESTful web services where the data is transferred between the back-end and the UDF service either in the form of JSON arrays containing pixel values, or base64 encoded strings by embedding it in a JSON file. Containerization using Docker for the reproducibility and testing of the R package which implements these services as well as future directions for its development, particularly to enhance scalability, are also discussed.

Details zur Publikation

Veröffentlichungsjahr: 2018
Sprache, in der die Publikation verfasst istEnglisch
Link zum Volltext: http://geomundus.org/2018/docs/papers/Pramit.pdf