FDL Europe 2020 - Digital Twin Earth

FDL Europe 2020

Digital Twin Earth

PUBLISHED 7 JAN 2021

FDL Europe Badge - Digital Twin Earth

FDL Europe is a public - private partnership between the European Space Agency (ESA), the University of Oxford, Trillium Technologies and leaders in commercial AI supported by Google Cloud, NVIDIA and Scan AI. FDL Europe works to apply AI technologies to space science, to push the frontiers of research and develop new tools to help solve some of the biggest challenges that humanity faces. These range from the effects of climate change to predicting space weather, from improving disaster response, to identifying meteorites that could hold the key to the history of our universe.

FDL Europe 2020 was a research sprint hosted by the University of Oxford that took place over a period of eight weeks in order to promote rapid learning and research outcomes in a collaborative atmosphere, pairing machine learning expertise with AI technologies and space science. The interdisciplinary teams address tightly defined problems and the format encourages rapid iteration and prototyping to create meaningful outputs to the space program and humanity.

Google Cloud Logo NVIDIA Logo Scan AI Logo Sense Logo Pasteur ISI Logo Airbus Logo Danish Meteorolgical Institiue Logo D-Orbit Logo Catapult Satellite Applications Logo Trillium Europe Logo University of Oxford Logo

Digital Twin Earth

Project Background

Extreme precipitation events, such as violent rain and hail storms, can devastate crop fields and disrupt harvests. These events can be locally forecasted with sophisticated numerical weather models that rely on extensive ground and satellite observations. However, such approaches require access to compute and data resources that developing countries in need - particularly in South America and West Africa - cannot afford. The lack of advance planning for precipitation events impedes socioeconomic development and ultimately impacts the livelihoods of millions around the world. Given the increase in global precipitation and extreme precipitation events driven by climate change, the need for accurate precipitation forecasts is ever more pressing.

Digital Twin Earth Research

Weather forecasting systems have not fundamentally changed since they were first operationalised nearly 50 years ago. Current state-of-the-art operational weather forecasting systems rely on numerical models that integrate the physical atmospheric state in time based on a system of physical equations and parameterised subgrid processes. While global simulations typically run at grid sizes of 10 km, regional models can reach 1.5 km. For global simulations, skilled forecast lengths are usually limited to a maximum of 10 days, with a conjectured hard limit of 14 to 15 days. Forecasts are also dependent upon the field of interest, with large-scale temperature patterns having much longer predictability time than precipitation events.

Project Approach

The research team aimed to enable data-driven investigations for global precipitation intensity forecasting from satellite imagery, assessing the challenges involved in moving from physics-based models to a data-driven digital twin of the Earth. The team put together a new dataset called RainBench, that was constructed of three publicly-available sources. Firstly, the European Centre for Medium-Range Weather Forecasts (ECMWF) simulated satellite data (SimSat) from a high-resolution weather-forecasting model. Secondly, the ECMWF Re-Analysis, 5th Edition (ERA5) dataset that provides hourly estimates of a variety of atmospheric, land and oceanic variables, such as specific humidity, temperature and geo-potential height at different pressure levels. Finally, the Integrated Multi-Satellite Retrievals (IMERG) global precipitation estimates - a global half-hourly precipitation estimation product provided by NASA, primarily using satellite data from multiple polar-orbiting and geo-stationary satellites.

This combined RainBench dataset was then trained useing neural network based on Convolutional Long Short-Term Memory (Conv-LTSM) algorithms, where a single model is trained conditioned on time and is capable of forecasting at different lead times, as illustrated below.

Precipitation Diagram

Following training, the team’s approach involved three-steps - State Estimation, State Forecasting and Precipitation Estimation to take the various datapoints and process them with a view to present a five-day global precipitation forecast. Each step is designed to be self-contained and can be designed and trained separately, with a final fine-tuning process can be used to harmonise them for the task at hand. The process is summarised in the diagram below.

Observations Diagram

To support efficient data-handling and experimentation on Rainbench, the team also released PyRain, an out-of-the-box experimentation framework. PyRain introduces an efficient data-loading pipeline for complex sample access patterns that scales to the terabytes of spatial time-series data typically encountered in the climate and weather domain.

Project Results

The three-stage approach was used on the RainBench dataset to forecast precipitation up to five days ahead. For this task, the team split the ERA5 and IMERG datasets into training / validation / test sets using the following time intervals: for training, 2010 to 2016 inclusive; for validation, 2017 and 2018; for testing, 2019. They pre-processed the target values by accumulating precipitations over three-hours periods (the frequency at which the Atmospheric State is estimated in the previous step) always in the perspective of fine-tuning the models learned in the three steps together, comparing the ERA5 data with the results from their fully-connected neural network (FCNN), with the results illustrated below.

Project Stages Diagram

The models were compared on four classes of precipitation - none, drizzle, light and heavy. The team noticed that their FCNN is accurate in distinguishing between rainy and non-rainy cells, but it had a tendency to overestimate precipitation rates, whereas ERA 5 tended towards underestimation. That said, FCNN did out-perform ERA5 on all precipitation classes with the exception of drizzle. For the five-day forecasting goal their experiments produced significants improvements as summarised below.

Five-day forecast Before RainBench and PyRain Now with RainBench and PyRain
State Estimation (from SimSat data) No SimSat Estimation Estimate specific humidity from SimSat data.
State Forecasting Existing WeatherBench forecasts Improved 3-day temperature forecasts and 3-day wet variable forecasts
Precipitation Estimation ERA-model precipitation estimation Neural-network for precipitation estimation

The team posit that FCNN's performance could be further improved by (i) extending the input feature set, (ii) carefully tuning the class weights and proportions, and (iii) extending the training set by adding more years of data. You can learn more about this case study by visiting the FDL EUROPE 2020 RESULTS PAGE, where a summary, poster and full technical memorandum can also be viewed and downloaded.

The Scan Partnership

Scan is a major supporter of FDL Europe. As an NVIDIA Elite Solution Provider Scan contributes multiple DGX supercomputers in order to facilitate much of the machine learning and deep learning development and training required during the research sprint period.

Project Wins

rainy

Demonstration of improved precipitation classification using ML over numerical models, in the majority of cases

device_thermostat

Demonstration of improved three-day precipitation and temperature forecasting over traditional methods

speed

Time savings generated during eight-week research sprint due to access to GPU-accelerated DGX systems

avatar

James Parr

Founder, FDL / CEO, Trillium Technologies

"FDL has established an impressive success rate for applied AI research output at an exceptional pace. Research outcomes are regularly accepted to respected journals, presented at scientific conferences and have been deployed on NASA and ESA initiatives - and in space."

avatar

Dan Parkinson

Director of Collaboration, Scan Computers

"We are proud to work with NVIDIA to support the FDL Europe research sprint with GPU-accelerated systems. It is a huge privilege to be associated with such ground-breaking research efforts in light of the challenges we all face when it comes to climate change and extreme weather events."

Speak to an expert

You’ve seen how Scan continues to help FDL Europe further its research into the climate change and space. Contact our expert AI team to discuss your project requirements.

phone_iphone Phone: 01204 474210

mail Email: [email protected]

Read more case studies

We have a large range of case studies from many industries

Find Out More