As ecologists, we regularly encounter a need to use large meteorological or hydrological data files to drive our research, from extracting data for our study sites, to modelling changes in species distribution in future climate scenarios, for example. However, without guidance, knowing what data is available, where it is available from, how to access the data, and how to process large datasets can be challenging.
On the 4th December 2023, we hosted a workshop to deliver such guidance to ecologists, at the University of Reading, with support from the Scenario Doctoral Training Partnership. The day began in the Brian Hoskins building, home to the Department of Meteorology and named after renowned meteorologist and climatologist, Professor Sir Brian Hoskins.
Our first speaker was final-year PhD student and hydroclimatologist at the UK Centre for Ecology and Hydrology (UKCEH), Wilson Chan. Wilson first introduced hydrological datasets including the National River Flow Archive, the National Hydrological Monitoring Programme, and the Global River Runoff Data Centre. In addition to providing daily river flow records for a selected period, the National Hydrological Monitoring Programme produces monthly Hydrological Summaries, which I have found to be a useful resource to refer to when needing to summarise the conditions of a study period for a project without having the time to analyse data myself. The National River Flow Archive has its own R package to facilitate the processing of data without the need to download from the website.
Moving on to meteorological variables, we were introduced to the NetCDF file format, a file type commonly used in climate science to handle large, gridded datasets, and the trade-off between spatial resolution and file size. With the basics covered, data sources from the Met Office were discussed, namely HadUKP and the HadUK-grid. Whilst these datasets can be accessed from the Met Office website, we were later shown how to directly access the HadUK datasets using JASMIN. We benefitted from an explanation of the methods and variables used to estimate Potential Evapotranspiration, data for which is available within the Had-UK Grid dataset.
After learning about how to access and understand datasets from past observations, we learned how climate scientists produce climate change projections using Global Climate Models (GCMs). First, we heard about the UK Climate Projects 18 (UKCP18), based on the Met Office Hadley Centre GCM. Then we heard about the Coupled Model Intercomparison Project (CMIP), which is a collation of outputs from many climate models run by modelling centres around the world, including how to access the data generated from different emission pathways, and when to use CMIP vs UKCP18. We finally heard about products such as the eFLaG river flow projections and the UK Climate Risk Indicators Portal, which may prove useful to practicing ecologists.
Our second speaker was Gwyneth Matthews, also a final year PhD student and researcher at the European Centre for Medium-Range Weather Forecasts (ECMWF). ECMWF is the organisation responsible for the implementation of the climate change and atmospheric monitoring components of the EU’s Copernicus programme, the earth-observation component of the EU’s Space programme. The European Flood Awareness System (EFAS) and Global Flood Awareness System (GLOFAS), hydrological products from the Copernicus Emergency Management Service (CEMS), were used as examples for downloading data from the climate data store. CEMS is managed by the Joint Research Centre of the European Commission and ECMWF is contracted as the CEMS computational center. As well as floods, CEMS monitors and forecasts droughts and fires.Terms such as ensembles forecasts, reforecasts, reanalysis and projections were described in detail, before taking a look at the ERA5 reanalysis dataset, which provides a global, spatially and temporally consistent dataset of atmospheric conditions. ERA5 is considered a good proxy for observed data and may be useful for ecologists studying large areas with spatially inconsistent meteorological observations.
ECMWF products are commonly stored in GRIB formats, which can be processed with the rNOMADS package. We were shown how to access data through the Meteorological Archival Retrieval System (MARS) and Climate Data Store, and the licensing and attribution requirements for products was explained.
The final talk of the morning was delivered by Fatima Chami, from the Science and Technology Facilities Council (STFC), to introduce JASMIN, a storage and computing resource for NERC-funded researchers to facilitate data-intensive research. JASMIN provides several services including the scientific analysis server and batch processing cluster (LOTUS), which were demonstrated in exercises in the afternoon. Before the afternoon, however, we were treated to a tour of the Reading Atmospheric Observatory, the second oldest atmospheric observatory in the country, where weather data has been recorded manually, and later both manually and automatically, since 1901.
Our knowledgeable guide, Dr Martin Airey, senior meteorological support scientist, led us around the site, first focusing on the manual instruments. The wet bulb and dry bulb thermometers, situated inside a white box called a Stevenson screen to facilitate accurate readings, are read every morning, including Christmas day, as are thermometers installed at various soil depths. The thermometers are calibrated to a calibration thermometer, which is kept indoors at the department and sent to a laboratory for its own calibration. The group also saw several types of gauges for measuring rainfall and sensors for measuring wind speed and sunlight.
After lunch we went through some more hands-on exercises, developed by Wilson Chan, Fatima Chami, and other members of the JASMIN team. Accessing JASMIN requires the generation of an ssh key pair, an unfamiliar concept to many attendees, and getting started proved to be a learning curve for me personally. However, once logged in, running an R script on the scientific server or LOTUS was surprisingly straight-forward. The exercises demonstrated how many useful datasets, stored on the CEDA archive, can be accessed directly through JASMIN, eliminating the need to download and process data on a local computer.
We hope all delegates enjoyed the workshop and look forward to running similar training events in the future. If you have any specific workshop topic ideas, please do get in touch with me (Caitlin), PGR/EC representative: students@iale.uk