Development of analysis tools for recurring agricultural policy issues based on the example of drought monitoring
A data cube (ODC) was set up at the Federal Institute of Agricultural Economics and Mountain Farming (BAB) in order to efficiently manage and analyze the constantly growing amount of raster data. The unique feature of the BAB's Open Data Cube is the extension of the technology originally intended for satellite images so that other data can also be indexed and loaded as time series into this multidimensional data cube. This will make it possible to intersect and evaluate a variety of raster and rasterized vector data (including ALS, INVEKOS and climate data) and satellite images together in one system. In addition to the purely spatial analysis, the temporal dimension can also be taken into account in the calculations and evaluations in the multidimensional data cube, which enables high-performance analyses of time series.
The aim is to establish the ODC as an analysis tool at the BAB in order to be able to carry out recurring data evaluations as a basis for decision-making on agricultural policy issues in the future in a much more targeted and, above all, more performant manner. The ODC should serve as a networked data center and be able to function openly for data integration via cloud object storage interfaces (e.g. S3).This is intended to replace part of the existing geodata infrastructure.This solution minimizes sources of error, makes the most up-to-date data immediately available to all users and is managed decentrally.The aim is for users to be able to carry out the necessary analyses independently via an internet browser on the underlying infrastructure. Existing analysis functions are to be expanded and recurring evaluations of various issues can be updated. As an example application, monitoring of Austria's climatic development is being developed specifically for agricultural areas. The ODC is to be transferred from a pilot environment to a scalable, fail-safe productive environment in the further course of the project.
The Jupyterhub platform around the ODC was migrated from a test environment to a Kubernetes cluster (system for orchestrating container applications), which ensures high reliability and rapid scalability.
In 2023, the infrastructure was further optimized and expanded into a dask cluster.This makes it possible to parallelize processes and make the best possible use of the hardware resources of the Kubernetes cluster in order to increase performance.In addition, login to the platform was replaced by single sign-on (SSO).
As a methodological use case, defined climate parameters (e.g. the climatic water balance, heat days and dry periods of 10 or more days) were calculated for each cadastral municipality in Austria for the normal climate period 1961-1990 and 1991-2020.The aim was to show climate changes and to create a basis that allows these or similar recurring questions to be answered at short notice with current data if required.
The ODC was presented and discussed at the GI-Salzburg 2023.
In 2024, the Kubernetes cluster in which the platform is embedded was greatly expanded. It now consists of 13 servers, three of which act as master nodes and ten as worker nodes. The availability and reliability of the cluster is significantly increased by the three master nodes. The ten worker nodes enable high scalability and efficient distribution of workloads, which improves the performance and responsiveness of the applications.
In the current project, an analysis platform was successfully set up and further developed with the implementation of ODC and STAC technologies. The platform is user-friendly, flexible and scalable and fail-safe thanks to its embedding in a Kubernetes cluster. Thanks to the simple linking with other tools such as Apache Airflow and Apache Superset, a workflow was developed as part of the project to efficiently design the entire data workflow - from data integration and data analysis to data visualisation.
The result is a robust infrastructure that enables large amounts of data from different data sources to be analysed together. In the application example, climatic data and drought indices were calculated and visualised and a flexible monitoring tool was created. The infrastructure was migrated to the production environment and is available to users internally for analyses. This represents an important milestone for future data-intensive analyses (big data processing) at the BAB. Overall, the project has shown that the continuous development and integration of advanced technologies are crucial for a powerful analysis and visualisation platform.
Meteorological drought indices can be used as indicators of climate conditions and water stress. Long-term monitoring can reveal development trends, serve as an early warning system and provide valuable information to strengthen the adaptability of agriculture to drought conditions in the long term. However, meteorological drought cannot be equated with agricultural drought - it is therefore not possible to draw direct conclusions about drought-related crop losses based on meteorological indicators, as vulnerability to drought varies (e.g. depending on crop type, variety, development phase) and is also dependent on other factors (e.g. water storage (capacity) of the soil, management measures, regional conditions) (Bachmair et al., 2018). This should be taken into account when planning future support measures for yield losses due to drought.
The available interpolated precipitation data is not sufficiently precise for detailed statements at farm or field level, as precipitation often occurs on a very small scale and with varying intensity. It is therefore only possible to create a rough overview and visualise large-scale trends.
In addition to meteorological indicators and indices, the actual condition of the vegetation based on satellite images can be a useful supplement. Satellite data provide valuable information on soil moisture, plant growth and plant condition (Trnka et al., 2020). In the preliminary project, tests were carried out with individual Sentinel-2 scenes. Large-scale data processing and analysis across Austria would have exceeded the time and content scope of this project. Systematic utilisation and analysis of satellite data at a national level could therefore be a useful further development.
Schedule
Project start: 01/2021
Project end: 12/2024