What if you have a perfect algorithm and want to apply it to thousands of satellite images? What if you want to automate your processing chain and convert it into an operational service? Or what if you are part of a team and want to share computing resources and also methods for processing, development, and validation? In these cases a processing system can help.

Our processing system for medium to large scale automated EO data processing 

We use our processing system Calvalus for all these purposes: We process 130,000 images for a yearly time series of global burned area products. Every day, we automatically process new Sentinel-3 OLCI data to water quality products and add them to a continuously growing data cube for the Cyanoalert service and other water quality services. We use our processing system for several projects at the same time, sharing parts of the software, and using a common way to specify workflows and to run processing jobs, systematically and on-demand.

Data and processing management for projects, institutions, and teams

Other organizations use Calvalus for the same reasons. CalFin operated by the FMI and SYKE in Finland and ESTHub as part of the Estonian Collaborative Ground Segment are two processing systems where scientists process datasets and develop new processing chains. Another example is CalLand in the land monitoring department of DLR where Calvalus serves as the processing system for a working group.

Apache Hadoop clusters and cloud solutions for Earth Observation 

Technically, Calvalus is based on Apache Hadoop as Cluster scheduler and “batch system”, with several additional functions to use a Hadoop for Earth Observation data processing.

We use Hadoop because it comes with parallelization and scalability, reliability and automated failover, map-reduce and sorting and partitioning, which is perfect for spatio-temporal aggregation, and also for validation and comparison. Calvalus is both used on dedicated clusters and in the cloud. Calvalus clusters are deployed on CreoDIAS and Sobloo DIAS for certain projects that take advantage of the Sentinel data available on these platforms and of the option to scale the cluster by adding or removing virtual machines.

Processing is as easy as submitting a request and getting the results 

In Calvalus users submit “requests” to the processing system that concurrently starts data processors to process satellite images into geophysical data products and aggregate them into daily or monthly composites or data cubes. More complex processing workflows in Calvalus are nothing else than generators of such processing requests. A workflow automatically generates often larger sets of processing requests with internal dependencies among them and monitors their progress. The advantage: Users can do any step manually that the system does automatically, e.g. to test something, or to recover from failure.

Integrate your own processors 

Calvalus comes with a request interface in Json, yaml and XML, also supports OGC WPS and a Web GUI for request submission. Workflows can be configured with Python scripts and Json request templates. Data processors are integrated as (Unix) executables, Python programs (Miniconda, Anaconda), and it in particular supports SNAP GPT graphs and operators of the different Sentinel Toolboxes. Calvalus also supports Docker containers as wrappers around processors. Calvalus outputs can be any format a processor generates. And it supports many standard EO data formats like NetCDF, GeoTIFF, Zarr, Dimap by functions of SNAP. 

BC’S ACTIVITIES
  • Continuous development since 2011
  • Use by BC for many concurrent projects
  • Deployment at customer premises and on DIAS
  • Hosting of services on BC cluster
  • Integration of processors and workflows
  • Support
  • Training
PROJECTS
CLIENTS/PARTNERS

SYKE+FMI, DLR, ESA, Estonian Land Board, HZG

 

INFORMATION | CONTACT

If you are interested in processing system solutions or if you need support in processing data, please contact the Calvalus team at info@brockmann-consult.de.