A Short Story about OGC WCS / WCPS

This notebook gives a quick overview of python access to datacubes via the OGC WCS and WCPS standards. The requests demonstrated access CODE-DE Datacube service enabled by the datacube reference implementation, rasdaman. The service end point used is https://ows.rasdaman.org/rasdaman/ows.

Data Overview

  • Sentinel 2
    • CRS: varying for each datacube depending on the UTM zone it covers
    • Datacube names: S2_${Level}_${utmCrsCode}_${band}_${resolution}
      • ${level} - L2A
      • ${utmCrsCode} - the EPSG code for datacube CRS: 32631, 32632, 32633,...
      • ${band} - the bands of sentinel 2 data (B01, B02, B03,...)
      • ${resolution} - 10m, 20m, 60m
      • Example: S2_L2A_32631_B04_60m

Web Coverage Service (WCS)

WCS is a standarized data access protocol for datacubes (coverages in OGC terminology). For the purpose of this notebook we use GET/KVP which nicely maps to URLs; further possible protocols include POST/XML, SOAP, and in future OAPI-Coverages.

WCS supports three core requests, all of which uniformly start with https://ows.rasdaman.org/rasdaman/ows?service=WCS&version=2.0.1&request= as you can see:

GetCoverage extraction from a coverage, commonly called subsetting, is subdivided into trimming and slicing. Trimming reduces the spatial and/or temporal extent, but leaves the number of dimensions unchanged - you can get a 3D sub-cube from a 3-D cube, a 2-D part of a 2-D map, etc. Slicing, on the other hand, reduces the dimensions - you can get a 1-D pixel history from a 3-D datacube, or a 2-D timeslice. By the way, the order in which you mention axes does not matter as you access them by name anyway.

Trimming / Slicing on 3D data cube to get a subset coverage

How to integrate WCS requests into a python script

If not done, first install the required python libraries, as usual with pip available from https://pypi.org/project/pip/.

$ pip install requests owslib numpy scipy matplotlib

Next, we build and send a WCS request which slices a Sentinel-2 satellite timeseriss datacube and returns the result as a JPEG image:

In [1]:
# Get a subset coverage by slicing on time axis, trimming on spatial axes, and encoding result in image/jpeg.

from IPython.display import Image
import requests

# Set base url which can be used in further code examples
service_endpoint = "https://ows.rasdaman.org/rasdaman/ows"
base_wcs_url = service_endpoint + "?service=WCS&version=2.0.1"

request = "&REQUEST=GetCoverage"
cov_id = "&COVERAGEID=S2_L2A_32631_TCI_60m"
subset_time = "&SUBSET=ansi(\"2021-04-09\")"
subset_e = "&SUBSET=E(669960,729960)"
subset_n = "&SUBSET=N(4990200,5015220)"
encode_format = "&FORMAT=image/jpeg"

response = requests.get(base_wcs_url + request + cov_id + subset_time + subset_e + subset_n + encode_format, verify=False)

# Display result directly
Image(data=response.content)
Out[1]:

...and that's it! It is the functionality of WCS Core only, though; much more is available with the WCS extensions, including band/variable extraction, reprojection, scaling, interpolation, data maintenance, etc. - best check out the EarthServer webinar series for details. In the next section we pick one particular WCS extension, WCS Processing, which offers the WCPS datacube analytics language.

OGC (Web Coverage Processing Service) WCPS

OGC WCPS is a high-level language for spatio-temporal extraction, analysis, filtering, and fusion of coverages. The return clause contains the processing and encoding directives. In the for part the coverages addressed are listed; they can be processed in sequence or in combination, enabling data fusion. An optional where clause can be used for server-side filtering - only the coverages matching will be processed and delivered; such results can be both coverages again, or scalars in case of aggregation. Both data and metadata of a coverage are amenable to retrieval. More on the WCPS language can be found in this tutorial and the EarthServer webinar series.

Technically, a WCPS request is sent by appending a query to the request:

https://code-de.rasdaman.com/rasdaman/ows?service=WCS&version=2.0.1&request=ProcessCoverages&query=...

The following query is equivalent to the previous WCS request which selects a 2D subset from a 3D coverage, and encodes the result as a PNG image; note that "ansi" stands for an ANSI-compliant time axis:

for $c in (S2_L2A_32631_TCI_60m)
return 
  encode(
      $c[ ansi( "2021-04-09" ),
          E( 669960:729960 ),
          N( 4990200:5015220 ) ], 
      "image/png" )

Let us inspect the query components step-by-step:

  • $c - the coverage variable, an alias for the coverages in the coverage list (in our case S2_L2A_32631_TCI_60m)

  • return - contains the processing expression, which here returns an encoded coverage

  • encode - a function which takes a coverage expression and encodes it in the desired format, here PNG

  • $c[ ... ] - this is a coverage subsetting expression where upper and lower limits are specified to restrict the extents along each axis ('trimming'). The time axis, on the other hand, is 'sliced' away. Hence, the result coverage will only retain axes E and N.

PS: Before sending such a query we must keep in mind that several characters need to be "escaped" in a GET request, i.e.: they must be replaced by their hexadecimal code. Manually this is tedious, but fortunately there are tools doing this of us, like https://meyerweb.com/eric/tools/dencoder/. The outcome looks less legible to us humans, but as queries are typically generated by JavaScript hits will not become visible to humans anyway:

https://ows.rasdaman.org/rasdaman/ows?service=WCS&version=2.0.1&request=ProcessCoverages&query=for%20%24c%20in%20(S2_L2A_32631_TCI_60m)%20return%20encode(%20%24c%5B%20ansi(%20%222021-04-09%22%20)%2C%20E(%20669960%3A729960%20)%2C%20%20N(%204990200%3A5015220%20)%20%5D%2C%20%20%22image%2Fpng%22%20)

Temporal Change Analysis with WCPS

Our next query performs time-series processing by calculating, in the Sentinel-2 datacube, the difference between two timeslices of the B1 band. The result is numerically adjusted so that it fits into the range of [0, 255] for encoding in JPEG.

In [4]:
query = '''
for $c in (S2_L2A_32631_B01_60m) 
return
  encode(
    ( 0.20 * ( 35.0 + ( 
                (float) $c[ ansi( "2021-04-09" ) ]  ) 
            ) 
    )[ E( 669960:729960 ), N( 4990200:5015220 )  ]
   , "image/jpeg")
'''

# Send a WCPS query for evaluation on the rasdaman server; as WCPS queries usually contain 
# special characters like '[', ']', '{', '}', it is necessary to send POST requests. 
response = requests.post(service_endpoint, data = {'query': query}, verify=False)

# Display result directly
#print response.content
Image(data=response.content)
Out[4]:

Band Math in WCPS

Any sort of pixel-level operation can be written in a simple way in WCPS. We inspect this by calculating the NDVI (Normalized Difference Vegetation Index) on our datacube. To this end, we first change the area of interest to some part of Bremen, Germany, which has both urban and agricultural areas; the cutout we scale to a horizontal size of 700.

In [5]:
query = '''
for $c in (S2_L2A_32631_TCI_60m)
return 
  encode(
      scale(
        $c[ ansi( "2021-04-09" ), E( 669960:729960 ), N( 4990200:5015220 )  ] * 1.3,
        { E:"CRS:1"(0:700) }
      ),"image/jpeg" )
'''

response = requests.post(service_endpoint, data = {'query': query}, verify=False)
Image(data=response.content)
Out[5]:

The NDVI now can be derived easily from the B8 (near-infrared) and B4 (red) bands of the 10m coverage. We provide the well-known formula and additionall threshold the result to show index values greater than 0.5 (higher vegetation) as white while everything else is black. Note that we need to be careful about data types and manually cast to float - the result is supposed to be integer whereas intermediate computation requires float. On the side, note the let statement which allows to conveniently define subexpressions; use it freely to make your query better understandable!

In [7]:
query = '''
for $c in (S2_L2A_32631_B08_10m),
    $d in (S2_L2A_32631_B04_10m)

let $cutOut := [ ansi( "2021-04-09" ), E( 670000:679000 ), N( 4990220:4993220 ) ]
return
  encode( (

      (((float) $c - $d) / ((float) $c + $d)) [ $cutOut ] > 0.5
 ) * 255

  , "image/jpeg")
'''

response = requests.post(service_endpoint, data = {'query': query}, verify=False)
Image(data=response.content)
Out[7]:

A false-color composite can be created from the corresponding bands in a straightforward manner by just specifying them in the desired order:

In [9]:
query = '''
for $c in (S2_L2A_32631_B08_10m),
    $d in (S2_L2A_32631_B04_10m),
    $e in (S2_L2A_32631_B03_10m)

let $cutOut := [ ansi( "2021-04-09" ), E( 670000:679000 ), N( 4990220:4993220 ) ]
    
return
  encode(

      {
         red:   $c[ $cutOut ] / 17.0;
         green: $d[ $cutOut ] / 17.0;
         blue:  $e[ $cutOut ] / 17.0
      }
     
    
  , "image/jpeg")
'''

response = requests.post(service_endpoint, data = {'query': query}, verify=False)
Image(data=response.content)
Out[9]:

What if the bands we want to combine come from coverages with different resolutions? We can scale the bands to a common resolution before the operations, e.g. below we combine B12 from a 20m coverage, and B8 / B3 from a higher resolution 10m coverage.

In [13]:
query = '''
for $c in (S2_L2A_32631_B12_20m), 
    $d in (S2_L2A_32631_B08_10m),
    $e in (S2_L2A_32631_B03_10m)

   
return
  encode(
    {
      red: scale( $c[ ansi( "2021-04-09" ), E( 670000:679000 ), N( 4990220:4993220 ) ],
                  { E:"CRS:1"(0:599), N:"CRS:1"(0:299) }) ;
    green: scale( $d[ ansi( "2021-04-09" ), E( 670000:679000 ), N( 4990220:4993220 ) ],
                  { E:"CRS:1"(0:599), N:"CRS:1"(0:299) }) ;
     blue: scale( $e[ ansi( "2021-04-09" ), E( 670000:679000 ), N( 4990220:4993220 ) ], 
                  { E:"CRS:1"(0:599), N:"CRS:1"(0:299) })
    } / 15.0
  , "image/jpeg")
'''

response = requests.post(service_endpoint, data = {'query': query}, verify=False)
Image(data=response.content)
Out[13]:

Timeseries Aggregation with WCPS

Datacube aggregation is straightforward: select the desired subset and apply an aggregation function such as min, max, avg, count, etc. For example, to calculate the average NDVI value (same spatio-temporal subset as before) the following query can be utilized:

In [14]:
query = '''
for $c in (S2_L2A_32631_B08_10m),
    $d in (S2_L2A_32631_B04_10m)
    
return
  avg(
    (((float) $c - $d) / ((float) $c + $d))
    [ ansi( "2021-04-09" ), E( 670000:730000 ), N( 4990220:5015220 ) ]
  )
'''
response = requests.post(service_endpoint, data = {'query': query}, verify=False)
print("Average NDVI value: " + response.content)
Average NDVI value: 0.5523334179587299

How about calculating this average for each time point between two dates? This can be done with a coverage constructor, which will iterate over all dates between the two given dates, resulting in a 1D array of average NDVI values; notice that the slicing on the time axis ansi is done with the "iterator" variable $pt. The 1D array is encoded as JSON in the end.

In [15]:
query = '''
for $c in (S2_L2A_32631_B08_10m),
    $d in (S2_L2A_32631_B04_10m)
    
return
  encode(
    coverage averageNdviValues
    over $pt t(imageCrsDomain($c[ansi("2021-04-08" : "2021-04-09" )], ansi)) 
    values
      avg(
        (((float) $c - $d) / ((float) $c + $d))
        [ ansi( $pt ), E( 670000:730000 ), N( 4990220:5015220 ) ]
      ),
    "json")
'''
response = requests.post(service_endpoint, data = {'query': query}, verify=False)
print("Average NDVI values: " + response.content)
Average NDVI values: [0.5523334179587299]