This tutorial explains how to use PyEPR to generate a file in GDAL Virtual Format (VRT) that can be used to access data with the powerful and popular GDAL library. GDAL already has support for ENVISAT products but this example is interesting for two reasons:
The complete code of the example is available in the examples/export_gdalvrt.py file. It is organized so that it can also be imported as a module in any program.
The export_gdalvrt module provides two functions:
Takes in input an epr.Band object and a VRT dataset and add a GDAL band to the VRT dataset
Takes in input a PyEPR Product (or a filename) and the file name of the output VRT file and generates the VRT file itself containing a band for each epr.Band present in the original epr.Product and also associated metadata.
The epr2gdal() function first creates the VRT dataset
if isinstance(product, str):
filename = product
product = epr.open(filename)
ysize = product.get_scene_height()
xsize = product.get_scene_width()
if os.path.exists(vrt) and not overwrite_existing:
raise ValueError('unable to create "{0}". Already exists'.format(vrt))
driver = gdal.GetDriverByName('VRT')
if driver is None:
raise RuntimeError('unable to get driver "VRT"')
gdal_ds = driver.Create(vrt, xsize, ysize, 0)
if gdal_ds is None:
raise RuntimeError('unable to create "{}" dataset'.format(vrt))
and then loops on all epr.Bands of the PyEPR epr.Product calling the epr2gdal_band() function on each of them:
for band in product.bands():
epr2gdal_band(band, gdal_ds)
The export_gdalvrt module also provides a epr_to_gdal_type mapping between EPR and GDAL data type identifiers.
The core of the example is the part of the code in the epr2gdal_band() function that generates the GDAL VRTRawRasterBand. It is a description of a raster file that the GDAL library uses for low level data access. Of course the entire machinery works because data in epr.Bands and epr.Datasets of ENVISAT products are stored as contiguous rasters.
filename = os.pathsep.join(product.file_path.split('/')) # denormalize
offset = dataset.get_dsd().ds_offset + field.get_offset()
line_offset = record.tot_size
pixel_offset = epr.get_data_type_size(field.get_type())
if band.sample_model == epr.E_SMOD_1OF2:
pixel_offset *= 2
elif band.sample_model == epr.E_SMOD_2OF2:
offset += pixel_offset
pixel_offset *= 2
options = [
'subClass=VRTRawRasterBand',
'SourceFilename={}'.format(filename),
'ImageOffset={}'.format(offset),
'LineOffset={}'.format(line_offset),
'PixelOffset={}'.format(pixel_offset),
'ByteOrder=MSB',
]
gtype = epr_to_gdal_type[field.get_type()]
ret = gdal_ds.AddBand(gtype, options=options)
if ret != gdal.CE_None:
raise RuntimeError(
'unable to add VRTRawRasterBand to "{}"'.format(vrt))
The fundamental part is the computation of the:
ImageOffset:
the offset in bytes to the beginning of the first pixel of data with respect to the beginning of the file.
In the example it is computed using
- the epr.DSD.ds_offset attribute, that represents the offset in bytes of the epr.Dataset from the beginning of the file, and
- the epr.Field.get_offset() method that returns the offset in bytes of the epr.Field containing epr.Band data from the beginning of the epr.Record
offset = dataset.get_dsd().ds_offset + field.get_offset()
LineOffset:
the offset in bytes from the beginning of one scanline of data and the next scanline of data. In the example it is set to the epr.Record size in bytes using the epr.Record.tot_size attribute:
line_offset = record.tot_size
PixelOffset:
the offset in bytes from the beginning of one pixel and the next on the same line. Usually it corresponds to the size in bytes of the elementary data type. It is set using the epr.Field.get_type() method and the epr.get_data_type_size() function:
pixel_offset = epr.get_data_type_size(field.get_type())
The band size in lines and columns of the GDAL bands is fixed at GDAL dataset level when it is created:
gdal_ds = driver.Create(vrt, xsize, ysize, 0)
if gdal_ds is None:
raise RuntimeError('unable to create "{}" dataset'.format(vrt))
Please note that in case of epr.Datasets storing complex values, like in MDS1 epr.Dataset of ASAR IMS epr.Products, pixels of real and imaginary parts are interleaved, so to represent epr.Bands of the two components the pixel offset have to be doubled and an additional offset (one pixel) must be added to the ImageOffset of the epr.Band representing the imaginary part:
if band.sample_model == epr.E_SMOD_1OF2:
pixel_offset *= 2
elif band.sample_model == epr.E_SMOD_2OF2:
offset += pixel_offset
pixel_offset *= 2
Note
the PyEPR API does not supports complex Bands. epr.Datasets containing complex data, like the MDS1 epr.Dataset of ASAR IMS epr.Products, are associated to two distinct epr.Bands containing the real (I) and the imaginary (Q) component respectively.
GDAL, instead, supports complex data types, so it is possible to map a complex ENVISAT epr.Dataset onto a single GDAL bands with complex data type.
This case is not handled in the example.
The epr2gdal_band() function also stores a small set of metadata for each epr.Band:
gdal_band = gdal_ds.GetRasterBand(gdal_ds.RasterCount)
gdal_band.SetDescription(band.description)
metadata = {
'name': band.get_name(),
'dataset_name': dataset.get_name(),
'dataset_description': dataset.description,
'lines_mirrored': str(band.lines_mirrored),
'sample_model': epr.get_sample_model_name(band.sample_model),
'scaling_factor': str(band.scaling_factor),
'scaling_offset': str(band.scaling_offset),
'scaling_method': epr.get_scaling_method_name(band.scaling_method),
'spectr_band_index': str(band.spectr_band_index),
'unit': band.unit if band.unit else '',
'bm_expr': band.bm_expr if band.bm_expr else '',
}
gdal_band.SetMetadata(metadata)
Metadata are also stored at GDAL dataset level by the epr2gdal() function:
metadata = {
'id_string': product.id_string,
'meris_iodd_version': str(product.meris_iodd_version),
'dataset_names': ','.join(product.get_dataset_names()),
'num_datasets': str(product.get_num_datasets()),
'num_dsds': str(product.get_num_dsds()),
}
gdal_ds.SetMetadata(metadata)
The epr2gdal() function also stores the contents of the MPH and the SPH records as GDAL dataset matadata in custom domains:
mph = product.get_mph()
metadata = str(mph).replace(' = ', '=').split('\n')
gdal_ds.SetMetadata(metadata, 'MPH')
sph = product.get_sph()
metadata = str(sph).replace(' = ', '=').split('\n')
gdal_ds.SetMetadata(metadata, 'SPH')
#!/usr/bin/env python3
import os
import epr
from osgeo import gdal
epr_to_gdal_type = {
epr.E_TID_UNKNOWN: gdal.GDT_Unknown,
epr.E_TID_UCHAR: gdal.GDT_Byte,
epr.E_TID_CHAR: gdal.GDT_Byte,
epr.E_TID_USHORT: gdal.GDT_UInt16,
epr.E_TID_SHORT: gdal.GDT_Int16,
epr.E_TID_UINT: gdal.GDT_UInt32,
epr.E_TID_INT: gdal.GDT_Int32,
epr.E_TID_FLOAT: gdal.GDT_Float32,
epr.E_TID_DOUBLE: gdal.GDT_Float64,
#epr.E_TID_STRING: gdal.GDT_Unknown,
#epr.E_TID_SPARE: gdal.GDT_Unknown,
#epr.E_TID_TIME: gdal.GDT_Unknown,
}
def epr2gdal_band(band, vrt):
product = band.product
dataset = band.dataset
record = dataset.read_record(0)
field = record.get_field_at(band._field_index - 1)
ysize = product.get_scene_height()
xsize = product.get_scene_width()
if isinstance(vrt, gdal.Dataset):
if (vrt.RasterYSize, vrt.RasterXSize) != (ysize, xsize):
raise ValueError('dataset size do not match')
gdal_ds = vrt
elif os.path.exists(vrt):
gdal_ds = gdal.Open(vrt, gdal.GA_Update)
if gdal_ds is None:
raise RuntimeError('unable to open "{}"'.format(vrt))
driver = gdal_ds.GetDriver()
if driver.ShortName != 'VRT':
raise TypeError('unexpected GDAL driver ({}). '
'VRT driver expected'.format(driver.ShortName))
else:
driver = gdal.GetDriverByName('VRT')
if driver is None:
raise RuntimeError('unable to get driver "VRT"')
gdal_ds = driver.Create(vrt, xsize, ysize, 0)
if gdal_ds is None:
raise RuntimeError('unable to create "{}" dataset'.format(vrt))
filename = os.pathsep.join(product.file_path.split('/')) # denormalize
offset = dataset.get_dsd().ds_offset + field.get_offset()
line_offset = record.tot_size
pixel_offset = epr.get_data_type_size(field.get_type())
if band.sample_model == epr.E_SMOD_1OF2:
pixel_offset *= 2
elif band.sample_model == epr.E_SMOD_2OF2:
offset += pixel_offset
pixel_offset *= 2
options = [
'subClass=VRTRawRasterBand',
'SourceFilename={}'.format(filename),
'ImageOffset={}'.format(offset),
'LineOffset={}'.format(line_offset),
'PixelOffset={}'.format(pixel_offset),
'ByteOrder=MSB',
]
gtype = epr_to_gdal_type[field.get_type()]
ret = gdal_ds.AddBand(gtype, options=options)
if ret != gdal.CE_None:
raise RuntimeError(
'unable to add VRTRawRasterBand to "{}"'.format(vrt))
gdal_band = gdal_ds.GetRasterBand(gdal_ds.RasterCount)
gdal_band.SetDescription(band.description)
metadata = {
'name': band.get_name(),
'dataset_name': dataset.get_name(),
'dataset_description': dataset.description,
'lines_mirrored': str(band.lines_mirrored),
'sample_model': epr.get_sample_model_name(band.sample_model),
'scaling_factor': str(band.scaling_factor),
'scaling_offset': str(band.scaling_offset),
'scaling_method': epr.get_scaling_method_name(band.scaling_method),
'spectr_band_index': str(band.spectr_band_index),
'unit': band.unit if band.unit else '',
'bm_expr': band.bm_expr if band.bm_expr else '',
}
gdal_band.SetMetadata(metadata)
return gdal_ds
def epr2gdal(product, vrt, overwrite_existing=False):
if isinstance(product, str):
filename = product
product = epr.open(filename)
ysize = product.get_scene_height()
xsize = product.get_scene_width()
if os.path.exists(vrt) and not overwrite_existing:
raise ValueError('unable to create "{0}". Already exists'.format(vrt))
driver = gdal.GetDriverByName('VRT')
if driver is None:
raise RuntimeError('unable to get driver "VRT"')
gdal_ds = driver.Create(vrt, xsize, ysize, 0)
if gdal_ds is None:
raise RuntimeError('unable to create "{}" dataset'.format(vrt))
metadata = {
'id_string': product.id_string,
'meris_iodd_version': str(product.meris_iodd_version),
'dataset_names': ','.join(product.get_dataset_names()),
'num_datasets': str(product.get_num_datasets()),
'num_dsds': str(product.get_num_dsds()),
}
gdal_ds.SetMetadata(metadata)
mph = product.get_mph()
metadata = str(mph).replace(' = ', '=').split('\n')
gdal_ds.SetMetadata(metadata, 'MPH')
sph = product.get_sph()
metadata = str(sph).replace(' = ', '=').split('\n')
gdal_ds.SetMetadata(metadata, 'SPH')
for band in product.bands():
epr2gdal_band(band, gdal_ds)
# @TODO: set geographic info
return gdal_ds
if __name__ == '__main__':
filename = 'MER_LRC_2PTGMV20000620_104318_00000104X000_00000_00000_0001.N1'
vrtfilename = os.path.splitext(filename)[0] + '.vrt'
gdal_ds = epr2gdal(filename, vrtfilename)
with epr.open(filename) as product:
band_index = product.get_band_names().index('water_vapour')
band = product.get_band('water_vapour')
eprdata = band.read_as_array()
unit = band.unit
lines_mirrored = band.lines_mirrored
scaling_offset = band.scaling_offset
scaling_factor = band.scaling_factor
gdal_band = gdal_ds.GetRasterBand(band_index + 1)
vrtdata = gdal_band.ReadAsArray()
if lines_mirrored:
vrtdata = vrtdata[:, ::-1]
vrtdata = vrtdata * scaling_factor + scaling_offset
print('Max absolute error:', abs(vrtdata - eprdata).max())
# plot
from matplotlib import pyplot as plt
plt.figure()
plt.subplot(2, 1, 1)
plt.imshow(eprdata)
plt.grid(True)
cb = plt.colorbar()
cb.set_label(unit)
plt.title('EPR data')
plt.subplot(2, 1, 2)
plt.imshow(vrtdata)
plt.grid(True)
cb = plt.colorbar()
cb.set_label(unit)
plt.title('VRT data')
plt.show()