Return to GeoComputation 99 Index
Stephen C. Taylor, Bernard Armour, William H. Hughes, Andrew Kult, Chris Nizman
Atlantis Scientific Inc., 20 Colonnade Rd, Suite 110, Nepean, ON K2E 7M6 Canada
There has recently been a surge of interest in interferometric value-added products based on SAR data acquired by RADARSAT and other spaceborne SAR sensors. These products include, primarily, digital elevation models (DEMs) and deformation maps. The high level of demand has put a strain on traditional interferometric SAR (InSAR) processing tools, which have been designed primarily for research purposes. We present a production, or operational, approach to InSAR processing that emphasizes maximal throughput and algorithmic robustness, as well as around-the-clock operation requiring relatively little and unskilled supervision. The basis for this approach is an application environment known as "Performance Emphasized Production Environment for Remote Sensing," or peppersTM.
Peppersuses a distributed parallel computing, tile-based pipelined approach to processing remotely sensed image data products, which are characterized by their very large size (often more than 500MB). It makes use of the inherently localized nature of most image processing algorithms to formulate the division of the labour, both spatially across the image and temporally along the pipeline of transformations. This formulation can be used to split the load among all available processing units, either using multiple threads of execution on a multi-CPU computer, or using a message passing technique (MPI) to coordinate multiple processes distributed across a networked cluster of computers.
Peppersrepresents an end-to-end approach to operational processing of remote sensing data. Its components manage every aspect of the production run, including such things as automatic update of intermediate products following a change, auditing of historical results, and the ability to trap exceptional conditions and redirect the approach to overcome these conditions. It is anticipated that peppers will form the basis for future operational remote sensing applications.
Airborne and spaceborne remote sensing imaging sensors are an essential tool for mapping ocean features, land use, environmental change, mineral exploration, emergency search and rescue, disaster management and, in general, characterizing our world on a local, regional, or global scale. As applications of remote sensing imagery become increasingly complex, high-speed algorithms and computers for digital image processing are becoming increasingly essential. This is especially true for remote sensing applications using data acquired by airborne or spaceborne synthetic aperture radar (SAR) systems. These systems may have multi-frequency sensors, acquire multiple polarization channels, and have interferometric capabilities (Way 1991, Raney 1991). They are typically high-resolution sensors having a wide dynamic range and generating massive volumes of multi-channel data. The algorithms used to process SAR data range from simple filters for speckle reduction to very complex processing algorithms such as in the case of SAR interferometry.
Processing of remote sensing data usually requires several processing steps. In most cases each step applies an image transformation (filtering, classifying, etc.) requiring operations on only a local region of imagery. These may be called local operators. In other cases, the entire image space must be operated on as a whole, such as in the case of transforming from one image domain to another (e.g. two-dimensional FFT). These may be called global operators. A third type of image transformation may operate locally in an image and travel in an unpredictable path over the whole image space. An example of this is a region growing algorithm where a single seed is used to start the processing. Another example is the class of cellular automaton algorithms (Codd 1968), which are characterized by operations where the value of one cell depends on the state of that same cell at a previous iteration as well as the state of its neighboring cells. A fourth class of algorithms, image reduction, is characterized by the application of a measurement across the whole image producing some type of non-image result, e.g., the mean of an image.
Interferometric SAR (InSAR) is a technique where two SAR images acquired with a nearly identical incidence angle are combined producing a phase interference image called an interferogram (Massonnet 1997, Zebker 1986, Dixon 1994). SAR images consist of both magnitude (brightness) and phase values. Often the phase information is thrown away; however, if it is retained, the SAR image is described as being complex (Elachi 1988). The phase in a complex SAR image is a coherent signal containing information about the distance between a resolution cell on the ground and the radar antenna, as well as information about the texture of terrain within a resolution cell. Using the phase information in the interferogram, it is possible to extract topographic height information, height change information, and fine scale temporal change measurements.
The operations required to produce a topographic height image, called a digital elevation model, from two complex SAR images involve local operations, global operations, and region growing operations. Typically, each complex SAR image is 6,000 by 20,000 pixels or 0.24 GB in size per scene. For large area mapping, as many as 30 scenes are acquired in one contiguous pass of the airborne or spaceborne SAR platform, and these are to be processed as one data set. Clearly, InSAR processing will benefit from a parallel computing taking advantage of multi-threading and distributed computing.
In the following sections, we introduce a new software system suitable for hosting a range of image processing and signal processing algorithms that are typically used in applications involving routine processing of huge volumes of data. First, we describe the requirements for such a system. Next, in Section 3, we describe peppersTM, our performance emphasized production environment for processing multi-dimensional data sets. This is followed in Section 4 by a description of InSAR processing in peppers. Finally, some conclusions are drawn in Section 5.
Our objective is to have a generic software infrastructure that insulates the application developer from the underlying parallel processing, image caching, data input/output, process control, and GUI. With this in mind, the following requirements are identified for a generic high performance image processing software framework:
To achieve the requirements listed in Section 2, a multithreading and distributed computing solution has been selected. Multithreading on a shared memory multiprocessor (SMP) computer typically does not deliver linear processing speed improvement when the number of CPUs exceeds a certain critical number. An alternative to this form of parallelism is distributed memory computing. Here, a cluster of networked computer platforms is used for a single processing run. Each node in the cluster is given a small task to perform. If the task can work mostly autonomously from other tasks and use only local memory and data, then there is a strong potential for linear speed improvement to be achieved for cluster sizes far exceeding the limitations of SMP architectures.
The peppers design takes advantage of multithreading on an SMP computer as well as allowing operation on a cluster of SMP computers. A tile-based, pipelined approach is used to distribute the processing load both spatially across the image data and temporally along a pipeline of image transformations. The parallelism may be visualized as a cluster of pipelines, each producing output tiles in the target image space.
Peppersconsists of the following components:
These components are used to achieve an end-to-end operational processing system. The HKV provides a transportable database for parameters and data, as well as a format for the accumulation of results. The DRS acts in a similar way to a makefile, only the rules allow for the correct sequencing of operations. The RMI provides an interface for controlling a processing run, communicating messages between processes, and providing progress feedback. The PPE is the workhorse that organizes and executes the pipeline to generate the desired image target(s). These components are described in further detail in the following sections.
The HKV database is a multiple file format database. Most files in the database are pure ASCII text files. The advantages of using an ASCII text representation to store data are the portability of this format and the large number of tools available on most systems to view, edit, and compare such files. Some disadvantages of text files are that they can bloat data size, and can be slower to read and write than their binary equivalents. For this reason, the HKV format allows for files containing a large amount of data to be in binary format (notably image data files).
A database is constructed of a series of records, where each record associates a unique key with a value. The keys define the characteristics of a set of parameters. A hierarchy of parameters can be established, so that one or more parameters can be contained within or owned by a parent parameter.
The Dependency Run Scheduler (DRS) is a stand alone executable module that ensures all target dependencies are up to date, given a set of dependency rules. It determines automatically which processing steps should be run or re-run to propagate changes to other entities. The DRS accepts as input one or more targets in the HKV database to update and construct a processing chain that will guarantee those targets exist and are up to date. It uses timestamp information to determine which entities are out of date, and relies on its rule base to determine the dependencies between entities. Targets can be generated independently or within the pipeline processing engine. The DRS and its rule base essentially perform the function of the "make" utility and "makefile," respectively, familiar to software developers. The DRS has the added capability to group together targets that can be generated together within the PPE.
The DRS system consists of the following:
The Run Monitor Interface has two primary functions:
Since the peppers execution model allows it to run as a background process, the user is able to disconnect from the run and log off the system, allowing the process to continue to run. The user (or another user) can then reconnect to the run to monitor progress, view run results, or issue commands. Multiple users can be connected to the run, but only a single user will be able to issue commands that can affect the status of the run. As only one run may be processed by the peppers application at a time, the RMI facilitates access to shared resources (e.g., the cluster) via mutual exclusion locking and process queuing.
The peppers Pipeline Processing Engine (PPE) is a stand-alone executable module that creates and runs an image processing pipeline. The pipeline uses a data pull paradigm. The output target image is partitioned into image tiles. A request is generated for output tiles that causes a chain of tile requests propagating up the pipeline from the output of the pipeline back to the input. The responsibilities of the PPE include loading and initializing the operators, setting up data input/output, handling tile requests, trapping and handling errors, and shutting down the pipeline. The PPE is designed to handle much of the "dirty work" of dealing with memory caches, data reads and writes, multiprocessing, and so on.
The PPE is capable of operating in a distributed computing environment, including combinations of multi-CPU computer and computers in a network cluster. Each processing unit works cooperatively to run the pipeline to conclusion; thus, a high degree of scalability is achievable.
Figure 1 shows a schematic representation of a processing pipeline. A series of operators are chained together in what can be a fairly complex node graph. An operator is responsible for manipulating a chunk of input pixels to produce a tile of output pixels. Between each operator are one or more images. These may be local tile caches or may represent the entire image stored on disk (in a tiled format).
Figure 1: The Processing Pipeline
After the pipeline is constructed, a tree of tile requests is built according to the data pull paradigm. The order in which tiles are processed and the assignment of the tile requests to ranks in the cluster determine the load balancing. Several request management strategies are implemented to accommodate the various types of processing pipelines.
The Image Processing Utility (IPU) library is intended to provide an easy to use C++ interface to a variety of commonly encountered problems in image processing and remote sensing. These include signal processing algorithms for filtering, windowing and spectral analysis, temporal and spatial reference system representations and calculations, and some numerical analysis problems. Collection classes (arrays, hashes, etc.) are not provided by the IPU.
The PPE is designed to take advantage of all available processors on a single computer platform, as well as take advantage of all available computer nodes in a networked cluster of computers. The software is being tested on PC Pentium II, III (Linux), Sun Ultra (Solaris 2.6), and SGI O2000 (IRIX 6.5) platforms.
Most of peppers is implemented in C++. The initial development platform is PC Pentium II running a Linux. MPI is used for communication between nodes in a cluster; however, CORBA is being considered as an alternative for communication between ranks in a run.
The peppers life cycle is centered around the concept of a production run. Each run targets a particular set of input data products for use in constructing one or more desired output products. The run can be thought of as a database of input parameters and intermediate and final results, stored in HKV format, usually as files within a single directory on disk. The end-user constructs a new run each time a new set of input data are to be processed. Runs can be moved offline, effectively preserving the state of the run so that it can be returned to at a later date. Once the final output products have been generated, the end-user can delete the run from the disk if desired.
From time to time, the end-user will encounter scenarios for which it would be useful to know the history of a particular run: what was run by whom in what sequence, and what were the results. To facilitate this type of information, peppers takes the approach of saving "snapshots" of the run status in a run history repository. In addition, each action taken by the end-user is logged in a run audit trail file.
In addition, applications may define a reporting function that can generate a nicely formatted report file for the current run status based on the current run files (or on the run status as of a given date using the run history repository).
Peppers can be approached from the level of the different components that comprise the core system (HKV, DRS, PPE, etc.). It can also be thought of in terms of the division between run-level and static system-level data files. Another approach is to see peppers as a series of stacked layers of execution, one built up on another as in Figure 2. Each layer extends the previous layer to provide a simpler and more complete interface to the peppers functionality.
Figure 2: Layered Architecture
The end-user or developer can operate at just about any of these layers. Most of the time, the topmost layer is sufficient so that the end-user need not know the particulars of underlying systems; however, it remains possible to go "under the hood"' to do special operations or new development. For instance, it is possible to use the script layer directly to use the operators in novel ways. Operators could also be executed directly from the command line, bypassing the DRS decision making.
The InSAR processing chain consists of the major processing steps required for generating a digital elevation model (DEM) from two complex SAR images. An example of the images at each stage in the processing is given in Figure 3. This example shows the imagery for DEM production using SAR interferometry. It also is possible to perform differential InSAR processing (Dixon 1994). This processing is very similar to DEM production; however, there are several additional steps, each of which can be implemented in the pipeline. In the following descriptions, we concentrate solely on the DEM production case in the interests of simplicity.
Figure 3: Example of RADARSAT Repeat-Pass Spaceborne SAR Interferometry of Death Valley, California. (SAR images are © 1996 CSA. All other images are © 1999 Atlantis Scientific Inc.)
The processing steps for DEM production using InSAR presented in order are as follows (Armour 1997, Kooij 1996, Dixon 1994):
All of these algorithms require that the whole interferogram be calculated and available prior to phase unwrapping starting.
For implementation in peppers, the InSAR processing chain is divided into pipeline streams. A stream is a sequence of processing steps that can be combined into one uninterrupted peppers pipeline.
The processing steps that are completed before a peppers run are:
These two steps do not require parallelization to compute efficiently. Furthermore, the complete results of this processing is required before the rest of the InSAR processing chain can begin.
The remainder of the InSAR processing chain is partitioned into streams. Stream boundaries are determined according to the input requirements of each operator. If a complete intermediate result is required as an input to a pipeline operator, then this intermediate image must be fully assembled before the operator can begin processing. This situation is typical of a global operator and of some region-growing operators. Phase unwrapping algorithms may be a global operator, a region-growing operator or both; however, most of the operators in the InSAR processing chain fall into the "local operators" category and work very well in a tiled and pipelined processing engine.
Each stream is a pipeline. It may be executed as one processing run, or combined with other streams. The DRS ensures that only those operators required to generate a desired output target are built into the pipeline that is executed. Not all the operations organized by the DRS have to be pipelines executed by the PPE. Any program may be launched as a processing step. The PPE is just one type of executable that can be launched to generate the desired target. In this way, the DRS works much like a makefile. Typically only the images generated at the end of a stream are saved to disk. All other intermediate images are transient.
Stream 1: Interferogram Generation
Stream 2: Phase Unwrapping
At this point, the user may optionally use an interactive phase unwrapping editor to evaluate the phase unwrapping accuracy, and possibly make editing changes.
Stream 3: Height Image Generation
Next, the user would interactively use Ground Control Points (GCPs) to correct horizontal and vertical errors in the height image. Other InSAR image products can be produced after the height image has been corrected. These are generated in Stream 4.
Stream 4: Geocoded InSAR Product Generation
These four streams are specified in peppers using the DRS rule files. This creates the dependencies between operators. A user starts a run by requesting, for example, a geocoded height image. The DRS is run to identify what operators are required to generate this image target. It is determined that Stream 3 must run to generate a geocoded DEM, but this requires an input of an unwrapped phase image; therefore, Stream 2 must be run, which in turn requires that Stream 1 is run to generate the inputs to Stream 2. Once the required operations are identified, the processing job is handed to the PPE, which sets up the pipeline and begins generating image tiles in the target image according to the data pull model. The processing is complete when all the tiles have been generated in the target image, which in this example is the geocoded height image. Stream 4: Product Generation is run to generate other types of interferometric products such as geocoded coherence image, and master and slave SAR images and terrain slope images.
Currently, Atlantis Scientific uses EarthViewTM InSAR (EVInSAR) to generate InSAR image products. This software is implemented as a single threaded application. In addition, it is designed to retain all intermediate images on disk. For DEM production, as described in Section 4.2, EVInSAR requires approximately 10 hours to process the output height image when running on a SGI Origin 2,000 with 2 by 175 MHz MIPS-4 CPUs and 512 MB RAM. It is the goal of InSAR-PeppersTM to reduce this processing time by at least 50% by eliminating the need for intermediate file I/O. The processing speed can then be improved by a factor of N as a result of the multithreading in the PPE, where N is the number of CPUs. It is anticipated that linear improvement in the processing speed will be achieved up to perhaps N=8 CPUs. To obtain still greater parallelization, it is planned that the PPE will run in on a cluster of 10 or more nodes to achieve a near linear improvement in processing speed.
The peppers software infrastructure was conceived to solve a growing need for a high performance development environment in which to develop operational software for image processing algorithms. Atlantis Scientific has designed peppers for its own use in operational InSAR processing. Though peppers is first being used for a remote sensing application, it is clear that this technology can be applied to other problems requiring high-speed processing of large volumes of multi-dimensional data sets.
At the time of writing this paper, the peppers software exists as a prototype and the InSAR application is being implemented in the peppers environment. It is planned that the system will become operational in the 3rd quarter of 1999.
The development of peppers and InSAR-Peppers is supported in part by the RADARSAT User Development Program of the Canadian Space Agency.
B. Armour, J. Ehrismann, M. Kooij, P. Farris-Manning, and S. Sato, 1997. "New software for technology for repeat pass spaceborne SAR (InSAR) processing", Proceedings of the International Symposium Geomatics in the Era of RADARSAT, 1997, 2nd Edition.
Codd, E.F., 1968. Cellular Automata, Academic Press, New York, NY.
Constantini, M., 1998. "A novel phase unwrapping method based on network programming", IEEE Transactions on Geoscience and Remote Sensing. 36, pp. 813-821.
Dixon, T.H. Ed., 1994. SAR Interferometry and Surface change Detection, Report of a Workshop Held in Boulder, Colorado, February 3-4 1994, Dixon, Timothy H., Ed. Available at http://southport.jpl.nasa.gov/scienceapps/dixon/index.html.
C. Elachi, 1988. Spaceborne Radar Remote Sensing: Applications and Techniques. IEEE Press, New York, NY.
Goldstein, R.M., H.A. Zebker, and C. Werner, 1988. "Satellite Radar Interferometry: Two-Dimensional Phase Unwrapping," Radioscience, 23, pp. 713-720.
van der Kooij, M.W.A., B. Armour, J. Ehrismann, H. Schwichow, and S. Sato, 1996. "A workstation for spaceborne Interferometric SAR data," The 26th International Symposium on Remote Sensing of Environment.
Massonnet, D., 1997. Satellite radar interferometry, Scientific American, 276, No. 2, pp. 46-53.
Pritt, M.D., 1996. "Phase Unwrapping by Means of Multigrid Techniques for Interferometric SAR," IEEE Transactions on Geoscience and Remote Sensing. 34, pp. 728-738.
Raney, R.K., A. P. Luscombe, E. J. Langham, and S. Ahmed, 1991. RADARSAT. Proceedings of the IEEE. 79, pp. 839-849.
Way, J. and E.A. Smith, 1991. "The evolution of synthetic aperture radar systems and their progression to the EOS SAR," IEEE Transactions on Geoscience and Remote Sensing, Vol. 29, No. 6, pp. 962-985.
Zebker, H.A. and R.M. Goldstein, 1986. "Topographic mapping from interferometric synthetic aperture radar observations," Journal of Geophysical Research, 91, B5, pp. 4,993-4,999.