Return to GeoComputation 99 Index

Failure prediction in automatically generated digital elevation models

M.J.Gooch and J.H.Chandler
Department of Civil and Building Engineering, Loughborough University, Loughborough, Leicestershire, LE11 3TU United Kingdom


Developments in digital photogrammetry have provided the ability to automatically generate Digital Elevation Models (DEMs) and is increasingly used by geoscientists (Chandler, 1999). Using overlapping imagery, dense grids of coordinates can be collected at high speeds (150 points per second) with a high level of accuracy. The trend towards using PC based hardware, the widespread use of Geographical Information Systems and the forthcoming availability of high resolution satellite imagery over the internet at ever lower costs, means that the use of automated digital photogrammetry for elevation modeling is likely to become more widespread. Automation can reduce the need for an in-depth knowledge of the subject thus rendering the technology an option for more users.

One criticism of the trend towards the automated black box approach is the common lack of quality control procedures within the software (Cooper 1998), particularly to identify areas of the DEM of low accuracy. The traditional method for accuracy assessment is through the use of check point data (data collected by an independent means of a higher level of accuracy against which the DEM can be compared). However, check point data is rarely available and the user is recommended to manually check and edit the data using stereo viewing methods, a potentially lengthy process which can negate the obvious speed advantages brought about by automation.

Research carried out at Loughborough sought to optimise the accuracies of automatically generated DEMs and focused upon the ERDAS Imagine OrthoMAX digital photogrammetric system(Gooch et al, in press). This software uses an area correlation based algorithm over which the user has a certain amount of control through the use of a set of strategy parameters (Gooch and Chandler, 1998). These control the acceptance and quality control requirements for the derived data and early research work assessed the effect of altering these parameters on the resulting accuracy of the DEMs.

This work allowed a software data processing model to be developed that is capable of identifying areas where elevations are unreliable and which the user should pay attention to when editing and checking the data. The software model developed will be explained and described in detail in the presentation. Results from tests on different scales of imagery, different types of imagery and other software packages will also be presented to demonstrate the efficacy and significantly the generality of the technique with other digital photgrammetric software systems.

1. A brief introduction to Digital Photogrammetry and DEMs

The definition of photogrammetry most widely used is that of Slama (1980 who states that "Photogrammetry is the art, science, and technology of obtaining reliable information about physical objects and the environment through processes of recording, measuring, and interpreting photographic images and patterns of electromagnetic radiant energy and other phenomena." In essence, spatial information is obtained from pairs (stereopairs) of overlapping photographs (terrestrial, satellite or aerial) once knowledge of the position and orientation of each image is known.

There is now a new era of photogrammetry known as digital photogrammetry (DP). With DP, processing is carried out using digital imagery (either scanned film based imagery or captured using a digital camera) on a workstation or high performance PC. Many of the tasks in the photogrammetric workflow have now been successfully automated which is seen by many (Heipke, 1999; Saleh, 1996) as the primary advantage of the system. The use of standard off-the-shelf components in the hardware (as opposed to the specialised, high precision optical and mechanical parts in the previous systems) has reduced the price of Digital Photogrammetric Systems (DPSs) and widened its user base and number of potential applications. The level of user training now required to produce accurate data is greatly reduced and the DPS offer a much greater degree of functionality.

One of the processes which has been successfully automated is the generation of DEMs. To do this, the software automatically finds common or conjugate points on the two images in the steropair and with a knowledge of the position and orientation of the two images, it is then possible to obtain 3D co-ordinates in the object space. The DEM is generated usually in a regular grid format, with the user able to define the grid spacing. The generation process can be carried out at very high speeds with typical collection rates of 150 points per second. This allows an inexperienced user to achieve in ten minutes what previously took a trained and experienced photogrammetrist 6 or 8 hours on an analytical plotter.

Fundamental to the automatic generation of DEMs is the process of image matching which is the process of finding the conjugate points in the two images. It can be done in one (or a combination) of two ways - area or feature based. Area based matching, as the name suggests, matches small areas or patches in each digital image using correlation techniques. Feature based matching identifies objects such as the edges of buildings, roads etc which are visible in both images. However, the process of image matching is an ill-posed one, since both methods are subject to problem areas (Heipke, 1995). Area based techniques have difficulty in areas with low, repetitive texture such as man-made features or areas of sudden elevation change whilst feature based techniques suffer in monotonous areas with few features. When the software encounters such areas, it usually finds an incorrect match for the conjugate points which results in erroneous elevation values. An example of this sort of failure can be seen in figure 1.

Figure 1. Raster image of failed DEM

It can be seen from figure 1 that the DEM generation algorithm has failed in a large section of the DEM denoted by the black area in the right hand side of the figure. The reason for this failure is probably due to a lack of image content, as there are no sudden elevation changes in the area. From examination of the raster image, it would appear that there are no other problem areas of data, but it is impossible to tell how accurate the information is without comparing it to some other measure of the surface.

The usual method to assess the accuracy of the DEM is either quantitatively by comparing it with checkpoints (points collected by an independent data source with a higher order of accuracy) or qualitatively by the user (Heipke, 1995). Qualitative testing involves the user interactively editing points that do not appear to lie on the surface (the DEM is overlaid on a stereomodel of the surface) through the use of stereo vision on a Digital Photogrammetric System (DPS). This can be lengthy process and, as with previous technology, requires expertise and experience to carry out accurately and efficiently.

The collection of checkpoint data prolongs the time taken and expense of the ground survey process and is not always practicable. If checkpoint data are available, global measures of the DEM accuracy can be obtained by comparing the elevation of the checkpoint with the elevation of the DEM at the same planimetric position. The most widely used global measure of accuracy is the root mean square error (r.m.s.e) (Li, 1988) where:

r.m.s.e = Ö (S dh2)/n

dh = residual at each check point

n = number of check points

A common criticism identifiable in the recent literature is the lack of quality control procedures in modern DP software. The technology allows for an answer i.e. a DEM to be generated with ease but provides little help in assessing the quality of the output (Cooper, 1998). As Heipke (1999) states "most of the ... algorithms have only little knowledge about when they work correctly and when they fail."

2. The strategy parameters used in DEM generation

Many DPS software manufacturers allow the users a degree of control over the automatic DEM generation process with a set of strategy parameters. These are user definable values that control the acceptance and quality control functions in the software. Parameter selection allows the user a degree of control over the algorithm such that the nature of the resulting DEM can be changed (i.e. different levels of interpolation). Zhang and Miller (1997) state that the parameters are functions of terrain type, signal power, flying height, x and y parallax, and image noise level. In theory, a correct set of parameters will provide an accurate DEM with only successfully correlated points included and unsuccessful points rejected from further DEM processing. An incorrect set may result in filtering successful points and the inclusion of badly correlated points (known as false fixes) or simply failure in finding correlated points (Gooch et al, in press).

The OrthoMAX software uses an area correlation based algorithm with 14 strategy parameters. A full list of the strategy parameters and their default values can be seen in table 1 and from the descriptions it is not always obvious what the parameters mean, or what effect they are going to have on the resulting DEM. Smith et al (1996) states that the "parameters are written in a technical language and, even if the basic image matching technique is understood, it does not always help in determining the use of all the parameters as many are obviously software dependent." Other DPS use different sets of parameters using other nomenclature. For example, the Match-T product has 28 parameters (Smith et al., 1996) whilst the Phodis TS software from Carl Zeiss uses just two. Loodts (1996) criticises the use of "uncontrollable "magic" strategies" in automatic DTM algorithms but provides no details upon how to specify and control the parameters.





Minimum Threshold


Noise Threshold


Maximum Parallax (x)


Minimum Template Size


Maximum Template Size


Minimum Precision


Rejection Factor


Skip Factor


Edge Factor


Start RRDS




y-Parallax Allowance




Post Processing



Table 1. Default strategy parameters and their default values

(adapted from ERDAS, 1994).

The correct choice of parameters should accept all of the successfully correlated points and filter out the unsuccessful. The wrong choice can have a detrimental effect on the accuracy (Gooch and Chandler, 1998). Figure 2 shows a DEM of the same area shown in figure 1 but generated using the default strategy parameters (the minimum threshold had been changed from 0.6 to 0.5 in the first example). It can be seen that the software has successfully correlated many more points and there is no large failed area as in figure 1. This simple example highlights the criticality of selecting the correct set of parameters.

Figure 2. Successfully correlated DEM

Few studies have been carried out on the strategy parameters used in the OrthoMAX software. Smith (1997) carried out an extensive study of the parameters on two sets of imagery, both of which were aerial. He isolated areas with different land-cover types on the imagery and then systematically varied the parameters, with the aim of optimising them with respect to accuracy. The results of the tests indicate that the manipulation of the parameters can have a significant effect on the result. For example, in one of the residential areas, changing the Minimum and Maximum Template sizes from the default values of 7 and 9 pixels to 5 and 20 improved the mean error of the DEM from -0.218m to -0.081m. Similarly, changing the Maximum Parallax parameter from the default value of 5 to 10 in an area of open moorland improved the mean error from -0.061m to 0.006m. Overall, Smith reported that the software was well suited to smooth, textured surfaces and that areas with sudden elevations changes reduced the accuracy. He suggests that the conclusions from the testing could be applied to other data sets but makes no recommendations as to how they could be applied to close-range type applications.

Butler et al (1998) tested the OrthoMAX software on a set of imagery of a gravel riverbed with a flying height of just 2.2m. Because of the scale of the imagery, the authors had difficulty in obtaining sufficient check data of a higher order of accuracy. They therefore optimised the parameters with respect to the software's estimates of precision and the minimisation of the level of interpolation in the DEM. They found that an improvement in the precision (as estimated internally by the software) did not always lead to a more successful or accurate DEM although the exact effects could not be quantified because of the quality of the check data. This work highlighted the need for alternative methods for assessing data quality. As the number of applications of DP increases, particularly with close-range imagery, the availability of check data reduces and the cost of obtaining data of a higher order of accuracy is increased significantly.

A series of tests were carried out on the strategy parameters used in the ERDAS Imagine OrthoMAX software using a variety of imagery with vastly different scales and image content than carried out in prior work reported. The aim of the tests was to quantify the effect of varying the strategy parameters on the accuracy of the DEM. The data sets used in the study included:

A prerequisite of each data set was that check data was available thus enabling the parameters to be optimised with respect to accuracy. Optimization of the parameters had to be carried out using a "trial and error" approach since it was found that two optimum parameter settings did not always combine in a positive manner. Hence the process was time consuming and not practicable in a production type environment.


Parameter set a

Parameter set b

Parameter set c

Parameter set d

Parameter set e

Parameter set f




































Table 2. r.m.s.e. (m) results from five different areas generated with 6 different sets of strategy parameters

Table 2 shows the effect of changing the strategy parameters on five different DEMs. DEMs 1,2,3 and 4 are derived using the same set of imagery (1:13000 photography) whilst DEM 5 was generated using a set of 1:6000 imagery. These results are typical of all of the results encountered and demonstrate that the specification of the strategy parameters is critical and can have a significant effect on the accuracy of a DEM in certain areas (areas 2,3,4,5 in the example). It was not immediately obvious from the testing as to what caused the parameters to have large effects only on selected areas.

The results from the extensive testing showed that there was no evident link with landcover type as suggested by Smith (1997) and, more importantly, alterations to the strategy parameters only affected certain checkpoints. This result was discovered after examination of a close-range data set (1:70) of a simulated riverbed (the check data was collected using a physical profiling rod). The results showed that in three of the four test areas used, alterations to the strategy parameters had little or no effect on the resulting r.m.s.e of the DEM, whilst in one every parameter change had a significant effect. When a closer examination of individual residuals was made, it was found that all of the points with the largest residuals were located at just one end of one of the profiles, and that changes to the overall r.m.s.e were due to varied height estimates at these locations only.

This result was then confirmed using previous data sets for which large improvements in the r.m.s.e were achieved. By removing the checkpoints with the largest residuals from the r.m.s.e calculations, it was found that the r.m.s.e was subject to much less variation than before.

3. The Failure Warning Model

Using the phenomena that the strategy parameters only affect points with the highest residuals, a model was developed which could automatically identify such areas. Such a facility would assist users (in particular novice users) during the lengthy checking and editing phase of work. Instead of checking the entire DEM, the user could focus on the areas highlighted by the model. By simply subtracted two DEMs of the same area generated using different strategy parameter settings, the difference DEM is likely to highlight areas where height estimates are unreliable. The areas where the parameters have no effect (i.e. the areas with the lowest residuals) will show up on the plot as having a value around 0m.


Figure 3. Difference image between two DEMs of the same area. Areas with a difference around 0m are coloured grey, whilst large differences are denoted by the white areas.

The model was developed in the Spatial Modeller tool in ERDAS Imagine. This is a visual modelling environment that allows for the generation and customisation of algorithms that can manipulate graphical data. The language allows for a wide variety of inputs including raster, numeric and vectors files with outputs to a similar range of file types.

The algorithm behind the Failure Warning Model (FWM) is simple. It isolates the areas significantly different to 0m in the difference DEM and the areas where the algorithm has interpolated points on terrain with a sudden elevation change (also likely to contain significant residuals). This information is then overlaid on an orthophoto of the area, enabling the user to print a hardcopy output to assist when the DEM is checked and edited using the stereo viewing tools. The model does not alter the DEMs in any way, it merely highlights the areas which it suspects as having the highest residuals. Each point on the DEM is given one of three classifications in the output:

0 (Black areas) signifies areas where the software has interpolated the elevation in areas where slope angle is varying rapidly

Unclassified (orthophoto visible) - the evidence from the data suggests that the height estimate of this point is as accurate as possible

256 (white areas) signifies the points with the lowest accuracy. These are the points that are susceptible to changes in the strategy parameters.

Figure 4. Example output from the FWM

The inputs for the FWM are as follows:

DEM of the area generated with the default strategy parameters

DEM of the area generated with a different set of strategy parameters (4 parameters were changed in these tests)

An orthophoto of the area.

The FWM has been tested on a large number of areas taken from different sets of imagery and has performed consistently well. In the analysis of the results, the ouput from the FWM was exported in an ASCII format and the point residuals for the three classifications were grouped together into a "zone". This enabled the r.m.s.e of each zone to be calculated. The purpose of this was to identify if the points classified as acceptible had a lower r.m.s.e than the entire DEM and conversely, if the areas highlighted by the FWM had a higher r.m.s.e than the entire model. This would indicate that the FWM was highlighting the correct points (those with the largest residuals). For brevity, the results from the tests on six areas of one of the sets of imagery are presented in figure 5.

Figure 5.Accuracy results of FWM output

The results presented in figure 5 show that the FWM has worked well on the imagery used in the test. The points that are classified as OK by the model consistently have a lower r.m.s.e than the overall model, suggesting that many of the points with the larger residuals have been filtered out. In four of the six areas, the points classified as 0/black (points that the algorithm has interpolated over sudden elevation changes) had a higher r.m.s.e than the acceptable points. This was the case for all of the areas classified as 256/white (areas susceptible to changes in the strategy parameters). The difference between the points classified as 256 and the rest of the points can be seen to be significant.

The model was also tested on the Phodis TS DPS from Carl Zeiss Inc. The program uses the TopoSURF algorithm, which in turn uses digital images in a correlation procedure to generate a large number of elevation points, from which a grid-type DEM is then derived ( The Phodis TS software has two user definable strategy parameters, the Terrain Type parameter (flat, hilly or mountainous) and the Smoothing Factor (low, medium or high). The output from Phodis is an ASCII file containing the Cartesian co-ordinates of each point in the grid followed by an estimate of the accuracy of each point. This estimation is made on a scale from 1 to 7, 1 being the most accurate. This served as a basis for a comparison between the two packages and to demonstrate the universality of the FWM approach.

Three areas of the 1:13,000 imagery were used in the testing of the FWM, two with a 5m grid spacing (areas 1 and 2) and the other with a 10m spacing. Area 1 covered mainly farmland with small forested areas, area 2 covered fields, trees, a steep slope and a large residential area whilst area 3 covered rural and residential areas. For each area, a DEM was generated with every combination of the two variable strategy parameters. Each DEM was then generated with the OrthoMAX DEM generation software (using identical grid spacings).

The output data from the Phodis software was imported into the Imagine environment for comparison with the ERDAS data and testing of the FWM. The FWM had to be modified for the Phodis output, since the software does not identify which points were interpolated on the DEM. Therefore only classifications of 256 (white) or acceptable could be used. The input for each run of the FWM was therefore a DEM of the area generated using the default strategy parameters, a DEM of the area generated with one or both of the parameters changed and an orthophoto of the area.

Phodis 1

Phodis 2

Phodis 7


FWM 256






Table 3. Phodis classification and FWM results model for Phodis data

Table 3 shows the average results for the 3 areas used in the testing of the Phodis software. The results show that the FWM performed significantly better than the Phodis classification system. This is symbolised by the fact that the points classified as unacceptable i.e. 256/white have a significantly higher r.m.s.e than the points classified as 2 or 7 by the Phodis software. Also, the points classified as OK by the FWM have a lower r.m.s.e than the points classified as 1 by the Phodis software. Whilst the Phodis classification of 2 consistently had a higher r.m.s.e than the points classified as 1, the points classified as 7 proved to be somewhat less robust.

4. Discussion

The work presented has highlighted the many disadvantages of parameter optimisation. Whilst significant improvements in the r.m.s.e were achieved in some areas, this was only after extensive testing and regeneration of the same model. The large number of strategy parameters used in OrthoMAX means that to test every parameter combination would be an enormous undertaking and clearly impracticable in normal circumstances.

The data and results from the study prove the extreme variability of the software in certain areas, and the importance and criticality of the strategy parameters. Whilst the parameters have proved to have a significant effect in certain areas, the data has shown that manipulation of the parameters has little if any effect in areas well suited to the matching algorithm. Improvements in the accuracy of these areas are likely to be marginal through manipulation of the strategy parameters.

The tests in this study were all carried out on data with checkpoint information. If this data is not available, the optimisation of the parameters becomes extremely difficult, as every model needs to be checked manually. Butler et al (1998) demonstrated that improvements in strategies such as the matching success are not always accompanied by an improvement in the accuracy of the DEM.

The research has also highlighted the criticality of the location of the checkpoint data. The USGS suggest that check data should be "well distributed" and "representative of the terrain" with a "minimum of 28 points per DEM ... required to compute the RMSE, which is composed of a single test using 20 interior points and 8 edge points" (USGS, 1996). If all of the check points lie in areas well suited to the algorithm used, the result is likely to be good and not subject to much variation. Any accuracy figure derived from the data is not likely to reflect any problem areas of the DEM. Conversely, if one or more of the checkpoints lie in ill-suited areas such as near to tall buildings or on man-made monotonous surfaces, derived accuracy results are going to be poorer and will not reflect the accuracy of the successful sections of the DEM. Whilst this is an obvious observation to make, and is countered in part by the required use of 28 checkpoints as opposed to say 10, this research has questioned the suitability of using the r.m.s.e as a measure of DEM accuracy. Should the organisation generating the DEM choose the location of the checkpoints or should it be the client using the DEM? If the producer of the DEM has the choice, it would be easy to select 28 points from areas well suited to the algorithm. Should the checkpoints all lie in areas well suited to automated methods, areas prone to problems or a mixture of the two? Selective sampling measures such as r.m.s.e can never represent the true accuracy of dense DEMs such as those generated by digital photogrammetry.

The FWM approach outlined in this paper has provided an alternative approach to parameter optimisation. Instead of attempting the potential lengthy process of optimisation the strategy parameters are used to identify areas where the software is likely to fail and areas where the residuals are likely to be highest. This can be used as a guide when the DEM is checked, and is of value to the novice user particularly. Time spent during the editing phase is reduced to a minimum, so increasing the profitability of a project. It also answers some of the criticisms raised by Cooper (1998) and Heipke (1999) by increasing the level of quality assessment procedures used in the software. As it is a software based assessment process, it could easily be incorporated into an existing DPS, thus providing another internal quality assessment procedure.

The approach has proved to work successfully in practice and has been tested on a wide range of photo-scales. It consistently highlights areas of DEMs with large residuals and results in unclassified areas having a lower r.m.s.e than that of the whole DEM. Significantly, the approach has been tested on the Phodis TS DEM generation software and proved to be successful, suggesting that the approach can be applied to other DPS. It is intended to test the system on at least one other DPS.

5. Conclusions

This paper has highlighted some of the problems and issues surrounding the optimisation of the strategy parameters used in the automated generation of DEMs. It has shown that the parameter specification can be significant and that a manual optimisation approach is lengthy and subject to a degree of variability. A system has been described which automatically identifies areas that are likely to contain suspicious residuals and results have been presented which prove the efficacy of the method. The model has proved successful on different software packages and both close-range and aerial imagery.


This work is funding by the EPSRC (grant no: 96304378). The authors gratefully acknowledge help from the following people and organisations;

Julie Shannon (Department of Geography, Kings College, London) for the use of the 1:45,000 imagery (collected as part of the MEDALUS III Project IV funded by EU contract ENV4-CT95-0118 by J.Shannon).

Clive Boardman at Photarc Surveys, Harrogate, UK for granting access to the Phodis TS software and Rachel Benson for her assistance and patience in the generation of the Phodis data.

Mladen Stojic at ERDAS for the provision of the 1:6,000 imagery.

The Organisation Européenne d'Etudes Photogrammétriques Expérimentales (OEEPE) for the use of one of their data sets.


Chandler, J. H., 1999. 'Effective application of automated digital photogrammetry for geomorphological research', Earth Surface Processes and Landforms. 24, pp51-63

Cooper, M. A. R., 1998. Datums, Coordinates and Differences. Landform Monitoring, Modelling and Analysis. Edited by S. N. Lane, K. S. Richards and J. H. Chandler. John Wiley and Sons Ltd. pp21-35.

ERDAS, 1994. Imagine OrthoMAX Users Guide, Vision International.

Gooch, M. J. and Chandler, J. H., 1998. 'Optimization of strategy parameters used in automated Digital Elevation Model generation', ISPRS International Archives of Photogrammetry and Remote Sensing, 32(2) pp88-95.

Gooch, M. J., Stojic, M. J. and Chandler, J. H. (in press). 'Accuracy assessment of digital elevation models generated using the ERDAS Imagine OrthoMAX digital photogrammetric system', Photogrammetric Record

Heipke, C., 1995. State-of-the-art of Digital Photogrammetric Workstations for topographic applications. Photogrammetric Engineering and Remote Sensing, 61(1), January 1995. pp49-56.

Heipke, C., 1999. Digital Photogrammetric Workstations. GIM International. 1 Vol 13. January 1999. p.81.

Loodts, J., 1996. Logistics and Integration of the System: The Eurosense Experiences. OEEPE Workshop on "Applications of Digital Photogrammetric Workstations" Lausanne 1996.

Saleh, R. A., 1996. Photogrammetry and the quest for digitisation. Photogrammetric Engineering and Remote Sensing. 62(6) June 1996. pp675-678.

Smith, D. G., 1997. Digital Photogrammetry for Elevation Modelling. PhD Thesis. University of Nottingham. 241 pages.

Smith, M. J. and Smith, D. G., 1996. Operational Experiences of Digital Photogrammetric Systems. International Archives of Photogrammetry and Remote Sensing. 31(B2) Vienna 1996. pp357-362.

USGS, 1996. Standards for Digital Elevation Models.

Zhang, B. And Miller, S., 1997. Adaptive automatic terrain extraction. Integrating photogrammetric techniques with scene analysis and machine vision III. SPIE 3072:27-36.