Return to GeoComputation 99 Index
J. Mark Ware and Christopher B. Jones
School of Computing, University of Glamorgan, CF37 1DL United Kingdom
David R. Miller
Macaulay Land Use Research Institute, Aberdeen, AB9 2QJ United Kingdom
In this paper we consider the problem of detecting land-cover change that occurs over a period of time. We achieve this by comparing pairs of vector-defined polygon coverages, C1 and C2. C1 represents land-cover for a particular region at the start of the time-period (t1) while C2 represents land-cover for the same region at the end of the time-period (t2). The standard technique for detecting change in this situation is to intersect C1 and C2 so as to produce a third coverage Ci. Each intersection-polygon in Ci is associated with two land-cover classification codes; one describes land-cover at time t1 while the other describes land-cover at time t2. If the classification codes differ, then the polygon is labeled as corresponding to an area that has undergone land-cover change; however, a problem with adopting this approach is that it assumes that the source coverages are free of error. This will not usually be the case; it is likely that C1 and C2 are subject to both classification error (i.e., some polygons are assigned to a wrong class) and positional error (i.e., some polygon boundaries are in the wrong location).
In the work presented here we consider locational error only. The problem we address concerns the fact that if C1 and C2 have been generated independently then, because of loacational error in both coverages, equivalent boundaries will not match exactly with regards to their geometry. The term equivalent boundaries are used here to describe a pair of boundaries that are intended to be representing the same real-world phenomena. The result is that in situations where boundaries are approximately equivalent, sliver polygons can be expected to occur; the difficulty we face is distinguishing between slivers and polygons that may represent genuine change. It should be noted that this and similar problems have been addressed widely in the literature. The interested reader is referred to Dougenik 1979, Chrisman 1987 and 1991, Zhang and Tulip 1990, Pullar 1991, Harvey and Vauglin 1996, and Edwards and Lowell 1996, for related work.
In an attempt to overcome the problem described, we propose a method that seeks to ensure that equivalent boundaries are represented by precisely the same geometry. This is achieved by aligning C1 and C2 prior to intersection. The method is based on the principle of conflation (Saalfield 1988) in which equivalent elements from the pair of coverages are identified and merged in a manner that, where appropriate, preserves the better quality representation. The alignment process makes use of a boundary matching procedure that takes locational error into consideration. Various metrics are evaluated to provide evidence for the equivalence of boundary pairs. Multivariate Bayesian methods (Johnson and Wichern, 1998) are used to exploit this evidence to provide a probabilistic measure of equivalence. In adopting a Bayesian approach it is assumed that training data can be acquired based on expert assertions of the existence of change, or no change, involving movements of boundaries. Having established equivalencies, matching boundary pairs B1 and B2 are replaced by an updated boundary, Bi, within their respective coverages. Bi is derived by calculating a weighted average of the two original features.
Coverage alignment involves the matching and merging of equivalent boundary pairs. Boundary matching is achieved using Bayesian multivariate classification, in which candidate matching pairs are classed as being actual matches or not. Matching boundary pairs can be thought of as belonging to a population P1 while non-matching pairs can be thought of as belonging to a population P2.
To begin, conditional probability density functions are calculated using geometric signatures obtained from a training set of manually classified boundary pairs. Three signatures are used, namely, length ratio L, Hausdorff distance D and average distance between boundary sections A. These signatures are derived for each boundary pair in a particular population Pj, and are subsequently used to determine a mean column vector MPj and covariance matrix XPj. Provided that the individual items of evidence are approximately normally distributed (or transformed to a normal distribution), the conditional probability density function for the items of evidence E supporting the hypothesis HPj is then represented by-
where E = (L, D, A)T, HPj is the hypothesis that a boundary pair belong to the population Pj, and n equals 3 (the number of items of evidence used).
When searching for matching boundaries we only consider boundary pairs that are likely to represent a match. These candidate matching boundary pairs are found using a basic feature matching procedure based on buffering. In essence, this procedure deems two boundaries (or parts of boundaries), B1 and B2, to be a candidate match if: (i) all of B1 lies within a pre-defined distance of B2; (ii) all of B2 lies within a pre-defined distance of B1; and (iii) the difference between B1 and B2 in terms of their orientation is less than some pre-defined angle. The values used for the distance and angle thresholds are determined by analysis of the manually classified training data.
Each candidate matching pair (B1,B2) is then assigned a posteriori probability of it being an actual match. This probability is calculated as-
where E represents the geometric signatures derived from (B1,B2), p(HP1) is the prior probability of a candidate boundary pair being a match, and p(E) is the prior probability of the evidence. The prior probabilities are estimated from all previously manually analysed data. Having been calculated, the a posteriori probabilities of all candidate matching boundary pairs are examined and compared against a pre-defined probability threshold value, and a list of matching boundary pairs is produced.
Each matching boundary pair is aligned using a boundary merging procedure. This procedure makes use of weighted interpolation, thus allowing for a range of results (i.e., B1 replaces B2 in C2, or B2 replaces B1 in C1, or B1 and B2 are both replaced by a weighted average). There are three steps involved in the alignment process. The first step involves adding vertices to B1 and B2 such that each boundary has the same number of vertices and these vertices appear at proportionally equivalent distances along the boundaries. Next, the aligned boundary Bi is found by calculating a weighted average of corresponding vertices in B1 and B2. Finally, the sharp breaks that occur between the merged boundaries and adjacent unmerged boundaries are spanned by inserting smooth connecting boundaries. More details may be found in Ware and Jones (1998).
The coverage alignment procedure has been implemented as a set of C functions. These functions have been made callable from the Avenue scripting language of the ArcView GIS. The functions can all be called from a modified version of the default ArcView user interface. The procedure has been tested using data supplied by the Macaulay Land Use Research Institute (MLURI), who also provided expert advice during the training stage. Three data sets, representing land-cover in the Glen Feshie region of the Cairngorms (Scotland) in 1946, 1964 and 1988, have been used. Figures 1 and 2 show the 1946 and 1988 data.
Figure 1. 1946 Data Set (5km x 6km).
Figure 2. 1988 Data Set (5km x 6km).
In experiments, 84% of the selected boundary matches with a posteriori probability exceeding 90% were found to correspond to expert assertions of boundary equivalencies. While this represents a reasonable result, it is hoped that larger training sets and the use of additional signatures will lead to greater success. Figure 3 shows part of the 1946/1988 change map using the standard intersection technique, while Figure 4 shows the corresponding change map produced subsequent to coverage alignment.
Figure 3. Part of 46/88 change map without coverage alignment (2km x 2km. Shaded polygons correspond to regions that have undergone change.
Figure 4. The same part of 46/88 change map, this time with coverage alignment. Notice that sliver polygons have been removed.
This paper has proposed a new approach to evaluating and improving automated environmental change detection when comparing vector-defined land-cover coverages. In particular we have considered the problem of locational error. A method for aligning coverages prior to intersection has been described. Our particular aim was to ensure that boundaries in separate coverages that are intended to be representing the same real-world phenomena are represented by the same geometry. Preliminary results and training data from land cover maps of Scotland found a reasonably close match between boundary matches found by our procedure and those found manually by experts.
There is much scope for future research. This might include the evaluation of alternative geometric metrics. There also is the opportunity to develop more versatile methods to combine conventional quantitative multivariate statistics with qualitative sources of evidence. Also, it is important to point out that even if the locational errors of map intersections could be reliably addressed, the result of change detection based on the comparison of the respective source-interpreted land cover classes is still subject to potentially large error, which is a function of the accuracy of the individual interpretations. The authors are currently carrying out work aimed at addressing this issue and hope to report findings in the near future.
Acknowledgment is due to the Scottish Office Agriculture, Environment and Fisheries Department for use of the historical and 1988 land cover data. Thanks also to John Bell for his assistance in interpreting aerial photographs. JMW was supported by NERC Grant GR3/10569.
Chrisman N.R., 1987. 'The accuracy of map overlays: A reassessment'. Landscape and Urban Planning 14, pp. 427-439.
Chrisman N.R., 1991. 'A diagnostic test for error in categorical maps'. Proc Auto-Carto 10, pp. 330-348.
Dougenik J.A., 1979. 'WHIRLPOOL: A geometric processor for polygon coverage data'. Proc Auto-Carto 4, pp. 304-311.
Edwards G. and K.E. Lowell, 1996. 'Modelling uncertainty in photointerpreted boundaries'. Photogrammetric Engineering and Remote Sensing 62(4), pp. 337-391.
Harvey F. and F. Vauglin, 1996. 'Geometric match processing: Applying multiple tolerances'. Proc 7th International Symposium on Spatial Data Handling, pp. 4A, pp. 13-29.
Johnson A.J. and D.W. Wichern, 1998. Applied Multivariate Statistical Analysis, Prentice Hall.
Pullar D., 1991. 'Spatial overlay with inexact numerical data'. Proc Auto-Carto 10, pp. 313-329.
Saalfield A., 1988. 'Conflation: automated map compilation'. International Journal of Geographical Information Systems 2(3), pp. 217-228.
Ware J.M. and C.B. Jones, 1998. 'Matching and Aligning Features in Overlaid Coverages'. Proc 6th ACM International Symposium on Advances in Geographical Information Systems (ACM-GIS'98), pp. 28-33.
Zhang G. and J. Tulip, 1990. 'An algorithm for the avoidance of sliver polygons and clusters of points in spatial overlay'. Proc 4th International Symposium on Spatial Data Handling 1, pp. 141-150.