Return to GeoComputation 99 Index

Assessing uncertainty in fuzzy land cover maps obtained by remote sensing

Peter M. Atkinson
University of Southampton, Department of Geography, Highfield, Southampton SO17 1BJ United Kingdom


Assessing the accuracy of fuzzy estimates of land cover obtained by remote sensing is not straightforward. The two dimensions of accuracy, bias and precision, are reflected differently with different statistics; however, no one standard statistic is adequate on its own as a measure of accuracy, especially where the distribution of proportions can have large or small variances and may even be bimodal. The beta distribution provides a good model with which to simulate realizations of land cover proportions. The beta distribution was used to simulate six different univariate distributions of actual land cover proportions in the form of images of 20 by 20 pixels. A Gaussian random noise term was then added to the values to simulate estimates of the proportions. Various statistics were computed both for the entire image (global) and for a moving window or kernel (local). The latter approach produced images of statistics that were averaged to obtain kernel-based equivalents of the global statistics. The results were promising in that the kernel-based approach appeared to penalize bimodality; however, no single statistic can be regarded as satisfactory on its own.

1. Introduction

Maximum likelihood classification is perhaps the most commonly applied method of classification in remote sensing. It amounts to assigning a pixel to the class with which it has the highest a posteriori probability of membership. The a posteriori probability of a pixel z(xi) belonging to class j, L(j| z(xi)), may be obtained from Bayes Theorem using Equation 1:


where Pj is the a priori probability of membership of class j, p(z(xi)|j) is the probability density for pixel z(xi) at location xi as a member of class j and c is the number of classes (Thomas et al., 1987; Foody et al., 1992).

A problem in remote sensing, and particularly with imagery of moderate and coarse spatial resolution, is that the land cover may vary more frequently spatially than the sampling interval between pixels in the imagery. This means that many of the pixels may represent a mixture of land cover classes (referred to as mixed pixels). Where the imagery comprises many mixed pixels, it is not sensible to assign a single pixel to a single class. For example, if a pixel represents 20% heathland, 40% woodland, and 40% cereals, it would be counter-intuitive to assign the pixel to, say, woodland. In these circumstances, a traditional hard spectral classifier (for example, maximum likelihood given by Equation 1) is inappropriate and should be replaced by a fuzzy classifier where the objective is to assign one pixel to many classes in varying proportions that sum to unity.

Fuzzy classification is now widely accepted in remote sensing as a ubiquitous solution to the problem of mixed pixels. There are many techniques available for estimating sub-pixel class proportions in this way including artificial neural networks (ANN) (Benediktsson et al. 1990; Aleksander and Morton 1991; Bischoff et al., 1992; Kanellopoulos et al. 1992; Benediktsson et al. 1993; Civco 1993; Paola and Schowengerdt, 1995), mixture modelling (Adams et al. 1985; Gillespie 1992; Settle and Drake 1993; Foody and Cox 1994) and fuzzy c-means classification (Bezdek et al. 1984; Foody and Cox 1994). Atkinson et al. (1997) describes these three techniques together and so the description will not be repeated here.

A large amount of effort is currently being focused on mapping land cover over the entire globe using National Oceanographic and Atmospheric Administration (NOAA) Advanced Very High Resolution Radiometer (AVHRR) imagery (for example, Ehrlich et al., 1994; Shimabukuro et al., 1997). NOAA AVHRR imagery is ideal for global-scale remote sensing because of the large area coverage provided synoptically (Townshend et al. 1991). Several approaches have been taken, but the most favoured involves the combination of multi-temporal NOAA AVHRR imagery to estimate hard land cover classes. A problem with NOAA AVHRR imagery is that the large pixels (each of 1.1 km by 1.1 km, Local Area Coverage) are likely to represent more than one land cover: they are likely to be mixed; therefore, for NOAA AVHRR imagery, fuzzy classification is often appropriate.

In a previous study, the three techniques for fuzzy classification mentioned above were evaluated and compared for the classification of land cover in the New Forest area of Hampshire, England (Atkinson and Embashi, 1996; Atkinson et al., 1997). NOAA AVHRR images for single dates were used (with the aim of land cover monitoring in mind) although the approach could be extended readily to multi-date imagery. A ‘local’ study was undertaken in which the four classes to be estimated (Woodland, Heathland, Agricultural land, and Other) were chosen in relation to the dominant land covers in the New Forest.

The accuracies of the three techniques were evaluated by comparing the estimated fuzzy proportions to known land cover proportions. The ‘known’ proportions were themselves estimated from co-registered and classified fine spatial resolution (20 m by 20 m) Système Pour l’Observation de la Terre (SPOT) High Resolution Visible (HRV) imagery (Atkinson et al., 1997). Bivariate distributions between the known and estimated proportions were plotted per class (Figure 1). Further, correlation coefficients, mean errors and root mean square (RMS) errors, were estimated per class. The study revealed that the ANN was consistently more accurate than mixture modelling or fuzzy c-means classification (Table 1).

Figure 1. Bivariate distributions for four land cover classes in the New Forest.

Table 1a. Mean errors for estimating the sub-pixel proportions of Woodland, Heathland, Agricultural land, and Other.




Agricultural land







Mix. Mod.










Table 1b. RMS errors for estimating the sub-pixel proportions of Woodland, Heathland, Agricultural land and Other.




Agricultural land







Mix. Mod.










Accurately estimating four classes of local interest is no guarantee of ability to contribute to a global land cover classification; therefore, a ‘global’ study was undertaken involving a larger part of the New Forest and, more importantly, eight land cover classes selected from the DISCover global land cover classification system (Table 2) (Justice et al., 1995). The DISCover classification system was developed by the IGBP-DIS Land Cover Working Group and involves seventeen distinct land cover classes intended to span the entire range of global land covers.

Table 2. DISCover classification system.


Land Cover Class


Evergreen Needleleaf Forest


Evergreen Broadleaf Forest


Deciduous Needleleaf Forest


Deciduous Broadleaf Forest


Mixed Forest


Closed Shrubland


Open Shrubland


Woody Savanna






Permanent Wetlands




Cropland/Natural Vegetation Mosaic


Urban and Built-up


Snow and Ice


Barren or Sparsely Vegetated



This time, the ANN only was used to estimate the eight classes: the fuzzy c-means classifier was already inaccurate for four classes, and the mixture model could not be used to estimate more classes than there were wavebands. The results (Figure 2, Table 3) showed that an ANN (and a single NOAA AVHRR image) could be used to estimate the proportional land cover for eight classes, thereby providing data with which to populate a fuzzy global land cover map.

Figure 2. Bivariate distributions for eight land-cover classes in the New Forest.

Table 3a. Mean errors for estimating the sub-pixel proportions of Evergreen needleleaf forest, Deciduous broadleaf forest, Open shrubland, Permanent wetlands, Croplands, Urban and built-up, Barren or sparsely vegetated and Water.




leaf forest

Decid. broadleaf forest

Open shrubland




Urban and built up

Barren or sparsely vegetated











Table 3b. RMS errors for estimating the sub-pixel proportions of Evergreen needleleaf forest, Deciduous broadleaf forest, Open shrubland, Permanent wetlands, Croplands, Urban and built-up, Barren or sparsely vegetated and Water.




leaf forest

Decid. broadleaf forest

Open shrubland




Urban and built up

Barren or sparsely vegetated











The studies highlighted several inadequacies of the statistics used to represent fuzzy classification accuracy. The most obvious of these was that it was not possible to compare directly the RMS errors for four classes to those for eight classes because the a priori probabilities for each case were different (E(p| c=4)=0.25, (E(p| c=8)=0.125, where c is the number of classes). It turns out that no single statistic or function in current use adequately describes the accuracy of fuzzy classification quantitatively.

In the next section, a method based on a local moving window or kernel is proposed that may facilitate the direct comparison of remotely sensed fuzzy classifications, including the four and eight class fuzzy classifications of the New Forest study. In sections 3 and 4, the method is evaluated using simulated imagery. First, the various problems are presented and the method is described.  

2. Problems with fuzzy accuracy assessment and proposed solutions

Each standard statistic for assessing the accuracy of a fuzzy classification is inadequate in one way or another. In this section, these shortcomings are explored and some potential solutions to the problems are proposed.

2.1 RMS error and correlation coefficient

In this section, several statistics are described that are used commonly to measure the level of agreement between a set of known proportions y and a set of estimated proportions z. Perhaps the two simplest measures of agreement between y and z are the mean error per class:


where, j is the class and n is the total number of pixels, and the root mean square (RMS) error per class:


This leads to overall mean and RMS errors defined as:




The mean error informs about bias and the RMS error informs about accuracy (bias and precision). An obvious problem with the RMS error is that it confounds two separate information dimensions into a single value. A second problem with the RMS error is that it is not standardized by any measure of variance; thus, if the variance in the variable to be predicted is large, the RMS error is likely to be large also, and vice versa, making comparisons between RMS errors problematic, even for different classes of the same data set. Correspondingly, the RMS error can be small when there is no correlation between the target and estimated proportions. The solution is to standardize.

The correlation coefficient is an alternative measure of the amount of association between a target and estimated set of proportions:


where cyj.zj is the covariance between y and z for class j defined as:


and syj and szj are the standard deviations of y and z for class j defined as:



A problem with r is that it can be large () indicating close association where the RMS error is large, that is, where the error of prediction is large. More specifically, r can be large where the estimates are biased, with zj being some fraction of yj; thus, r informs about precision, and it is standardized by the two variances, but it is not sensitive to bias. Further, if the variances in the two distributions are small, the estimates can be accurate with r=0.

One further issue in using r is that if confidence intervals are a concern, a probability density function (p.d.f.) is required, and the Gaussian model is universally the most commonly chosen. For data representing classes of land cover, with proportions constrained in the interval (0,1) the Gaussian p.d.f. is hardly appropriate. As described below, the beta distribution provides a good model for the known range of cases.

2.2 Binomial and Beta distributions

Where remote sensing has been used to estimate a hard land cover class j for a given pixel i, the resulting value uij can be considered as a draw from a Bernoulli probability function, p.f.. That is, U~Bernoulli(p):


where p now represents probability. An image of hard classified pixels uij (i=1,…,n; j=1,…,c) could be modelled with the Binomial distribution. That is, U~Bin(n, p):


where p represents probability, u is the number of draws allocated to the particular class and n is the number of trials (or pixels). This model might be useful where the a priori probability of selecting a given class is known. The binomial distribution is a discrete distribution and is not a natural choice for modelling land cover proportions that vary continuously on the interval (0,1). A suitable alternative is the beta distribution (Collins and Woodcock, 1999).

Where remote sensing has been used to estimate the proportional cover of a given land cover class j for a given pixel i, the resulting value uij can be considered as a draw from a beta probability function (p.f.). That is, U~Beta(a , b ):


where a and b are shape parameters, and G represents the gamma function


The two shape parameters a and b are related to the mean m and variance s 2 of the distribution by;



Examples of the beta distribution for various parameter values are given in Figure 3a-f.

Figure 3. Beta distributions for various parameter values.

Collins and Woodcock (1999) used the beta distribution to represent the fraction of snow cover within pixels and demonstrated it’s ability to represent the relation between spatial resolution and variance. The beta distribution with large variance has a bimodal distribution with two peaks close to the extremes of 0 and 1. Indeed, at the extreme case of point data, the distribution is Bernoulli; therefore, the beta distribution is appropriate for bimodal distributions. Further, as the spatial resolution coarsens, one expects the variance to decrease and also the bimodality of the distribution to become increasingly unimodal. At the extreme case of a single pixel representing the entire image, the distribution f(u) is a degenerate probability mass function:


with a mean of p and a variance of 0.

The beta distribution provides a good model for proportional land cover data. It can represent various degrees of fuzziness from extremely hard distributions (such as are common for water classes) and extremely fuzzy distributions (as are common for land covers with a high spatial frequency); however, the beta distribution does not in itself solve the problems of information dimension and standardization raised above. Rather, in this paper, the beta distribution is used to generate realistic distributions with which to test some common statistics.

2.3 Standardized statistics

The correlation coefficient is a standardized measure of the association between the variables y and z. In that sense, it provides useful information relating to accuracy; however, the r statistic is not sensitive to bias as explained above.

One possible alternative to r is to standardize the RMS error by the RMS error obtained when estimating all proportions by the mean of the target distribution (that is, dividing by s2yj, the variance of the target distribution). The RMS error standardized in this way leads to smaller-than-average standardized RMS errors (SRMSEj) when the target distribution is consistently under-estimated and larger-than-average SRMSEj when it is over-estimated; therefore, the standardized RMS error, SRMSEj for a given class j is given by:


or equivalently,


Note that Equations 17 and 18 are not strictly equivalent since the denominator in Equations 8 and 9 is (n-1), not n; however, for remotely sensed images, the difference between n and (n-1) is likely to be small. The standardized RMS error SRMSEj is sensitive to bias in exactly the way that r is not (as implied above, r is a measure of association, not accuracy). Another possibility for standardization is to divide by the variance of the beta distribution representing the target variable. 

3. Method

It is quite common for one or two classes in a remotely sensed image to be rather ‘hard’, with some proportions being close to 1 and many being close to 0, with very few in-between. A good example is provided by the water classes where pixels tend to be either allocated completely to water (e.g., over an ocean) or not at all to water (e.g., over land), with mixed pixels arising at the water-land interface only. The scatterplot between the known land cover proportions yj and the estimated proportions zj can look like that in Figure 2h, with two concentrations of points close to the extremes of (0,0) and (1,1). As discussed above, the beta distribution provides a good model for proportions and such bimodal distributions can arise from a beta p.f.; however, standardizing by the variance of the continuous proportions, or the variance of the beta distribution tends to reduce the SRMSEj to very small values even when there is no correlation whatsoever within the two clusters close to (0,0) and (1,1). Similarly, the r value for such a distribution tends to be very large (close to 1)-- we appear to have come to an impasse.

A possible solution to the above problem is to view the image as comprising a non-stationary beta p.f.. Locally, for example, over oceans, the probability p of pixels containing water is very large. Over land, the probability pj of pixels containing water is locally very small. Thus, the distribution of proportions yj (or zj) can be seen as being drawn from a set of beta distributions that are valid locally. The solution is to compute a standardized measure of accuracy within a kernel S of size u by v pixels.

The problem then is to define a suitable kernel size (especially given that S is likely to be fixed for all j), and this will be a matter of trial and error, although consideration should be given to the form of the variogram. Since the choice of u and v is to some extent arbitrary, an alternative is to use an inverse distance weighting with a large kernel. The inverse distance weighting d ij is given by:


where i represents the pixel to be assessed and j represents a neighbour. This should facilitate the comparison of results obtained by different investigators using different data sets.

4. Analysis

In section 2.2 above, the beta distribution was proposed as the most appropriate p.f. for the proportions estimated in a fuzzy land cover map obtained by remote sensing. In this section, the beta distribution is used to simulate a variety of bivariate distributions of land cover proportions and their estimates. Various statistics are then computed for each simulated data set and the results compared.

4.1 Simulation

A single beta p.f. was used to simulate an image of 20 by 20 proportions zij with m =0.5 and s 2=0.05. Then, a random Gaussian noise with m =0 and s 2=0.1 was added to proportions to provide an image of estimates. The bivariate distribution between these two variables and their corresponding images are shown in Figure 4a-c. This represents a common case in remote sensing where the proportions are relatively ‘fuzzy’. Second, a single beta p.f. was used to simulate a second image of proportions, this time with m =0.5 and s 2=0.225. Again, a Gaussian random noise was added to provide the estimates. The bivariate distribution and the two images are shown in Figure 4d-f; thus, it is clear that the beta p.f. can be used to closely simulate the observed, and importantly, the bimodal cases.

Figure 4. Four simulated bivariate distributions and the associated images of observed and estimated proportions. The method used to draw realizations is explained in the text.

A problem with the image of proportions represented in Figure 4e is that the spatial distribution is extremely unlikely in practice because of the lack of spatial correlation. To illustrate the ability of the beta p.f. to generate reasonable looking images of proportions when the data are bimodal, two sets of simulations were combined, one for the upper half of the image with m =0.9 and s 2=0.02 and one for the lower half of the image with m =0.1 and s 2=0.02. The bivariate distribution is shown in Figure 4g. The resulting spatial distribution (Figure 4h) is similar to that which would be expected for a typically hard class, for example, a water class; thus, a non-stationary beta p.f. can be used to realize realistic images of land cover proportions.

4.2 Assessing accuracy

In this section, the various statistics available for accuracy assessment are applied to a range of simulated data sets and their utility is evaluated.

Four univariate distributions were simulated using beta distributions with means and variances as shown in Table 4. To each distribution, a Gaussian noise term with m =0 and s 2=0.1 was added as before to produce the estimates. The first distribution (Figure 5a) was a repeat of that in Figure 4g. The second and third distributions were obtained using the same beta model parameters as the first, but this time the estimated proportions (Figure 5b) and the actual proportions (Figure 5c) were each divided by three to introduce bias. The fourth distribution (Figure 5d) was obtained with m =0.1 and s 2=0.01 to simulate the common case where all pixels contain only a small proportion of the class in question. Examples of such land covers with high spatial frequency are linear features such as rivers and rare occurrences such as salt-marsh.

Table 4. Parameters of simulated beta distributions


m (beta)

s 2 (beta)



































Figure 5. Simulated bivariate distribution, associated observed image, and kernel-based statistics displayed as images.

One further distribution was obtained using a three-part beta simulation with the first nine rows of the image simulated with m =0.95 and s 2=0.01, the tenth row simulated with m =0.5 and s 2=0.02, and the final ten rows simulated with m =0.05 and s 2=0.02 (Figure 6a). These choices were made to simulate a boundary of mixed pixels as would be common at the water-land interface discussed above.

The r statistic, the mean error, the RMS error and the standardized RMS error were computed for each simulated data set. The results are given in Table 5. In addition, a kernel of 7 by 7 pixels was applied to the images and the statistics calculated again within the moving window. This led to images of statistics, some of which are shown for the fifth (water-land) simulated data set in Figure 6b-f. These values were averaged to estimate kernel-based versions of the standard ones and these are given in Table 5.

Table 5. Statistics describing accuracy in the five simulated bivariate distributions



Figure 5a

Figure 5b

Figure 5c

Figure 5d

Figure 6a







Mean error






RMS error






SRMS error






S r






S Mean error






S RMS error






S SRMS error







Figure 6. Simulated bivariate distributions representing four common scenarios for remotely sensed estimates of land cover proportions. The method used to draw the realizations is explained in the text.

5. Results

The r statistic for each example is large (» 0.9) with the exception of that for the fourth example, which is near zero. This illustrates that (i) r does not represent bias (see Figures 5b and c) and (ii) r penalizes a bivariate distribution with two small variances and zero association (Figure 5d). The latter point represents an important shortcoming of the r statistic since such distributions are in many cases desirable.

The mean error is as expected: zero except for the two distributions with induced bias. The mean error provides simple and useful information about bias. It is dependent on the variance in the two variables: the smaller the variances, the smaller the mean error tending to be.

The RMS error is approximately 0.1 except for the two distributions with bias for which the RMS error is approximately 0.37. This illustrates that (1) the RMS error does not distinguish between a bivariate distribution with large r (see Figure 5a) and one with zero r (see Figure 5d) and (2) the RMS error is a measure of accuracy, affected by both precision and bias.

The standardized RMS error varies greatly between distributions. Take the expected value of one as a benchmark. The first distribution with SRMSEj of 0.88 indicates slightly better estimation than can be achieved with the mean of the target distribution. The biased estimates (Figure 5b and c) have much larger values and this is desirable. The fourth distribution (Figure 5d) also has a large SRMSEj value. The desirability of this is debatable; however, it does reflect the lack of association between the two variables, which the RMS error does not. Finally, the SRMSEj for the bimodal distribution is less than half that for the benchmark. This highlights the problem of standardizing the RMS error (or any other statistic) with the variance of a bimodal distribution. The value is deflated and does not represent the level of association (r of near zero) within the bimodal clusters near to (0,0) and (1,1).

The results for the kernel-based approach are different in some important ways. First, the mean r is similar to that before, except that now the r value for the fourth distribution (Figure 5d) is larger, and that for the bimodal distribution is smaller. The decrease in the r value for the bimodal distribution is exactly the result desired: the correlation is less within the clusters than for the distribution as a whole, and this is reflected by a smaller r for the kernel-based approach.

The average mean and RMS errors for the image are similar to the standard mean and RMS errors obtained without the kernel, as expected. The average SRMSEj can be evaluated as before, using the value of 1 as a benchmark. The first distribution has a mean SRMSEj of 0.42 indicating accurate prediction. As before, the two biased distributions have large values, as desired. The value for the fourth distribution with small variances (Figure 5d) is somewhere between these two; however, the value for the bimodal distribution is now larger than that for the first distribution, and this is exactly as desired: the bimodal distribution should be penalized because the level of association within each cluster is close to zero.

6. Discussion

In the method presented in this paper the binomial distribution is assumed to vary locally across the image. The assumption implicit in this model is that the parameters of the p.f. vary smoothly from place to place so that within a moving window the u by v pixels can reasonably be considered as drawn from the same distribution; however, in many cases there exists abrupt boundaries where the probability is likely to change equally abruptly. For example, for water classes, there tends to be an abrupt water-land interface at which pixels tend to be mixed. Either side of the boundary, the pixels tend to be hard, that is, either all water or all land. In these circumstances, the local kernel approach will not resolve the bimodal problem when the kernel straddles the boundary. One option would be to use an adaptive kernel size, which reduces when variance is large. As an alternative, one might consider methods that operate on the distribution of values in feature space (as opposed to geographical space). For example, it may be possible to segment the bivariate distribution into clusters and perform an accuracy assessment on each separately, or simply to smooth the distribution locally as with generalized additive models.

The requirement for a non-stationary beta distribution could be investigated by examining the covariance function or equivalently the variogram for the image of known or estimated proportions, yj (or xj). The experimental semivariance is defined as half the average squared difference between values separated by a given lag h, where h is a vector in both distance and direction; thus, the experimental variogram g (h) can be obtained from a =1,2,…,P(h) pairs of observations {z(xa ), z(xa +h)}at locations {xa , xa +h} separated by a fixed lag h:


Where the semivariance increases with lag, there is spatial correlation in the proportions. Where the semivariance continues to increase to a lag that is a large proportion of the one-dimensional size of the image (say one third), a non-stationary approach may be required. If the semivariance is large at all lags, there is no need for a kernel S, although to facilitate comparison, the kernel should be applied universally.

7. Conclusion

In summary, initial results suggest that the RMS error, standardized by some measure of the variances in the bivariate distribution, provides an interesting alternative to the correlation coefficient. Further, it would appear that the kernel-based approach presented above does have something to offer, particularly where the distribution of proportions to be estimated is bimodal. The choice of statistic to represent accuracy, or more likely the combination of statistics, will depend ultimately on what the user perceives to be important accuracy criteria.

Further research will be undertaken to evaluate the performance of a broader range of statistics, for a broader set of cases and the approaches will be applied to the real imagery discussed at the beginning of this paper.  


This paper builds on collaborative research undertaken with Dr. Mohamed Embashi. The author would like to thank Dr. Embashi for providing the New Forest data. 


Adams, J.B., M.O. Smith, and P.E Johnson, 1985. Spectral mixture modelling: a new analysis of rock and soil types at the Viking Lander 1 site. Journal of Geophysical Research, 91, pp. 8,098-8,112.

Aleksander, I. and H. Morton, 1991. An Introduction to Neural Computing (London: Chapman and Hall).

Atkinson, P.M., M.E.J. Cutler, and H. Lewis, 1997. Mapping sub-pixel proportional land cover with AVHRR imagery, International Journal of Remote Sensing, 18, pp. 917-935.

Atkinson, P.M. and M.R. Embashi, 1996. The effect of number of training data on the accuracy of mapping proportional land cover with a neural network, RSS96 Remote Sensing Science and Industry, Remote Sensing Society, Nottingham, pp. 586-591.

Benediktsson, J.A., P.H. Swain, and O.K. Ersoy, 1990. Neural network approaches versus statistical methods in classification of multisource remote sensing data. IEEE Transactions on Geoscience and Remote Sensing, 28, pp. 540-552.

Benediktsson, J.A., P.H. Swain, and O.K. Ersoy, 1993. Conjugate-gradient neural networks in classification of multisource and very high dimensional remote sensing data. International Journal of Remote Sensing, 14, pp. 2,883-2,903.

Bezdek, J.C., R. Ehrlich, and W. Full, W., 1984. FCM: The fuzzy c-means clustering algorithm. Computers and Geosciences, 10, pp. 191-203.

Bischof, H., W. Schneider, and A.J. Pinz, 1992. Multi-spectral classification of Landsat images using a neural network. IEEE Transactions on Geoscience and Remote Sensing, 30, pp. 482-490.

Civco, D.L., 1993. Artificial neural networks for land cover classification and mapping. International Journal of Geographical Information Systems, 7, pp. 173-186.

Collins, J. and C.E. Woodcock, 1999. Modelling the distribution of cover fraction of a geophysical field. in (Atkinson, P.M. and Tate, N.J., Eds.), Advances in Remote Sensing and GIS Analysis (Wiley: Chichester) [in press].

Ehrlich, D., J.E. Estes, and A. Singh, 1994. Applications of NOAA/AVHRR 1 km data for environmental monitoring. International Journal of Remote Sensing, 15, pp. 145-161.

Foody, G.M. and D.P. Cox, 1994. Sub-pixel land cover composition estimation using a linear mixture model and fuzzy membership functions. International Journal of Remote Sensing, 15, pp. 619-631.

Foody, G.M., N.A. Campbell, N.M. Trodd, and T.F. Wood, 1992. Derivation and applications of probabilistic measures of class membership from the maximum likelihood classification. Photogrammetric Engineering and Remote Sensing, 58, pp. 1,335-1,341.

Paola, J.D. and R.D. Schowengerdt, 1995. Review article: A review and analysis of backpropagation neural networks for classification of remotely sensed multispectral imagery. International Journal of Remote Sensing, 16, pp. 3,033-3,058.

Gillespie, A.R., 1992. Spectral mixture analysis of multi-spectral thermal infrared images. Remote Sensing of Environment, 42, pp. 137-145.

Justice, C.O., G.B. Bailey, M.E. Maiden, S.I. Rasool, D.E. Strebel, and J.D. Tarpley, 1995. Recent data and information system initiatives for remotely sensed measurements of the land surface. Remote Sensing of Environment, 51, pp. 235-244.

Kanellopoulos, I., A. Varfis, G.G. Wilkinson, and J. Megier, 1992. Land cover discrimination in SPOT HRV imagery using an artificial neural network: a 20 class experiment. International Journal of Remote Sensing, 13, pp. 917-924.

Settle, J.J. and N.A. Drake, 1993. Linear mixing and the estimation of ground cover proportions. International Journal of Remote Sensing, 14, pp. 1,159-1,177.

Shimabukuro, Y.E., V.C. Carvalho, and B.F.T. Rudhorff, 1997. NOAA-AVHRR data processing for the mapping of vegetation cover, International Journal of Remote Sensing, 18, pp. 671-677.

Thomas, I., Benning, V. and Ching, N.P., 1987. Classification of Remotely Sensed Images (Adam Hilger: Bristol).

Townshend, J.R.G., C.O. Justice, W. Li, C. Gurney, and J. Mcmanus, 1991. Global land cover classification by remote sensing: capabilities and future possibilities. Remote Sensing of Environment, 35, pp. 243-255.