Automatic Neural Network Configuration for the Classification of Remotely-Sensed Imagery

Mark Gahegan, Gordon German and Geoff West
Department of GIS, School of Computing, Curtin University, PO Box U 1987, Perth 6001, Western Australia

Abstract
Neural Networks are now established computational tools used for search minimisation and data classification. They have been successfully applied to geographic and remotely-sensed data in a number of different studies (e.g. Bischof et al., 1992; Kamata & Kawaguchi, 1993). A barrier to their general acceptance and use by all but 'experts' is the difficulty of configuring the network initially. This has lead to their rejection by some researchers. However, neural networks offer some highly desirable features for landuse classification problems since they are able to take in a variety of data types, recorded on different statistical scales and combine them. Unlike many other types of classifiers, the neural network achieves this without reliance on a fixed parametric model of data distribution, but rather adapts to the distribution inherent within the training data. As such, neural networks should offer advantages of increased accuracy.

This paper describes the architectural problems of applying neural networks to landuse classification exercises in geography and details some of the latest developments from an ongoing research project aimed at overcoming these problems (Gordon & Gahegan, 1996). A comprehensive strategy for the configuration of neural networks is presented, whereby the network is automatically constructed by a process involving initial analysis of the training data and of the application. By careful study of the functioning of each part of the network it is possible to select the architecture and initial weights on the node connections so the constructed network is 'right first time'. In addition, further adaptations can be made to control the behaviour of hyperplanes during training to optimise the network functioning from the perspective of landuse classification. The entire configuration process is encapsulated by single application which may be treated by the user as a 'black box', allowing the network to be applied in much the same way as a maximum likelihood classifier, with no further effort being required of the user.

Specifically this paper addresses the issue of restricting the movement of the hyperplanes according to measures of class separability and overlap in the search space. A particular problem with neural networks is that of ensuring generality in so far as the trained network can be applied to the full dataset with a high (and known) degree of accuracy. Existing approaches tend to concentrate on characterising each individual training class as accurately as possible by encouraging the 'tight' fitting of hyperplanes. If unchecked, this leads to over-training of the network, giving excellent results when classifying the original training data, but much poorer results when generalising to the entire scene (or validation set). It is difficult to know (without a comprehensive testing and validation exercise) when over-training might have occurred. An alternative approach, shown here, is to concentrate instead on the pairwise separation of training classes by modifying the movement of hyperplanes according to measures of within-class variance.

A comprehensive set of results is presented, detailing the application of this approach to a problem of floristic classification on a publicly available dataset. The results show that the architecture is stable in that network convergence is assured, and reliable in that classification accuracies often exceed those of other techniques applied to the same dataset.

References
Bischof, H., Schneider, W. and Pinz, A. J. 1992. "Multispectral classification of Landsat images using neural networks", IEEE Trans. Geoscience and Remote Sensing, 30 (3), 482-490.

German, G. and Gahegan, M. 1996. "Neural network architectures for the classification of temporal image sequences". To appear in: Computers and Geosciences (special edition on Neural Networks).

Kamata, S. and Kawaguchi, E. 1993. "A neural network classifier for multi-temporal Landsat images using spatial and spectral information", Proc. IEEE l993 International Joint Conference on Neural Networks, 3, 2199-2202.