Human Systems Modelling: Results of the HPC Initiative at Leeds

Ian Turton and Stan Openshaw
Centre for Computational Geography, School of Geography, University of Leeds, Leeds LS2 9JT

The paper will show some examples of the innovative work carried out at the Centre for Computational Geography under the EPSRC's high performance computing initiative (HPCI) using the resources of the Cray T3D supercomputer at Edinburgh. The project has three main aims: to port a selection of serial and vector parallel codes on to the Cray T3D, to use the codes to carry out new science that had previously been considered computationally infeasible, and to educate other geographers as to what was possible in the rapidly developing fields of parallel and super computing.

The first example that will be demonstrated is the parallelisation of large spatial interaction models. Two large interaction data sets were created from the 1991 Census of Population the Special Migration Statistics (SMS) and the Special Workplace Statistics (SWS). Both of these datasets are provided at ward level giving 10,7642 possible interactions. To model these datasets using conventional serial spatial interaction models would take several weeks of computer time. However by making use of parallel computing it is possible to evaluate several thousand models per hour. This allows geographers to take advantage of the new fine resolution of these data sets. The increased power of parallel computing also allows the computation of new variants of models such as disaggregations of the model by destination or origin zone.

Once it becomes possible to calibrate spatial interaction models on this scale it becomes practicable to solve large spatial optimisation problems. In a retail context or hospital planning situation it is often of interest to determine which set of N out of M locations will yield maximum profits or other performance indicator. The quality of the results of this type of problem depends heavily on how many possible models can be evaluated in some fixed time. It will be shown how with some careful work it was possible to speed up the model by a factor of 2.8 million times and obtain results at least twice as good as previously.

The third example of the value of being able to compute an existing geographical model many times faster is the sensitivity analysis of a European Union population model. In the past it has been considered enough to run a model for the whole of the EU, especially if the modelling is being carried out at a regional level within the member states. With the application of supercomputing power to the problem it is now possible to run the model thousands of times with perturbations of the model inputs being made to allow an estimate of the sensitivity of the model to be found. This gives new insights into the modelling problem and allows better comparisons of the various scenarios commonly used by European population modellers.

Zone redesign has always been considered a hard problem by geographers. But with the geographical systems information (GIS) revolution providing new high quality digital boundary information combined with parallel computing power it becomes possible to design new zoning systems limited only by the data available for small zones and the users imagination. The paper will show examples of re-zoning wards in metropolitan districts to maximise government deprivation area status (and grant payments) as well as offering a new approach to visualising spatial data.

The power of neural networks has been demonstrated in the past for geographers interested in classifying small areas. However limited computing power has prevented their use becoming wide spread for larger problems. The paper will show the results of using a parallel self organising map to classify all the enumeration districts from the 1991 Census of Population (145716 zones) using 80 variables for each zone. The extra power that parallel computing gives also allows the use of error estimation and zone size to be applied to the problem offering a classification of unparalleled quality.

Finally the paper will look at work the consortium has carried out to investigate the possibilities of applying modern artificial intelligence methods to geography. The example presented is the use of genetic programming to attempt to develop new spatial interaction models. This is a very compute intensive problem requiring hundreds of thousands of possible models to be evaluated before a better model is found. Even with the advantages of genetic programming to direct the search towards promising areas of study it would be impossible to carry out this sort of work with out the aid of massively parallel computing. Some illustrative examples of the benefits of this approach are offered.

The paper as a whole gives an overview of the work of the Leeds HPCI consortium over the last two years and it is hoped that it provides a useful introduction to the application of parallel computing to other geographers who may have been put off by worries of complexity, or fears that the gains were too small to be worth the effort involved in the transition, or who simply failed to realise what is now possible.