In-Round-Out: the Organisation of Information in Population Projections

Philip Rees
School of Geography, University of Leeds, Leeds LS2 9JT, United Kingdom

The standard approach to subnational projection
Projections for subnational populations are routinely carried out by both national and international organisations. The outputs are the populations of regions classified by age and sex for years in the future. The methodology for handling such projections has been under development for some three decades. At its heart, it contains a multiregional model, the essentials of which were proposed by Rogers in 1968 and developed by him and others in the 1970s (Rogers, 1975, 1995; Rees & Wilson 1977; Stone, 1976).

Difficulties faced by the standard approach
However, there have been difficulties in implementing the multiregional model when the number of regions is high. These difficulties include:

  1. lack of detailed inter-regional migration data
  2. lack of detailed regional fertility, mortality and international migration data
  3. difficulties in developing scenarios for future migration parameters.
  4. the multiregional model yields too many variables that need to be forecast
For example, in the United Kingdom there are no time series of data for any of the inputs to a multiregional population projection for the new unitary authorities (UAs) which are being created in 1996-98 as the new basis of local government. While the boundaries of the new UAs are known for Scotland and Wales (Jackson & Lewis, 1996), the final decisions on some UAs in England and Wales have still to be taken.

Improvements needed
So what features need to be added to the standard multiregional model so that its key advantage (that it captures the mutual interaction between regional populations) can be exploited?

The model needs to be embodied in a system of information processing with the following features:

  1. Raw data base for the recent past
  2. Routines for geographical conversion of the raw data
  3. Routines for estimation of model variables from raw inputs
  4. Routines for derivation of key diagnostic indicators
  5. Statistical analysis of the key effects in the estimated database
  6. Redesign of the multiregional model that builds on the statistical analysis
  7. Development of scenarios for input variables
  8. Routines for the reporting and display of model outputs
  9. Routines for constraining outputs, either top-down or bottom up
These features have been developed in two recent streams of work. The first is a set of population modelling and information systems developed for local authority clients by GMAP Ltd (Rees 1993, 1994). In these systems the basic structure of raw-estimate-scenario-output databases was constructed and linked by estimation, scenario development, execution and screen display routines. The second stream is the development of projections for European regions by the European Commission, Eurostat and various Netherlands research institutes (Cruijsen, 1991; Haverkate & van Haselen, 1990; NEI, 1993, 1994a, b, c, d). This work recognised the need to develop partitioned, hierarchical models, to use reverse standardisation techniques for estimating missing data, to use in-depth analysis of time series and expert judgement as a basis of scenario development. Recent work (van Imhoff et al., 1995) has developed a framework for generating and assessing alternative model structures

The UKPOP projection model for the new UAs
The paper will describe a system for projecting the populations of the new unitary authorities of the United Kingdom. To carry out that task a raw database is assembled at a variety of spatial scales (district and ward) that contribute to the new areas. Data from the last census in 1991 or from 1990 and 1991 calendar years is used to construct estimates of the detailed input variables, while data for districts for 1992-95 is used to update the input variables.

Raw data base for the recent past

  1. Population data by age and sex for a mixture of districts, wards and postal sectors is aggregated using a look-up table to the new unitary authorities. At the same time underenumeration factors are applied and the data adjusted to a mid-year basis.
  2. Ward level births and deaths data are aggregated in the same way.
  3. Internal migration data cannot be treated in this way, however. Inter-ward migrant flows from Set 1 of the Special Migration Statistics (SMS) are aggregated to an inter-UA basis using look-up tables.
  4. Immigration can be estimated from ward data in SMS, Set 1 but emigration data cannot. Use is made of the International Passenger Survey (IPS) information about emigration.

Routines for geographical conversion of the raw data
Efficient routines are needed to effect the geographical conversions outlined above. These involve transparent look-up tables.

Routines for estimation of model variables from raw inputs
These routines are mainly needed for adding age-sex disaggregation at the finest spatial scale where such data are absent or unreliable. The technique of reverse standardisation is used.

Routines for derivation of key diagnostic indicators
Efficient routines are needed for generating fertility indices (total fertility rate, general fertility rate, standardised fertility indices), mortality indices (mean and median life expectancies, standardised mortality indices), and migration indices (gross migraproduction rates, standardised migration indices), because they can potentially be used with 204 UAs (32 in Scotland, 22 in Wales, 149 in England (provisional), with 1 assumed in Northern Ireland).

Statistical analysis of the key effects in the estimated database
The key question which statistical analysis helps answer is as follows. What reduced form model can be used to estimate the full array of migration probabilities used in a multiregional projection model? The dimensions of the model can be represented by origins (O), destinations (D), ages (A), sexes (S) and time (T)?

A saturated model would use all array probabilities:

Model = ODAST The simplest model would assume independence of each of the dimensions: Model = O + D + A + S + T The saturated model contains far too many variables to be forecast. The independence model ignores vital dependencies between dimensions (e.g. people migrate to nearby places). A compromise between these extremes is needed that is a successful trade-off between number of variables to be used and loss of information and which links easily to reliable raw data. An example is: Model = OT + DT + AS + OD in which AS and OD effects are kept constant over time in the projection but origin effects and destination effects are varied (through time series analysis) over time.

Redesign of the multiregional model that builds on the statistical analysis
The model uses an extension of the hierarchical design used by Rees (1996) for NUTS 1 regions in the European Union. In that ECPOP model two levels were recognised in the hierarchy: member states of the European Union and regions within member states. In the UKPOP model the two levels are home countries (England, Wales, Scotland and Northern Ireland) and UAs within home countries. This design recognises the country division of national projections (carried out by the Government Actuary and the Office for National Statistics) and the fact that the subnational projections are carried out by separate bodies (ONS for England, Welsh Office for Wales, General Register Office for Scotland and no projections for Northern Ireland).

Development of scenarios for input variables
The UKPOP model provides flexible means of designing scenarios using either the detailed input variables or summary indicators. Experience has shown that full flexibility is needed for projections that need to be tweaked to reflect minor local factors but that generality is needed for carrying broad brush scenario exploration.

Routines for the reporting and display of model outputs
Because the results of subnational projections are so voluminous, methods are needed for displaying the information in an efficient way under user control. The UKPOP model uses user selected parameters to control output, but the task of interfacing to visualisation software is left to the future.

Routines for constraining outputs, either top-down or bottom up
A final requirement of a projection model for many regions is that the outputs be adjusted in a consistent way to external constraints. In a hierarchical model, the national population may be projected independently of the subnational. It is then possible to adjust the subnational results to the national. In the UKPOP model, an option is included for constraint of model outputs to external constraints (such as the GAD/ONS projections).

Conclusions
The flow of information into, round and out of population projection software must be planned with care. The UKPOP model, implemented for a new set of geographic areas, illustrates the kind of strategies needed to achieve a reasonable compromise between the unreasonable information demands of a conventional multiregional model when over 200 regions are being used and complete parsimony which neglects important migration behaviours. The UKPOP model provides, for the first time, an integrated projection of subnational populations for the whole of the UK using the same model and assumptions.

References
Cruijsen, H. 1991. "Fertility in the European Community: main trends, recent projections and two future paths". In Eurostat, Background papers on fertility, mortality and international migration under two long term population scenarios for the European Community. Luxembourg: Statistical Office of the European Communities.

Haverkate, R. and Van Haselen, H. 1990. Demographic evolution through time in European regions (Demeter 2015). Report to the Commission of the European Communities, Directorate-General for Regional Policy. Rotterdam: Netherlands Economic Institute.

Jackson, G. and Lewis, C. 1996. "Local government reorganisation in Scotland and Wales", Population Trends, 83, Spring, 43-51.

NEI. 1993. Regional population and labour force scenarios for the EEA. Interim Report: Population. Rotterdam: Netherlands Economic Institute.

NEI. 1994a. Regional population and labour force scenarios for the European Union. Part 1: two long term population scenarios. Rotterdam: Netherlands Economic Institute, Department of Regional and Urban Development; Erasmus University, Department of Public Health; Netherlands Interdisciplinary Demographic Institute (NIDI), June 1994.

NEI. 1994b. Regional population and labour force scenarios for the European Union. Part 11: Two long term labour force scenarios. Rotterdam: Netherlands Economic Institute, Department of Regional and Urban Development; Erasmus University, Economic Geography Institute, June 1994.

NEI. 1994c. Regional population and labour force scenarios for the European Union. Part 111: Results population scenarios. Rotterdam: Netherlands Economic Institute, Department of Regional and Urban Development; Erasmus University, Department of Public Health, March 1994.

NEI. 1994d. Regional population and labour force scenarios for the European Union. Part 1V: Results labour force scenarios. Rotterdam: Netherlands Economic Institute, Department of Regional and Urban Development; Erasmus University, Economic Geography Institute, March 1994.

Rees, P. 1994 "Estimating and projecting the populations of urban communities". Environment and Planning A, 26, 1671-97.

Rees, P. 1994. "The projection of small area populations: a case study in Swansea, Wales". Chapter 11 in Hooimeijer P., Van der Knaap G.A., Van Weesep J. and Woods R. (eds.) Population dynamics in Europe: current issues in population geography. Netherlands Geographical Studies 173, Royal Netherlands Geographical Society and Department of Geography, University of Utrecht.

Rees, P. and Wilson, A. 1977. Spatial population analysis. London: Edward Arnold.

Rogers, A. 1975. An introduction to multiregional mathematical demography. New York: Wiley.

Rogers, A. 1995. Multiregional demography. Chichester: Wiley.

Stone, R. 1976. Towards a system of social and demographic statistics. United Nations, New York.

Van Imhoff, E., van der Gaag, N., van Wissen, L. and Rees, P. 1995. "Model selection for internal migration at the NUTS-2 level". Paper submitted to the International Journal of Population Geography.