First Experiences in Implementing a Spatial Metadata Repository

Sophie Cockcroft
Department of Information Science, University of Otago, PO Box 56, Dunedin

Integrated software engineering environments (ISEE) for traditional non spatial information systems are well developed, incorporating Database Management Systems (DBMS) and Computer Aided Software Engineering (CASE) tools. The core component of the ISEE is the repository. It brings all the other components together and provides a common area to which all tools can link. In this fashion it also provides a central point for control. No such facility exists for the management of spatial data. Whilst many Spatial Information Systems (SIS) incorporate a data dictionary, it is typically a passive catalogue of features, attributes and descriptions. The work described here is concerned with two facets of the repository in particular; firstly, the support for physical database objects, through data about data or ‘metadata’, during system development and secondly database integrity control during production. This paper demonstrates the usefulness of using a repository in spatial database development by showing how it can be used to incorporate spatial integrity constraints in database schemas.

The concept of metadata has evolved through several disciplines. At its simplest level metadata is additional information necessary for data to be useful. A more insightful explanation was provided by Henderson who classified metadata into dictionary metadata describing characteristics, relationships and uses and directory metadata describing where the data is and how it can be accessed. Both types of metadata have received much attention from the GIS community recently (Medyckyj-scott et al., 1996]. The emphasis, however tends to be on directory metadata. It is dictionary metadata which is of interest here.

The inability of current spatial information systems to enforce integrity constraints poses a serious threat to the quality of data entered into such systems. This inability, and suggestions for addressing it, are recurring themes in the database and GIS literature (Marble, 1990; Worboys, 1994; Günther & Lamberts, 1994; Hadzilacos & Tryfona, 1992; Hadzilacos & Tryfona, 1996; Medeiros & Pires, 1994). There are a number of ways of classifying integrity constraints. This paper discusses two alternative, but not mutually exclusive classifications; static/transitional/dynamic integrity constraints and constraints implemented by means of topological/semantic/entity-referential/user rules.

In SIS the traditional approach to database management has been to allow the application layer to supplement the set of capabilities offered by the underlying system architectures. In this way the operational needs of spatial data handling are satisfied, including integrity constraint checking. Various researchers have suggested the repository as a means of removing the burden of managing GIS capabilities from the application layer. Topics addressed include the management of integrity constraints (Cockcroft, 1996) and the management of GIS operations within the repository (Stefanakis & Sellis, 1996).

The ultimate aim of this work is improvement of data quality through the imposition of integrity rules on data entry. This will be done by incorporating constraints in data base schemas. Triggering operations are an important component of any database strategy. With triggering operations the responsibility for data integrity lies within the scope of the DBMS rather than application programs. The rules from which triggers are derived are stored in the repository. This paper concentrates on constraints based on the sixteen cases for binary topological relationships of Egenhofer & Franzosa (1991) that is static, topological constraints. As a first step to incorporating them in schemas as described above, a repository incorporating an integrity constraint management tool is suggested. A case study is described here which demonstrates how spatial data quality can be improved through the imposition of integrity constraints on data entry, and how these constraints can be managed by means of a repository.

Cockcroft, S. K. S. 1996. "Towards the automatic transition from conceptual to logical spatial data models". In R. Pascoe (Ed.), Proceedings: The 1996 New Zealand Conference on Geographical Information Systems and Spatial Information Research, Otago University, New Zealand, July 9th-11th : AURISA/SIRC. (In Press).

Egenhofer, M. J., & Franzosa, R. D. 1991. "Point-set Topological Relations, International Journal of Geographical Information Systems", 5(2), 161-174.

Günther, O., & Lamberts, J. 1994. "Object-oriented Techniques for the Management of Geographic and Environmental Data", The Computer Journal, 37(1), 16-25.

Hadzilacos, T., & Tryfona, N. 1992. "A model for expressing topological integrity constraints in geographic databases". In A. U. Frank, I. Campari, & U. Formentini (Ed.), Proceedings: Theories and Methods of Spatio-Temporal Reasoning in Geographic Space. International Conference GIS - From Space to Territory, Pisa , Italy : 252-68.

Hadzilacos, T., & Tryfona, N. 1996. "Logical data modelling for geographic applications", International journal of Geographical Information Systems, 10(2), 179-200.

Marble, D. F. 1990. "The extended data dictionary: A critical element in building viable spatial databases". In Proceedings: 11th annual ESRI user conference.

Medeiros, C. B., & Pires, F. 1994. "Databases for GIS", ACM SIGMOD record, 23(1), 107-115.

Medyckyj-scott, D., Cuthbertson, M., & Newman, I. 1996. "Discovering environmental data: metadatabases, network information, resource tools and the GENIE system", International Journal of Geographical Information Systems, 10(1), 65-84.

Stefanakis, E., & Sellis, T. 1996. "A DBMS repository for the Application Domain of Geographic Information Systems". In Proceedings: 7th International Symposium on Spatial Data Handling (SDH '96), Delft, The Netherlands (In Press).

Worboys, M. F., 1994. "Object-Oriented Approaches to Geo-Referenced Information", International Journal of Geographical Information Systems, 8(4), 385-399.