Abstract
A wide range of spatial operators have been developed for querying spatially referenced data, such as 'near', 'inside', 'outside', 'upstream', 'downstream' etc. This paper discusses equivalent work at the Institute of Hydrology on developing search mechanisms for querying the time dimension in a four dimensional GIS. The paper opens by identifying the properties of environmental time-series data and concludes that all data including spatial data have the potential to change over time. A generic data model capable of recording the history of environmental features as they move through space and time is described. The description concentrates particularly on those aspects of the model concerning time, for example, how it records that some attribute values relate to an instant in time, while others relate to an hour, a day or some indeterminate period terminated by a future, as yet unknown event.
The main part of the paper outlines the search mechanisms and how they exploit the information in the database. Existing relational query languages can easily articulate simple time based queries such as 'within a date range". However, to ask the apparently simple question "find the sites where sediment levels exceed x when flow exceeded y" can require a highly complex query. This would be the case if the measurements of sediment and flow are not simultaneous and sediment values are instantaneous observations while the flow values are mean values each relating to a day. The objective of the Institute's work is to allow such queries to be expressed naturally and leave the system to resolve the complexity.
In order to express a query, the user must be able to visualise how the data are stored. Ideally, a simple generic stucture is required that can house all foreseeable data. Without such a structure, or data model, it is hard to consuct a logical and consistaent search engine for selecive extraction of data.
A generic data model has been developed, that is capable of storing both spatial and non-spatial time variant data. At the user level the data model records the world in terms of features. A feature is any object whose description over time and space the user wishes to record. The types of feature recorded are decided and defined by the user and the definitions stored in the systems data dictionary. Examples of features might include roads, rivers or river flow gauging stations. Features and events observed at them are defined in terms of attribute values. Attributes are also defined by the user, examples being feature name, grid reference or a concentration of a chemical.
The cube illustrated above, provides the user with a simple mental image of their data. Each cell contains the value of an attribute describing a feature at some moment in time, for example, the rate of flow in the Thames at Teddington on the 14/05/96. Each axis is infinite, and time may be recorded at any level of accuracy from years, through to months, weeks, days, hour, minutes and seconds.
The cube data model provides an environment for expressing queries. Extended SQL like languages for querying spatial data have been described elsewhere and such a facility is envisaged for the cube. This paper, however, will concentrate on a development that has been found extremely valuable from a users point of view. As well as extracting columns of data, it is also useful to be able to create lists of objects as the output of a query. To accomplish this, the concept of WHAT, WHERE and WHEN lists have been developed. A WHERELIST contains list of features of interest. A WHATLIST contains a list of attributes of interest. A WHENLIST contains a subset of the infinite time axis. Combined lists such as WHEREWHEN are also possible. Lists identify a number of cells in the cube and provide a very convenient way of driving applications that report on or analyse the contents of the cube. The paper will discuss their creation and use.
The paper concludes with a review of unresolved problems.