Towards an ontology of fields
Karen K. Kemp
National Center for Geographic Information and Analysis
University of California Santa Barbara, Santa Barbara CA USA, 93106-4060
Email: kemp@ncgia.ucsb.edu
Andrej Vckovsky
Netcetera AG, CH-8040 Zürich, Switzerland
Email: vckovski@netcetera.ch
Abstract
While philosophers define ontology as "a branch of metaphysics concerned
with the nature and relations of being", within the knowledge representation
and reasoning community, a more tractable definition exists. There, an
ontology is "a specification of a conceptualization" or a definition of
the vocabulary used to represent knowledge. An ontology describes the concepts
and relationships that exist within a specific domain and describes all
that can be represented about that domain. An ontology of fields which
explicitly characterizes spatially continuous phenomena in order that they
can be consistently modeled and completely described within spatial databases
is needed. Such an ontology must be based on a formal definition of fields.
We argue that the classical definition of a field as a function on a domain
which is a subset of space-time is accurate, explicit and expressive, and
provides access to the full set of mathematical tools for the characterization
of fields. Thus, we conclude there is no need for more ontology.
1. Introduction
Interoperability and object orientation are today's prominent trends in
information technology. Within GIS, Open GIS (Buehler and McKee 1996) envisions
shared reusable components and well specified interface definitions which
will allow spatial data in any of a number of common formats to be used
and reused with a full range of spatial analysis and modeling components
supplied by various software vendors and private individuals. But such
complete interoperability does not work in a world of unique proprietary
implementations. Interoperability depends on a set of well defined conceptualizations
of reality and their representation in the digital domain. For successful
data integration and data exchange, these conceptualizations must be understood
and accepted generally by the community of users.
In the AI field in which knowledge representation and sharing are central
concerns, the development of formal ontologies which express shared assumptions
and real world models is key to their software development efforts. Gruber
suggests that "For AI systems, what "exists" is that which can be represented"
(Gruber 1993, p1). This seems to be equally true for all systems which
model the real world. In order to model the world, we need to be able to
represent it.
Considerable attention has been devoted to understanding how people
conceptualize the real world as discrete objects. Cognitive psychologists
and others have developed theories for how we discretize and categorize
the world (see for example Rosch and Lloyd 1978, Couclelis 1992 and Mark
1997) while AI researchers have built any number of ontologies describing
objects and concepts (see for example the library of ontologies being constructed
using the Ontology Editor at the Stanford Knowledge Systems Laboratory
at http://www-ksl-svc.stanford.edu
or the World Fact book at http://www.odci.gov/cia/publications/factbook/index.html).
However, phenomena which scientists conceive of as spatially continuous,
such as temperature and soil moisture, have received almost no attention
in this regard.
Given that this is an aspect of geographic information science which
apparently needs focused attention, the US National Center for Geographic
Information and Analysis (NCGIA) identified "The Ontology of Fields" as
one of the nine topics to be included in its current set of specialist
meetings to be held between 1997 and 1999 under Project Varenius (see http://www.ncgia.org/varenius).
This specialist meeting was held in May 1998 in Bar Harbor Maine. Discussion
was lively and well tempered by the presence of a number of philosophers,
including one of the meeting's leaders, Prof. Barry Smith from the University
at Buffalo, in addition to various geographic information scientists, computer
scientists and others. This paper represents the authors' own consolidation
of those discussions and should not be taken as a consensus opinion. It
is hoped that it may provide a foundation for further discussion and refinement.
2. Definitions
2.1 What is a field?
At the meeting, the philosophers were deeply concerned with the question
"Do fields really exist or are they simply an intellectual construct?"
Certainly, fields are a fundamental concept in science. Gravity, air pressure
and elevation are all understood and modeled in a continuous context. However,
the representation of phenomena as fields is often either a conscious decision
(such as using density of vegetation as a variable) or an imposition of
technology (such as the use of rasters as a tractable solution to modeling
the flow of water over the land surface). Here we acknowledge this philosophical
question but for modeling purposes we assume their true existence.
One of the problems with formalizing the concept of fields is that the
term has many definitions and meanings. At the Varenius meeting, the definition
of fields varied according to one's discipline. One statistician suggested
that fields may be all of the following: spatial response surfaces; emergent
properties of collections of objects; the spatial distribution from which
a variable or feature is extracted; and, the fabric in which objects are
defined. Putting aside all the non-physical uses of "field" (including
a field of wheat, the field of geography, working in the field), a clear
formal definition for computational purposes is still lacking.
While most geographic information scientists will generally accept the
distinction between "entities" in the real world and "objects" as representations
of them, no similar dichotomy has wide acceptance for fields. Though there
are many forms of digital representation of fields--including rasters,
pointgrids, TINs, and contour lines--there is no widely used single term
to express the concept of a digital representation of a field. This lack
of a term for a generic, formal representation has hindered the development
of a commonly understood conceptualization of fields and their characteristics
so that they can be expressed formally for computational purposes.
The usual notion of fields in geographic information science has been
adopted from physics. In the 19th century, fields were originally
conceived within the context of "force fields". The term force field is
used to describe a phenomenon which is associated with a (conservative)
force caused by some distribution of matter, charge, etc. that is experienced
by a imaginary unit particle within a spatio-temporal area. Typical force
fields are a gravitational field or a magnetic field. Attributing every
spatial location with a force can be modeled mathematically by a function
or a mapping between the spatial (and temporal) location and the associated
force vector. The mathematical representation of a force field as a vector-valued
function of space and time thus motivated the use of the word "field" to
describe any phenomenon which can be represented mathematically by a function
of space and time, such as for example, temperature distributions, potentials
and densities. Therefore, we use the term "field" for the subsequent discussion
for any phenomenon that can be mathematically described by a function of
space and time.
Mathematically, the definition of a function consists of:
-
a domain, D (the "independent variables"),
-
a range, R (also value domain or co-domain, the "dependent variables"),
and
-
a rule that associates every element of domain D with exactly one element
of range R (various elements of D can be associated with the same element
of R, of course).
Both D and R are any sets. However, the use of the term field, especially
in geographic information science, most often implies that the domain D
is a subset of space-time. Often, the term "continuous field" is used to
emphasize that the domain is continuous, i.e., that the domain is a compact
and connected set (a "continuum"). This frequently generates confusion
because in mathematics the notion of a continuous function does not only
describe a quality of the domain D but also a quality of the rule, i.e.,
a continuous function is smooth and has no sudden jumps.
A second source of confusion when defining a field as something that
can be modeled by a function is the frequent misconception that the function's
rule associating elements from the domain (e.g., earth surface) with elements
of the range (e.g., temperature measurements) needs to be "deterministic"
or somehow directly computable (i.e., "a function of ..."). The rule can
be defined as an analytical expression of the domain values, but it could
also be defined, for example, as an explicit association of every individual
element of the domain D with a corresponding element of the range R. In
most cases in a computational context, however, this mapping rule must
be defined by an estimation from a small, discrete set of samples.
2.2 Fields and objects
In some domains or in some analytical procedures, conceptualizations switch
between objects and fields as needed. Does this imply that many things
can be both objects and fields? Can everything be both an object and a
field? In geographic information science it has been often argued that
the field and object view show a certain duality (Worboys and Deen, 1991;
Couclelis 1992; Kemp 1997). Objects are primarily identified by their non-spatial
and temporal characteristics and then attributed with their spatial (and
temporal) extension. The measurement and representation of fields usually
first identify the spatial and temporal component (the element of the domain)
and then associate the (non-spatial) field value. The conceptual transition
from fields to objects occurs when it is necessary to extract specific
characteristics of a field such as extrema (e.g., valleys and ridges from
an elevation model). Objects to field transitions occur when proximity
fields or density fields are modeled from a discrete set of objects.
The transition from an object view to a field view or vice versa can
be established, at least theoretically, by a characteristic function (object
to field) or inversion of the function (field to object). However, this
is rarely feasible in an computational context. Applications that use both
object and field views simultaneously are still an exception. Most such
applications use an object view to define the domain of a field, e.g.,
water temperature along rivers.
2.3 What is ontology?
While philosophers define ontology as a branch of metaphysics concerned
with the nature and relations of being, within the knowledge representation
and reasoning community (generally known as AI), a more tractable definition
exists. Philosophers believe there is only one ontology. They seek to isolate
artifacts of theories and eliminate them from the ontology. An ontology
in the philosophical sense is not based on theories, it cannot be mined
for the truth, it is the truth. On the other hand, in the AI community,
an ontology is "a specification of a conceptualization" or a definition
of the vocabulary used to represent knowledge. An ontology describes the
concepts and relationships that exist within a specific domain and describes
all that can be represented about that domain. Ontologies provide a means
by which characteristics of a specific representation can be assumed and
behavior predefined. Computation thus moves closer to perception, away
from data structures. Multiple user views can be accommodated by providing
translations between different ontologies.
In the context of geographic information systems an ontology is somewhat
synonymous to what is called sometimes a formal system for the specification
(e.g., Frank and Kuhn 1995) or an essential model (Buehler and McKee 1996).
The basic objective is to provide an explicit, unambiguous specification
of spatial information which can serve as a kind of lingua franca for data
exchange and data integration (i.e., interoperability). The specification
provided by the ontology remains on a conceptual level and can be for used,
for example, for various different concrete implementations. The use of
a common specification provides the necessary basis for comparability and
information exchange and is a central objective in any interoperability
initiative.
2.4 Geographic Ontologies
Some progress has been made in laying a foundation for the ontology of
geographic objects, where these are defined as:
-
spatial objects on or near the surface of the earth. Furthermore, they
are objects of a certain minimal scale (roughly: of a scale such that they
cannot be surveyed unaided within a single perceptual act)... Geographic
objects do not merely have constituent object-parts, they also have boundaries,
which contribute as much to their ontological make-up as do the constituents
that they comprehend in their interiors. Geographic objects are prototypically
connected or contiguous, but they are sometimes scattered or separated.
(Smith and Mark 1998, p. 310)
The specialist meeting for NCGIA's Research Initiative 21 which examines
"Formal Models of Common-sense Geographic Worlds" (Naïve Geography)
resulted in a preliminary draft of a "geographic ontology" (Egenhofer,
Mark, and Hornsby 1997). Following are some relevant portions of
that ontology:
-
IV. space (always empty)
-
A. inside a room
-
B. inside the grand canyon
-
V. place
-
A. [complete enclosure]
-
B. territory
-
C. home
-
D. neighborhood
-
E. region
-
VI. topological features
-
A. surface
-
1. lake surface
-
2. top soil
-
B. interior
-
1. underneath lake surface
-
C. edge
-
1. frontier
-
2. barrier
-
3. dam
-
4. cliff
-
5. shoreline
-
D. side/end
and later
-
XII. properties of geographical features
-
A. metric properties
-
1. absolute
-
a) width
-
b) breadth
-
c) distance
-
2. relative
-
B. non metric properties
-
1. density
-
2. color
-
3. ....
-
XIII. location
It is clear that fields do not fit easily into this kind of ontology. Their
characteristics and properties need a very different conceptualization.
3. Developing an ontology of fields
Given the mathematical definition above, when exploring the properties
of fields it is useful to pursue the same epistemological path used by
many natural sciences. The principle is very simple and often used unconsciously--by
establishing a correspondence between the phenomenon A under consideration
(a field) and a mathematical object B (a function), many useful tools of
mathematics for characterizing and analyzing B can be used to characterize
A. Thus, if a field corresponds to a ("is a") function, and that function
has a certain property, then the field has that property, too. For example,
if the domain of a field is a domain of a function, all the mathematical
properties of that set can be used for its characterization. In that sense,
by assuming all rules, vocabulary and specifications of mathematics, the
definition of a field as something that can be represented as a function
automatically provides a very powerful, exact and expressive set of tools
for the characterization of a field. However, the richness provided by
that correspondence may, in fact make such an ontology useless, i.e., it
does not provide a small set of concepts and vocabulary which is exhaustively
explicit about everything you need to know about fields.
Whether or not we can use mathematical tools as a foundation for an
ontology of fields, defining a field as something that can be modeled or
represented by a function provides a very useful means of structuring an
ontology. Such an ontology would consider:
-
Properties of the domain
-
Properties of the range
-
Properties of the association rule
-
Properties of the field as a whole
The following discussion of field properties highlights some of the characteristics
which are useful in specific applications using fields and also suggest
areas of further research. Certainly, these characteristics do provide
a basis for the computational implementation of a field ontology.
3.1 Properties of the domain
The domain of a field is defined as a subset of space-time. Therefore,
a useful characterization is based on a characterization of that subset.
-
Dimension: The dimension of the subset can be 0 (trivial case, a
point), 1 (e.g., a line), 2 (e.g., plane), 3 (e.g., a cube or a plane in
time) or 4 (e.g., a cube in time).
-
Mixture of spatial and temporal dimensions: A domain can be spatial
only (i.e., a static field), temporal only (i.e., a time series) or have
both spatial and temporal extension. The distinction is important since
a mixture of temporal and spatial dimensions is not isotropic, i.e., the
combined domain is not a subset of homogeneous Euclidian space because
things such as the distance between two points (at different times) in
that subset cannot be easily defined.
-
Linear vs. non-linear: We call a domain linear if there is a linear
mapping between the domain and a unit cube of the corresponding dimension
(i.e., a straight line, a time interval, a rectangular area, a hypercube).
Non-linear domains are all other domains such as curved lines, circles,
spheres, bubbles. All domains with holes are non-linear. This distinction
is important because most current implementations deal only with linear
domains. Non-linear domains are most often extended to the smallest linear
subset that contains the domain (a minimum bounding box), associating every
element within that extended subset but not within the true domain to a
special value of the range (a kind of null value). Support of non-linear
domains would be especially useful for applications that model continuous
phenomena along networks, e.g., river or road networks.
-
Bounded vs. open domains: The domains can be classified into open
and bounded domains depending on the existence of a boundary. However,
that distinction is not very useful since practically speaking every domain
can be regarded as bounded, even if the boundary is "ad infinitum". Note
that an interaction between a temporal dimension and a bounded domain may
result in a moving boundary.
-
Discrete vs. continuous: Continuous fields have been defined above
as fields with a compact and connected domain. A discrete field is by analogy
a field with a discrete set as domain, e.g., a discrete set of points in
space and time. Actually such domains usually define the measurement of
most fields. The samples of a field can therefore be regarded as a (discrete)
field themselves, i.e., a kind of a "metafield".
3.2 Properties of the range
In contrast to the definition of the domain (a subset of space-time), there
are no restrictions on the definition of the range. The range can be any
set. The range can be a subset of the real numbers, a few tomatoes, geographic
objects, words, or any other imaginable set. Again, there are a few fundamental
characteristics for ranges, including:
-
Natural versus derived: Are the values measured directly in the
real world, e.g., air temperature or terrestrial radiation, or are they
derived from objects or other fields, e.g., density or error fields? The
distinction is fuzzy because many "natural" fields are actually "derived"
fields in the sense that the measured values are only a representation
of the "truth" and sometimes even represent an "artificial" construct.
For example, temperature is strictly defined as the mean kinetic energy
of the molecules, atoms or ions of which a body is composed and thus must
always be measured indirectly.
-
Scale of measurement: For sampled or measured fields, the concepts
of measurement theory can be used to characterize different types of values
as ratio, nominal, ordinal, etc.
-
Dimension: If the range is a subset of a vector space, it could
be useful to use the dimension of the subset as a criterion. This is sometimes
called the co-dimension. However, care must be taken not to confuse
the range's dimension with the dimension of the domain.
Further characterization will be very discipline and application specific
and perhaps beyond the needs of ontological engineering.
3.3 Properties of the association rule
The association rule is fundamental for understanding how inferred values
should be derived from a discrete representation. There are a number of
different types of rules:
-
Empirical (estimated from measurements) vs. theoretical
-
Analytic function of the domain
-
Stochastic (i.e., noise)
-
Composed as / superposition from a set of functions (e.g., spectral components).
Perhaps the most important property of the association rule is the question
of change which ties the temporal dimension of the domain to the rule.
Related to change are the characteristics of persistence, identity
and movement which are generally associated with objects. Do they
have a counterpart in fields?
3.4 Properties of the field as a whole
The properties of the field as a whole are characteristics of the association
rule. Most disciplines have their specific ideas on what characteristics
are important discriminators and they have developed tools and methods
for corresponding analysis. Typical cases include:
-
Stationarity: Are the field values constant along any dimension,
e.g., temporal, some spatial dimension?
-
Symmetry: Are there any symmetries in the field values, e.g., rotation
symmetry, symmetries along some axes, and so on?
-
Periodicty: Are the field values periodic along some dimensions,
e.g., waves in space and time?
-
Trends: Does the field show any trends, i.e., is the field sufficiently
similar to a simple derived field such as a linear field?
-
Features: Are there any special subsets of the domain, such as minima
or maxima? Most object-extraction methods rely on the identification of
such special features of the association rule.
4. Conclusion
Granted, there may still be a great many unexplored philosophical questions
about the ontology of fields (e.g. Do fields exist only when they are measured?
Are there features in fields? What is an atmospheric front--is it a real
object or just a conceptualization of an object within a real field?) However,
we argue that the definition of a field as a function on a domain which
is a subset of space-time is sufficiently specific for our computational
purposes. The mathematical description of a field as a function is accurate,
explicit and expressive. There is no need for more ontology (after all,
it has all of mathematics as its formal system!). In fact, this rich mathematical
foundation may imply that it is not possible to define a single, complete,
"true" ontology of fields.
As further demonstration of its richness, this formal definition suggests
several interesting research questions, including:
-
How can we handle fields where their domain is defined by the extension
of an object (e.g., a "field where the domain is the (spatial component
of) an object?
-
How can we handle fields with non-linear domains (e.g., a field defined
on a curved river)?
-
How can we handle fields with domains that have both spatial and temporal
dimension, and even worse: where that domain is not only mixed space/time
but also non-linear (e.g., water temperature of a river that changes its
basin over time, or the profile of elevation/sea surface on a moving coast)?
-
How can we handle fields where the reference to the domain is not given
as geographic coordinates but as entities such as pressure or temperature
as used in the atmospheric sciences?
-
Is there a set of generic object-extraction methods (e.g., extrema, ridges,
valleys, saddle-points) or is this discipline- and application-specific?
-
How do measurements relate to the formal definition of a field, e.g., what
is the correspondence between a discrete set of measured values and a function?
By acknowledging this mathematical definition of fields, it is possible
that a number of the fundamental challenges still facing geographic information
scientists as they attempt to formalize representations of the real world
may be solved.
Acknowledgements
This paper was inspired by discussions at the "The Ontology of Fields"
specialist meeting, a research initiative under the Varenius Project of
National Center for Geographic Information and Analysis (NCGIA). The Varenius
Project is supported by a grant from the U.S. National Science Foundation
(SBR-9600465). We thank the other participants of the specialist meeting
for their lively discussions and look forward to their comments, criticisms
and, we hope, support for the ideas presented here.
References
Buehler, Kurt, and Lance McKee. 1996. The OpenGIS Guide, OGIS TC
Document 96-001, OpenGIS Consortium, Inc., Wayland, Massachusetts.
Couclelis, Helen. 1992. People manipulate objects (but cultivate fields):
beyond the raster-vector debate in GIS. In A. U. Frank, I. Campari and
U. Formentini, eds., Theories and Methods of Spatio-Temporal Reasoning
in Geographic Space, Springer-Verlag, pp 65-77.
Egenhofer, Max J., David M. Mark, and Kathleen Hornsby. 1997. Formal
Models of Commonsense Geographic Worlds, Report on the Specialist Meeting
of Research Initiative 21. Technical Report 97-2, National Center for
Geographic Information and Analysis, University of California Santa Barbara,
USA.
Frank, Andrew U. and Werner Kuhn. 1995. Specifying OpenGIS with functional
languages. In Max J. Egenhofer and John R. Herring, eds., Advances in
Spatial Databases, Lecture Notes in Computer Science 951, Springer
Verlag, Berlin, pp. 184-19.5
Gruber, Thomas R. 1993. Toward Principles for the Design of Ontologies
Used for Knowledge Sharing, Technical Report 93-04, Knowledge Systems
Laboratory. Palo Alto CA:, Stanford University.
Kemp, Karen K. 1997. Fields as a framework for integrating GIS and environmental
process models. Part one: Representing spatial continuity. Transactions
in GIS 1(3):219-234 and 1(4):335.
Mark, David M. 1997. Cognitive perspectives on spatial and spatio-temporal
reasoning. In Craglia, M., and Couclelis, H., Geographic Information
Research Bridging the Atlantic, London: Taylor and Francis, pp. 308-319.
Rosch, Eleanor, and Barbara Bloom Lloyd. 1978. Cognition and categorization.
Hillsdale, N.J.: L. Erlbaum Associates, distributed by Halsted Press, New
York.
Smith, Barry, and David M. Mark. 1998. Ontology and geographic kinds.
Proceedings of International Symposium on Spatial Data Handling (SDH'98),
12-15 July 1998, at Vancouver BC, pp. 308-318.
Worboys, M.F., and S.M. Deen, 1991. Semantic heterogeneity in distributed
geographical databases, SIGMOD Record, 20(4).