Compass Interactive BV,
Eekholt 54, 1112 XH Diemen, The Netherlands.
The most basic of the generalization operators is selection: which map objects should be selected for display on a particular map at a particular scale ? This question has been addressed for point features, where it is called settlement selection ,. In the case of subdivisions, the dissolution operator can be used for elimination (the inverse of selection), for instance with the GAP-tree .
The selection of line features has been considered in the context of road network maps and drainage networks. We are aware of two papers that select subsets of the roads to perform generalization ,. Both consider whether algorithms in graph theory can be used. For example, a road network map connecting a set of cities can be generalized by only selecting the roads in the minimum spanning tree. This idea guarantees the topological property of connectedness, but there are serious limitations to the graph approach. Firstly, there is no guarantee that avoiding coalescence and imperceptibility--two of the objectives of generalization--is realized. These geometric aspects that are not present in a graph. Secondly, semantic aspects of road network generalization such as avoiding large detours are not taken into account explicitly.
Another paper  discusses the detection and collapse of groups of junctions to a single junction. This is another important aspect of road network generalization and complementary to road selection.
In this paper we address road network generalization by selection more rigorously. We will first define what the input to road network generalization should look like. Then we list the basic geometric, topological, and semantic requirements  of the output, the generalized road network. Such a list is necessary to design an approach that performs the task at hand well. Based on the list, we outline a three-step approach to road network generalization. Most of the ideas in the paper have been implemented and applied to a data set.
From the input, two important concepts can be introduced. They are sets whose elements depend on the currently selected network.
The list above contains one hard requirement and two `soft' requirements. The first requirement will always be guaranteed by the algorithm, no matter what input road network. Given this decision, we cannot guarantee the second requirement, so it is soft. It may be the case that only way to avoid a long detour is to include a road that is very close to another road. The other choice would be to avoid long detours always, but then one has to accept that coalescence may occur. We have chosen to avoid coalescence over avoiding detours. The third requirement is soft in the sense that in the output network, some less important roads may be chosen while some important roads are not. This is desirable as we noted before, since less important roads may be relatively important in the neighborhood.
Figure 1: Circles around ends of roads, and distances between roads.
The algorithm to detect if two roads are in conflict is used often in whole generalization procedure. Therefore, it is important that this detection be done very efficiently. We treat this test first before giving the whole generalization procedure.
Suppose that two roads r1 and r2 have m1 and m2 nodes defining them, respectively. A brute force test for minimum distance would take O(m1·m2) time. By employing techniques from computational geometry , one can improve this considerably. We compute the Voronoi diagram of all nodes and arcs  from the roads r1 and r2. A Voronoi region of a feature from r1 adjacent to a Voronoi region of a feature from r2 gives a pair of features that may give the minimum distance between r1 and r2. By testing all pairs of neighboring Voronoi regions and their features we can find the overall minimum.
Figure 2: Voronoi diagram of the nodes and arcs of two roads.
Such an algorithm to find the minimum distance takes only O((m1+m2)log(m1+m2)) time; it takes O((m1+m2)log(m1+m2)) time to compute the Voronoi diagram and it takes O(m1+m2) time to test adjacent regions. One can also use a heuristic to be even faster in most cases. Compute the bounding box B1 of r1, the bounding box B2 of r2, and test if these are at least distance d apart. Such a test requires only O(m1+m2) time in total, or only O(1) time if the bounding boxes have already been computed. But it sometimes fails to decide whether roads are too close, and we need a more time consuming algorithm in such cases.
The method we have designed for road network generalization has three stages. In the first, an initial selection of all roads is made based exclusively on the importance of the roads and the total number of conflicts. The second step is to test all pairs of roads for conflicts, and delete the least important of the two. The third step consists of adding roads, possibly of lesser importance, to avoid long detours. Care must be taken that no new conflicts are introduced here. We discuss the three stages in more detail.
When testing whether a road r is in conflict with some other road, we start with the roads that bound the same face as r. As soon as we find a conflict we can stop the test for r and mark it as being in conflict. Because the test for conflicts excludes the parts of roads near the ends, it may be the case that r has no conflicts with roads bounding the same face, but has conflicts with roads that bound other faces. By cleverly choosing the sequence by which to test roads we can usually decide after a few conflict tests whether r has conflicts or not. A logical sequence is a breadth-first sequence starting from the road r. By using bounding boxes, it is possible to decide when it is safe to conclude that r is not in conflict.
An upper bound on the running time of the first step is O(knlogn), where k is the number of roads and n is the total number of nodes of the network. One can expect to do the first step much faster with the suggestions for efficiency just given.
Figure 3: When the vertical road is eliminated, the two other roads become one.
Elimination of conflicts should be done incrementally. When a conflict between two roads is resolved by elimination of one road, other conflicts with the eliminated road are resolved too, of course. But new conflicts may arise when two roads are adjoined, since the former junction, now normal vertex, no longer has a conflict-free region around it. This is the reason that we cannot precompute all conflicting pairs and simply remove one road of each.
We'll treat the roads in order of increasing importance. If the road under consideration is in conflict with some road of the same or greater importance, we eliminate it. If this makes a new, merged road to appear, we have to treat it too, even if both subroads have already been tested. Again, bounding box tests and a suitable order by which to test other roads will help considerably for efficiency.
The worst case running time of the second step is O(knlogn), and again we expect to do much better in practice.
Note that detours are defined in terms of shortest paths in the input network, and detours are avoided only for the set P of important nodes. So we start by computing all shortest paths between pairs of nodes in Q in the input network. Using Dijkstra's algorithm  this takes O(mnlogn) time in total, where m is the number of important points and n is the total number of nodes and arcs in the network.
Most difficult in this step is how to generate roads that can be tested for inclusion in the road network. We do this as follows. For each important node p we compute its shortest path in the currently selected network to all other important nodes. If each path is at most c times the length of the corresponding shortest path in the input network, we don't have any detours from p and we are happy with the connections from p. If there is a serious detour to another important node p', we'll try to add roads to create a shorter connection. We take the shortest path between p and p' in the input network and consider which parts are not present in the currently selected network. We test each of these for conflicts with the selected roads, and add them if there are no conflicts. If we achieved to add all parts, the shortest path between p and p' is present in the selected network. If only some parts could be added, we may still have reduced the length of the shortest path between p and p' and possibly, avoided the detour.
For the analysis, we consider the running time for each important node that is tested. We compute the lengths of the shortest paths to every other important node in O(nlogn) time with Dijkstra's algorithm. For each of the at most m important nodes to which a serious detour exists, we may need to consider determining and adding at most k roads, where k is the total number of roads in the input network. For these roads we must test if they are not in conflict with any other road. This test takes O(nlogn) time for each road piece, leading to a worst-case running time of O(km2nlogn). But one can expect a running time somewhere between O(mnlogn) time and O(m2nlogn) time typically, if heuristics with bounding boxes and an appropriate order to test for conflicts are used.
If the method above does not avoid a detour sufficiently, it is certainly possible to generate other short paths and try to insert these. As noted before, however, it is not possible to obtain a guarantee on the detour as long as one insists on avoiding coalescence.
We can redefine conflicts for short roads as follows. Two roads are in conflict if both have length at most 2d and they are incident to some small face with area at most, say, 2d2.
The method described above will resolve the conflict by removing one of the conflicting short roads. This causes the small face to be merged with another face.
To achieve that the preselection is smaller, can adapt the first step by computing the smallest value i such that at least some percentage of roads are not in conflict, and at least some percentage of the roads is not too short. This may cause i to be greater than before (described in the first step). The motivation for this solution is that a cluster of short roads generally contains many that are less important.
The ideas outlined in this paper have been implemented partly and can be run via the World Wide Web. The Java applet can be found at http://www.jarno.demon.nl/thesis.htm. The implementation is a prototype that hasn't been optimized in running time. Therefore it is rather slow. Not all solutions suggested in this paper are included. It does allow tuning of several parameters, however, and gives an indication of possible outputs of the generalization method.
Figure 4: Data set used by the Java applet.
In this study of road network generalization, we have only chosen the selection operator as means of generalization. There are several other operators that can be useful: line simplification, line smoothing, merge lines, collapse a road (and merge the incident intersections), and local displacement of lines. This is a subject of further research. Another issue is a more extended study of dealing with short roads, especially the combination of junction collapse with our method.
 K.E. Brassel and R. Weibel. A review and conceptual framework of automated map generalization. Int. J. of GIS, 2:229-244, 1988.
 B.P. Buttenfield and R.B. McMaster, editors. Map Generalization: Making Rules for Knowledge Representation. Longman, London, 1991.
 T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press, Cambridge, MA, 1990.
 Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried Schwarzkopf. Computational Geometry: Algorithms and Applications. Springer-Verlag, Berlin, 1997. http://www.cs.ruu.nl/geobook/
 G.E. Langran and T.K. Poiker. Integration of name selection and name placement. In Proc. 2nd Int. Symp. on Spatial Data Handling, pages 50-64, 1986.
 W.A. Mackaness and M.K. Beard. Use of graph theory to support map generalization. Cartography and Geographic Information Systems, 20:210-221, 1993.
 G.A. Mackechnie and W. Mackaness. Detection and simplification of road junctions in automated map generalization. In ACSM/ASPRS Annual Convention & Exposition, Technical Papers Volume 1: Surveying & Cartography, pages 72-82, Bethesda, MD, 1997. ASPRS/ACSM.
 R.B. McMaster and K. Stuart Shea. Generalization in Digital Cartography. AAG, Washington, D.C., 1992.
 J.-C. Müller, J.-P. Lagrange, and R. Weibel, editors. GIS and Generalization - Methodology and Practice, volume I of GISDATA. Taylor & Francis, London, 1995.
 F. P. Preparata and M. I. Shamos. Computational Geometry: An Introduction. Springer-Verlag, New York, NY, 1985.
 K.S. Shea and R.B. McMaster. Cartographic generalization in a digital environment: when and how to generalize. In Proc. Auto-Carto 9, pages 56-67, 1989.
 R.C. Thomson and D.E. Richardson. A graph theory approach to road network generalization. In Proc. 17th Int. Cartographic Conference, pages 1871-1880, 1995.
 F. Töpfer and W. Pillewizer. The principles of selection. The Cartographic Journal, 3:10-16, 1966.
 M. van Kreveld, R. van Oostrum, and J. Snoeyink. Efficient settlement selection for interactive display. In Proc. Auto-Carto 13: ACSM/ASPRS Annual Convention Technical Papers, pages 287-296, 1997. http://www.cs.ruu.nl/people/rene/settlement/index.html
 P. van Oosterom. The GAP-tree, an approach to `on-the-fly' map generalization of an area partitioning. In J.-C. Müller, J.-P. Lagrange, and R. Weibel, editors, GIS and Generalization - Methodology and Practice, number 1 in GISDATA. Taylor & Francis, London, 1995.
 R. Weibel. A typology of constraints to line simplification. In M.J. Kraak and M. Molenaar, editors, Proc. 7th Int. Symp. on Spatial Data Handling, pages 9A.1-9A.14, 1996.
 C. K. Yap. An O(n logn) algorithm for the Voronoi diagram of a set of simple curve segments. Discrete Comput. Geom., 2:365-393, 1987.