Evaluating Automatic Road Detection across a Large Aerial Imagery Collection

The automated extraction of roads from aerial imagery can be of value for tasks including mapping, surveillance and change detection. Unfortunately, there are no public databases or standard evaluation protocols for evaluating these techniques. Many techniques are further hindered by a reliance on manual initialisation, making large scale application of the techniques impractical. In this paper, we present a public database and evaluation protocol for the evaluation of road extraction algorithms, and propose an improved automatic seed finding technique to initialise road extraction, based on a combination of geometric and colour features.


I. INTRODUCTION
As the amount of aerial data being captured continues to increase, it is important to develop techniques to automatically process this data.Roads and road networks are key features in aerial imagery of built-up areas, and the automatic extraction of these features can be valuable for tasks such as mapping, change detection, and surveillance.At present, there is no standard evaluation methodology or database for evaluating road extraction techniques across a wide variety of regions, with most algorithms presented on a small set (typically less than a dozen) of images.This fragmentation of aerial imagery datasets makes it difficult to develop, benchmark and compare algorithms.Furthermore, many existing techniques require manual initialisation, making such large scale evaluations impractical.
The article attempts to address both these limitations by: 1) introducing a public evaluation database, consisting of 300 locations at 3 different resolutions, covering many different terrain types (suburban, rural, forest, river and ocean), different weather and illumination conditions; and 2) proposing an improved road network seed detection algorithm, for the automated extraction of road networks.

A. Aerial imagery collection
In order to allow for our database to be easily distributed to other researchers, we choose to collect our aerial imagery from the Australian aerial imagery company NearMap under their Community License1 which allows for distribution under the Creative Commons Attribution Share Alike (CC-BY-SA) license.High resolution aerial imagery was downloaded from NearMap's servers from 300 randomly chosen locations within the greater South-East Brisbane region, bounded by  1.By choosing these locations randomly across a wide area, the database provides a wide variety of urban, suburban, rural and bush areas, including a significant subset of tiles that do not have any roads at all.
Each of the 300 regions chosen from the database were approximately 541m × 541m, or the area corresponding to a single zoom 16 tile available from NearMap's aerial imagery servers.By choosing a single zoom 16 tile for each location, high resolution imagery can easily be obtained in the same location by collecting the zoom 17 or 18 tiles that also construct that location, as illustrated in Fig. 2. Having three distinct zoom levels available allows for road-detection algorithms to easily be evaluated at multiple scales in the same location.
NearMap imagery is also available at different dates for many regions in the coverage area, and while we plan to include multiple dates for each location in future datasets, at this stage we have simply chosen the latest date available for each location.However, as the entire database region was not flown in a single flight, the database as it exists at present does have some variation due to differences in time-of-day and weather during the separate flights.

B. Reference road network
While limited road detection evaluations can be performed by inspection, large scale evaluation of road detection algorithms require a reference road network to serve as a point of reference for the extracted road network.For our reference road network we chose to use CC-BY-SA licensed street data provided by the OpenStreetMap project 2 .A local extract of the database region was downloaded at the time of constructing the database (May 2011) and kept locally to ensure that the reference data would not change.The CC-BY-SA license of the OpenStreetMap project will allow the reference road network to be easily shared with other researchers alongside the similarly licensed NearMap imagery.
Because the OpenStreetMap database contains features that are not presently of interest to our road detection algorithms (such as footpaths, parks, buildings, etc.) we filtered the local extract to only contain objects that were likely to be paved roads suitable for automotive usage.In particular, objects tagged as a highway with the values motorway(_link), trunk(_link), primary(_link), secondary(_link), tertiary, residential, unclassified, living_street, service or pedestrian were retained in a local roadonly reference dataset.This road-only dataset was then converted to a 1-pixel wide skeleton image matched with each location and scale in the evaluation database for later use in performance metric calculation.

C. Performance metrics calculation
In order to allow for a comprehensive evaluation of road detection algorithms on our aerial imagery database, we have chosen to use the completeness, correctness and quality measures first proposed by Harvey [1], and defined as follows.
The completeness (Cp) of a road network is the percentage of the reference road network that is successfully extracted by the road detection algorithm, and can be defined as where L mr is the length of the matched reference, and L r is the length of the reference (for a given image).If there is no reference road network, the completeness is assumed to be 100%.
The correctness (Cr) of a road network is the percentage of the extracted road network that is matched by the reference network, and can be defined as where L me is the length of matched extraction, and L e is the length of the extracted road network (for a given image).If there is no extracted road network, the correctness is assumed to be 100%.Finally, the quality (Q) of a road network measures the contribution of the matched roads to the entire extracted and reference road network (where 100% implies that the entire network is matched), and can be defined as Similarly to the correctness and completeness, if there is no reference and extracted road network in a given image, the quality is assumed to be 100%.While the calculation of the total length of an extracted (L e ) and/or reference (L r ) road network can easily be performed by calculating the total number of pixels in a skeleton network image, the calculation of the matched lengths L me and L mr require matching road segments to be first identified.For our evaluations, the 'buffer method' [2], illustrated in Fig. 3, was used to match the two road networks.By using a dilation buffer around the reference or extracted road network, and intersecting with its counterpart, the matched and unmatched portions of the network can easily be calculated at a pixellevel for use in metrics calculation.Dilation was performed using a line structural element, arranged perpendicular to the road being dilated, with the structural length being 5, 10 or 20 pixels at zoom levels 16, 17 and 18 respectively.

A. Existing approaches
Gruen and Li [3] suggest that typical road extraction procedure can be divided into three stages: image pre-processing, road finding and road following.Pre-processing typically consists of steps such as colour conversion [5], normalisation [4], or sharpening [3].
Road finding is performed through seed detection.A seed is a point in the image that has a high likelihood of being a road, and these points are the starting points for growing the road network.The majority of existing systems require manual seeding [6], [3], [1], [7], which is time consuming and prevents the system being automatic.Automatic seed detection techniques are proposed in [5], [8], [9].Christophe et al. [5] and Laptev et al. [8] use line detection to locate the road edge and road body respectively.Hu et al. [9] assesses the rectangularity of regions surrounding potential seed points using a 'spoke wheel' operator.These techniques [5], [8], [9] however are all prone to error when encountering obstacles on the road such as overhanging trees, shadows or vehicles.
Road following typically seeks to extract one or more features to continuously detect, as the road network is grown.Baumgartner and Hinz [6] rely on colour, using a colour profile obtained during a manual initialisation to predict and grow the road network.Laptev et al. [8] use a ribbon snake to combine intensity and texture to follow the road, while Christophe et al. [5] rely on gradient information.Hu et al. [9] grow the road network through repeated application of the spoke wheel operator (see Fig. 4) to obtain a road footprint.The spoke wheel is iteratively applied at the peaks of the previously extracted footprints, until the final extracted footprint yields no more children.

B. Proposed road detection algorithm
In this paper we propose a modified version of the algorithm presented by Hu et al [9].The proposed approach first performs an automatic seed detection, followed by road extraction.The extracted road network can be converted to a vector format for comparison with the ground truth.Like [9], the proposed approach relies on the detection of footprints to extract the road network.The footprint of a pixel describes the geometric characteristics of the local area, such as its rectangularity and orientation, and is determined using a spoke wheel operator [9].As shown in Fig. 4, a road footprint is extracted by creating a spoke wheel, W, with N spokes of radius M , and centred at point P, denoted as W (P, N, M ).For the system presented here a N of 64 and an on-ground spoke length of 10 metres was chosen.
The intersection of the spoke wheel with the edge of the road network, C i , is determined for each spoke i as the first point from the centre out that meets the requirement where I (x) is the intensity of pixel x, σ (W (P, N, M )) is the standard deviation of the intensity of all pixels on wheel W, and k is used to tune the threshold.A larger value of k will allow more flexibilty in ignoring obstacles in the road network, but introduces the risk of including off-road areas.For the system presented in this paper, k was determined empirically to be 0.5 on a small subset of evaluation images.If no intersection is found for a particular spoke, then C i is set to the end of the spoke, and the final footprint is formed by joining all individual C i points, as shown in Fig. 4(b).
The proposed approach performs automatic seed detection to initialise the road extraction process.Ideally, the seed points should be guaranteed to be on a road, so the road network can then be grown from these seed points.Conversely, road segments that do not contain a seed point may not be detected; therefore each road segment should also have at least one seed point.
We extend the seed detection process of [9], which was based purely on a footprint rectangularity measure, by incorporating additional saturation and network expansion constraints.Firstly, random points within the saturation layer for the HSI aerial image are chosen as possible seed locations.These points are filtered through three processes: a saturation threshold, a footprint rectangularity test, and a network expansion test.The saturation threshold removes the seeds with saturation intensity likely to indicate non-road areasbe off the road (see Fig. 5(b)).Similar to [9], the rectangularity test is conducted using the minimal oriented bounding box [10] to eliminate footprints with a non-rectangular shape.An example of the remaining seeds after saturation and rectangularity test is shown in Fig. 5(c).
The remaining seed points are tested for network expansion through a potential test and a mean stretch distance (MSD) test.The potential test measures how many generations a seed footprint can grow a road network.All seeds with a potential less than 4 are discarded.Once a seed has shown that it can grow for at least 4 generations, the MSD is calculated as the mean distance of each end point from the original seed, and any seeds with an on-ground MSD of less than 20 metres are rejected.An example of the remaining seeds after the network expansion tests is shown in Fig. 5(d).
The final set of seed points is then used to grow the road network, using the footprint to propagate the network.From [9], the peaks of the footprint indicate the directions of the road, hence the algorithm finds the footprint peaks and uses these points as the center of the next footprint.This process continues until there are no peaks in the footprint, or the next footprint overlaps an existing footprint.For example, Fig. 4(c) shows the points v 1 , v 2 and v 3 as the peak vertices of the footprint.
Once the road network has completed growing, a minimal pruning process is conducted by removing all footprints that have a mean saturation value < 50.This approach helped to remove many false detection events, such as within buildings, but more sophisticated pruning techniques will be investigated in future research.
The remaining road footprints are then combined to form a binary image indicating the presence or otherwise of the road network, as shown in Fig. 5(e).Finally, a sequence of opening and closing morphological operations were used to remove unwanted spikes and re-connect broken and disconnected segments, following by a skeletonisation operation to get a rough 1-pixel wide road network, as shown in Fig. 5(f).Further simplification of this network was performed using the Douglas-Peucker [11] algorithm, and stored for later comparison with the reference network.

IV. EVALUATION RESULTS AND DISCUSSION
By evaluating the proposed system against the evaluation protocol outlined in Section II we obtained extracted street networks at three different zoom levels for each of the 300 evaluation locations.
The completeness (Cp), correctness (Cr) and quality (Q) for each location and zoom-level were then calculated, and evaluation images for each location were created showing a colour-coded road network over the aerial imagery indicating which sections correspond to matched extraction, false extraction and missed reference.
A summary of the quality metric across every location in the database at zoom 18 is shown in Fig. 6.By looking at the locations with Q > 20% (orange and green) it is clear that the proposed algorithm works best in the urban areas of the map, with few red locations (Q < 20%) in these areas.While there are some green locations (Q > 60%) outside of the urban areas, they typically do not contain any reference or extracted roads (and therefore get 100% quality).
A summary of the road detection performance at each zoom level over the entire database is shown in Table I, and example evaluation images of two urban regions at each of the zoom levels is also shown in Fig. 7.These results show that, as would be expected, better performance can be obtained as the image resolution is increased.It can be clearly seen in Fig. 9 that the proposed road detection algorithm has problems with falsely detecting similar linear features as roads in (a) and (b), and has problems with detecting roads under tree cover in (c) and (d).Fig. 9(e) is of particular interest as it shows roads that are actually on the ground, but are not reflected in the ground truth, and similar comments can be made about the car park lanes in (f), although the large amounts of false detection on the buildings are still a considerable problem here.

V. CONCLUSION AND FUTURE WORK
In this article, we have presented a database and evaluation methodology for evaluating road extraction from aerial imagery, using publicly available aerial imagery and road network data.We have used this database to evaluate our proposed road extraction system based on Hu et al [9].We introduce additional constraints in the automatic selection of seeds (colour, and additional geometric constraints) to improve the quality of seeds, and thus the overall quality of the extracted road network.Future work will focus on using the proposed database to apply machine learning methods to the task of road extraction, to automatically learn model parameters and improve road detection performance.
Researchers interested in obtaining a copy of this aerial imagery collection to compare performance should get in contact with the final author using the email address provided.

Figure 1 .
Figure 1.Aerial imagery was collected in 300 randomly selected locations (shown as purple squares) across the greater south-east Brisbane region.[Image CC-BY-SA OpenStreetMap Contributors and QUT]

Figure 3 .
Figure 3. Calculation of the length of matched extraction network (Lme) and the length of matched reference network (Lmr) is performed by dilating the counterpart network and taking the intersection.

Figure 4 .
Figure 4.An example road footprint is found by (a) laying a spoke operator over the road network, (b) taking the intersection of the spokes with the roadedges, and (c) finding the maximum-distance points for continuing the road network.

Figure 5 .
Figure 5.An overview of the proposed road detection algorithm on an example location.[Images CC-BY-SA NearMap and QUT]

Figure 6 .
Figure 6.An overview of the quality (Q) scores of the proposed system across all locations at Zoom 18. Labels indicate example images used in the remainder of this paper in order of appearance.[Key: Q < 20%, 20% ≤ Q ≤ 60%, Q > 60%.Image CC-BY-SA OpenStreetMap Contributors and QUT]

Figure 7 .
Figure 7. Road detection performance of location A and B at zoom levels 16, 17 and 18. [Key: matched extraction, false extraction, missed reference.All images CC-BY-SA NearMap and QUT]

Table I AVERAGE
SYSTEM PERFORMANCE OVER THE PROPOSED DATABASE FOR THREE DIFFERENT ZOOM LEVELS.