Bing Maps is releasing mined roads around the world. We have detected 54.2M km of roads worldwide. Mining is performed with Bing Maps imagery including Maxar and Airbus. The data is freely available for download and use under the Open Data Commons Open Database License (ODbL). We plan to opensource both NN model and geometry generation code in first half of 2025.
Region | Length in '000 Km | File size in MB |
---|---|---|
Australia and Oceania | 2314.7 | 383 |
Caribbean | 243.7 | 76 |
Central America | 1538.3 | 427 |
Central Asia | 1204 | 309 |
Eastern Africa | 1668.8 | 360 |
Eastern Asia | 153.9 | 48 |
Eastern Europe | 4601.4 | 1382 |
Middle Africa | 513.8 | 112 |
Northern Africa | 1387.2 | 388 |
Northern America | 12990.6 | 3865 |
Northern Europe | 2380 | 985 |
South America | 5694.7 | 1245 |
Southeastern Asia | 2777 | 680 |
Southern Africa | 1217.9 | 241 |
Southern Asia | 5676.3 | 1467 |
Southern Europe | 2727.7 | 972 |
Western Africa | 1130.3 | 306 |
Western Asia | 2444.4 | 756 |
Western Europe | 3560.5 | 1410 |
World | 54225.2 | 16564 |
Each file has all roads from a certain geographical region. Each row in a file has an Alpha-3 code and a geojson of a road (Alpha-3 code of a region where the road geojson approximately is) separated with TAB (\t). Each geojson also contains property "WidthMeters" - approximate width of the road in meters.
World is divided into subregions for better usability based on United Nations geoscheme
Alpha-3 codes are used from IBAN and Wikipedia page. Also refer to AlphaCodeToRegionName.tsv file (some smaller regions/disputed areas might have ambigious codes)
GeoJSON is a format for encoding a variety of geographic data structures. For Intensive Documentation and Tutorials, Refer to GeoJson Blog
The road extraction is done in two major stages:
- Semantic Segmentation – Recognizing road pixels on the aerial image using Convolutional Neural Network (CNN).
- Geometry Generation - A series of algorithms and processes transforming output of semantic segmentation into roads in geometry format.
- Image postprocessing
- Thinning
- Connectivity improvement
- Graph construction
- Finalizing road shapes and network quality
- Stiching road geojsons between neighboring images where needed
Our network was based on UNet and ResNet and the following papers [U-Net] (https://arxiv.org/abs/1505.04597), [Res U-Net] (https://arxiv.org/pdf/1512.03385.pdf), [Res U-Net] (https://arxiv.org/pdf/1711.10684.pdf). The model was trained on 512x512 images, it is fully-convolutional, which allows images of any size (that is divisable by 64) be processed by the model (constrained by GPU memory, 1088x1088 in our case). The training set consists of 20000 labeled images. Majority of the satellite images cover diverse areas all around the world. To achieve a good set representation, we have enriched the set with samples from various areas covering mountains, glaciers, forests, deserts, beaches, coasts, etc. Images in the set are of 1088x1088 pixel size with 100 cm/pixel resolution. The training is done with Keras toolkit.
We measure intermediate stage metrics to track performance of our models. Pixel metric measures performance of the the Convolutional Neural Network and APLS metric (Average Path Length Similarity) measures overall connectivity after geometry generation stage.
Metric | Precision | Recall |
---|---|---|
Pixel | 85.24% | 82.81% |
APLS | 87.53% | 79.33% |
The vintage of the roads depends on the vintage of the underlying imagery. Because Bing Imagery is a composite of multiple sources it is difficult to know the exact dates for individual pieces of data. However data is up-to-date with freshest available imagery from Microsoft Maps.
The result of the pipeline (after going through conflation, cutting, filtering and quality control reached 95% precision and pushed into Microsoft Maps production.
Microsoft has a continued interest in supporting a thriving OpenStreetMap (OSM) ecosystem.
We will opensource both NN model and geometry generation code in 2025
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. Microsoft's general trademark guidelines can be found at http://go.microsoft.com/fwlink/?LinkID=254653.
Privacy information can be found at https://privacy.microsoft.com/en-us/
Microsoft and any contributors reserve all other rights, whether under their respective copyrights, patents, or trademarks, whether by implication, estoppel or otherwise.