Georeferencer: Crowdsourced Georeferencing for Map Library Collections
Georeferencing of historical maps offers a number of important advantages for libraries: improved retrieval and user interfaces, better understanding of maps, and comparison/overlay with other maps and spatial data. Until recently, georeferencing has involved various relatively time-consuming and costly processes using conventional geographic information system software, and has been infrequently employed by map libraries. The Georeferencer application is a collaborative online project allowing crowdsourced georeferencing of map images. It builds upon a number of related technologies that use existing zoomable images from library web servers. Following a brief review of other approaches and georeferencing software, we describe Georeferencer through its five separate implementations to date: the Moravian Library (Brno), the Nationaal Archief (The Hague), the National Library of Scotland (Edinburgh), the British Library (London), and the Institut Cartografic de Catalunya (Barcelona). The key success factors behind crowdsourcing georeferencing are presented. We then describe future developments and improvements to the Georeferencer technology.
Georeferencing of historical map images involves assigning spatial information so that they align with real world geography. The essential process typically consists of adding control points to an historic map that have a real-world location; once there are sufficient control points, the historic map can be transformed so that it correctly aligns with geographic reality. The choice of coordinate system, the type of transformation method, and the method of resampling pixels in the image will all affect the end result.
Georeferencing historic maps results in a number of important advantages for libraries:
Georeferencing has always involved resources of time and/or money. Whilst the simplest cropping and transcription of four corner points of a map in a standard projection can take less than a minute, it is often necessary to georeference hundreds or even thousands of maps in a series in this fashion to create a seamless mosaic. For very accurate warping of a map without a standard projection and accurate geometry, potentially hundreds of control points are needed for one map, and the work can take many hours. Map libraries have often found it difficult to fund even the scanning and online delivery of maps, so georeferencing has therefore been a further cost to delivering maps online.
Other approaches and software
Most georeferencing work today is undertaken on desktop computers using a number of standard Geographic Information System programs, including commercial software such as ArcGIS, MapInfo, and GlobalMapper, and open-source software such as QGIS, GRASS, GDAL, MapRectifier or MapWinGIS Image Georeferencer. Bespoke applications have sometimes been developed to provide more efficient georeferencing workflows, a good example being the QUAD-G program developed at the University of Wisconsin-Madison. QUAD-G uses the open-source FWTools software to allow automatic cropping and georeferencing of United States Geological Survey topographic maps.
Georeferencing in an online environment has had a shorter history with fewer applications. Metacarta's MapRectifier was one of the earliest web-based georeferencing applications, developed from 2006, allowing maps to be uploaded to their server, georeferenced, and the resulting image made available as a Web Map Service. MapRectifier inspired the MapWarper software, developed by Tim Waters in 2009 and designed for use with OpenStreetMap mapping. Whilst both MapRectifier and MapWarper were open-source applications, both relied upon map images being uploaded to a server from which their owners had no ongoing control or rights over them.
Georeferencing applications for libraries have specifically addressed this issue, so that libraries could control their own map images, metadata and georeferenced outputs. In 2010-2011, MapWarper was further developed to form New York Public Library's Map Warper (Fig. 1) and Harvard University's WorldMap WARP. These implementations of MapWarper share standard functions in common with the Georeferencer application described in this paper: registration to control access, assigning of control points, rectification, visualisation of georeferenced maps, viewing map metadata, and export of georeferenced imagery in various standard formats including KML, WMS and WMTS or GeoTIFF. Other features, such as cropping neat lines to mask map sheet margins, WMS delivery, and presentation as an overlay in Google Maps using the open-source OpenLayers software, are available in some but not all implementations of MapWarper and Georeferencer. Most recently, a vector tracing tool was added to NYPL's Map Rectifier amongst the functions for crowdsourcing, allowing buildings and other features to be captured as vector layers with individual attributes.
Figure 1: New York Public Library's Map Warper interface, showing rectified map overlaid on an OpenStreetMap backdrop
Another relevant and successful project is Geospatial.org's eHarta project, organised and run by volunteers, and focusing on historical series maps of Romania (Crăciunescu, et al., 2011). Work began in 2010 and in its initial phase, some 1,800 map sheets of a single map series were processed in less than two days. A series of tools were developed using completely open-source software, including ExtJS, GeoExt, Zoomify, PHP, MySQL, GeoServer and OpenLayers. The tools are accompanied by detailed online instructions for recording map sheet-level metadata, selecting maps, cropping along map neat lines (Fig. 2), assigning four control points to the corners of the map, and automatically georeferencing the images based on shapefiles of the relevant map series. Within a few days, the images are processed into GeoServer and then made available as seamless mosaics on the Geospatial.org website. The project employed a number of important features for successful crowdsourcing: a friendly web interface, good feedback channels and a mailing list for volunteers, public recognition of individuals' efforts, as well as open data licences for the final georeferenced images. The project won the "Better Data Award" at the "Open Data Challenge", awarded at the European Commission's Digital Agenda Assembly on 16-17 June 2011.
Figure 2: Placing a control point in the e-Harta interface.
The Georeferencer application is a collaborative online project, developed by Petr Přidal and the Moravian Library in Brno as part of the OldMapsOnline and TEMAP projects, allowing free, crowdsourced collaborative online georeferencing of map images from a number of libraries. Georeferencer is an online service based upon the open-source GDAL, Proj4 and MapServer applications, and it has allowed a cheaper, open, and collaborative way of georeferencing maps compared to in-house library georeferencing. It shares several of the advantages of MapWarper and the New York Public Library's Map Rectifier, but unlike MapWarper, it does not rely upon images being uploaded to a website, instead making use of zoomable images on existing library web servers. It is therefore easier to apply as there is no new local software installation and maintenance required.
For the user, registration, functionality and design have all improved over time and we describe here the present state of the technology in 2012. In 2011, the use of Google, Facebook, and Twitter accounts was added for authentication, and by upgrading to the latest version of OpenLayers in 2012, basic support was also added for mobile devices. The main georeferencing window presents the historic map to be georeferenced in the left-hand window and out-of-copyright georeferenced maps in the right-hand window (Fig. 3).
Figure 3: Showing the Georeferencer's Georeference window with the historic map to be georeferenced on the left,
The default right-hand window mapping is OpenStreetMap (with map tiles from Cloudmade and MapQuest) but other layers can be selected, including Ordnance Survey OpenData and the NLS Historical Maps API mapping for the United Kingdom, as well as Google maps, satellite and terrain layers globally. The use of out-of-copyright or open mapping avoids any issues of proprietary rights in the control points by third parties or of licensing restrictions such as for the Google GeoCoding API. For locating the modern map, the gazetteer originally used GeoNames available reliably via EDINA Unlock, but more recently the OpenStreetMap Nominatim gazetteer has been used. These gazetteers allow free, modern placename queries without licensing restrictions. Once georeferenced, the maps can be visualised using the Google Earth browser plugin as a georeferenced overlay with a transparency slider (Fig. 4).
Figure 4: The Georeferencer's Visualize window, showing the georeferenced map overlaid on a Google Earth backdrop, with adjustable transparency.
It is also possible to analyse the geometric accuracy of the georeferenced map using the online version of the MapAnalyst software. The control points are used to construct distortion grids, vectors of displacement, accuracy circles, and isolines of local scale and rotation. These visualizations can help to identify wrongly assigned control points. As a by-product, MapAnalyst also computes the historical map's scale, its rotation and statistical indicators relating to the georeferencing transformation.
Georeferencer captures a history of all editing operations and tracks each modification to individual users, similar to how Wikipedia works with encyclopaedic text. It is therefore possible to preview, use, or restore a status of editing at any given point in time; revert malicious operations; or generate multiple versions of the rectified maps applying different coordinate transformations (affine, second-order polynomial or Thin Plate Spline), which can be accessed directly via WMS/WMTS or as a GeoTIFF file.
Case studiesThe Georeferencer development started in 2008 (Přidal & Zabicka, 2008) and the system has thus far been deployed by five holding institutions:
As a result of the lessons learned from the three initial pilots, and other documented crowdsourcing projects and analyses, the Georeferencer software was substantially upgraded in 2011. With this upgraded version of Georeferencer at NLS in 2012, there was more popularity and within four months a further 200 maps had been georeferenced.
The resulting crowdsourced spatial metadata is produced in a format ready to be integrated into library catalogs. Metadata from some of the participating institutions is now included in the Old Maps Online search engine, which demonstrates the real value of enriching map images with spatial coordinate metadata. Old Maps Online is a gateway to historical maps, held by numerous cartographic collections, allowing searching of online maps via both textual (using online gazetteers) and graphic (map) geographical interfaces, and can further be narrowed by date. The search results provide a direct link to the map image on the website of the host institution (Southall & Přidal, 2012).
Results of crowdsourcing
Amongst wider crowdsourcing efforts, georeferencing may be considered a "contributory" project, whereby professionals design a project which the public is then asked to contribute to, consisting of "correction and transcription" tasks (Oomen & Aroyo, 2011). In general terms, participants create metadata spatial coordinates to accompany the output of the digitisation process scanned maps using the tools provided. While online georeferencing projects have been undertaken using internal volunteer groups explicitly organized for the task, participation is gathered most usually by a public open call, where any interested party can contribute, thus the tools have been designed for a general Internet audience. Crowdsourcing geospatial metadata for historic maps is being accomplished by several other projects in addition to Georeferencer, most prominantly New York Public Library's Map Rectifier (Waters, 2010) and Geospatial.org's eHarta project, both discussed above.
The most significant developments in the 2011 upgrade to Georeferencer 3.0 were related to the public crowdsourcing features, and were based on a combination of the lessons learned from the previous pilots and recommendations from the wider field of crowdsourcing (Holley, 2010). These enhancements of the user experience consisted of an improved design and professional instructional video; simpler registration and user logging; and statistics and visualisation tools supporting competition in crowdsourcing. The latter includes widgets illustrating overall project progress and competitive rankings of individual contributor's input, as well as personal recognition of each contributor associated with each map completed (Fig. 5).
Figure 5: The Institut Cartografic de Catalunya's Georeferencer widgets,
The varying successes of early implementations of Georeferencer revealed the importance of feedback and user input, which was evaluated and actively addressed in the next release and later implementations. The NLS experience highlighted some of the areas in which better results could have been achieved. There were problems with the registration and login, with password confirmation details being slow to arrive or diverted into spam folders. Better promotion by NLS, particularly through social media channels, or by liaising with a core group of interested participants as at the Nationaal Archief, could have created a larger core of interested, engaged users.
Both the BL and ICC the only full implementations of the upgraded Georeferencer to date met with tremendous success and were completed in remarkably short amounts of time. This may to a degree be attributed to the map content and the engaging georeferencing process, but because these features were present in previous pilots, the outstanding uptake can also be said to be due to the new crowdsourcing enhancements built into the application, use of social media for publicity, and "prizes" as motivational factors, all features identified as means to increase participation and productivity (Holley, 2010). The number of participants was relatively low BL had 90 volunteers and ICC 88 with the majority of contributors georeferencing 1-2 maps, and a minority completing over half of the work. The results were sometimes surprisingly revelatory about the mindset and approach of the volunteers completing the work. For instance, while the system required a minimum input of three control points per map, in the case of one BL map, the user entered 352 control points, pointing to engagement with the map content and activity rather than a desire to gain "points"; nearly 40% of the completed maps contained at least ten control points.
Both institutions invited the top five participants to visit the library for an exclusive look at collections and behind-the-scenes operations. These 'super' volunteers were not generally map library users, or even those with a particular interest in maps, but rather enthusiastic and motivated individuals with general interests and technological literacy with web tools. This suggested that online georeferencing, rather than serving existing map users and enthusiasts, can instead be a powerful way to introduce historic maps, as well as the institutions, to new audiences.
Future developmentsThe Georeferencer software has undergone significant development in the last two years, adding additional functionality and crowdsourcing tools described above, and a number of further enhancements are planned. Some of these are useful functions from other georeferencing applications, whilst others take advantage of significant progress in web-mapping technologies:
(See the Georeferencer Roadmap page for further details.)
Methodological, presentation and selection issues in georeferencing
Whilst there are good reasons for georeferencing historical maps and projects such as Georeferencer, there are relevant methodological issues to be borne in mind. The georeferencing process emphasises the geometric accuracy of historic maps and therefore downplays other aspects of their content their symbolism, placenames, the way that they represent features, their purpose and usage, their various ideological, political and cultural contexts, and other meanings. In the course of transforming an original map, georeferencing changes lines and shapes, the distances between objects, the map's aesthetics and its value as a cultural object. As such, it can misuse or misinterpret historic maps, visualising them in ways that their makers and users would probably never have intended, and lead to false ideas and assumptions about the maps themselves. If applied in a simplistic way, georeferencing can therefore reinforce a modern fallacy that the main purpose of historic maps is representing the real world in a geometrically accurate way, that accurate maps are "better", and that the history of mapping is simply a history of improvements in accuracy over time.
In practice, these issues can be taken into account in georeferencing projects such as Georeferencer, partly by always clearly presenting the original, unwarped and uncropped map image as the original artefact, as well as any georeferenced version. They can also guide the selection of maps for georeferencing excluding those early maps where georeferencing would not assist understanding and the way that the georeferencing project and its purpose is described, as well as the presentation of any results. Through an awareness of the arguments against georeferencing, map librarians can approach georeferencing projects in a more sensitive way to take account of these criticisms. The georeferencing of the Roy Military Survey map of Scotland (Fleet & Kowal, 2007), or the Linguistic Geographies' Gough Map website, are recent examples of how maps can still be presented in their original form whilst also employing geospatial technologies.
The Georeferencer software has great potential, and allows libraries and archives to exploit many of the advantages of georeferencing their historic map collections thereby increasing accessibility and discoverability at very low costs. The five implementations of this shared service have contributed to ongoing development of the application since its first release in 2010, and the collective experience has resulted in improvements to functionality, and, most particularly, in the area of crowdsourcing. Future plans for Georeferencer allowing more flexibility and applying a more diverse set of technologies promise further improvements.
While the advantages that online georeferencing offers to cartographic collections are considerable, these very specialised benefits may be eclipsed by the broader ability to expose and share collections with the public in a new and a much more engaging way than was previously possible. As web users become accustomed to manipulating mainstream online map tools, making historic materials available in the same forum for examination and integration will evoke their relevance, and an appreciation for the landscape and past portrayals of it. Just as essential as exposing collections and making them available to the public immediately, however, georeferencing also serves as a means of investing in future access to collections by gathering essential geographic metadata.
We are very grateful to Noelia Ramos, Institut Cartografic de Catalunya, for helpful information on their Georeferencer project and results. This paper was supported by the Programme of Applied Research and Development of the National and Cultural Identity (NAKI) from the Ministry of Culture of the Czech Republic (Project No. DF11P01OVV003 TEMAP).
Georeferencer.org beta version: http://www.georeferencer.org/
 De Boer, A.(2010). Processing old maps and drawings to create virtual historic landscapes, e-Perimetron, 5(2), 49-57.
 Crăciunescu, V. et al. (2011). Project eHarta: a collaborative initiative to digitally preserve and freely share old cartographic documents in Romania, e-Perimetron, 6(4), 261-269.
 Davie, M.F. & Frumin, M. (2007). Late 18th century Russian Navy maps and the first 3D visualization of the walled city of Beirut, e-Perimetron, 2(2), 52-65.
 Fleet, C. (2011). Historical maps in ScotlandsPlaces: new collaborative geographic retrieval and presentation options for the National Library of Scotland's maps, ePerimetron 6(4), 230-243.
 Fleet, C. & Kowal, K. (2007). Roy Military Survey map of Scotland (1747-1755): mosaicing, geo-referencing, and web delivery, ePerimetron, 2(4), 194-208.
 Gaspar, J.A. (2012). Blunders, errors and entanglements: scrutinizing the Cantino planisphere with a cartometric eye, Imago Mundi, 64, 181-200.
 Heere, E. (2011). The accuracy of the maps of Zeeland; accuracy measurement as part of the cartobibliography, e-Perimetron, 6(3) 187-199.
 Koussoulakou, A. et al. (2011). On the Generalkarte coverage of the northern part of Greece and its interactions with the relevant subsequent Greek map series, e-Perimetron, 6(1), 46-56.
 Kowal, KC. and Pridal, P. (2012). Online georeferencing for libraries: the British Library implementation of Georeferencer for spatial metadata enhancement and public engagement. Journal of Map & Geography Libraries: Advances in Geospatial Information, Collections & Archives, 8:3, 276-289, 2012.
 Oomen, J. & Aroyo, L. (2011). Crowdsourcing in the cultural heritage domain: opportunities and challenges. C&T '11 Proceedings of the 5th International Conference on Communities and Technologies, 138-149.
 Přidal, P. & Zabicka, P. (2008). Tiles as an approach to on-line publishing of scanned old maps, vedute and other historical documents. e-Perimetron, 3(1), 10-21.
 Southall, H. & Přidal, P. (2012). Old Maps Online: Enabling global access to historical mapping. e-Perimetron 7(2), 73-81.
 Waters, T. (2010). Putting a Map Library on the Net: Crowdsourcing Georectification and Digitization of Historical Maps. AGI GeoCommunity '10: Opportunities in a Changing World.
About the Author