[1] The places mentioned in digitized text collections constitute a rich data source for researchers in many disciplines.
To map a set of place names or toponyms that occur in a document to their corresponding latitude/longitude coordinates, a polygon, or any other spatial footprint, a disambiguation step is necessary.
If some of those text documents are geotagged --- e.g. because they are micro-blog posts with latitude and longitude automatically added --- they can be used to infer the varying geographical specificity of arbitrary terms, e.g. "cable car" or "high tide" [3] .
Nonetheless, a resolution technique can still disambiguate a metonymy reference as long as it is identified as a toponym in the recognition phase.
Supervised methods typically cast the problem as a learning task wherein the model first extracts contextual and non-contextual features and then, a classifier is trained on a labelled dataset.
The Context-Hierarchy Fusion[6] model estimates the geographic scope of documents and leverages the connections between nearby place names as evidence to resolve toponyms.
One can also geoparse location references from other forms of media, for examples audio content in which a speaker mentions a place.
Geocoding analyzes unambiguous structured location references, such as postal addresses and rigorously formatted numerical coordinates.
Geoparsing handles ambiguous references in unstructured discourse, such as "Al Hamra," which is the name of several places, including towns in both Syria and Yemen.