




[](https://jairomelo.com/Georesolver/)
[](https://github.com/jairomelo/Georesolver/issues)
# GeoResolver
GeoResolver is a lightweight Python library for resolving place names into geographic coordinates and related metadata using multiple gazetteer services, including [GeoNames](https://www.geonames.org/), [WHG](https://whgazetteer.org/), [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page), and [TGN](https://www.getty.edu/research/tools/vocabularies/tgn/).
The library provides a unified interface and standardized response format across sources, making it easier to disambiguate, enrich, or geocode place names—especially in datasets, archival collections, and manually curated records.
> **GeoResolver is particularly useful for historical geocoding and legacy datasets**, where place names may be inconsistent, ambiguous, or obsolete. It is not intended to replace tools like Geopy for general-purpose geocoding. Instead, GeoResolver complements them by offering a targeted approach based on authoritative gazetteers, tailored for academic, historical, and archival contexts.
## How it works
The logic behind GeoResolver is straightforward:
Given a place name as input, the library queries one or more gazetteers in sequence, searching for the closest match using a fuzzy matching algorithm. If a sufficiently good match is found, it returns the coordinates of the place. If not, it moves on to the next gazetteer, continuing until a match is found or all gazetteers have been queried.
If no match is found in any gazetteer, the library returns a `None` value.
A fuzziness threshold can be configured to control how strict the match should be. The default threshold is 90, meaning the library only accepts matches that are at least 90% similar to the input. Lowering the threshold allows more lenient matches; raising it makes the match stricter.
It's possible to be even more flexible by enabling a flexible threshold for short place names. This is useful when you want to allow some places (like 'Rome', or 'Lima') to match without reducing the threshold for longer names.
To improve precision, you can filter by country code and place type for deambiguation or to narrow down results.
Some services allow specifying place types using localized terms, which can be useful when working with multilingual datasets.
GeoResolver includes a basic mapping of common place types in `data/mappings/places_map.json`. You can also pass a custom mapping to the `PlaceResolver` class to support additional types or override defaults. This is useful for adapting the resolution logic to domain-specific vocabularies or legacy data.
## How to use
To use GeoResolver, install the library via `pip`. It’s recommended to use a virtual environment to avoid conflicts with other packages:
```bash
pip install georesolver
```
### Geonames configuration
To use the GeoNames service, you must create a free account at [GeoNames](https://www.geonames.org/login) and obtain a username. This username is required to make API requests.
> **Warning**: It's possible to use the username `demo` for testing purposes, but this user has very limited quota and it's possible to hit the limit quickly, especially with batch requests.
You can provide your username in one of two ways:
**Environment variable**
Create a `.env` file in your project directory:
```
GEONAMES_USERNAME=your_geonames_username
```
**Pass it explicitly**
```python
from georesolver import GeoNamesQuery
geonames_query = GeoNamesQuery(geonames_username="your_geonames_username")
```
### Basic Example Usage
The most straightforward way to use the library is through the `PlaceResolver` class. By default, `PlaceResolver` queries all available services — *GeoNames*, *WHG*, *TGN*, and *Wikidata* — in that order.
To resolve a place name, call the `.resolve()` method with the name and (optionally) a country code and place type. If no filters are specified, the first sufficiently similar match across all services is returned.
```python
from georesolver import PlaceResolver
# Initialize the resolver (uses all services by default)
resolver = PlaceResolver()
# Resolve a place name
result = resolver.resolve("London", country_code="GB", place_type="inhabited places")
if result:
print(f"Coordinates: {result['latitude']}, {result['longitude']}")
print(f"Source: {result['source']}")
print(f"Confidence: {result['confidence']}")
else:
print("No match found")
```
Sample output:
```bash
Coordinates: 51.50853, -0.12574
Source: WHG
Confidence: 100.0
```
### Enhanced Return Format
Starting with v0.2.0, the `resolve()` method returns a structured dictionary with comprehensive metadata:
```python
{
"place": "London",
"standardize_label": "London",
"language": "en",
"latitude": 51.50853,
"longitude": -0.12574,
"source": "GeoNames",
"id": 2643743,
"uri": "http://sws.geonames.org/2643743/",
"country_code": "GB",
"part_of": "",
"part_of_uri": "",
"confidence": 95.5,
"threshold": 90,
"match_type": "exact"
}
```
### Customizing Services
You can control which services `PlaceResolver` uses and configure them individually. For example:
```python
from georesolver import PlaceResolver, GeoNamesQuery, TGNQuery
geonames = GeoNamesQuery(geonames_username="your_geonames_username")
tgn = TGNQuery()
resolver = PlaceResolver(
services=[geonames, tgn],
threshold=80,
flexible_threshold=True, # Use flexible threshold for short place names
flexible_threshold_value=70, # Lower threshold for short names
lang="es", # Spanish language support
verbose=True
)
```
This gives you more control over the resolution logic, including match strictness (`threshold`), flexible thresholding for short place names, language preferences, and logging verbosity (`verbose=True`).
### Batch Resolution
GeoResolver supports batch resolution from a `pandas.DataFrame`, making it easy to process large datasets.
You can use the `resolve_batch` method to apply place name resolution to each row of a DataFrame. This method supports optional columns for country code and place type, and can return results in different formats.
```python
import pandas as pd
from georesolver import PlaceResolver, GeoNamesQuery
# Sample data
df = pd.DataFrame({
"place_name": ["London", "Madrid", "Rome"],
"country_code": ["GB", "ES", "IT"],
"place_type": ["city", "city", "city"]
})
# Initialize the resolver
resolver = PlaceResolver(services=[GeoNamesQuery(geonames_username="your_username")], verbose=True)
# Resolve in batch, return structured results as a DataFrame
result_df = resolver.resolve_batch(df,
place_column="place_name",
country_column="country_code",
place_type_column="place_type",
show_progress=False
)
print(result_df.columns.tolist())
# Output: ['place', 'standardize_label', 'language', 'latitude', 'longitude', 'source', 'place_id', 'place_uri', 'country_code', 'part_of', 'part_of_uri', 'confidence', 'threshold', 'match_type']
```
This returns a new DataFrame with columns for all resolved place attributes including coordinates, source information, and confidence scores.
#### Return options
The `resolve_batch` method returns a `pandas.DataFrame` by default, but you can also return a list of dictionaries that can be useful for JSON serialization.
```python
# Return results as a list of dictionaries
results = resolver.resolve_batch(df,
place_column="place_name",
country_column="country_code",
place_type_column="place_type",
return_df=False, # Return list of dictionaries
show_progress=False)
print(results[:2])
```
Example output:
```python
[
{
"place": "London",
"standardize_label": "London",
"language": "en",
"latitude": 51.50853,
"longitude": -0.12574,
"source": "GeoNames",
"id": 2643743,
"uri": "http://sws.geonames.org/2643743/",
"country_code": "GB",
"part_of": "",
"part_of_uri": "",
"confidence": 100.0,
"threshold": 90,
"match_type": "exact"
},
{
"place": "Madrid",
"standardize_label": "Madrid",
"language": "en",
"latitude": 40.4165,
"longitude": -3.70256,
"source": "GeoNames",
"id": 3117735,
"uri": "http://sws.geonames.org/3117735/",
"country_code": "ES",
"part_of": "",
"part_of_uri": "",
"confidence": 100.0,
"threshold": 90,
"match_type": "exact"
}
]
```
## Custom Place Type Mapping
Different gazetteers use different terms to classify place types (e.g., "populated place", "settlement", "city", "pueblo"). To unify these differences, GeoResolver uses a configurable place type mapping that standardizes input values before querying services.
By default, GeoResolver uses a built-in mapping stored at `data/mappings/places_map.json`. This file maps normalized place types (like "city") to the equivalent terms used by each service.
Example mapping entry:
```json
"city": {
"geonames": "PPL",
"wikidata": "Q515",
"tgn": "cities",
"whg": "p"
},
```
You can provide your own mapping by passing a JSON file path to `PlaceResolver`:
```python
resolver = PlaceResolver(
services=[GeoNamesQuery(geonames_username="your_username")],
places_map_json="path/to/your_custom_mapping.json"
)
```
This is useful when working with domain-specific vocabularies, legacy datasets, or non-English place type terms. You can also use it simply to override the default mapping with your own preferences.
Each service-specific list should contain valid place type codes or labels expected by that gazetteer.
## Wikidata Integration
This library queries the Wikidata MediaWiki API via the endpoint:
`https://www.wikidata.org/w/api.php`
It does not use the SPARQL endpoint (`https://query.wikidata.org/sparql`), as this approach is faster and more reliable for simple place lookups. The library performs entity searches by name and retrieves coordinates, country (P17), and administrative data from the entity information.
**Enhanced in v0.2.0**: WikidataQuery now provides better country and administrative entity data retrieval, with improved matching against the BaseQuery interface for consistency across all services.
> ⚠️ **Performance Note**: Wikidata API queries involve multiple HTTP requests per place (search + entity data). This process is relatively slow and not recommended for bulk resolution. Consider using GeoNames or WHG for large-scale batch processing.
## Contributing
Contributions are welcome! If you encounter a bug, need additional functionality, or have suggestions for improvement, feel free to open an issue or submit a pull request.
## License
This project is licensed under a GNU General Public License v3.0. See the [LICENSE](LICENSE) file for details.
## Acknowledgements
This library relies on open data sources and public APIs provided by GeoNames, WHG, Wikidata, and TGN. Special thanks to the maintainers of these projects for their commitment to accessible geographic knowledge.
Raw data
{
"_id": null,
"home_page": null,
"name": "georesolver",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "geocoding, georesolver, geonames, wikidata, tgn, whg",
"author": null,
"author_email": "Jairo Antonio Melo Florez <jairoantoniomelo@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/a2/5f/6058e2c02f053001775b27c9e09c8ec8d76d144c07987e9bf91ef3476180/georesolver-0.2.2.tar.gz",
"platform": null,
"description": "\n\n\n\n\n[](https://jairomelo.com/Georesolver/)\n[](https://github.com/jairomelo/Georesolver/issues)\n\n\n# GeoResolver\n\nGeoResolver is a lightweight Python library for resolving place names into geographic coordinates and related metadata using multiple gazetteer services, including [GeoNames](https://www.geonames.org/), [WHG](https://whgazetteer.org/), [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page), and [TGN](https://www.getty.edu/research/tools/vocabularies/tgn/).\n\nThe library provides a unified interface and standardized response format across sources, making it easier to disambiguate, enrich, or geocode place names\u2014especially in datasets, archival collections, and manually curated records.\n\n> **GeoResolver is particularly useful for historical geocoding and legacy datasets**, where place names may be inconsistent, ambiguous, or obsolete. It is not intended to replace tools like Geopy for general-purpose geocoding. Instead, GeoResolver complements them by offering a targeted approach based on authoritative gazetteers, tailored for academic, historical, and archival contexts.\n\n## How it works\n\nThe logic behind GeoResolver is straightforward:\n\nGiven a place name as input, the library queries one or more gazetteers in sequence, searching for the closest match using a fuzzy matching algorithm. If a sufficiently good match is found, it returns the coordinates of the place. If not, it moves on to the next gazetteer, continuing until a match is found or all gazetteers have been queried.\n\nIf no match is found in any gazetteer, the library returns a `None` value.\n\nA fuzziness threshold can be configured to control how strict the match should be. The default threshold is 90, meaning the library only accepts matches that are at least 90% similar to the input. Lowering the threshold allows more lenient matches; raising it makes the match stricter.\n\nIt's possible to be even more flexible by enabling a flexible threshold for short place names. This is useful when you want to allow some places (like 'Rome', or 'Lima') to match without reducing the threshold for longer names.\n\nTo improve precision, you can filter by country code and place type for deambiguation or to narrow down results. \n\nSome services allow specifying place types using localized terms, which can be useful when working with multilingual datasets.\n\nGeoResolver includes a basic mapping of common place types in `data/mappings/places_map.json`. You can also pass a custom mapping to the `PlaceResolver` class to support additional types or override defaults. This is useful for adapting the resolution logic to domain-specific vocabularies or legacy data.\n\n## How to use\n\nTo use GeoResolver, install the library via `pip`. It\u2019s recommended to use a virtual environment to avoid conflicts with other packages:\n\n```bash\npip install georesolver\n```\n\n### Geonames configuration\n\nTo use the GeoNames service, you must create a free account at [GeoNames](https://www.geonames.org/login) and obtain a username. This username is required to make API requests.\n\n> **Warning**: It's possible to use the username `demo` for testing purposes, but this user has very limited quota and it's possible to hit the limit quickly, especially with batch requests.\n\nYou can provide your username in one of two ways:\n\n**Environment variable**\n\nCreate a `.env` file in your project directory:\n\n```\nGEONAMES_USERNAME=your_geonames_username\n```\n\n**Pass it explicitly**\n\n```python\nfrom georesolver import GeoNamesQuery\n\ngeonames_query = GeoNamesQuery(geonames_username=\"your_geonames_username\")\n```\n\n### Basic Example Usage\n\nThe most straightforward way to use the library is through the `PlaceResolver` class. By default, `PlaceResolver` queries all available services \u2014 *GeoNames*, *WHG*, *TGN*, and *Wikidata* \u2014 in that order.\n\nTo resolve a place name, call the `.resolve()` method with the name and (optionally) a country code and place type. If no filters are specified, the first sufficiently similar match across all services is returned.\n\n```python\nfrom georesolver import PlaceResolver\n\n# Initialize the resolver (uses all services by default)\nresolver = PlaceResolver()\n\n# Resolve a place name\nresult = resolver.resolve(\"London\", country_code=\"GB\", place_type=\"inhabited places\")\nif result:\n print(f\"Coordinates: {result['latitude']}, {result['longitude']}\")\n print(f\"Source: {result['source']}\")\n print(f\"Confidence: {result['confidence']}\")\nelse:\n print(\"No match found\")\n```\n\nSample output:\n\n```bash\nCoordinates: 51.50853, -0.12574\nSource: WHG\nConfidence: 100.0\n```\n\n### Enhanced Return Format\n\nStarting with v0.2.0, the `resolve()` method returns a structured dictionary with comprehensive metadata:\n\n```python\n{\n \"place\": \"London\",\n \"standardize_label\": \"London\",\n \"language\": \"en\",\n \"latitude\": 51.50853,\n \"longitude\": -0.12574,\n \"source\": \"GeoNames\",\n \"id\": 2643743,\n \"uri\": \"http://sws.geonames.org/2643743/\",\n \"country_code\": \"GB\",\n \"part_of\": \"\",\n \"part_of_uri\": \"\",\n \"confidence\": 95.5,\n \"threshold\": 90,\n \"match_type\": \"exact\"\n}\n```\n\n### Customizing Services\n\nYou can control which services `PlaceResolver` uses and configure them individually. For example:\n\n```python\nfrom georesolver import PlaceResolver, GeoNamesQuery, TGNQuery\n\ngeonames = GeoNamesQuery(geonames_username=\"your_geonames_username\")\ntgn = TGNQuery()\n\nresolver = PlaceResolver(\n services=[geonames, tgn], \n threshold=80, \n flexible_threshold=True, # Use flexible threshold for short place names\n flexible_threshold_value=70, # Lower threshold for short names\n lang=\"es\", # Spanish language support\n verbose=True\n)\n```\n\nThis gives you more control over the resolution logic, including match strictness (`threshold`), flexible thresholding for short place names, language preferences, and logging verbosity (`verbose=True`).\n\n### Batch Resolution\n\nGeoResolver supports batch resolution from a `pandas.DataFrame`, making it easy to process large datasets.\n\nYou can use the `resolve_batch` method to apply place name resolution to each row of a DataFrame. This method supports optional columns for country code and place type, and can return results in different formats.\n\n```python\nimport pandas as pd\nfrom georesolver import PlaceResolver, GeoNamesQuery\n\n# Sample data\ndf = pd.DataFrame({\n \"place_name\": [\"London\", \"Madrid\", \"Rome\"],\n \"country_code\": [\"GB\", \"ES\", \"IT\"],\n \"place_type\": [\"city\", \"city\", \"city\"]\n})\n\n# Initialize the resolver\nresolver = PlaceResolver(services=[GeoNamesQuery(geonames_username=\"your_username\")], verbose=True)\n\n# Resolve in batch, return structured results as a DataFrame\nresult_df = resolver.resolve_batch(df,\n place_column=\"place_name\",\n country_column=\"country_code\",\n place_type_column=\"place_type\",\n show_progress=False\n)\n\nprint(result_df.columns.tolist())\n# Output: ['place', 'standardize_label', 'language', 'latitude', 'longitude', 'source', 'place_id', 'place_uri', 'country_code', 'part_of', 'part_of_uri', 'confidence', 'threshold', 'match_type']\n```\n\nThis returns a new DataFrame with columns for all resolved place attributes including coordinates, source information, and confidence scores.\n\n\n#### Return options\n\nThe `resolve_batch` method returns a `pandas.DataFrame` by default, but you can also return a list of dictionaries that can be useful for JSON serialization.\n\n```python\n# Return results as a list of dictionaries\nresults = resolver.resolve_batch(df, \n place_column=\"place_name\", \n country_column=\"country_code\", \n place_type_column=\"place_type\", \n return_df=False, # Return list of dictionaries\n show_progress=False)\n\nprint(results[:2])\n```\n\nExample output:\n\n```python\n[\n {\n \"place\": \"London\",\n \"standardize_label\": \"London\",\n \"language\": \"en\",\n \"latitude\": 51.50853,\n \"longitude\": -0.12574,\n \"source\": \"GeoNames\",\n \"id\": 2643743,\n \"uri\": \"http://sws.geonames.org/2643743/\",\n \"country_code\": \"GB\",\n \"part_of\": \"\",\n \"part_of_uri\": \"\",\n \"confidence\": 100.0,\n \"threshold\": 90,\n \"match_type\": \"exact\"\n },\n {\n \"place\": \"Madrid\",\n \"standardize_label\": \"Madrid\",\n \"language\": \"en\",\n \"latitude\": 40.4165,\n \"longitude\": -3.70256,\n \"source\": \"GeoNames\",\n \"id\": 3117735,\n \"uri\": \"http://sws.geonames.org/3117735/\",\n \"country_code\": \"ES\",\n \"part_of\": \"\",\n \"part_of_uri\": \"\",\n \"confidence\": 100.0,\n \"threshold\": 90,\n \"match_type\": \"exact\"\n }\n]\n```\n\n## Custom Place Type Mapping\n\nDifferent gazetteers use different terms to classify place types (e.g., \"populated place\", \"settlement\", \"city\", \"pueblo\"). To unify these differences, GeoResolver uses a configurable place type mapping that standardizes input values before querying services.\n\nBy default, GeoResolver uses a built-in mapping stored at `data/mappings/places_map.json`. This file maps normalized place types (like \"city\") to the equivalent terms used by each service.\n\nExample mapping entry:\n\n```json\n\"city\": {\n \"geonames\": \"PPL\",\n \"wikidata\": \"Q515\",\n \"tgn\": \"cities\",\n \"whg\": \"p\"\n },\n```\n\nYou can provide your own mapping by passing a JSON file path to `PlaceResolver`:\n\n```python\nresolver = PlaceResolver(\n services=[GeoNamesQuery(geonames_username=\"your_username\")],\n places_map_json=\"path/to/your_custom_mapping.json\"\n)\n```\n\nThis is useful when working with domain-specific vocabularies, legacy datasets, or non-English place type terms. You can also use it simply to override the default mapping with your own preferences.\n\nEach service-specific list should contain valid place type codes or labels expected by that gazetteer.\n\n## Wikidata Integration\n\nThis library queries the Wikidata MediaWiki API via the endpoint:\n`https://www.wikidata.org/w/api.php`\n\nIt does not use the SPARQL endpoint (`https://query.wikidata.org/sparql`), as this approach is faster and more reliable for simple place lookups. The library performs entity searches by name and retrieves coordinates, country (P17), and administrative data from the entity information.\n\n**Enhanced in v0.2.0**: WikidataQuery now provides better country and administrative entity data retrieval, with improved matching against the BaseQuery interface for consistency across all services.\n\n> \u26a0\ufe0f **Performance Note**: Wikidata API queries involve multiple HTTP requests per place (search + entity data). This process is relatively slow and not recommended for bulk resolution. Consider using GeoNames or WHG for large-scale batch processing.\n\n## Contributing\n\nContributions are welcome! If you encounter a bug, need additional functionality, or have suggestions for improvement, feel free to open an issue or submit a pull request.\n\n## License\n\nThis project is licensed under a GNU General Public License v3.0. See the [LICENSE](LICENSE) file for details.\n\n## Acknowledgements\n\nThis library relies on open data sources and public APIs provided by GeoNames, WHG, Wikidata, and TGN. Special thanks to the maintainers of these projects for their commitment to accessible geographic knowledge.\n",
"bugtrack_url": null,
"license": "GPL-3.0-only",
"summary": "Multi-source place name to coordinates resolver using TGN, WHG, GeoNames, and Wikidata",
"version": "0.2.2",
"project_urls": {
"Documentation": "https://jairomelo.com/Georesolver/",
"Homepage": "https://github.com/jairomelo/Georesolver",
"Issues": "https://github.com/jairomelo/Georesolver/issues"
},
"split_keywords": [
"geocoding",
" georesolver",
" geonames",
" wikidata",
" tgn",
" whg"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "c43f5fe7651f02983cc9851aba5945cbf689aa97358168fc8bd1f9835609787c",
"md5": "93f15f9378c06c4c21de91533ff47d8e",
"sha256": "2330aaae97e1399f6397f412c67fd0b21d01e5b42b28289f3de88a957a8e8e08"
},
"downloads": -1,
"filename": "georesolver-0.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "93f15f9378c06c4c21de91533ff47d8e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 31793,
"upload_time": "2025-07-15T00:28:31",
"upload_time_iso_8601": "2025-07-15T00:28:31.616203Z",
"url": "https://files.pythonhosted.org/packages/c4/3f/5fe7651f02983cc9851aba5945cbf689aa97358168fc8bd1f9835609787c/georesolver-0.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a25f6058e2c02f053001775b27c9e09c8ec8d76d144c07987e9bf91ef3476180",
"md5": "ffc9576f033073e08082aa98ad59bb29",
"sha256": "219c2786b0ce261ff1b69822991e93701e521c03ab4d3cc88c774ab4b27504f7"
},
"downloads": -1,
"filename": "georesolver-0.2.2.tar.gz",
"has_sig": false,
"md5_digest": "ffc9576f033073e08082aa98ad59bb29",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 37754,
"upload_time": "2025-07-15T00:28:33",
"upload_time_iso_8601": "2025-07-15T00:28:33.239626Z",
"url": "https://files.pythonhosted.org/packages/a2/5f/6058e2c02f053001775b27c9e09c8ec8d76d144c07987e9bf91ef3476180/georesolver-0.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-15 00:28:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "jairomelo",
"github_project": "Georesolver",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "SPARQLWrapper",
"specs": [
[
"==",
"2.0.0"
]
]
},
{
"name": "RapidFuzz",
"specs": [
[
"==",
"3.13.0"
]
]
},
{
"name": "requests",
"specs": [
[
"==",
"2.32.4"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
"==",
"1.1.0"
]
]
},
{
"name": "ratelimit",
"specs": [
[
"==",
"2.2.1"
]
]
},
{
"name": "requests-cache",
"specs": [
[
"==",
"1.2.1"
]
]
},
{
"name": "tqdm",
"specs": [
[
"==",
"4.67.1"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"2.3.0"
]
]
},
{
"name": "pycountry",
"specs": [
[
"==",
"24.6.1"
]
]
}
],
"lcname": "georesolver"
}