# OSM Feature Extractor
[![PyPI version](https://badge.fury.io/py/osm-feature-extractor.svg)](https://badge.fury.io/py/osm-feature-extractor)
A lightweight application designed to automatically extract features from an OpenStreetMap (OSM)
file and map them to a user-defined GeoJSON file containing a collection of polygons.
The extracted features can include `count` for nodes, `length` for ways, and `area` for areas.
These features can then be used for machine learning applications based on OSM data.
For more details on the features that are extracted, check [FEATURES.md](FEATURES.md) and
the [OSM wiki](https://wiki.openstreetmap.org/wiki/Map_Features).
Example generated dataframe:
![df](osm_feature_extractor/utils/img/data_frame.png)
Data visualised on a map:
![df](osm_feature_extractor/utils/img/data_kepler.png)
## Installation
$ pip install osm-feature-extractor
## Usage
After installation, you can use the tool directly with the `osm_feature_extractor` command,
which supports two primary operations: `extract` and `analyze`.
### `extract`
The `extract` command requires a minimum of three flags:
$ osm_feature_extractor extract \
--osm-file <path_to_osm_file> \
--input-polygons-file <path_to_polygons_file> \
--output-file <path_to_output_file>
To see all available flags, use the help command:
$ osm_feature_extractor extract --help
Alternatively, you can provide a `.conf` file with the required parameters:
$ osm_feature_extractor extract --conf-file <path_to_conf_file>
The configuration file should have the following format:
```shell
[user-defined]
osm_file: <path_to_osm_file>
input_polygons_file: <path_to_polygons_file>
output_file: <path_to_output_file>
[default]
process_base_data: True
process_osm_data: True
polygons_file: polygons.geojson
osm_extractor_files_dir: osm_extractor_files_dir
```
**Note**: _Processing large OSM files may take some time. It is recommended to use the CLI tool [osmium extract](https://docs.osmcode.org/osmium/latest/osmium-extract.html)
to reduce the OSM file to your area of interest before running the feature extractor._
### `analyze`
The `analyze` command provides a quick overview of the OSM file, including the total number of nodes,
ways, bounds, and the centroid. To use this feature, run:
$ osm_feature_extractor analyze --osm-file <path_to_osm_file>
Raw data
{
"_id": null,
"home_page": "https://github.com/diogomatoschaves/osm-feature-extractor",
"name": "osm-feature-extractor",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.12",
"maintainer_email": null,
"keywords": "osm, feature-augmentation, feature-extractor, osm-features",
"author": "Diogo Matos Chaves",
"author_email": "diogo_chaves@hotmail.com",
"download_url": "https://files.pythonhosted.org/packages/71/16/bf3da849f79b9e5f180898a88212235c0bfba0ca6e0f5ba849949156a51f/osm_feature_extractor-0.3.0.tar.gz",
"platform": null,
"description": "# OSM Feature Extractor \n\n[![PyPI version](https://badge.fury.io/py/osm-feature-extractor.svg)](https://badge.fury.io/py/osm-feature-extractor)\n\nA lightweight application designed to automatically extract features from an OpenStreetMap (OSM) \nfile and map them to a user-defined GeoJSON file containing a collection of polygons. \nThe extracted features can include `count` for nodes, `length` for ways, and `area` for areas. \nThese features can then be used for machine learning applications based on OSM data.\n\nFor more details on the features that are extracted, check [FEATURES.md](FEATURES.md) and \nthe [OSM wiki](https://wiki.openstreetmap.org/wiki/Map_Features).\n\nExample generated dataframe:\n\n![df](osm_feature_extractor/utils/img/data_frame.png)\n\nData visualised on a map:\n\n![df](osm_feature_extractor/utils/img/data_kepler.png)\n\n## Installation\n\n $ pip install osm-feature-extractor\n\n## Usage\n\nAfter installation, you can use the tool directly with the `osm_feature_extractor` command, \nwhich supports two primary operations: `extract` and `analyze`.\n\n### `extract`\n\nThe `extract` command requires a minimum of three flags:\n\n $ osm_feature_extractor extract \\\n --osm-file <path_to_osm_file> \\ \n --input-polygons-file <path_to_polygons_file> \\\n --output-file <path_to_output_file>\n\nTo see all available flags, use the help command:\n\n $ osm_feature_extractor extract --help\n\nAlternatively, you can provide a `.conf` file with the required parameters:\n\n $ osm_feature_extractor extract --conf-file <path_to_conf_file>\n\nThe configuration file should have the following format:\n\n```shell\n[user-defined]\nosm_file: <path_to_osm_file>\ninput_polygons_file: <path_to_polygons_file>\noutput_file: <path_to_output_file>\n\n[default]\nprocess_base_data: True\nprocess_osm_data: True\npolygons_file: polygons.geojson\nosm_extractor_files_dir: osm_extractor_files_dir\n```\n\n**Note**: _Processing large OSM files may take some time. It is recommended to use the CLI tool [osmium extract](https://docs.osmcode.org/osmium/latest/osmium-extract.html)\nto reduce the OSM file to your area of interest before running the feature extractor._\n\n### `analyze` \n\nThe `analyze` command provides a quick overview of the OSM file, including the total number of nodes, \nways, bounds, and the centroid. To use this feature, run:\n\n $ osm_feature_extractor analyze --osm-file <path_to_osm_file>\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Library to extract OSM features and map them to GeoJSON polygons.",
"version": "0.3.0",
"project_urls": {
"Homepage": "https://github.com/diogomatoschaves/osm-feature-extractor",
"Repository": "https://github.com/diogomatoschaves/osm-feature-extractor"
},
"split_keywords": [
"osm",
" feature-augmentation",
" feature-extractor",
" osm-features"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4ebb74a3a1606004edc567b1fd581976ca0fa06ecb6830db982e353690dc1871",
"md5": "9c0693fe89a3a4daee1687bfca4475ad",
"sha256": "e656db41745788ef302ef085e477f7831ad87e9ffedc10e412479d22bc15212e"
},
"downloads": -1,
"filename": "osm_feature_extractor-0.3.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9c0693fe89a3a4daee1687bfca4475ad",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.12",
"size": 15911750,
"upload_time": "2024-09-02T09:46:38",
"upload_time_iso_8601": "2024-09-02T09:46:38.960975Z",
"url": "https://files.pythonhosted.org/packages/4e/bb/74a3a1606004edc567b1fd581976ca0fa06ecb6830db982e353690dc1871/osm_feature_extractor-0.3.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7116bf3da849f79b9e5f180898a88212235c0bfba0ca6e0f5ba849949156a51f",
"md5": "b2cfa6a675c8ed079459d22b338e3332",
"sha256": "1930dda524a267df082bdd70b7f2945cef8e2b32565466a4cae89a3472a7ffe1"
},
"downloads": -1,
"filename": "osm_feature_extractor-0.3.0.tar.gz",
"has_sig": false,
"md5_digest": "b2cfa6a675c8ed079459d22b338e3332",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.12",
"size": 15877713,
"upload_time": "2024-09-02T09:46:42",
"upload_time_iso_8601": "2024-09-02T09:46:42.069072Z",
"url": "https://files.pythonhosted.org/packages/71/16/bf3da849f79b9e5f180898a88212235c0bfba0ca6e0f5ba849949156a51f/osm_feature_extractor-0.3.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-02 09:46:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "diogomatoschaves",
"github_project": "osm-feature-extractor",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "anyio",
"specs": [
[
"==",
"4.4.0"
]
]
},
{
"name": "appnope",
"specs": [
[
"==",
"0.1.4"
]
]
},
{
"name": "argon2-cffi",
"specs": [
[
"==",
"23.1.0"
]
]
},
{
"name": "argon2-cffi-bindings",
"specs": [
[
"==",
"21.2.0"
]
]
},
{
"name": "arrow",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "asttokens",
"specs": [
[
"==",
"2.4.1"
]
]
},
{
"name": "async-lru",
"specs": [
[
"==",
"2.0.4"
]
]
},
{
"name": "attrs",
"specs": [
[
"==",
"24.2.0"
]
]
},
{
"name": "babel",
"specs": [
[
"==",
"2.16.0"
]
]
},
{
"name": "backcall",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "beautifulsoup4",
"specs": [
[
"==",
"4.12.3"
]
]
},
{
"name": "bleach",
"specs": [
[
"==",
"6.1.0"
]
]
},
{
"name": "certifi",
"specs": [
[
"==",
"2024.7.4"
]
]
},
{
"name": "cffi",
"specs": [
[
"==",
"1.17.0"
]
]
},
{
"name": "charset-normalizer",
"specs": [
[
"==",
"3.3.2"
]
]
},
{
"name": "click",
"specs": [
[
"==",
"8.1.7"
]
]
},
{
"name": "click-plugins",
"specs": [
[
"==",
"1.1.1"
]
]
},
{
"name": "cligj",
"specs": [
[
"==",
"0.7.2"
]
]
},
{
"name": "comm",
"specs": [
[
"==",
"0.2.2"
]
]
},
{
"name": "contourpy",
"specs": [
[
"==",
"1.2.1"
]
]
},
{
"name": "cycler",
"specs": [
[
"==",
"0.12.1"
]
]
},
{
"name": "debugpy",
"specs": [
[
"==",
"1.8.5"
]
]
},
{
"name": "decorator",
"specs": [
[
"==",
"5.1.1"
]
]
},
{
"name": "defusedxml",
"specs": [
[
"==",
"0.7.1"
]
]
},
{
"name": "executing",
"specs": [
[
"==",
"2.1.0"
]
]
},
{
"name": "fastjsonschema",
"specs": [
[
"==",
"2.20.0"
]
]
},
{
"name": "fiona",
"specs": [
[
"==",
"1.9.6"
]
]
},
{
"name": "fonttools",
"specs": [
[
"==",
"4.53.1"
]
]
},
{
"name": "fqdn",
"specs": [
[
"==",
"1.5.1"
]
]
},
{
"name": "geopandas",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "h11",
"specs": [
[
"==",
"0.14.0"
]
]
},
{
"name": "httpcore",
"specs": [
[
"==",
"1.0.5"
]
]
},
{
"name": "httpx",
"specs": [
[
"==",
"0.27.2"
]
]
},
{
"name": "idna",
"specs": [
[
"==",
"3.7"
]
]
},
{
"name": "ipykernel",
"specs": [
[
"==",
"6.29.5"
]
]
},
{
"name": "ipython",
"specs": [
[
"==",
"8.27.0"
]
]
},
{
"name": "ipython-genutils",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "ipywidgets",
"specs": [
[
"==",
"7.8.3"
]
]
},
{
"name": "isoduration",
"specs": [
[
"==",
"20.11.0"
]
]
},
{
"name": "jedi",
"specs": [
[
"==",
"0.19.1"
]
]
},
{
"name": "Jinja2",
"specs": [
[
"==",
"3.1.4"
]
]
},
{
"name": "json5",
"specs": [
[
"==",
"0.9.25"
]
]
},
{
"name": "jsonpointer",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "jsonschema",
"specs": [
[
"==",
"4.23.0"
]
]
},
{
"name": "jsonschema-specifications",
"specs": [
[
"==",
"2023.12.1"
]
]
},
{
"name": "jupyter-events",
"specs": [
[
"==",
"0.10.0"
]
]
},
{
"name": "jupyter-lsp",
"specs": [
[
"==",
"2.2.5"
]
]
},
{
"name": "jupyter_client",
"specs": [
[
"==",
"8.6.2"
]
]
},
{
"name": "jupyter_core",
"specs": [
[
"==",
"5.7.2"
]
]
},
{
"name": "jupyter_server",
"specs": [
[
"==",
"2.14.2"
]
]
},
{
"name": "jupyter_server_terminals",
"specs": [
[
"==",
"0.5.3"
]
]
},
{
"name": "jupyterlab",
"specs": [
[
"==",
"4.2.5"
]
]
},
{
"name": "jupyterlab_pygments",
"specs": [
[
"==",
"0.3.0"
]
]
},
{
"name": "jupyterlab_server",
"specs": [
[
"==",
"2.27.3"
]
]
},
{
"name": "jupyterlab_widgets",
"specs": [
[
"==",
"1.1.9"
]
]
},
{
"name": "keplergl",
"specs": [
[
"==",
"0.3.0"
]
]
},
{
"name": "kiwisolver",
"specs": [
[
"==",
"1.4.5"
]
]
},
{
"name": "MarkupSafe",
"specs": [
[
"==",
"2.1.5"
]
]
},
{
"name": "matplotlib",
"specs": [
[
"==",
"3.9.2"
]
]
},
{
"name": "matplotlib-inline",
"specs": [
[
"==",
"0.1.7"
]
]
},
{
"name": "mistune",
"specs": [
[
"==",
"3.0.2"
]
]
},
{
"name": "nbclient",
"specs": [
[
"==",
"0.10.0"
]
]
},
{
"name": "nbconvert",
"specs": [
[
"==",
"7.16.4"
]
]
},
{
"name": "nbformat",
"specs": [
[
"==",
"5.10.4"
]
]
},
{
"name": "nest-asyncio",
"specs": [
[
"==",
"1.6.0"
]
]
},
{
"name": "notebook",
"specs": [
[
"==",
"7.2.2"
]
]
},
{
"name": "notebook_shim",
"specs": [
[
"==",
"0.2.4"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"2.0.1"
]
]
},
{
"name": "ordered-set",
"specs": [
[
"==",
"4.1.0"
]
]
},
{
"name": "osmium",
"specs": [
[
"==",
"3.7.0"
]
]
},
{
"name": "overrides",
"specs": [
[
"==",
"7.7.0"
]
]
},
{
"name": "packaging",
"specs": [
[
"==",
"24.1"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"2.2.2"
]
]
},
{
"name": "pandocfilters",
"specs": [
[
"==",
"1.5.1"
]
]
},
{
"name": "parso",
"specs": [
[
"==",
"0.8.4"
]
]
},
{
"name": "pexpect",
"specs": [
[
"==",
"4.9.0"
]
]
},
{
"name": "pickleshare",
"specs": [
[
"==",
"0.7.5"
]
]
},
{
"name": "pillow",
"specs": [
[
"==",
"10.4.0"
]
]
},
{
"name": "platformdirs",
"specs": [
[
"==",
"4.2.2"
]
]
},
{
"name": "prometheus_client",
"specs": [
[
"==",
"0.20.0"
]
]
},
{
"name": "prompt_toolkit",
"specs": [
[
"==",
"3.0.47"
]
]
},
{
"name": "psutil",
"specs": [
[
"==",
"6.0.0"
]
]
},
{
"name": "ptyprocess",
"specs": [
[
"==",
"0.7.0"
]
]
},
{
"name": "pure_eval",
"specs": [
[
"==",
"0.2.3"
]
]
},
{
"name": "pycparser",
"specs": [
[
"==",
"2.22"
]
]
},
{
"name": "Pygments",
"specs": [
[
"==",
"2.18.0"
]
]
},
{
"name": "pyogrio",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "pyparsing",
"specs": [
[
"==",
"3.1.2"
]
]
},
{
"name": "pyproj",
"specs": [
[
"==",
"3.6.1"
]
]
},
{
"name": "python-dateutil",
"specs": [
[
"==",
"2.9.0.post0"
]
]
},
{
"name": "python-json-logger",
"specs": [
[
"==",
"2.0.7"
]
]
},
{
"name": "pyturf",
"specs": [
[
"==",
"0.6.10"
]
]
},
{
"name": "pytz",
"specs": [
[
"==",
"2024.1"
]
]
},
{
"name": "PyYAML",
"specs": [
[
"==",
"6.0.2"
]
]
},
{
"name": "pyzmq",
"specs": [
[
"==",
"26.2.0"
]
]
},
{
"name": "referencing",
"specs": [
[
"==",
"0.35.1"
]
]
},
{
"name": "requests",
"specs": [
[
"==",
"2.32.3"
]
]
},
{
"name": "rfc3339-validator",
"specs": [
[
"==",
"0.1.4"
]
]
},
{
"name": "rfc3986-validator",
"specs": [
[
"==",
"0.1.1"
]
]
},
{
"name": "rpds-py",
"specs": [
[
"==",
"0.20.0"
]
]
},
{
"name": "Rtree",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.14.0"
]
]
},
{
"name": "Send2Trash",
"specs": [
[
"==",
"1.8.3"
]
]
},
{
"name": "setuptools",
"specs": [
[
"==",
"74.0.0"
]
]
},
{
"name": "shapely",
"specs": [
[
"==",
"2.0.5"
]
]
},
{
"name": "six",
"specs": [
[
"==",
"1.16.0"
]
]
},
{
"name": "sniffio",
"specs": [
[
"==",
"1.3.1"
]
]
},
{
"name": "soupsieve",
"specs": [
[
"==",
"2.6"
]
]
},
{
"name": "stack-data",
"specs": [
[
"==",
"0.6.3"
]
]
},
{
"name": "terminado",
"specs": [
[
"==",
"0.18.1"
]
]
},
{
"name": "tinycss2",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "tornado",
"specs": [
[
"==",
"6.4.1"
]
]
},
{
"name": "traitlets",
"specs": [
[
"==",
"5.14.3"
]
]
},
{
"name": "traittypes",
"specs": [
[
"==",
"0.2.1"
]
]
},
{
"name": "types-python-dateutil",
"specs": [
[
"==",
"2.9.0.20240821"
]
]
},
{
"name": "tzdata",
"specs": [
[
"==",
"2024.1"
]
]
},
{
"name": "uri-template",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "urllib3",
"specs": [
[
"==",
"2.2.2"
]
]
},
{
"name": "wcwidth",
"specs": [
[
"==",
"0.2.13"
]
]
},
{
"name": "webcolors",
"specs": [
[
"==",
"24.8.0"
]
]
},
{
"name": "webencodings",
"specs": [
[
"==",
"0.5.1"
]
]
},
{
"name": "websocket-client",
"specs": [
[
"==",
"1.8.0"
]
]
},
{
"name": "widgetsnbextension",
"specs": [
[
"==",
"3.6.8"
]
]
}
],
"lcname": "osm-feature-extractor"
}