osm-rawdata


Nameosm-rawdata JSON
Version 0.4.1 PyPI version JSON
download
home_pageNone
SummaryMake data extracts from OSM data.
upload_time2024-12-09 19:55:00
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseAGPL-3.0-only
keywords hot osm openstreetmap
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # OSM RawData

<!-- markdownlint-disable -->
<p align="center">
  <img src="https://github.com/hotosm/fmtm/blob/main/images/hot_logo.png?raw=true" style="width: 200px;" alt="HOT"></a>
</p>
<p align="center">
  <em>A python module for accessing OSM data in a postgres database.</em>
</p>
<p align="center">
  <a href="https://github.com/hotosm/osm-rawdata/actions/workflows/build.yml" target="_blank">
      <img src="https://github.com/hotosm/osm-rawdata/workflows/Build/badge.svg" alt="Build">
  </a>
  <a href="https://github.com/hotosm/osm-rawdata/actions/workflows/build-ci.yml" target="_blank">
      <img src="https://github.com/hotosm/osm-rawdata/workflows/Build CI Img/badge.svg" alt="CI Build">
  </a>
  <a href="https://github.com/hotosm/osm-rawdata/actions/workflows/docs.yml" target="_blank">
      <img src="https://github.com/hotosm/osm-rawdata/workflows/Publish Docs/badge.svg" alt="Publish Docs">
  </a>
  <a href="https://github.com/hotosm/osm-rawdata/actions/workflows/publish.yml" target="_blank">
      <img src="https://github.com/hotosm/osm-rawdata/workflows/Publish to PyPi.org/badge.svg" alt="Publish">
  </a>
  <a href="https://github.com/hotosm/osm-rawdata/actions/workflows/pytest.yml" target="_blank">
      <img src="https://github.com/hotosm/osm-rawdata/workflows/PyTest/badge.svg" alt="Test">
  </a>
  <a href="https://pypi.org/project/osm-rawdata" target="_blank">
      <img src="https://img.shields.io/pypi/v/osm-rawdata?color=%2334D058&label=pypi%20package" alt="Package version">
  </a>
  <a href="https://pypistats.org/packages/osm-rawdata" target="_blank">
      <img src="https://img.shields.io/pypi/dm/osm-rawdata.svg" alt="Downloads">
  </a>
  <a href="https://github.com/hotosm/osm-rawdata/blob/main/LICENSE.md" target="_blank">
      <img src="https://img.shields.io/github/license/hotosm/osm-rawdata.svg" alt="License">
  </a>
</p>

---

📖 **Documentation**: <a href="https://hotosm.github.io/osm-rawdata/" target="_blank">https://hotosm.github.io/osm-rawdata/</a>

🖥️ **Source Code**: <a href="https://github.com/hotosm/osm-rawdata" target="_blank">https://github.com/hotosm/osm-rawdata</a>

---

<!-- markdownlint-enable -->

These is a module to work with
[OpenStreetMap](https://www.openstreetmap.org) data using postgres and
a custom database schema. This code is derived from the [HOT Export
Tool](https://export.hotosm.org/en/v3/),
[osm-fieldwork](https://pypi.org/project/osm-fieldwork/), and
[Underpass](https://github.com/hotosm/underpass), and the [Raw Data
API](https://github.com/hotosm/raw-data-api), which is the new FastAPI
backend for the HOT Export Tool.

Since multiple projects need to do data extracts from OpenStreetMap in
a flexible way, this was designed to have a single body of code to
maintain.

## Installation

To install osm-rawdata, you can use pip. Here are two options:

- Directly from the main branch:
  `pip install git+https://github.com/hotosm/osm-rawdata.git`

- Latest on PyPi:
  `pip install osm-rawdata`

- Including the packages required for `importer.py`:
  `pip install osm-rawdata[importer]`

> NOTE that importer.py will not work unless the extra dependencies
> are specified using osm-rawdata[importer]

## Using the Container Image

- osm-rawdata scripts can be used via the pre-built container images.
- These images come with all dependencies bundled, so are simple to run.

Run a specific command:

```bash
docker run --rm -v $PWD:/data ghcr.io/hotosm/osm-rawdata:latest geofabrik <flags>
```

Run interactively (to use multiple commands):

```bash
docker run --rm -it -v $PWD:/data ghcr.io/hotosm/osm-rawdata:latest
```

> Note: the output directory should always be /data/... to persist data.

## The Database Schema

This project is heavily dependant on postgres and postgis. This schema
was optimized for data anaylsis more than display purposes. The
traditional schema use for OSM shows how it has evolved over the
years. Some tags are columns (usually empty), and others get put into
an **hstore** _tag_ column where they have to be accessed
directly. One big change in this datbase schema is all the tags are in
a single column, reducing the data size considerably, while also being
easier to query in a consistent manner. In the past a developer had to
keep track of what was a column, and what was in the _tags_ column,
which was inefficient.

This schema has 4 tables, similar to the traditional ones. OSM data is
imported using [osm2pgsql](https://osm2pgsql.org/) but uses a
[lua](http://www.lua.org/) script to create the custom
schema. This module's usage is all read-only, as Underpass can keep
the raw data updated every minute, and we just want to access that
data.

Things get more interesting as this module supports both a local
database and a remote one. They use different query languages. To
simplify this, a configuration file is used, which then generates the
proper query syntax.

## The Config File

This reads in two different formats that describe the eventualy SQL
query. The YAML format was originally used by Export Tool, but later
abandoned for a JSON format. The YAML format was adopted by the
osm-fieldwork project before this transistion happened, so uses an
enhanced version to define the queries.

The JSON format is also supported, both parsing the config file and
also generating that query from a YAML config file.

## The Files

### geofabrik.py

This is a simple utility to download a file from GeoGFabrik.

### config.py

This class parses either then JSON or YAML config file formatted
files, and creates a data structure used later to generater the
database query.

### postgres.py

This class handles working with the postgres database. It sets up the
connections, and handles processing the results from the queries.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "osm-rawdata",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "hot, osm, openstreetmap",
    "author": null,
    "author_email": "Rob Savoye <rob.savoye@hotosm.org>,Sam Woodcock <sam.woodcock@hotosm.org>",
    "download_url": "https://files.pythonhosted.org/packages/57/bd/1b565b702635b25bcd951f6ec83a63eaf1feab08ade65c97983ae7948179/osm-rawdata-0.4.1.tar.gz",
    "platform": null,
    "description": "# OSM RawData\n\n<!-- markdownlint-disable -->\n<p align=\"center\">\n  <img src=\"https://github.com/hotosm/fmtm/blob/main/images/hot_logo.png?raw=true\" style=\"width: 200px;\" alt=\"HOT\"></a>\n</p>\n<p align=\"center\">\n  <em>A python module for accessing OSM data in a postgres database.</em>\n</p>\n<p align=\"center\">\n  <a href=\"https://github.com/hotosm/osm-rawdata/actions/workflows/build.yml\" target=\"_blank\">\n      <img src=\"https://github.com/hotosm/osm-rawdata/workflows/Build/badge.svg\" alt=\"Build\">\n  </a>\n  <a href=\"https://github.com/hotosm/osm-rawdata/actions/workflows/build-ci.yml\" target=\"_blank\">\n      <img src=\"https://github.com/hotosm/osm-rawdata/workflows/Build CI Img/badge.svg\" alt=\"CI Build\">\n  </a>\n  <a href=\"https://github.com/hotosm/osm-rawdata/actions/workflows/docs.yml\" target=\"_blank\">\n      <img src=\"https://github.com/hotosm/osm-rawdata/workflows/Publish Docs/badge.svg\" alt=\"Publish Docs\">\n  </a>\n  <a href=\"https://github.com/hotosm/osm-rawdata/actions/workflows/publish.yml\" target=\"_blank\">\n      <img src=\"https://github.com/hotosm/osm-rawdata/workflows/Publish to PyPi.org/badge.svg\" alt=\"Publish\">\n  </a>\n  <a href=\"https://github.com/hotosm/osm-rawdata/actions/workflows/pytest.yml\" target=\"_blank\">\n      <img src=\"https://github.com/hotosm/osm-rawdata/workflows/PyTest/badge.svg\" alt=\"Test\">\n  </a>\n  <a href=\"https://pypi.org/project/osm-rawdata\" target=\"_blank\">\n      <img src=\"https://img.shields.io/pypi/v/osm-rawdata?color=%2334D058&label=pypi%20package\" alt=\"Package version\">\n  </a>\n  <a href=\"https://pypistats.org/packages/osm-rawdata\" target=\"_blank\">\n      <img src=\"https://img.shields.io/pypi/dm/osm-rawdata.svg\" alt=\"Downloads\">\n  </a>\n  <a href=\"https://github.com/hotosm/osm-rawdata/blob/main/LICENSE.md\" target=\"_blank\">\n      <img src=\"https://img.shields.io/github/license/hotosm/osm-rawdata.svg\" alt=\"License\">\n  </a>\n</p>\n\n---\n\n\ud83d\udcd6 **Documentation**: <a href=\"https://hotosm.github.io/osm-rawdata/\" target=\"_blank\">https://hotosm.github.io/osm-rawdata/</a>\n\n\ud83d\udda5\ufe0f **Source Code**: <a href=\"https://github.com/hotosm/osm-rawdata\" target=\"_blank\">https://github.com/hotosm/osm-rawdata</a>\n\n---\n\n<!-- markdownlint-enable -->\n\nThese is a module to work with\n[OpenStreetMap](https://www.openstreetmap.org) data using postgres and\na custom database schema. This code is derived from the [HOT Export\nTool](https://export.hotosm.org/en/v3/),\n[osm-fieldwork](https://pypi.org/project/osm-fieldwork/), and\n[Underpass](https://github.com/hotosm/underpass), and the [Raw Data\nAPI](https://github.com/hotosm/raw-data-api), which is the new FastAPI\nbackend for the HOT Export Tool.\n\nSince multiple projects need to do data extracts from OpenStreetMap in\na flexible way, this was designed to have a single body of code to\nmaintain.\n\n## Installation\n\nTo install osm-rawdata, you can use pip. Here are two options:\n\n- Directly from the main branch:\n  `pip install git+https://github.com/hotosm/osm-rawdata.git`\n\n- Latest on PyPi:\n  `pip install osm-rawdata`\n\n- Including the packages required for `importer.py`:\n  `pip install osm-rawdata[importer]`\n\n> NOTE that importer.py will not work unless the extra dependencies\n> are specified using osm-rawdata[importer]\n\n## Using the Container Image\n\n- osm-rawdata scripts can be used via the pre-built container images.\n- These images come with all dependencies bundled, so are simple to run.\n\nRun a specific command:\n\n```bash\ndocker run --rm -v $PWD:/data ghcr.io/hotosm/osm-rawdata:latest geofabrik <flags>\n```\n\nRun interactively (to use multiple commands):\n\n```bash\ndocker run --rm -it -v $PWD:/data ghcr.io/hotosm/osm-rawdata:latest\n```\n\n> Note: the output directory should always be /data/... to persist data.\n\n## The Database Schema\n\nThis project is heavily dependant on postgres and postgis. This schema\nwas optimized for data anaylsis more than display purposes. The\ntraditional schema use for OSM shows how it has evolved over the\nyears. Some tags are columns (usually empty), and others get put into\nan **hstore** _tag_ column where they have to be accessed\ndirectly. One big change in this datbase schema is all the tags are in\na single column, reducing the data size considerably, while also being\neasier to query in a consistent manner. In the past a developer had to\nkeep track of what was a column, and what was in the _tags_ column,\nwhich was inefficient.\n\nThis schema has 4 tables, similar to the traditional ones. OSM data is\nimported using [osm2pgsql](https://osm2pgsql.org/) but uses a\n[lua](http://www.lua.org/) script to create the custom\nschema. This module's usage is all read-only, as Underpass can keep\nthe raw data updated every minute, and we just want to access that\ndata.\n\nThings get more interesting as this module supports both a local\ndatabase and a remote one. They use different query languages. To\nsimplify this, a configuration file is used, which then generates the\nproper query syntax.\n\n## The Config File\n\nThis reads in two different formats that describe the eventualy SQL\nquery. The YAML format was originally used by Export Tool, but later\nabandoned for a JSON format. The YAML format was adopted by the\nosm-fieldwork project before this transistion happened, so uses an\nenhanced version to define the queries.\n\nThe JSON format is also supported, both parsing the config file and\nalso generating that query from a YAML config file.\n\n## The Files\n\n### geofabrik.py\n\nThis is a simple utility to download a file from GeoGFabrik.\n\n### config.py\n\nThis class parses either then JSON or YAML config file formatted\nfiles, and creates a data structure used later to generater the\ndatabase query.\n\n### postgres.py\n\nThis class handles working with the postgres database. It sets up the\nconnections, and handles processing the results from the queries.\n\n",
    "bugtrack_url": null,
    "license": "AGPL-3.0-only",
    "summary": "Make data extracts from OSM data.",
    "version": "0.4.1",
    "project_urls": {
        "documentation": "https://hotosm.github.io/osm-rawdata",
        "homepage": "https://hotosm.github.io/osm-rawdata",
        "repository": "https://github.com/hotosm/osm-rawdata"
    },
    "split_keywords": [
        "hot",
        " osm",
        " openstreetmap"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d670dc9fa7c7da182bf51e3faf7d44dff29b647aa2835e1c70f8b1f5b5f2d278",
                "md5": "3b4c50e0919b75a53cedc3783220cb65",
                "sha256": "5a1dfda1fed66e41a9f9a8853af6e08e868d43d5e0855470cd84f27e9731eddc"
            },
            "downloads": -1,
            "filename": "osm_rawdata-0.4.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3b4c50e0919b75a53cedc3783220cb65",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 54890,
            "upload_time": "2024-12-09T19:54:59",
            "upload_time_iso_8601": "2024-12-09T19:54:59.826730Z",
            "url": "https://files.pythonhosted.org/packages/d6/70/dc9fa7c7da182bf51e3faf7d44dff29b647aa2835e1c70f8b1f5b5f2d278/osm_rawdata-0.4.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "57bd1b565b702635b25bcd951f6ec83a63eaf1feab08ade65c97983ae7948179",
                "md5": "2c115ef3112003af301633047a2cc469",
                "sha256": "47aadecf78ff13ae80779778e2e6ebcd7d5b71452c1787633e92a2d60527b977"
            },
            "downloads": -1,
            "filename": "osm-rawdata-0.4.1.tar.gz",
            "has_sig": false,
            "md5_digest": "2c115ef3112003af301633047a2cc469",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 43163,
            "upload_time": "2024-12-09T19:55:00",
            "upload_time_iso_8601": "2024-12-09T19:55:00.886063Z",
            "url": "https://files.pythonhosted.org/packages/57/bd/1b565b702635b25bcd951f6ec83a63eaf1feab08ade65c97983ae7948179/osm-rawdata-0.4.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-09 19:55:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hotosm",
    "github_project": "osm-rawdata",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "osm-rawdata"
}
        
Elapsed time: 0.79989s