# cjdb
[![MIT badge](https://img.shields.io/pypi/l/cjdb)](LICENSE) [![PyPI](https://img.shields.io/pypi/v/cjdb)](https://pypi.org/project/cjdb)
`cjdb` is a Python-based importer/exporter of [CityJSONL files (CityJSON Lines)](https://www.cityjson.org/cityjsonl/) to and from a PostgreSQL database.
It requires the [PostGIS](https://postgis.net/) extension.
## Installation
```bash
pip install cjdb
```
It is recommended to install it in an isolated environment.
## Usage
Check our [docs online](https://cityjson.github.io/cjdb/cjdb.html) or
```bash
cjdb --help
```
## Quickstart
Sample CityJSON data can be downloaded from [the 3DBAG download service](https://3dbag.nl/).
For example, [download the tile "9-284-556"](https://data.3dbag.nl/cityjson/v20230622/tiles/9/284/556/9-284-556.city.json), within which part of TU Delft is located.
Then, having downloaded the CityJSON file, you need a combination of [cjio](https://github.com/cityjson/cjio) (external CityJSON processing library) and cjdb to import it to a schema in a database. Here is a step-by-step guide:
1. Convert CityJSON to CityJSONL
```bash
cjio --suppress_msg 9-284-556.json export jsonl 9-284-556.jsonl
```
2. Create a new database called "testcjdb"
If you installed PostgreSQL you should have the program 'createdb', so `createdb testcjdb`
Alternatively, you can use PgAdmin, [see how](https://postgis.net/workshops/postgis-intro/creating_db.html).
3. Import CityJSONL to the database in the schema "cjdb"
```bash
cjdb import -H localhost -U postgres -d testcjdb -s cjdb -f 9-284-556.jsonl
```
**Alternatively steps 1 and 3 in a single command:**
```bash
cjio --suppress_msg 9-284-556.json export jsonl stdout | cjdb import -H localhost -U postgres -d postgres -s cjdb
```
The metadata and the objects can then be found in the tables in the specified schema (`cjdb` in this example).
The password can be specified in the `PGPASSWORD` environment variable. If not specified, the app will prompt for the password.
4. If you want to export from the database you have two options. You can export the whole database in a CityJSONL file with:
```bash
cjdb export -H localhost -U postgres -d testcjdb -s cjdb -o result.jsonl
```
or export only part of it, using a select query as input. The select query should return the ids of the objects to be exported:
```bash
cjdb export -H localhost -U postgres -d testcjdb -s cjdb -o result.jsonl -q "SELECT 'NL.IMBAG.Pand.1655100000500568' as object_id"
```
5. If you want to convert from CityJSONFeatures to city json you can use `cjio`:
```bash
cat /path/to/result.city.jsonl | cjio stdin save /path/to/output.city.json
```
## Using docker
Build:
```bash
docker build -t cjdb:latest .
```
Run:
```bash
docker run --rm -it cjdb cjdb --help
```
To import some files, the `-v` option is needed to mount our local file directory in the container:
```bash
docker run -v {MYDIRECTORY}:/data --rm -it --network=host cjdb cjdb import -H localhost -U postgres -d postgres -W postgres -f /data/5870_ext.jsonl
```
## Important Notes
### Data model
The `cjdb` importer loads the data in accordance with a [specific data model](cjdb/model/README.md).
For example SQL queries on the tables see [here](cjdb/model/BASICQUERIES.md)
### Indexes
Some indexes are created when a new schema is created (refer to [Data Model](cjdb/model/README.md)).
In addition to these indexes, the user can add more indexes on certain CityObject attributes with the `-x/--attr-index` or the `-px/--partial-attr-index` flags.
We recommend these additional indexes for attributes that are frequently queried.
The second option uses a partial index with a `not null` condition on the attribute.
This saves disk space when indexing an attribute that is not present among all the imported CityObjects.
This is often the case with CityJSON, because in a single dataset there can be different object types, with different attributes.
### Structuring the database and its schemas
It is recommended to group together semantically coherent objects, by importing them to the same database schema.
One database can have different schemas.
While the current data model supports the import of any type of CityJSON objects together (`Building` and `SolitaryVegetationObject`), the data becomes harder to manage for the user.
Example of this would be having different attributes for the same CityObject type (which should be consistent for data coming from the same source).
### Input == CityJSONFeature
The importer works only on with [*CityJSONL* files](https://www.cityjson.org/specs/#text-sequences-and-streaming-with-cityjsonfeature), which are CityJSON files decomposed into their *features* (`CityJSONFeature`).
The easiest way to create these from a CityJSON file is with [cjio](https://github.com/cityjson/cjio), by following [these instructions](https://github.com/cityjson/cjio#stdin-and-stdout).
The importer supports 3 kinds of input:
1. a single CityJSONL file (only those as the output of cjio currently work)
1. a directory of CityJSONL files (all files with *jsonl* extensions are located and imported)
1. STDIN using the pipe operator: `cat file.jsonl | cjdb ...`
### Coordinate Reference Systems
The `cjdb` importer does not allow inconsistent CRSs (coordinate reference systems) within the same database schema. For storing data in different CRSs, you have to create different schemas.
The data needs to be either harmonized beforehand, or the `--transform` flag can be used upon import, to reproject all the geometries to the CRS of the existing schema.
Specifying a 2D CRS (instead of a 3D one) will cause the Z-coordinates to remain unchanged.
**Note:** reprojections slow down the import significantly.
**Note:** Source data with missing `"metadata"/"referenceSystem"` cannot be reprojected due to unknown source reference system.
You can use the `-I/--srid` flag to set the SRID of the input file.
### 3D reprojections
[`pyproj`](https://pyproj4.github.io/pyproj/stable/) is used for CRS reprojections.
While it supports 3D CRS transformations between different systems, sometimes downloading additional [grids](https://pyproj4.github.io/pyproj/stable/transformation_grids.html) is required.
The importer will attempt to download the grids needed for the reprojection, with the following message:
```
Attempting to download additional grids required for CRS transformation.
This can also be done manually, and the files should be put in this folder:
{pyproj_directory}
```
If that fails, the user will have to download the required grids and put them in the printed `{pyproj_directory}` themselves.
### CityJSON Extensions
If [CityJSON Extensions](https://www.cityjson.org/extensions/) are present in the imported files, they can be found listed in the `extensions` column in the `cj_metadata` table.
The [CityJSON specifications](https://www.cityjson.org/specs/#extensions) mention 3 different extendable features, and the `cjdb` importer deals with them as follows:
1. Complex attributes
No action is taken. These attributes end up in the `attributes` JSONB column.
2. Additional root properties
Additional root properties are placed in the `extra properties` JSONB column in the `cj_metadata` table.
3. Additional CityObject type
Additional CityObject types are appended to the list of allowed CityJSON objects.
### CityJSON GeometryTemplate
[Geometry templates](https://www.cityjson.org/specs/1.1.2/#geometry-templates)
are resolved for each object geometry, so that the object in the table ends up with its real-world coordinates (instead of vertex references or relative template coordinates).
### Data validation
The importer does not validate the structure of the file. It is assumed that the input file is schema-valid ([CityJSON validator](https://validator.cityjson.org/)).
It sends out warnings when:
- CityObject types appear which are defined neither in the main CityJSON specification nor in any of the supplied extensions.
- the specified target CRS does not have the Z-axis defined
- the source dataset does not have a CRS defined at all
### Repeated object IDs
The importer does not check if an object with a specific ID exists already in the database - every imported object gets and new id. However, at the time of import the importer will detect previously detected files with the same filename. The user can choose to run the import with either the `-g, --ignore-repeated-file` option to import files with the same filename under a different id or `--overwrite` to overwrite *all* previously imported objects with this filename.
## Contributors
This project started as a group project in the [MSc Geomatics at TUDelft](https://geomatics.tudelft.nl/).
The original code for the project can be found [here](https://github.com/leoleonsio/cjdb), and the authors were:
[@cynthiacai56](https://github.com/cynthiacai56), [@LanYan1110](https://github.com/LanYan1110), [@YitongXia](https://github.com/YitongXia), [@Topher2k](https://github.com/Topher2k), [@siebren014](https://github.com/siebren014), [@leoleonsio](https://github.com/leoleonsio)
This version has been improved and will be maintained by [@GinaStavropoulou](https://github.com/GinaStavropoulou), and [@hugoledoux](https://github.com/hugoledoux).
Raw data
{
"_id": null,
"home_page": "https://github.com/tudelft3d/cjdb",
"name": "cjdb",
"maintainer": "Gina Stavropoulou",
"docs_url": null,
"requires_python": ">=3.8.1,<4.0.0",
"maintainer_email": "g.stavropoulou@tudelft.nl",
"keywords": "CityJSON,PostgreSQL",
"author": "Cynthia Cai",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/d6/16/caa5d8f7f5e3bcce12bf5c2d1f5711c2f81fd165a855f461eda5f1e1d25d/cjdb-2.1.0.tar.gz",
"platform": null,
"description": "# cjdb\n[![MIT badge](https://img.shields.io/pypi/l/cjdb)](LICENSE) [![PyPI](https://img.shields.io/pypi/v/cjdb)](https://pypi.org/project/cjdb)\n\n`cjdb` is a Python-based importer/exporter of [CityJSONL files (CityJSON Lines)](https://www.cityjson.org/cityjsonl/) to and from a PostgreSQL database. \nIt requires the [PostGIS](https://postgis.net/) extension.\n\n\n## Installation\n```bash\npip install cjdb\n```\nIt is recommended to install it in an isolated environment.\n\n\n## Usage\n\nCheck our [docs online](https://cityjson.github.io/cjdb/cjdb.html) or\n\n```bash\ncjdb --help\n```\n\n## Quickstart\n\nSample CityJSON data can be downloaded from [the 3DBAG download service](https://3dbag.nl/). \nFor example, [download the tile \"9-284-556\"](https://data.3dbag.nl/cityjson/v20230622/tiles/9/284/556/9-284-556.city.json), within which part of TU Delft is located.\nThen, having downloaded the CityJSON file, you need a combination of [cjio](https://github.com/cityjson/cjio) (external CityJSON processing library) and cjdb to import it to a schema in a database. Here is a step-by-step guide:\n\n1. Convert CityJSON to CityJSONL\n\n```bash\ncjio --suppress_msg 9-284-556.json export jsonl 9-284-556.jsonl \n```\n\n2. Create a new database called \"testcjdb\"\n\nIf you installed PostgreSQL you should have the program 'createdb', so `createdb testcjdb`\n\nAlternatively, you can use PgAdmin, [see how](https://postgis.net/workshops/postgis-intro/creating_db.html).\n\n3. Import CityJSONL to the database in the schema \"cjdb\"\n```bash\ncjdb import -H localhost -U postgres -d testcjdb -s cjdb -f 9-284-556.jsonl\n```\n\n**Alternatively steps 1 and 3 in a single command:**\n\n```bash\ncjio --suppress_msg 9-284-556.json export jsonl stdout | cjdb import -H localhost -U postgres -d postgres -s cjdb\n```\n\nThe metadata and the objects can then be found in the tables in the specified schema (`cjdb` in this example).\n\nThe password can be specified in the `PGPASSWORD` environment variable. If not specified, the app will prompt for the password.\n\n4. If you want to export from the database you have two options. You can export the whole database in a CityJSONL file with: \n```bash\ncjdb export -H localhost -U postgres -d testcjdb -s cjdb -o result.jsonl\n```\nor export only part of it, using a select query as input. The select query should return the ids of the objects to be exported:\n\n```bash\ncjdb export -H localhost -U postgres -d testcjdb -s cjdb -o result.jsonl -q \"SELECT 'NL.IMBAG.Pand.1655100000500568' as object_id\"\n```\n\n5. If you want to convert from CityJSONFeatures to city json you can use `cjio`:\n```bash\ncat /path/to/result.city.jsonl | cjio stdin save /path/to/output.city.json\n```\n\n## Using docker\nBuild:\n```bash\ndocker build -t cjdb:latest .\n```\n\nRun:\n```bash\ndocker run --rm -it cjdb cjdb --help\n```\n\nTo import some files, the `-v` option is needed to mount our local file directory in the container:\n```bash\ndocker run -v {MYDIRECTORY}:/data --rm -it --network=host cjdb cjdb import -H localhost -U postgres -d postgres -W postgres -f /data/5870_ext.jsonl \n```\n\n## Important Notes\n### Data model\n\nThe `cjdb` importer loads the data in accordance with a [specific data model](cjdb/model/README.md).\n\nFor example SQL queries on the tables see [here](cjdb/model/BASICQUERIES.md)\n\n\n### Indexes\nSome indexes are created when a new schema is created (refer to [Data Model](cjdb/model/README.md)).\n\nIn addition to these indexes, the user can add more indexes on certain CityObject attributes with the `-x/--attr-index` or the `-px/--partial-attr-index` flags.\nWe recommend these additional indexes for attributes that are frequently queried.\nThe second option uses a partial index with a `not null` condition on the attribute.\nThis saves disk space when indexing an attribute that is not present among all the imported CityObjects.\nThis is often the case with CityJSON, because in a single dataset there can be different object types, with different attributes.\n\n\n### Structuring the database and its schemas\n\nIt is recommended to group together semantically coherent objects, by importing them to the same database schema.\nOne database can have different schemas.\n\nWhile the current data model supports the import of any type of CityJSON objects together (`Building` and `SolitaryVegetationObject`), the data becomes harder to manage for the user. \nExample of this would be having different attributes for the same CityObject type (which should be consistent for data coming from the same source).\n\n\n### Input == CityJSONFeature\nThe importer works only on with [*CityJSONL* files](https://www.cityjson.org/specs/#text-sequences-and-streaming-with-cityjsonfeature), which are CityJSON files decomposed into their *features* (`CityJSONFeature`).\n\nThe easiest way to create these from a CityJSON file is with [cjio](https://github.com/cityjson/cjio), by following [these instructions](https://github.com/cityjson/cjio#stdin-and-stdout).\n\nThe importer supports 3 kinds of input:\n 1. a single CityJSONL file (only those as the output of cjio currently work)\n 1. a directory of CityJSONL files (all files with *jsonl* extensions are located and imported)\n 1. STDIN using the pipe operator: `cat file.jsonl | cjdb ...`\n\n\n### Coordinate Reference Systems\nThe `cjdb` importer does not allow inconsistent CRSs (coordinate reference systems) within the same database schema. For storing data in different CRSs, you have to create different schemas.\n\nThe data needs to be either harmonized beforehand, or the `--transform` flag can be used upon import, to reproject all the geometries to the CRS of the existing schema. \nSpecifying a 2D CRS (instead of a 3D one) will cause the Z-coordinates to remain unchanged.\n\n**Note:** reprojections slow down the import significantly.\n\n**Note:** Source data with missing `\"metadata\"/\"referenceSystem\"` cannot be reprojected due to unknown source reference system. \nYou can use the `-I/--srid` flag to set the SRID of the input file. \n\n\n### 3D reprojections\n[`pyproj`](https://pyproj4.github.io/pyproj/stable/) is used for CRS reprojections. \nWhile it supports 3D CRS transformations between different systems, sometimes downloading additional [grids](https://pyproj4.github.io/pyproj/stable/transformation_grids.html) is required. \nThe importer will attempt to download the grids needed for the reprojection, with the following message:\n\n```\nAttempting to download additional grids required for CRS transformation.\nThis can also be done manually, and the files should be put in this folder:\n {pyproj_directory}\n```\n\nIf that fails, the user will have to download the required grids and put them in the printed `{pyproj_directory}` themselves. \n\n\n### CityJSON Extensions\nIf [CityJSON Extensions](https://www.cityjson.org/extensions/) are present in the imported files, they can be found listed in the `extensions` column in the `cj_metadata` table.\n\nThe [CityJSON specifications](https://www.cityjson.org/specs/#extensions) mention 3 different extendable features, and the `cjdb` importer deals with them as follows:\n\n1. Complex attributes\n\nNo action is taken. These attributes end up in the `attributes` JSONB column.\n\n2. Additional root properties\n\nAdditional root properties are placed in the `extra properties` JSONB column in the `cj_metadata` table.\n\n3. Additional CityObject type\n\nAdditional CityObject types are appended to the list of allowed CityJSON objects.\n\n### CityJSON GeometryTemplate\n[Geometry templates](https://www.cityjson.org/specs/1.1.2/#geometry-templates)\nare resolved for each object geometry, so that the object in the table ends up with its real-world coordinates (instead of vertex references or relative template coordinates).\n\n### Data validation\nThe importer does not validate the structure of the file. It is assumed that the input file is schema-valid ([CityJSON validator](https://validator.cityjson.org/)).\nIt sends out warnings when:\n- CityObject types appear which are defined neither in the main CityJSON specification nor in any of the supplied extensions. \n- the specified target CRS does not have the Z-axis defined\n- the source dataset does not have a CRS defined at all\n\n### Repeated object IDs\nThe importer does not check if an object with a specific ID exists already in the database - every imported object gets and new id. However, at the time of import the importer will detect previously detected files with the same filename. The user can choose to run the import with either the `-g, --ignore-repeated-file` option to import files with the same filename under a different id or `--overwrite` to overwrite *all* previously imported objects with this filename.\n\n\n## Contributors\n\nThis project started as a group project in the [MSc Geomatics at TUDelft](https://geomatics.tudelft.nl/).\nThe original code for the project can be found [here](https://github.com/leoleonsio/cjdb), and the authors were:\n[@cynthiacai56](https://github.com/cynthiacai56), [@LanYan1110](https://github.com/LanYan1110), [@YitongXia](https://github.com/YitongXia), [@Topher2k](https://github.com/Topher2k), [@siebren014](https://github.com/siebren014), [@leoleonsio](https://github.com/leoleonsio)\n\nThis version has been improved and will be maintained by [@GinaStavropoulou](https://github.com/GinaStavropoulou), and [@hugoledoux](https://github.com/hugoledoux).\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "CJDB is a tool that enables CityJSON integration with a PostgreSQL database",
"version": "2.1.0",
"project_urls": {
"Homepage": "https://github.com/tudelft3d/cjdb",
"Repository": "https://github.com/tudelft3d/cjdb"
},
"split_keywords": [
"cityjson",
"postgresql"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "32732f7f6ec724a9471df80a024b30544d84dbfa28aed36c236a0d8ae6095253",
"md5": "596b205a289c5130959cfb46194dfa83",
"sha256": "5a813a87c1c007001554026368dc806595f11e93055c8668a2b26e28e192a99b"
},
"downloads": -1,
"filename": "cjdb-2.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "596b205a289c5130959cfb46194dfa83",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8.1,<4.0.0",
"size": 85263,
"upload_time": "2023-10-20T16:01:21",
"upload_time_iso_8601": "2023-10-20T16:01:21.616165Z",
"url": "https://files.pythonhosted.org/packages/32/73/2f7f6ec724a9471df80a024b30544d84dbfa28aed36c236a0d8ae6095253/cjdb-2.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d616caa5d8f7f5e3bcce12bf5c2d1f5711c2f81fd165a855f461eda5f1e1d25d",
"md5": "375714c4487a2cbfe5035be785617446",
"sha256": "1ed92202bcc0f5ac3cb5ab4d5c2bcc3cf33d9d83f0350f9f67e264f3b03c4ce8"
},
"downloads": -1,
"filename": "cjdb-2.1.0.tar.gz",
"has_sig": false,
"md5_digest": "375714c4487a2cbfe5035be785617446",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8.1,<4.0.0",
"size": 87288,
"upload_time": "2023-10-20T16:01:23",
"upload_time_iso_8601": "2023-10-20T16:01:23.694170Z",
"url": "https://files.pythonhosted.org/packages/d6/16/caa5d8f7f5e3bcce12bf5c2d1f5711c2f81fd165a855f461eda5f1e1d25d/cjdb-2.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-20 16:01:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tudelft3d",
"github_project": "cjdb",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "cjdb"
}