nycdb


Namenycdb JSON
Version 0.3.3 PyPI version JSON
download
home_page
SummaryDatabase of NYC housing data
upload_time2024-01-30 01:22:10
maintainer
docs_urlNone
author
requires_python>=3.8
licenseAGPL-3.0-or-later
keywords nyc housing data postgresql
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # NYCDB

**a tool for building a database of NYC housing data**

This is a Python library and cli tool for installing, updating and managing NYCDB, a postgres database of NYC Housing Data.

For more background information on this project and links to download copies of full database dump visit: https://github.com/nycdb/nycdb. We use the term **nycdb** to refer to both the python software and the running copy of the postgres database.

## Using the cli tool

You will need python 3.6+ and Postgres. The latest version can be installed from pypi with pip:  `python3 -m pip install nycdb`

If the installation is successful, you can view a summary of the tool's options by running `nycdb --help`

To print a list of datasets: ` nycdb --list-datasets`

`nycdb`'s main job is to download datasets and import them into postgres. It does not manage the database for you. You can use the flags `-U/--user`, `-D/--database`, `-P/--password`, and `-H/--host` to instruct nycdb to connect to the correct database. See `nycdb --help` for the defaults.

Example: downloading, loading, and verifying the dataset **hpd_violations**:

``` sh
nycdb --download hpd_violations
nycdb --load hpd_violations
nycdb --verify hpd_violations
```

To delete a previously imported dataset (for example, if you'd like to import an updated version of it), use the `--drop` option. For example, to delete the dataset **hpd_violations**: `nycdb --drop hpd_violations`.

You can also verify all datasets: ` nycdb --verify-all `

By default the downloaded data files are is stored in `./data`. Use `--root-dir` to change the location of the data directory.

You can export a `.sql` file for any dataset by using the `--dump` command

## Development

There are two development workflows: one using python virtual environments and one using docker.

### Using docker and docker-compose

Clone the nycdb repository to your computer, open the terminal, and set your working directory to the location of the cloned nycdb folder using `cd <filepath>`

To get started all you have to do is run `docker-compose up`.

On the first run Docker will take longer to downloads and build the images. It
will start a Postgres server on port 5432 of your local machineYou can also press
<kbd>CTRL</kbd>-<kbd>C</kbd> at any point to stop the server.

In a separate terminal, you will be able to now use the nycdb cli: `docker-compose run nycdb --help`.

You will not have any data loaded when you create your local instance of the db. Use functions like `--download` and `--load` to add datasets to your local database, for example: `docker-compose run nycdb --download <dataset>`

You can also open a python3 shell: `docker-compose run --entrypoint=python3 nycdb` or run the test suit `docker-compose run --entrypoint="pytest tests" nycdb`

You may also develop on nycdb itself:

* Any changes you make to the tool's source code will automatically be reflected
  in future invocations of `nycdb` and the test suite.
* The postgres database server is forwarded to localhost:5432 which you can connect to via a desktop client if you like.
* If you don't have a desktop Postgres client, you can always run
  `nycdb --dbshell` to interactively inspect the database with [psql](http://postgresguide.com/utilities/psql.html).

To update the database after adding new packages or dev dependencies, just run `docker-compose up --build --force-recreate --no-deps`. This command will take a bit longer than the regular `docker-compose up` command, but will reinstall packages within the docker container without removing any downloaded files or database data from the docker database.

To stop the database run `docker-compose down`. The downloaded files and database data are stored in docker volumes and are not automatically removed.

However, if you ever want to wipe the database, run `docker-compose down -v`.

### Python3 virtual environments

If you have postgres installed separately, you can use this alternative method without docker:

Setup and active a virtual environment:

``` sh
python3 -m venv venv
source venv/bin/activate
```

Install nycdb: ` pip install -e ./src`

As long as the virtual environment is activated, you can use `nycdb` directly in your shell.

###  Adding New Datasets

See the [guide here](ADDING_NEW_DATASETS.md) for the steps to add a new dataset

### A Note on PostGIS

To use datasets with spatial information such as `boundaries`, you will need to install the extension [PostGIS](https://postgis.net/) and have the program `shp2pgsql` installed and available on your path. `shp2pgsql` is typically installed alongside the PostGIS installation.

The extension can be enabled in your database by executing the SQL `CREATE EXTENSION postgis`.

When `docker compose up` is executed for the first time, the PostGIS extension will be enabled automatically.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "nycdb",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "nyc,housing,data,postgresql",
    "author": "",
    "author_email": "nycdb <nycdb@riseup.net>",
    "download_url": "https://files.pythonhosted.org/packages/fa/40/576971358f42215ef6f86c16b2f7369c55d5c68df4bfc2bdd320cf4c48a2/nycdb-0.3.3.tar.gz",
    "platform": null,
    "description": "# NYCDB\n\n**a tool for building a database of NYC housing data**\n\nThis is a Python library and cli tool for installing, updating and managing NYCDB, a postgres database of NYC Housing Data.\n\nFor more background information on this project and links to download copies of full database dump visit: https://github.com/nycdb/nycdb. We use the term **nycdb** to refer to both the python software and the running copy of the postgres database.\n\n## Using the cli tool\n\nYou will need python 3.6+ and Postgres. The latest version can be installed from pypi with pip:  `python3 -m pip install nycdb`\n\nIf the installation is successful, you can view a summary of the tool's options by running `nycdb --help`\n\nTo print a list of datasets: ` nycdb --list-datasets`\n\n`nycdb`'s main job is to download datasets and import them into postgres. It does not manage the database for you. You can use the flags `-U/--user`, `-D/--database`, `-P/--password`, and `-H/--host` to instruct nycdb to connect to the correct database. See `nycdb --help` for the defaults.\n\nExample: downloading, loading, and verifying the dataset **hpd_violations**:\n\n``` sh\nnycdb --download hpd_violations\nnycdb --load hpd_violations\nnycdb --verify hpd_violations\n```\n\nTo delete a previously imported dataset (for example, if you'd like to import an updated version of it), use the `--drop` option. For example, to delete the dataset **hpd_violations**: `nycdb --drop hpd_violations`.\n\nYou can also verify all datasets: ` nycdb --verify-all `\n\nBy default the downloaded data files are is stored in `./data`. Use `--root-dir` to change the location of the data directory.\n\nYou can export a `.sql` file for any dataset by using the `--dump` command\n\n## Development\n\nThere are two development workflows: one using python virtual environments and one using docker.\n\n### Using docker and docker-compose\n\nClone the nycdb repository to your computer, open the terminal, and set your working directory to the location of the cloned nycdb folder using `cd <filepath>`\n\nTo get started all you have to do is run `docker-compose up`.\n\nOn the first run Docker will take longer to downloads and build the images. It\nwill start a Postgres server on port 5432 of your local machineYou can also press\n<kbd>CTRL</kbd>-<kbd>C</kbd> at any point to stop the server.\n\nIn a separate terminal, you will be able to now use the nycdb cli: `docker-compose run nycdb --help`.\n\nYou will not have any data loaded when you create your local instance of the db. Use functions like `--download` and `--load` to add datasets to your local database, for example: `docker-compose run nycdb --download <dataset>`\n\nYou can also open a python3 shell: `docker-compose run --entrypoint=python3 nycdb` or run the test suit `docker-compose run --entrypoint=\"pytest tests\" nycdb`\n\nYou may also develop on nycdb itself:\n\n* Any changes you make to the tool's source code will automatically be reflected\n  in future invocations of `nycdb` and the test suite.\n* The postgres database server is forwarded to localhost:5432 which you can connect to via a desktop client if you like.\n* If you don't have a desktop Postgres client, you can always run\n  `nycdb --dbshell` to interactively inspect the database with [psql](http://postgresguide.com/utilities/psql.html).\n\nTo update the database after adding new packages or dev dependencies, just run `docker-compose up --build --force-recreate --no-deps`. This command will take a bit longer than the regular `docker-compose up` command, but will reinstall packages within the docker container without removing any downloaded files or database data from the docker database.\n\nTo stop the database run `docker-compose down`. The downloaded files and database data are stored in docker volumes and are not automatically removed.\n\nHowever, if you ever want to wipe the database, run `docker-compose down -v`.\n\n### Python3 virtual environments\n\nIf you have postgres installed separately, you can use this alternative method without docker:\n\nSetup and active a virtual environment:\n\n``` sh\npython3 -m venv venv\nsource venv/bin/activate\n```\n\nInstall nycdb: ` pip install -e ./src`\n\nAs long as the virtual environment is activated, you can use `nycdb` directly in your shell.\n\n###  Adding New Datasets\n\nSee the [guide here](ADDING_NEW_DATASETS.md) for the steps to add a new dataset\n\n### A Note on PostGIS\n\nTo use datasets with spatial information such as `boundaries`, you will need to install the extension [PostGIS](https://postgis.net/) and have the program `shp2pgsql` installed and available on your path. `shp2pgsql` is typically installed alongside the PostGIS installation.\n\nThe extension can be enabled in your database by executing the SQL `CREATE EXTENSION postgis`.\n\nWhen `docker compose up` is executed for the first time, the PostGIS extension will be enabled automatically.\n",
    "bugtrack_url": null,
    "license": "AGPL-3.0-or-later",
    "summary": "Database of NYC housing data",
    "version": "0.3.3",
    "project_urls": {
        "homepage": "https://github.com/nycdb/nycdb",
        "repository": "https://github.com/nycdb/nycdb"
    },
    "split_keywords": [
        "nyc",
        "housing",
        "data",
        "postgresql"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "db30fa4af14243160240c51dc9dd9858c1ba55204f7f81af358c3b587b3b6c77",
                "md5": "5263b8ec1d8a0d9f8b7dc68e7bb890c3",
                "sha256": "c2486cc9ffa6b25b1cfc42e9bcc0426f8f4ce62c484ee922bfececf69aa9f99f"
            },
            "downloads": -1,
            "filename": "nycdb-0.3.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5263b8ec1d8a0d9f8b7dc68e7bb890c3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 84597,
            "upload_time": "2024-01-30T01:22:08",
            "upload_time_iso_8601": "2024-01-30T01:22:08.307546Z",
            "url": "https://files.pythonhosted.org/packages/db/30/fa4af14243160240c51dc9dd9858c1ba55204f7f81af358c3b587b3b6c77/nycdb-0.3.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fa40576971358f42215ef6f86c16b2f7369c55d5c68df4bfc2bdd320cf4c48a2",
                "md5": "ad2afcf0595e45a08cb3440226953557",
                "sha256": "ec895482a48a33f25d27f6dfac55c3925838e9af614aff6c5ed8013320cb69e4"
            },
            "downloads": -1,
            "filename": "nycdb-0.3.3.tar.gz",
            "has_sig": false,
            "md5_digest": "ad2afcf0595e45a08cb3440226953557",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 49888,
            "upload_time": "2024-01-30T01:22:10",
            "upload_time_iso_8601": "2024-01-30T01:22:10.739287Z",
            "url": "https://files.pythonhosted.org/packages/fa/40/576971358f42215ef6f86c16b2f7369c55d5c68df4bfc2bdd320cf4c48a2/nycdb-0.3.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-30 01:22:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nycdb",
    "github_project": "nycdb",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "nycdb"
}
        
Elapsed time: 0.17317s