intake-duckdb


Nameintake-duckdb JSON
Version 0.1.2 PyPI version JSON
download
home_page
SummaryDuckDB plugin for Intake
upload_time2023-04-19 20:49:41
maintainer
docs_urlNone
authorBlake Rosenthal
requires_python>=3.8
licenseCopyright (c) 2023, Anaconda, Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
keywords intake duckdb duck
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Intake-DuckDB

[![Build Status](https://github.com/intake/intake-duckdb//actions/workflows/main.yaml/badge.svg)](https://github.com/intake/intake-duckdb/actions)
[![Documentation Status](https://readthedocs.org/projects/intake-duckdb/badge/?version=latest)](http://intake-duckdb.readthedocs.io/en/latest/?badge=latest)

DuckDB Plugin for Intake

## Installation

From PyPI
```shell
pip install intake-duckdb
```

Or conda-forge
```shell
conda install -c conda-forge intake-duckdb
```
## Usage

Load an entire table into a dataframe
```python
source = intake.open_duckdb("path/to/dbfile", "tablename")
df = source.read()

```
Or a custom SQL in [valid DuckDB query syntax](https://duckdb.org/docs/sql/query_syntax/select)
```python
source = intake.open_duckdb("path/to/dbfile", "SELECT col1, col2 FROM tablename")
df = source.read()
```

Can also iterate over table chunks
```python
source_chunked = intake.open_duckdb("path/to/dbfile", "tablename", chunks=10)
source_chunked.discover()
for chunk in source_chunked.read_chunked():
    # do something
    ...
```

DuckDB catalog: create an Intake catalog from a DuckDB backend
```python
cat = intake.open_duckdb_cat("path/to/dbfile")

# list the sources in 'cat'
list(cat)

df = cat["tablename"].read()
df_chunks = [chunk for chunk in cat["tablename"](chunks=10).read_chunked()]
```

Run DuckDB queries on other Intake sources (that produce pandas DataFrames) within the same catalog
```yaml
# cat.yaml
sources:
  csv_source:
    args:
      urlpath: https://data.csv
    description: Remote CSV source
    driver: csv

  duck_source:
    args:
      targets:
        - csv_source
      sql_expr: SELECT col FROM csv_source LIMIT 10
    description: Source referencing other sources in catalog
    driver: duckdb_transform
```
```python
cat  = intake.open_catalog("cat.yaml")
duck_source = cat.duck_source.read()
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "intake-duckdb",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Blake Rosenthal <brosenthal@anaconda.com>",
    "keywords": "intake,duckdb,duck",
    "author": "Blake Rosenthal",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/12/d8/c0f699275b203fff063575db365c5c4f975fe96dc89f3436628163b1dde7/intake-duckdb-0.1.2.tar.gz",
    "platform": null,
    "description": "# Intake-DuckDB\n\n[![Build Status](https://github.com/intake/intake-duckdb//actions/workflows/main.yaml/badge.svg)](https://github.com/intake/intake-duckdb/actions)\n[![Documentation Status](https://readthedocs.org/projects/intake-duckdb/badge/?version=latest)](http://intake-duckdb.readthedocs.io/en/latest/?badge=latest)\n\nDuckDB Plugin for Intake\n\n## Installation\n\nFrom PyPI\n```shell\npip install intake-duckdb\n```\n\nOr conda-forge\n```shell\nconda install -c conda-forge intake-duckdb\n```\n## Usage\n\nLoad an entire table into a dataframe\n```python\nsource = intake.open_duckdb(\"path/to/dbfile\", \"tablename\")\ndf = source.read()\n\n```\nOr a custom SQL in [valid DuckDB query syntax](https://duckdb.org/docs/sql/query_syntax/select)\n```python\nsource = intake.open_duckdb(\"path/to/dbfile\", \"SELECT col1, col2 FROM tablename\")\ndf = source.read()\n```\n\nCan also iterate over table chunks\n```python\nsource_chunked = intake.open_duckdb(\"path/to/dbfile\", \"tablename\", chunks=10)\nsource_chunked.discover()\nfor chunk in source_chunked.read_chunked():\n    # do something\n    ...\n```\n\nDuckDB catalog: create an Intake catalog from a DuckDB backend\n```python\ncat = intake.open_duckdb_cat(\"path/to/dbfile\")\n\n# list the sources in 'cat'\nlist(cat)\n\ndf = cat[\"tablename\"].read()\ndf_chunks = [chunk for chunk in cat[\"tablename\"](chunks=10).read_chunked()]\n```\n\nRun DuckDB queries on other Intake sources (that produce pandas DataFrames) within the same catalog\n```yaml\n# cat.yaml\nsources:\n  csv_source:\n    args:\n      urlpath: https://data.csv\n    description: Remote CSV source\n    driver: csv\n\n  duck_source:\n    args:\n      targets:\n        - csv_source\n      sql_expr: SELECT col FROM csv_source LIMIT 10\n    description: Source referencing other sources in catalog\n    driver: duckdb_transform\n```\n```python\ncat  = intake.open_catalog(\"cat.yaml\")\nduck_source = cat.duck_source.read()\n```\n",
    "bugtrack_url": null,
    "license": "Copyright (c) 2023, Anaconda, Inc. All rights reserved.  Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:  Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.  Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ",
    "summary": "DuckDB plugin for Intake",
    "version": "0.1.2",
    "split_keywords": [
        "intake",
        "duckdb",
        "duck"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2341dd3031b3a834ff8723ffdcd0430a30a253f75c1cc64de719652e9290d675",
                "md5": "f5c4059ad854080467c778407be0a884",
                "sha256": "0897f57eae10ad9b1f1f573ca80dff02ed8cf2ebf8e45736093f981416efb43e"
            },
            "downloads": -1,
            "filename": "intake_duckdb-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f5c4059ad854080467c778407be0a884",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 7117,
            "upload_time": "2023-04-19T20:49:39",
            "upload_time_iso_8601": "2023-04-19T20:49:39.514956Z",
            "url": "https://files.pythonhosted.org/packages/23/41/dd3031b3a834ff8723ffdcd0430a30a253f75c1cc64de719652e9290d675/intake_duckdb-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "12d8c0f699275b203fff063575db365c5c4f975fe96dc89f3436628163b1dde7",
                "md5": "036d276e9c4a42ea4be9cea8cd0bf6db",
                "sha256": "7c395b8c5fb8bf1f9765e9481b81df8092d05d2fca991100e7b9ee0f7f03d5c4"
            },
            "downloads": -1,
            "filename": "intake-duckdb-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "036d276e9c4a42ea4be9cea8cd0bf6db",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 5336,
            "upload_time": "2023-04-19T20:49:41",
            "upload_time_iso_8601": "2023-04-19T20:49:41.187171Z",
            "url": "https://files.pythonhosted.org/packages/12/d8/c0f699275b203fff063575db365c5c4f975fe96dc89f3436628163b1dde7/intake-duckdb-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-19 20:49:41",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "intake-duckdb"
}
        
Elapsed time: 7.04518s