graphique


Namegraphique JSON
Version 2.0 PyPI version JSON
download
home_pageNone
SummaryGraphQL service for python dataframes and parquet datasets.
upload_time2025-10-16 02:37:12
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords graphql arrow parquet
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![image](https://img.shields.io/pypi/v/graphique.svg)](https://pypi.org/project/graphique/)
![image](https://img.shields.io/pypi/pyversions/graphique.svg)
[![image](https://pepy.tech/badge/graphique)](https://pepy.tech/project/graphique)
![image](https://img.shields.io/pypi/status/graphique.svg)
[![build](https://github.com/coady/graphique/actions/workflows/build.yml/badge.svg)](https://github.com/coady/graphique/actions/workflows/build.yml)
[![image](https://codecov.io/gh/coady/graphique/branch/main/graph/badge.svg)](https://codecov.io/gh/coady/graphique/)
[![CodeQL](https://github.com/coady/graphique/actions/workflows/github-code-scanning/codeql/badge.svg)](https://github.com/coady/graphique/actions/workflows/github-code-scanning/codeql)
[![image](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![image](https://mypy-lang.org/static/mypy_badge.svg)](https://mypy-lang.org/)

[GraphQL](https://graphql.org) service for [ibis](https://ibis-project.org) dataframes, [arrow](https://arrow.apache.org) tables, and [parquet](https://parquet.apache.org) datasets. The schema for a query API is derived automatically.

## Version 2
When this project started, there was no out-of-core execution engine with performance comparable to [PyArrow](https://arrow.apache.org/docs/python/index.html). So it effectively included one, based on datasets and [Acero](https://arrow.apache.org/docs/python/api/acero.html).

Since then the ecosystem has grown considerably: [DuckDB](https://duckdb.org), [DataFusion](https://datafusion.apache.org), and [Ibis](https://ibis-project.org). As of version 2, graphique is based on `ibis`. It provides a common dataframe API for multiple backends, enabling graphique to also have a default but configurable backend.

Being a major version upgrade, there are incompatible changes from version 1. However the overall API remains largely the same.

## Usage
There is an example app which reads a parquet dataset.
```console
env PARQUET_PATH=... uvicorn graphique.service:app
```

Open http://localhost:8000/ to try out the API in [GraphiQL](https://github.com/graphql/graphiql/tree/main/packages/graphiql#readme). There is a test fixture at `./tests/fixtures/zipcodes.parquet`.

```console
env PARQUET_PATH=... strawberry export-schema graphique.service:app.schema
```
outputs the graphql schema.

### Configuration
The example app uses [Starlette's config](https://www.starlette.io/config/): in environment variables or a `.env` file.

* PARQUET_PATH: path to the parquet directory or file
* FEDERATED = '': field name to extend type `Query` with a federated `Table`
* METRICS = False: include timings from apollo tracing extension
* COLUMNS = None: list of names, or mapping of aliases, of columns to select
* FILTERS = None: json `filter` query for which rows to read at startup

Configuration options exist to provide a convenient no-code solution, but are subject to change in the future. Using a custom app is recommended for production usage.

### App
For more options create a custom [ASGI](https://asgi.readthedocs.io/en/latest/index.html) app. Call graphique's `GraphQL` on an ibis [Table](https://ibis-project.org/reference/expression-tables) or arrow [Dataset](https://arrow.apache.org/docs/python/api/dataset.html).
Supply a mapping of names to datasets for multiple roots, and to enable federation.

```python
import ibis
from graphique import GraphQL

source = ibis.read_*(...)  # or ibis.connect(...).table(...) or pyarrow.dataset.dataset(...)
# apply initial projections or filters to `source`
app = GraphQL(source)  # Table is root query type
app = GraphQL.federated({<name>: source, ...}, keys={<name>: [], ...})  # Tables on federated fields
```

Start like any ASGI app.

```console
uvicorn <module>:app
```

### API
#### types
* `Dataset`: interface for an ibis table or arrow dataset.
* `Table`: implements the `Dataset` interface. Adds typed `row`, `columns`, and `filter` fields from introspecting the schema.
* `Column`: interface for an ibis column. Each data type has a corresponding column implementation: Boolean, Int, BigInt, Float, Decimal, Date, Datetime, Time, Duration, Base64, String, Array, Struct. All columns have a `values` field for their list of scalars. Additional fields vary by type.
* `Row`: scalar fields. Tables are column-oriented, and graphique encourages that usage for performance. A single `row` field is provided for convenience, but a field for a list of rows is not. Requesting parallel columns is far more efficient.

#### selection
* `slice`: contiguous selection of rows
* `filter`: select rows by predicates
* `join`: join tables by key columns
* `take`: rows by index
* `dropNull`: remove rows with nulls

#### projection
* `project`: project columns with expressions
* `columns`: provides a field for every `Column` in the schema
* `column`: access a column of any type by name
* `row`: provides a field for each scalar of a single row
* `cast`: cast column types
* `fillNull`: fill null values

#### aggregation
* `group`: group by given columns, and aggregate the others
* `distinct`: group with all columns
* `runs`: provisionally group by adjacency
* `unnest`: unnest an array column
* `count`: number of rows

#### ordering
* `order`: sort table by given columns
* options `limit` and `dense`: select rows with smallest or largest values

### Performance
Performance is dependent on the [ibis backend](https://ibis-project.org/backends/duckdb), which defaults to [duckdb](https://duckdb.org/). There are no internal Python loops. Scalars do not become Python types until serialized.

[PyArrow](https://arrow.apache.org/docs/python/) is also used for partitioned dataset optimizations, and for any feature which ibis does not support. Table fields are lazily evaluated up until scalars are reached, and automatically cached as needed for multiple fields.

## Installation
```console
pip install graphique[server]
```

## Dependencies
* ibis-framework (with duckdb or other backend)
* strawberry-graphql[asgi,cli]
* pyarrow
* isodate
* uvicorn (or other [ASGI server](https://asgi.readthedocs.io/en/latest/implementations.html))

## Tests
100% branch coverage.

```console
pytest [--cov]
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "graphique",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "graphql, arrow, parquet",
    "author": null,
    "author_email": "Aric Coady <aric.coady@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/2f/b7/125e01e96e94de272ffaf03e87c49f7f61d62cb68247e33cf5c07129d63d/graphique-2.0.tar.gz",
    "platform": null,
    "description": "[![image](https://img.shields.io/pypi/v/graphique.svg)](https://pypi.org/project/graphique/)\n![image](https://img.shields.io/pypi/pyversions/graphique.svg)\n[![image](https://pepy.tech/badge/graphique)](https://pepy.tech/project/graphique)\n![image](https://img.shields.io/pypi/status/graphique.svg)\n[![build](https://github.com/coady/graphique/actions/workflows/build.yml/badge.svg)](https://github.com/coady/graphique/actions/workflows/build.yml)\n[![image](https://codecov.io/gh/coady/graphique/branch/main/graph/badge.svg)](https://codecov.io/gh/coady/graphique/)\n[![CodeQL](https://github.com/coady/graphique/actions/workflows/github-code-scanning/codeql/badge.svg)](https://github.com/coady/graphique/actions/workflows/github-code-scanning/codeql)\n[![image](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![image](https://mypy-lang.org/static/mypy_badge.svg)](https://mypy-lang.org/)\n\n[GraphQL](https://graphql.org) service for [ibis](https://ibis-project.org) dataframes, [arrow](https://arrow.apache.org) tables, and [parquet](https://parquet.apache.org) datasets. The schema for a query API is derived automatically.\n\n## Version 2\nWhen this project started, there was no out-of-core execution engine with performance comparable to [PyArrow](https://arrow.apache.org/docs/python/index.html). So it effectively included one, based on datasets and [Acero](https://arrow.apache.org/docs/python/api/acero.html).\n\nSince then the ecosystem has grown considerably: [DuckDB](https://duckdb.org), [DataFusion](https://datafusion.apache.org), and [Ibis](https://ibis-project.org). As of version 2, graphique is based on `ibis`. It provides a common dataframe API for multiple backends, enabling graphique to also have a default but configurable backend.\n\nBeing a major version upgrade, there are incompatible changes from version 1. However the overall API remains largely the same.\n\n## Usage\nThere is an example app which reads a parquet dataset.\n```console\nenv PARQUET_PATH=... uvicorn graphique.service:app\n```\n\nOpen http://localhost:8000/ to try out the API in [GraphiQL](https://github.com/graphql/graphiql/tree/main/packages/graphiql#readme). There is a test fixture at `./tests/fixtures/zipcodes.parquet`.\n\n```console\nenv PARQUET_PATH=... strawberry export-schema graphique.service:app.schema\n```\noutputs the graphql schema.\n\n### Configuration\nThe example app uses [Starlette's config](https://www.starlette.io/config/): in environment variables or a `.env` file.\n\n* PARQUET_PATH: path to the parquet directory or file\n* FEDERATED = '': field name to extend type `Query` with a federated `Table`\n* METRICS = False: include timings from apollo tracing extension\n* COLUMNS = None: list of names, or mapping of aliases, of columns to select\n* FILTERS = None: json `filter` query for which rows to read at startup\n\nConfiguration options exist to provide a convenient no-code solution, but are subject to change in the future. Using a custom app is recommended for production usage.\n\n### App\nFor more options create a custom [ASGI](https://asgi.readthedocs.io/en/latest/index.html) app. Call graphique's `GraphQL` on an ibis [Table](https://ibis-project.org/reference/expression-tables) or arrow [Dataset](https://arrow.apache.org/docs/python/api/dataset.html).\nSupply a mapping of names to datasets for multiple roots, and to enable federation.\n\n```python\nimport ibis\nfrom graphique import GraphQL\n\nsource = ibis.read_*(...)  # or ibis.connect(...).table(...) or pyarrow.dataset.dataset(...)\n# apply initial projections or filters to `source`\napp = GraphQL(source)  # Table is root query type\napp = GraphQL.federated({<name>: source, ...}, keys={<name>: [], ...})  # Tables on federated fields\n```\n\nStart like any ASGI app.\n\n```console\nuvicorn <module>:app\n```\n\n### API\n#### types\n* `Dataset`: interface for an ibis table or arrow dataset.\n* `Table`: implements the `Dataset` interface. Adds typed `row`, `columns`, and `filter` fields from introspecting the schema.\n* `Column`: interface for an ibis column. Each data type has a corresponding column implementation: Boolean, Int, BigInt, Float, Decimal, Date, Datetime, Time, Duration, Base64, String, Array, Struct. All columns have a `values` field for their list of scalars. Additional fields vary by type.\n* `Row`: scalar fields. Tables are column-oriented, and graphique encourages that usage for performance. A single `row` field is provided for convenience, but a field for a list of rows is not. Requesting parallel columns is far more efficient.\n\n#### selection\n* `slice`: contiguous selection of rows\n* `filter`: select rows by predicates\n* `join`: join tables by key columns\n* `take`: rows by index\n* `dropNull`: remove rows with nulls\n\n#### projection\n* `project`: project columns with expressions\n* `columns`: provides a field for every `Column` in the schema\n* `column`: access a column of any type by name\n* `row`: provides a field for each scalar of a single row\n* `cast`: cast column types\n* `fillNull`: fill null values\n\n#### aggregation\n* `group`: group by given columns, and aggregate the others\n* `distinct`: group with all columns\n* `runs`: provisionally group by adjacency\n* `unnest`: unnest an array column\n* `count`: number of rows\n\n#### ordering\n* `order`: sort table by given columns\n* options `limit` and `dense`: select rows with smallest or largest values\n\n### Performance\nPerformance is dependent on the [ibis backend](https://ibis-project.org/backends/duckdb), which defaults to [duckdb](https://duckdb.org/). There are no internal Python loops. Scalars do not become Python types until serialized.\n\n[PyArrow](https://arrow.apache.org/docs/python/) is also used for partitioned dataset optimizations, and for any feature which ibis does not support. Table fields are lazily evaluated up until scalars are reached, and automatically cached as needed for multiple fields.\n\n## Installation\n```console\npip install graphique[server]\n```\n\n## Dependencies\n* ibis-framework (with duckdb or other backend)\n* strawberry-graphql[asgi,cli]\n* pyarrow\n* isodate\n* uvicorn (or other [ASGI server](https://asgi.readthedocs.io/en/latest/implementations.html))\n\n## Tests\n100% branch coverage.\n\n```console\npytest [--cov]\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "GraphQL service for python dataframes and parquet datasets.",
    "version": "2.0",
    "project_urls": {
        "Changelog": "https://github.com/coady/graphique/blob/main/CHANGELOG.md",
        "Documentation": "https://coady.github.io/graphique",
        "Homepage": "https://github.com/coady/graphique",
        "Issues": "https://github.com/coady/graphique/issues"
    },
    "split_keywords": [
        "graphql",
        " arrow",
        " parquet"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c2e8f70b2b50aeb39f98796bc26ffcf7cff7c500b33bac77e4f62da3097a6dfb",
                "md5": "2fa1a839fd7969ab4a97eb875a4befcc",
                "sha256": "9e819da37d48555cb318fd6dc8fd0b32fed5e0abd15c026ce04750f53d7d45c7"
            },
            "downloads": -1,
            "filename": "graphique-2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2fa1a839fd7969ab4a97eb875a4befcc",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 23215,
            "upload_time": "2025-10-16T02:37:10",
            "upload_time_iso_8601": "2025-10-16T02:37:10.656650Z",
            "url": "https://files.pythonhosted.org/packages/c2/e8/f70b2b50aeb39f98796bc26ffcf7cff7c500b33bac77e4f62da3097a6dfb/graphique-2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2fb7125e01e96e94de272ffaf03e87c49f7f61d62cb68247e33cf5c07129d63d",
                "md5": "2508959aa9f54f931d781a431e755028",
                "sha256": "a80f3febc63f53af8ec6bf564633c2f989119896636e86e15937463adf1c5bfb"
            },
            "downloads": -1,
            "filename": "graphique-2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2508959aa9f54f931d781a431e755028",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 29919,
            "upload_time": "2025-10-16T02:37:12",
            "upload_time_iso_8601": "2025-10-16T02:37:12.206136Z",
            "url": "https://files.pythonhosted.org/packages/2f/b7/125e01e96e94de272ffaf03e87c49f7f61d62cb68247e33cf5c07129d63d/graphique-2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-16 02:37:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "coady",
    "github_project": "graphique",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "graphique"
}
        
Elapsed time: 4.08102s