udata-hydra-csvapi


Nameudata-hydra-csvapi JSON
Version 0.2.3 PyPI version JSON
download
home_pageNone
SummaryAPI for CSV converted by udata-hydra
upload_time2024-11-28 14:24:28
maintainerNone
docs_urlNone
authordata.gouv.fr
requires_python<4.0,>=3.11
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Api-tabular

This connects to [hydra](https://github.com/datagouv/hydra) and serves the converted CSVs as an API.

## Run locally

Start [hydra](https://github.com/datagouv/hydra) via `docker compose`.

Launch this project:

```shell
docker compose up
```

You can now access the raw postgrest API on http://localhost:8080.

Now you can launch the proxy (ie the app):

```shell
poetry install
poetry run adev runserver -p8005 api_tabular/app.py        # Api related to apified CSV files by udata-hydra
poetry run adev runserver -p8005 api_tabular/metrics.py    # Api related to udata's metrics
```

And query postgrest via the proxy using a `resource_id`, cf below. Test resource_id is `aaaaaaaa-1111-bbbb-2222-cccccccccccc`

## API

### Meta informations on resource

```shell
curl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/
```

```json
{
  "created_at": "2023-04-21T22:54:22.043492+00:00",
  "url": "https://data.gouv.fr/datasets/example/resources/fake.csv",
  "links": [
    {
      "href": "/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/profile/",
      "type": "GET",
      "rel": "profile"
    },
    {
      "href": "/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/",
      "type": "GET",
      "rel": "data"
    },
    {
      "href": "/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/swagger/",
      "type": "GET",
      "rel": "swagger"
    }
  ]
}
```

### Profile (csv-detective output) for a resource

```shell
curl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/profile/
```

```json
{
  "profile": {
    "header": [
        "id",
        "score",
        "decompte",
        "is_true",
        "birth",
        "liste"
    ]
  },
  "...": "..."
}
```

### Data for a resource (ie resource API)

```shell
curl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/
```

```json
{
  "data": [
    {
        "__id": 1,
        "id": " 8c7a6452-9295-4db2-b692-34104574fded",
        "score": 0.708,
        "decompte": 90,
        "is_true": false,
        "birth": "1949-07-16",
        "liste": "[0]"
    },
    ...
  ],
  "links": {
      "profile": "http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/profile/",
      "swagger": "http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/swagger/",
      "next": "http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/?page=2&page_size=20",
      "prev": null
  },
  "meta": {
      "page": 1,
      "page_size": 20,
      "total": 1000
  }
}
```

This endpoint can be queried with the following operators as query string (replacing `column_name` with the name of an actual column), if the column type allows it (see the swagger for each column's allowed parameter):

```
# sort by column
column_name__sort=asc
column_name__sort=desc

# exact value
column_name__exact=value

# differs
column_name__differs=value

# contains (for strings only)
column_name__contains=value

# in (value in list)
column_name__in=value1,value2,value3

# less
column_name__less=value

# greater
column_name__greater=value

# strictly less
column_name__strictly_less=value

# strictly greater
column_name__strictly_greater=value

# group by values
column_name__groupby

# count values
column_name__count

# mean / average
column_name__avg

# minimum
column_name__min

# maximum
column_name__max

# sum
column_name__sum
```

> /!\ WARNING: aggregation requests are only available for resources that are listed in the `ALLOW_AGGREGATION` list of the config file.

> NB : passing an aggregation operator (`count`, `avg`, `min`, `max`, `sum`) returns a column that is named `<column_name>__<operator>` (for instance: `?birth__groupby&score__sum` will return a list of dicts with the keys `birth` and `score__sum`).

For instance:
```shell
curl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/?score__greater=0.9&decompte__exact=13
```
returns
```json
{
  "data": [
    {
      "__id": 52,
      "id": " 5174f26d-d62b-4adb-a43a-c3b6288fa2f6",
      "score": 0.985,
      "decompte": 13,
      "is_true": false,
      "birth": "1980-03-23",
      "liste": "[0]"
    },
    {
      "__id": 543,
      "id": " 8705df7c-8a6a-49e2-9514-cf2fb532525e",
      "score": 0.955,
      "decompte": 13,
      "is_true": true,
      "birth": "1965-02-06",
      "liste": "[0, 1, 2]"
    }
  ],
  "links": {
    "profile": "http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/profile/",
    "swagger": "http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/swagger/",
    "next": null,
    "prev": null
  },
  "meta": {
    "page": 1,
    "page_size": 20,
    "total": 2
  }
}
```

With filters and aggregators (filtering is always done **before** aggregation, no matter the order in the parameters):
```shell
curl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/?decompte__groupby&birth__less=1996&score__avg
```
i.e. `decompte` and average of `score` for all rows where `birth<="1996"`, grouped by `decompte`, returns
```json
{
    "data": [
        {
            "decompte": 55,
            "score__avg": 0.7123333333333334
        },
        {
            "decompte": 27,
            "score__avg": 0.6068888888888889
        },
        {
            "decompte": 23,
            "score__avg": 0.4603333333333334
        },
        ...
    ]
}
```

Pagination is made through queries with `page` and `page_size`:
```shell
curl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/?page=2&page_size=30
```


## Contributing

### Pre-commit hook

This repository uses a [pre-commit](https://pre-commit.com/) hook which lint and format code before each commit.
Please install it with:
```shell
poetry run pre-commit install
```

### Lint and format code

To lint, format and sort imports, this repository uses [Ruff](https://astral.sh/ruff/).
You can run the following command to lint and format the code:
```shell
poetry run ruff check --fix && poetry run ruff format
```

### Releases

The release process uses [bump'X](https://github.com/datagouv/bumpx).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "udata-hydra-csvapi",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": "data.gouv.fr",
    "author_email": "opendatateam@data.gouv.fr",
    "download_url": "https://files.pythonhosted.org/packages/7f/cd/39ee08e874a28938d3cda7ae3bd378f1aa7f2f7b982129a2cdc534b8dfe1/udata_hydra_csvapi-0.2.3.tar.gz",
    "platform": null,
    "description": "# Api-tabular\n\nThis connects to [hydra](https://github.com/datagouv/hydra) and serves the converted CSVs as an API.\n\n## Run locally\n\nStart [hydra](https://github.com/datagouv/hydra) via `docker compose`.\n\nLaunch this project:\n\n```shell\ndocker compose up\n```\n\nYou can now access the raw postgrest API on http://localhost:8080.\n\nNow you can launch the proxy (ie the app):\n\n```shell\npoetry install\npoetry run adev runserver -p8005 api_tabular/app.py        # Api related to apified CSV files by udata-hydra\npoetry run adev runserver -p8005 api_tabular/metrics.py    # Api related to udata's metrics\n```\n\nAnd query postgrest via the proxy using a `resource_id`, cf below. Test resource_id is `aaaaaaaa-1111-bbbb-2222-cccccccccccc`\n\n## API\n\n### Meta informations on resource\n\n```shell\ncurl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/\n```\n\n```json\n{\n  \"created_at\": \"2023-04-21T22:54:22.043492+00:00\",\n  \"url\": \"https://data.gouv.fr/datasets/example/resources/fake.csv\",\n  \"links\": [\n    {\n      \"href\": \"/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/profile/\",\n      \"type\": \"GET\",\n      \"rel\": \"profile\"\n    },\n    {\n      \"href\": \"/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/\",\n      \"type\": \"GET\",\n      \"rel\": \"data\"\n    },\n    {\n      \"href\": \"/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/swagger/\",\n      \"type\": \"GET\",\n      \"rel\": \"swagger\"\n    }\n  ]\n}\n```\n\n### Profile (csv-detective output) for a resource\n\n```shell\ncurl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/profile/\n```\n\n```json\n{\n  \"profile\": {\n    \"header\": [\n        \"id\",\n        \"score\",\n        \"decompte\",\n        \"is_true\",\n        \"birth\",\n        \"liste\"\n    ]\n  },\n  \"...\": \"...\"\n}\n```\n\n### Data for a resource (ie resource API)\n\n```shell\ncurl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/\n```\n\n```json\n{\n  \"data\": [\n    {\n        \"__id\": 1,\n        \"id\": \" 8c7a6452-9295-4db2-b692-34104574fded\",\n        \"score\": 0.708,\n        \"decompte\": 90,\n        \"is_true\": false,\n        \"birth\": \"1949-07-16\",\n        \"liste\": \"[0]\"\n    },\n    ...\n  ],\n  \"links\": {\n      \"profile\": \"http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/profile/\",\n      \"swagger\": \"http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/swagger/\",\n      \"next\": \"http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/?page=2&page_size=20\",\n      \"prev\": null\n  },\n  \"meta\": {\n      \"page\": 1,\n      \"page_size\": 20,\n      \"total\": 1000\n  }\n}\n```\n\nThis endpoint can be queried with the following operators as query string (replacing `column_name` with the name of an actual column), if the column type allows it (see the swagger for each column's allowed parameter):\n\n```\n# sort by column\ncolumn_name__sort=asc\ncolumn_name__sort=desc\n\n# exact value\ncolumn_name__exact=value\n\n# differs\ncolumn_name__differs=value\n\n# contains (for strings only)\ncolumn_name__contains=value\n\n# in (value in list)\ncolumn_name__in=value1,value2,value3\n\n# less\ncolumn_name__less=value\n\n# greater\ncolumn_name__greater=value\n\n# strictly less\ncolumn_name__strictly_less=value\n\n# strictly greater\ncolumn_name__strictly_greater=value\n\n# group by values\ncolumn_name__groupby\n\n# count values\ncolumn_name__count\n\n# mean / average\ncolumn_name__avg\n\n# minimum\ncolumn_name__min\n\n# maximum\ncolumn_name__max\n\n# sum\ncolumn_name__sum\n```\n\n> /!\\ WARNING: aggregation requests are only available for resources that are listed in the `ALLOW_AGGREGATION` list of the config file.\n\n> NB : passing an aggregation operator (`count`, `avg`, `min`, `max`, `sum`) returns a column that is named `<column_name>__<operator>` (for instance: `?birth__groupby&score__sum` will return a list of dicts with the keys `birth` and `score__sum`).\n\nFor instance:\n```shell\ncurl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/?score__greater=0.9&decompte__exact=13\n```\nreturns\n```json\n{\n  \"data\": [\n    {\n      \"__id\": 52,\n      \"id\": \" 5174f26d-d62b-4adb-a43a-c3b6288fa2f6\",\n      \"score\": 0.985,\n      \"decompte\": 13,\n      \"is_true\": false,\n      \"birth\": \"1980-03-23\",\n      \"liste\": \"[0]\"\n    },\n    {\n      \"__id\": 543,\n      \"id\": \" 8705df7c-8a6a-49e2-9514-cf2fb532525e\",\n      \"score\": 0.955,\n      \"decompte\": 13,\n      \"is_true\": true,\n      \"birth\": \"1965-02-06\",\n      \"liste\": \"[0, 1, 2]\"\n    }\n  ],\n  \"links\": {\n    \"profile\": \"http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/profile/\",\n    \"swagger\": \"http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/swagger/\",\n    \"next\": null,\n    \"prev\": null\n  },\n  \"meta\": {\n    \"page\": 1,\n    \"page_size\": 20,\n    \"total\": 2\n  }\n}\n```\n\nWith filters and aggregators (filtering is always done **before** aggregation, no matter the order in the parameters):\n```shell\ncurl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/?decompte__groupby&birth__less=1996&score__avg\n```\ni.e. `decompte` and average of `score` for all rows where `birth<=\"1996\"`, grouped by `decompte`, returns\n```json\n{\n    \"data\": [\n        {\n            \"decompte\": 55,\n            \"score__avg\": 0.7123333333333334\n        },\n        {\n            \"decompte\": 27,\n            \"score__avg\": 0.6068888888888889\n        },\n        {\n            \"decompte\": 23,\n            \"score__avg\": 0.4603333333333334\n        },\n        ...\n    ]\n}\n```\n\nPagination is made through queries with `page` and `page_size`:\n```shell\ncurl http://localhost:8005/api/resources/aaaaaaaa-1111-bbbb-2222-cccccccccccc/data/?page=2&page_size=30\n```\n\n\n## Contributing\n\n### Pre-commit hook\n\nThis repository uses a [pre-commit](https://pre-commit.com/) hook which lint and format code before each commit.\nPlease install it with:\n```shell\npoetry run pre-commit install\n```\n\n### Lint and format code\n\nTo lint, format and sort imports, this repository uses [Ruff](https://astral.sh/ruff/).\nYou can run the following command to lint and format the code:\n```shell\npoetry run ruff check --fix && poetry run ruff format\n```\n\n### Releases\n\nThe release process uses [bump'X](https://github.com/datagouv/bumpx).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "API for CSV converted by udata-hydra",
    "version": "0.2.3",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "853524e8eb453e4d2b26852aded665f4d52ac6b976ab694afebe362524abe59d",
                "md5": "d6e66422072dd0c03240e611d3fb5908",
                "sha256": "eb500a7839ce9404e25913d8fece9d06d636e3e696b2ec213e37ff288ccd52c5"
            },
            "downloads": -1,
            "filename": "udata_hydra_csvapi-0.2.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d6e66422072dd0c03240e611d3fb5908",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 13322,
            "upload_time": "2024-11-28T14:24:27",
            "upload_time_iso_8601": "2024-11-28T14:24:27.429169Z",
            "url": "https://files.pythonhosted.org/packages/85/35/24e8eb453e4d2b26852aded665f4d52ac6b976ab694afebe362524abe59d/udata_hydra_csvapi-0.2.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7fcd39ee08e874a28938d3cda7ae3bd378f1aa7f2f7b982129a2cdc534b8dfe1",
                "md5": "79efea35d7f5e1057bfb51b8a3aa757b",
                "sha256": "27094bc5ed4b7ab454bc4b84631ea9995a9f4a44b51d5ef0767cb6005513c10e"
            },
            "downloads": -1,
            "filename": "udata_hydra_csvapi-0.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "79efea35d7f5e1057bfb51b8a3aa757b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 11814,
            "upload_time": "2024-11-28T14:24:28",
            "upload_time_iso_8601": "2024-11-28T14:24:28.374728Z",
            "url": "https://files.pythonhosted.org/packages/7f/cd/39ee08e874a28938d3cda7ae3bd378f1aa7f2f7b982129a2cdc534b8dfe1/udata_hydra_csvapi-0.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-28 14:24:28",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "udata-hydra-csvapi"
}
        
Elapsed time: 0.57610s