kedro-datasets


Namekedro-datasets JSON
Version 5.1.0 PyPI version JSON
download
home_pageNone
SummaryKedro-Datasets is where you can find all of Kedro's data connectors.
upload_time2024-10-18 15:20:08
maintainerNone
docs_urlNone
authorKedro
requires_python>=3.10
licenseApache Software License (Apache 2.0)
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Kedro-Datasets

<!-- Note that the contents of this file are also used in the documentation, see docs/source/index.md -->

[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/kedro-org/kedro-plugins/blob/main/LICENSE.md)
[![Python Version](https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12-blue.svg)](https://pypi.org/project/kedro-datasets/)
[![PyPI Version](https://badge.fury.io/py/kedro-datasets.svg)](https://pypi.org/project/kedro-datasets/)
[![Code Style: Black](https://img.shields.io/badge/code%20style-black-black.svg)](https://github.com/ambv/black)

Welcome to `kedro_datasets`, the home of Kedro's data connectors. Here you will find `AbstractDataset` implementations powering Kedro's DataCatalog created by QuantumBlack and external contributors.

## Installation

`kedro-datasets` is a Python plugin. To install it:

```bash
pip install kedro-datasets
```

### Install dependencies at a group-level

Datasets are organised into groups e.g. `pandas`, `spark` and `pickle`. Each group has a collection of datasets, e.g.`pandas.CSVDataset`, `pandas.ParquetDataset` and more. You can install dependencies for an entire group of dependencies as follows:

```bash
pip install "kedro-datasets[<group>]"
```

This installs Kedro-Datasets and dependencies related to the dataset group. An example of this could be a workflow that depends on the data types in `pandas`. Run `pip install 'kedro-datasets[pandas]'` to install Kedro-Datasets and the dependencies for the datasets in the [`pandas` group](https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets/kedro_datasets/pandas).

### Install dependencies at a type-level

To limit installation to dependencies specific to a dataset:

```bash
pip install "kedro-datasets[<group>-<dataset>]"
```

For example, your workflow might require the `pandas.ExcelDataset`, so to install its dependencies, run `pip install "kedro-datasets[pandas-exceldataset]"`.

```{note}
From `kedro-datasets` version 3.0.0 onwards, the names of the optional dataset-level dependencies have been normalised to follow [PEP 685](https://peps.python.org/pep-0685/). The '.' character has been replaced with a '-' character and the names are in lowercase. For example, if you had `kedro-datasets[pandas.ExcelDataset]` in your requirements file, it would have to be changed to `kedro-datasets[pandas-exceldataset]`.
```

## What `AbstractDataset` implementations are supported?

We support a range of data connectors, including CSV, Excel, Parquet, Feather, HDF5, JSON, Pickle, SQL Tables, SQL Queries, Spark DataFrames and more. We even allow support for working with images.

These data connectors are supported with the APIs of `pandas`, `spark`, `networkx`, `matplotlib`, `yaml` and more.

[The Data Catalog](https://docs.kedro.org/en/stable/data/data_catalog.html) allows you to work with a range of file formats on local file systems, network file systems, cloud object stores, and Hadoop.

Here is a full list of [supported data connectors and APIs](https://docs.kedro.org/projects/kedro-datasets/en/kedro-datasets-2.0.0/api/kedro_datasets.html).

## How can I create my own `AbstractDataset` implementation?
Take a look at our [instructions on how to create your own `AbstractDataset` implementation](https://docs.kedro.org/en/stable/data/how_to_create_a_custom_dataset.html).

## Can I contribute?

Yes! Want to help build Kedro-Datasets? Check out our guide to [contributing](https://github.com/kedro-org/kedro-plugins/blob/main/kedro-datasets/CONTRIBUTING.md).

## What licence do you use?

Kedro-Datasets is licensed under the [Apache 2.0](https://github.com/kedro-org/kedro-plugins/blob/main/LICENSE.md) License.

## Python version support policy
* The [Kedro-Datasets](https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets) package follows the [NEP 29](https://numpy.org/neps/nep-0029-deprecation_policy.html) Python version support policy.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "kedro-datasets",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Kedro",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/3c/f4/6492a56b1f5b0d6c8dfbf55b0793e7f68a2089883522484abac33d1ed722/kedro_datasets-5.1.0.tar.gz",
    "platform": null,
    "description": "# Kedro-Datasets\n\n<!-- Note that the contents of this file are also used in the documentation, see docs/source/index.md -->\n\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/kedro-org/kedro-plugins/blob/main/LICENSE.md)\n[![Python Version](https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12-blue.svg)](https://pypi.org/project/kedro-datasets/)\n[![PyPI Version](https://badge.fury.io/py/kedro-datasets.svg)](https://pypi.org/project/kedro-datasets/)\n[![Code Style: Black](https://img.shields.io/badge/code%20style-black-black.svg)](https://github.com/ambv/black)\n\nWelcome to `kedro_datasets`, the home of Kedro's data connectors. Here you will find `AbstractDataset` implementations powering Kedro's DataCatalog created by QuantumBlack and external contributors.\n\n## Installation\n\n`kedro-datasets` is a Python plugin. To install it:\n\n```bash\npip install kedro-datasets\n```\n\n### Install dependencies at a group-level\n\nDatasets are organised into groups e.g. `pandas`, `spark` and `pickle`. Each group has a collection of datasets, e.g.`pandas.CSVDataset`, `pandas.ParquetDataset` and more. You can install dependencies for an entire group of dependencies as follows:\n\n```bash\npip install \"kedro-datasets[<group>]\"\n```\n\nThis installs Kedro-Datasets and dependencies related to the dataset group. An example of this could be a workflow that depends on the data types in `pandas`. Run `pip install 'kedro-datasets[pandas]'` to install Kedro-Datasets and the dependencies for the datasets in the [`pandas` group](https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets/kedro_datasets/pandas).\n\n### Install dependencies at a type-level\n\nTo limit installation to dependencies specific to a dataset:\n\n```bash\npip install \"kedro-datasets[<group>-<dataset>]\"\n```\n\nFor example, your workflow might require the `pandas.ExcelDataset`, so to install its dependencies, run `pip install \"kedro-datasets[pandas-exceldataset]\"`.\n\n```{note}\nFrom `kedro-datasets` version 3.0.0 onwards, the names of the optional dataset-level dependencies have been normalised to follow [PEP 685](https://peps.python.org/pep-0685/). The '.' character has been replaced with a '-' character and the names are in lowercase. For example, if you had `kedro-datasets[pandas.ExcelDataset]` in your requirements file, it would have to be changed to `kedro-datasets[pandas-exceldataset]`.\n```\n\n## What `AbstractDataset` implementations are supported?\n\nWe support a range of data connectors, including CSV, Excel, Parquet, Feather, HDF5, JSON, Pickle, SQL Tables, SQL Queries, Spark DataFrames and more. We even allow support for working with images.\n\nThese data connectors are supported with the APIs of `pandas`, `spark`, `networkx`, `matplotlib`, `yaml` and more.\n\n[The Data Catalog](https://docs.kedro.org/en/stable/data/data_catalog.html) allows you to work with a range of file formats on local file systems, network file systems, cloud object stores, and Hadoop.\n\nHere is a full list of [supported data connectors and APIs](https://docs.kedro.org/projects/kedro-datasets/en/kedro-datasets-2.0.0/api/kedro_datasets.html).\n\n## How can I create my own `AbstractDataset` implementation?\nTake a look at our [instructions on how to create your own `AbstractDataset` implementation](https://docs.kedro.org/en/stable/data/how_to_create_a_custom_dataset.html).\n\n## Can I contribute?\n\nYes! Want to help build Kedro-Datasets? Check out our guide to [contributing](https://github.com/kedro-org/kedro-plugins/blob/main/kedro-datasets/CONTRIBUTING.md).\n\n## What licence do you use?\n\nKedro-Datasets is licensed under the [Apache 2.0](https://github.com/kedro-org/kedro-plugins/blob/main/LICENSE.md) License.\n\n## Python version support policy\n* The [Kedro-Datasets](https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets) package follows the [NEP 29](https://numpy.org/neps/nep-0029-deprecation_policy.html) Python version support policy.\n",
    "bugtrack_url": null,
    "license": "Apache Software License (Apache 2.0)",
    "summary": "Kedro-Datasets is where you can find all of Kedro's data connectors.",
    "version": "5.1.0",
    "project_urls": {
        "Documentation": "https://docs.kedro.org",
        "Source": "https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets",
        "Tracker": "https://github.com/kedro-org/kedro-plugins/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3f51bdb760c9b5c23854c73c4dae0d3757ef1f9b9423f2b15199db26dcdcd6c1",
                "md5": "614b19bbc374dd9f56c7416a0342bc74",
                "sha256": "6636b6fb7d469a04b38e1b37d898a31705a761687967439dea2f2ff9fa0e10ae"
            },
            "downloads": -1,
            "filename": "kedro_datasets-5.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "614b19bbc374dd9f56c7416a0342bc74",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 191478,
            "upload_time": "2024-10-18T15:20:06",
            "upload_time_iso_8601": "2024-10-18T15:20:06.442905Z",
            "url": "https://files.pythonhosted.org/packages/3f/51/bdb760c9b5c23854c73c4dae0d3757ef1f9b9423f2b15199db26dcdcd6c1/kedro_datasets-5.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3cf46492a56b1f5b0d6c8dfbf55b0793e7f68a2089883522484abac33d1ed722",
                "md5": "fafb58f541a2ef0e58e57aff55216af5",
                "sha256": "f958c3c8c4d7f1c97ebf36d747255374d312d430fad4643b1d7f9eed1bcf574a"
            },
            "downloads": -1,
            "filename": "kedro_datasets-5.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "fafb58f541a2ef0e58e57aff55216af5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 107856,
            "upload_time": "2024-10-18T15:20:08",
            "upload_time_iso_8601": "2024-10-18T15:20:08.407916Z",
            "url": "https://files.pythonhosted.org/packages/3c/f4/6492a56b1f5b0d6c8dfbf55b0793e7f68a2089883522484abac33d1ed722/kedro_datasets-5.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-18 15:20:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kedro-org",
    "github_project": "kedro-plugins",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "kedro-datasets"
}
        
Elapsed time: 0.41043s