deepecho


Namedeepecho JSON
Version 0.6.0 PyPI version JSON
download
home_pageNone
SummaryCreate sequential synthetic data of mixed types using a GAN.
upload_time2024-04-11 02:18:36
maintainerNone
docs_urlNone
authorNone
requires_python<3.13,>=3.8
licenseBSL-1.1
keywords deepecho deepecho
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
<br/>
<p align="center">
    <i>This repository is part of <a href="https://sdv.dev">The Synthetic Data Vault Project</a>, a project from <a href="https://datacebo.com">DataCebo</a>.</i>
</p>

[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
[![PyPi Shield](https://img.shields.io/pypi/v/deepecho.svg)](https://pypi.python.org/pypi/deepecho)
[![Tests](https://github.com/sdv-dev/DeepEcho/workflows/Run%20Tests/badge.svg)](https://github.com/sdv-dev/DeepEcho/actions?query=workflow%3A%22Run+Tests%22+branch%3Amain)
[![Downloads](https://pepy.tech/badge/deepecho)](https://pepy.tech/project/deepecho)
[![Coverage Status](https://codecov.io/gh/sdv-dev/DeepEcho/branch/main/graph/badge.svg)](https://codecov.io/gh/sdv-dev/DeepEcho)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/sdv-dev/DeepEcho/main?filepath=tutorials/timeseries_data)
[![Slack](https://img.shields.io/badge/Slack%20Workspace-Join%20now!-36C5F0?logo=slack)](https://bit.ly/sdv-slack-invite)

<div align="left">
<br/>
<p align="center">
<a href="https://github.com/sdv-dev/DeepEcho">
<img align="center" width=40% src="https://github.com/sdv-dev/SDV/blob/stable/docs/images/DeepEcho-DataCebo.png"></img>
</a>
</p>
</div>

</div>

# Overview

**DeepEcho** is a **Synthetic Data Generation** Python library for **mixed-type**, **multivariate
time series**. It provides:

1. Multiple models based both on **classical statistical modeling** of time series and the latest
   in **Deep Learning** techniques.
2. A robust [benchmarking framework](https://github.com/sdv-dev/SDGym) for evaluating these methods
   on multiple datasets and with multiple metrics.
3. Ability for **Machine Learning researchers** to submit new methods following our `model` and
   `sample` API and get evaluated.

| Important Links                               |                                                                      |
| --------------------------------------------- | -------------------------------------------------------------------- |
| :computer: **[Website]**                      | Check out the SDV Website for more information about the project.    |
| :orange_book: **[SDV Blog]**                  | Regular publshing of useful content about Synthetic Data Generation. |
| :book: **[Documentation]**                    | Quickstarts, User and Development Guides, and API Reference.         |
| :octocat: **[Repository]**                    | The link to the Github Repository of this library.                   |
| :keyboard: **[Development Status]**           | This software is in its Pre-Alpha stage.                             |
| [![][Slack Logo] **Community**][Community]    | Join our Slack Workspace for announcements and discussions.          |
| [![][MyBinder Logo] **Tutorials**][Tutorials] | Run the SDV Tutorials in a Binder environment.                       |

[Website]: https://sdv.dev
[SDV Blog]: https://sdv.dev/blog
[Documentation]: https://sdv.dev/SDV
[Repository]: https://github.com/sdv-dev/DeepEcho
[License]: https://github.com/sdv-dev/DeepEcho/blob/main/LICENSE
[Development Status]: https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha
[Slack Logo]: https://github.com/sdv-dev/SDV/blob/stable/docs/images/slack.png
[Community]: https://bit.ly/sdv-slack-invite
[MyBinder Logo]: https://github.com/sdv-dev/SDV/blob/stable/docs/images/mybinder.png
[Tutorials]: https://mybinder.org/v2/gh/sdv-dev/DeepEcho/main?filepath=tutorials

# Install

**DeepEcho** is part of the **SDV** project and is automatically installed alongside it. For
details about this process please visit the [SDV Installation Guide](
https://sdv.dev/SDV/getting_started/install.html)

Optionally, **DeepEcho** can also be installed as a standalone library using the following commands:

**Using `pip`:**

```bash
pip install deepecho
```

**Using `conda`:**

```bash
conda install -c pytorch -c conda-forge deepecho
```

For more installation options please visit the [DeepEcho installation Guide](INSTALL.md)

# Quickstart

**DeepEcho** is included as part of [SDV](https://sdv.dev/SDV) to model and sample synthetic
time series. In most cases, usage through SDV is recommeded, since it provides additional
functionalities which are not available here. For more details about how to use DeepEcho
whithin SDV, please visit the corresponding User Guide:

* [SDV TimeSeries User Guide](https://sdv.dev/SDV/user_guides/timeseries/par.html)

## Standalone usage

**DeepEcho** can also be used as a standalone library.

In this short quickstart, we show how to learn a mixed-type multivariate time series
dataset and then generate synthetic data that resembles it.

We will start by loading the data and preparing the instance of our model.

```python3
from deepecho import PARModel
from deepecho.demo import load_demo

# Load demo data
data = load_demo()

# Define data types for all the columns
data_types = {
    'region': 'categorical',
    'day_of_week': 'categorical',
    'total_sales': 'continuous',
    'nb_customers': 'count',
}

model = PARModel(cuda=False)
```

If we want to use different settings for our model, like increasing the number
of epochs or enabling CUDA, we can pass the arguments when creating the model:

```python  # keep this as python (without the 3) to avoid using it in test-readme
model = PARModel(epochs=1024, cuda=True)
```

Notice that for smaller datasets like the one used on this demo, CUDA usage introduces
more overhead than the gains it obtains from parallelization, so the process in this
case is more efficient without CUDA, even if it is available.

Once we have created our instance, we are ready to learn the data and generate
new synthetic data that resembles it:

```python3
# Learn a model from the data
model.fit(
    data=data,
    entity_columns=['store_id'],
    context_columns=['region'],
    data_types=data_types,
    sequence_index='date'
)

# Sample new data
model.sample(num_entities=5)
```

The output will be a table with synthetic time series data with the same properties to
the demo data that we used as input.

# What's next?

For more details about **DeepEcho** and all its possibilities and features, please check and
run the [tutorials](tutorials).

If you want to see how we evaluate the performance and quality of our models, please have a
look at the [SDGym Benchmarking framework](https://github.com/sdv-dev/SDGym).

Also, please feel welcome to visit [our contributing guide](CONTRIBUTING.rst) in order to help
us developing new features or cool ideas!

---


<div align="center">
<a href="https://datacebo.com"><img align="center" width=40% src="https://github.com/sdv-dev/SDV/blob/stable/docs/images/DataCebo.png"></img></a>
</div>
<br/>
<br/>

[The Synthetic Data Vault Project](https://sdv.dev) was first created at MIT's [Data to AI Lab](
https://dai.lids.mit.edu/) in 2016. After 4 years of research and traction with enterprise, we
created [DataCebo](https://datacebo.com) in 2020 with the goal of growing the project.
Today, DataCebo is the proud developer of SDV, the largest ecosystem for
synthetic data generation & evaluation. It is home to multiple libraries that support synthetic
data, including:

* 🔄 Data discovery & transformation. Reverse the transforms to reproduce realistic data.
* 🧠 Multiple machine learning models -- ranging from Copulas to Deep Learning -- to create tabular,
  multi table and time series data.
* 📊 Measuring quality and privacy of synthetic data, and comparing different synthetic data
  generation models.

[Get started using the SDV package](https://sdv.dev/SDV/getting_started/install.html) -- a fully
integrated solution and your one-stop shop for synthetic data. Or, use the standalone libraries
for specific needs.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "deepecho",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.8",
    "maintainer_email": null,
    "keywords": "deepecho, DeepEcho",
    "author": null,
    "author_email": "\"DataCebo, Inc.\" <info@sdv.dev>",
    "download_url": "https://files.pythonhosted.org/packages/9f/d4/ace5480822b16830d04469b9d926f381aea8a55f98b1d1e7a3912b371843/deepecho-0.6.0.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n<br/>\n<p align=\"center\">\n    <i>This repository is part of <a href=\"https://sdv.dev\">The Synthetic Data Vault Project</a>, a project from <a href=\"https://datacebo.com\">DataCebo</a>.</i>\n</p>\n\n[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)\n[![PyPi Shield](https://img.shields.io/pypi/v/deepecho.svg)](https://pypi.python.org/pypi/deepecho)\n[![Tests](https://github.com/sdv-dev/DeepEcho/workflows/Run%20Tests/badge.svg)](https://github.com/sdv-dev/DeepEcho/actions?query=workflow%3A%22Run+Tests%22+branch%3Amain)\n[![Downloads](https://pepy.tech/badge/deepecho)](https://pepy.tech/project/deepecho)\n[![Coverage Status](https://codecov.io/gh/sdv-dev/DeepEcho/branch/main/graph/badge.svg)](https://codecov.io/gh/sdv-dev/DeepEcho)\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/sdv-dev/DeepEcho/main?filepath=tutorials/timeseries_data)\n[![Slack](https://img.shields.io/badge/Slack%20Workspace-Join%20now!-36C5F0?logo=slack)](https://bit.ly/sdv-slack-invite)\n\n<div align=\"left\">\n<br/>\n<p align=\"center\">\n<a href=\"https://github.com/sdv-dev/DeepEcho\">\n<img align=\"center\" width=40% src=\"https://github.com/sdv-dev/SDV/blob/stable/docs/images/DeepEcho-DataCebo.png\"></img>\n</a>\n</p>\n</div>\n\n</div>\n\n# Overview\n\n**DeepEcho** is a **Synthetic Data Generation** Python library for **mixed-type**, **multivariate\ntime series**. It provides:\n\n1. Multiple models based both on **classical statistical modeling** of time series and the latest\n   in **Deep Learning** techniques.\n2. A robust [benchmarking framework](https://github.com/sdv-dev/SDGym) for evaluating these methods\n   on multiple datasets and with multiple metrics.\n3. Ability for **Machine Learning researchers** to submit new methods following our `model` and\n   `sample` API and get evaluated.\n\n| Important Links                               |                                                                      |\n| --------------------------------------------- | -------------------------------------------------------------------- |\n| :computer: **[Website]**                      | Check out the SDV Website for more information about the project.    |\n| :orange_book: **[SDV Blog]**                  | Regular publshing of useful content about Synthetic Data Generation. |\n| :book: **[Documentation]**                    | Quickstarts, User and Development Guides, and API Reference.         |\n| :octocat: **[Repository]**                    | The link to the Github Repository of this library.                   |\n| :keyboard: **[Development Status]**           | This software is in its Pre-Alpha stage.                             |\n| [![][Slack Logo] **Community**][Community]    | Join our Slack Workspace for announcements and discussions.          |\n| [![][MyBinder Logo] **Tutorials**][Tutorials] | Run the SDV Tutorials in a Binder environment.                       |\n\n[Website]: https://sdv.dev\n[SDV Blog]: https://sdv.dev/blog\n[Documentation]: https://sdv.dev/SDV\n[Repository]: https://github.com/sdv-dev/DeepEcho\n[License]: https://github.com/sdv-dev/DeepEcho/blob/main/LICENSE\n[Development Status]: https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha\n[Slack Logo]: https://github.com/sdv-dev/SDV/blob/stable/docs/images/slack.png\n[Community]: https://bit.ly/sdv-slack-invite\n[MyBinder Logo]: https://github.com/sdv-dev/SDV/blob/stable/docs/images/mybinder.png\n[Tutorials]: https://mybinder.org/v2/gh/sdv-dev/DeepEcho/main?filepath=tutorials\n\n# Install\n\n**DeepEcho** is part of the **SDV** project and is automatically installed alongside it. For\ndetails about this process please visit the [SDV Installation Guide](\nhttps://sdv.dev/SDV/getting_started/install.html)\n\nOptionally, **DeepEcho** can also be installed as a standalone library using the following commands:\n\n**Using `pip`:**\n\n```bash\npip install deepecho\n```\n\n**Using `conda`:**\n\n```bash\nconda install -c pytorch -c conda-forge deepecho\n```\n\nFor more installation options please visit the [DeepEcho installation Guide](INSTALL.md)\n\n# Quickstart\n\n**DeepEcho** is included as part of [SDV](https://sdv.dev/SDV) to model and sample synthetic\ntime series. In most cases, usage through SDV is recommeded, since it provides additional\nfunctionalities which are not available here. For more details about how to use DeepEcho\nwhithin SDV, please visit the corresponding User Guide:\n\n* [SDV TimeSeries User Guide](https://sdv.dev/SDV/user_guides/timeseries/par.html)\n\n## Standalone usage\n\n**DeepEcho** can also be used as a standalone library.\n\nIn this short quickstart, we show how to learn a mixed-type multivariate time series\ndataset and then generate synthetic data that resembles it.\n\nWe will start by loading the data and preparing the instance of our model.\n\n```python3\nfrom deepecho import PARModel\nfrom deepecho.demo import load_demo\n\n# Load demo data\ndata = load_demo()\n\n# Define data types for all the columns\ndata_types = {\n    'region': 'categorical',\n    'day_of_week': 'categorical',\n    'total_sales': 'continuous',\n    'nb_customers': 'count',\n}\n\nmodel = PARModel(cuda=False)\n```\n\nIf we want to use different settings for our model, like increasing the number\nof epochs or enabling CUDA, we can pass the arguments when creating the model:\n\n```python  # keep this as python (without the 3) to avoid using it in test-readme\nmodel = PARModel(epochs=1024, cuda=True)\n```\n\nNotice that for smaller datasets like the one used on this demo, CUDA usage introduces\nmore overhead than the gains it obtains from parallelization, so the process in this\ncase is more efficient without CUDA, even if it is available.\n\nOnce we have created our instance, we are ready to learn the data and generate\nnew synthetic data that resembles it:\n\n```python3\n# Learn a model from the data\nmodel.fit(\n    data=data,\n    entity_columns=['store_id'],\n    context_columns=['region'],\n    data_types=data_types,\n    sequence_index='date'\n)\n\n# Sample new data\nmodel.sample(num_entities=5)\n```\n\nThe output will be a table with synthetic time series data with the same properties to\nthe demo data that we used as input.\n\n# What's next?\n\nFor more details about **DeepEcho** and all its possibilities and features, please check and\nrun the [tutorials](tutorials).\n\nIf you want to see how we evaluate the performance and quality of our models, please have a\nlook at the [SDGym Benchmarking framework](https://github.com/sdv-dev/SDGym).\n\nAlso, please feel welcome to visit [our contributing guide](CONTRIBUTING.rst) in order to help\nus developing new features or cool ideas!\n\n---\n\n\n<div align=\"center\">\n<a href=\"https://datacebo.com\"><img align=\"center\" width=40% src=\"https://github.com/sdv-dev/SDV/blob/stable/docs/images/DataCebo.png\"></img></a>\n</div>\n<br/>\n<br/>\n\n[The Synthetic Data Vault Project](https://sdv.dev) was first created at MIT's [Data to AI Lab](\nhttps://dai.lids.mit.edu/) in 2016. After 4 years of research and traction with enterprise, we\ncreated [DataCebo](https://datacebo.com) in 2020 with the goal of growing the project.\nToday, DataCebo is the proud developer of SDV, the largest ecosystem for\nsynthetic data generation & evaluation. It is home to multiple libraries that support synthetic\ndata, including:\n\n* \ud83d\udd04 Data discovery & transformation. Reverse the transforms to reproduce realistic data.\n* \ud83e\udde0 Multiple machine learning models -- ranging from Copulas to Deep Learning -- to create tabular,\n  multi table and time series data.\n* \ud83d\udcca Measuring quality and privacy of synthetic data, and comparing different synthetic data\n  generation models.\n\n[Get started using the SDV package](https://sdv.dev/SDV/getting_started/install.html) -- a fully\nintegrated solution and your one-stop shop for synthetic data. Or, use the standalone libraries\nfor specific needs.\n",
    "bugtrack_url": null,
    "license": "BSL-1.1",
    "summary": "Create sequential synthetic data of mixed types using a GAN.",
    "version": "0.6.0",
    "project_urls": {
        "Chat": "https://bit.ly/sdv-slack-invite",
        "Issue Tracker": "https://github.com/sdv-dev/Deepecho/issues",
        "Source Code": "https://github.com/sdv-dev/Deepecho/",
        "Twitter": "https://twitter.com/sdv_dev"
    },
    "split_keywords": [
        "deepecho",
        " deepecho"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7d47fb053543057ddaf662faf5cbe3a01fd56dbc4c9117bf0c8c9117ebaf26c1",
                "md5": "3043b5117e99aef815a73b65fcc5de7b",
                "sha256": "f99ebf99619262616d80a9d4a80a8e3f09a02acb9e38ed689757383a18779926"
            },
            "downloads": -1,
            "filename": "deepecho-0.6.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3043b5117e99aef815a73b65fcc5de7b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.8",
            "size": 27792,
            "upload_time": "2024-04-11T02:18:35",
            "upload_time_iso_8601": "2024-04-11T02:18:35.466632Z",
            "url": "https://files.pythonhosted.org/packages/7d/47/fb053543057ddaf662faf5cbe3a01fd56dbc4c9117bf0c8c9117ebaf26c1/deepecho-0.6.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9fd4ace5480822b16830d04469b9d926f381aea8a55f98b1d1e7a3912b371843",
                "md5": "babe777121293f862da3b09c717d9376",
                "sha256": "fb55288828c2059b45695d275e2a1d72a312e4110253873ac46674735e8421a6"
            },
            "downloads": -1,
            "filename": "deepecho-0.6.0.tar.gz",
            "has_sig": false,
            "md5_digest": "babe777121293f862da3b09c717d9376",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.8",
            "size": 30388,
            "upload_time": "2024-04-11T02:18:36",
            "upload_time_iso_8601": "2024-04-11T02:18:36.927356Z",
            "url": "https://files.pythonhosted.org/packages/9f/d4/ace5480822b16830d04469b9d926f381aea8a55f98b1d1e7a3912b371843/deepecho-0.6.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-11 02:18:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sdv-dev",
    "github_project": "Deepecho",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "deepecho"
}
        
Elapsed time: 0.21871s