meerschaum


Namemeerschaum JSON
Version 2.1.7 PyPI version JSON
download
home_pagehttps://meerschaum.io
SummarySync Time-Series Pipes with Meerschaum
upload_time2024-01-08 03:49:40
maintainer
docs_urlNone
authorBennett Meares
requires_python>=3.8
licenseApache Software License 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <img src="https://meerschaum.io/assets/banner_1920x320.png" alt="Meerschaum banner" style="width: 100%"/>

| PyPI | GitHub | Info | Stats |
|---|---|---|---|
| ![PyPI]( https://img.shields.io/pypi/v/meerschaum?color=%2300cc66&label=Version ) | ![GitHub Repo stars](https://img.shields.io/github/stars/bmeares/Meerschaum?style=social) | ![License](https://img.shields.io/github/license/bmeares/Meerschaum?label=License) | ![Number of plugins]( https://img.shields.io/badge/dynamic/json?color=3098c1&label=Public%20Plugins&query=num_plugins&url=https%3A%2F%2Fapi.mrsm.io%2Finfo ) |
| ![PyPI - Python Version]( https://img.shields.io/pypi/pyversions/meerschaum?label=Python&logo=python&logoColor=%23ffffff ) | ![GitHub Sponsors](https://img.shields.io/github/sponsors/bmeares?color=eadf15&label=Sponsors) | [![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/meerschaum)](https://artifacthub.io/packages/search?repo=meerschaum) | ![Number of registered users]( https://img.shields.io/badge/dynamic/json?color=3098c1&label=Registered%20Users&query=num_users&url=https%3A%2F%2Fapi.mrsm.io%2Finfo ) |

<p align="center">
<img src="https://meerschaum.io/files/images/demo.gif" alt="Meerschaum demo" height="450px">
</p>

## What is Meerschaum?
Meerschaum is a tool for quickly synchronizing time-series data streams called **pipes**. With Meerschaum, you can have a data visualization stack running in minutes.

<p align="center">
<img src="https://meerschaum.io/assets/screenshots/weather_pipes.png"/>
</p>

## Why Meerschaum?

Two words: *incremental updates*. Fetch the data you need, and Meerschaum will handle the rest.

If you've worked with time-series data, you know the headaches that come with ETL.
Data engineering often gets in analysts' way, and when work needs to get done, every minute spent on pipelining is time taken away from real analysis.

Rather than copy / pasting your ETL scripts, simply build pipes with Meerschaum! [Meerschaum gives you the tools to design your data streams how you like](https://towardsdatascience.com/easy-time-series-etl-for-data-scientists-with-meerschaum-5aade339b398) โ€• and don't worry โ€” you can always incorporate Meerschaum into your existing systems!

### Want to Learn More?

You can find a wealth of information at [meerschaum.io](https://meerschaum.io)!

Additionally, below are several articles published about Meerschaum:

- Interview featured in [*Console 100 - The Open Source Newsletter*](https://console.substack.com/p/console-100)
- [*A Data Scientist's Guide to Fetching COVID-19 Data in 2022*](https://towardsdatascience.com/a-data-scientists-guide-to-fetching-covid-19-data-in-2022-d952b4697) (Towards Data Science)
- [*Time-Series ETL with Meerschaum*](https://towardsdatascience.com/easy-time-series-etl-for-data-scientists-with-meerschaum-5aade339b398) (Towards Data Science)
- [*How I automatically extract my M1 Finance transactions*](https://bmeares.medium.com/how-i-automatically-extract-my-m1-finance-transactions-b43cef857bc7)

## Installation

For a more thorough setup guide, visit the [Getting Started](https://meerschaum.io/get-started/) page at [meerschaum.io](https://meerschaum.io).

### TL;DR

```bash
pip install -U --user meerschaum
mrsm stack up -d db grafana
mrsm bootstrap pipes
```

## Usage

Please visit [meerschaum.io](https://meerschaum.io) for setup, usage, and troubleshooting information. You can find technical documentation at [docs.meerschaum.io](https://docs.meerschaum.io), and here is a complete list of the [Meerschaum actions](https://meerschaum.io/reference/actions/).

### CLI
```bash
### Install the NOAA weather plugin.
mrsm install plugin noaa

### Register a new pipe to the built-in SQLite DB.
### You can instead run `bootstrap pipe` for a wizard.
### Enter 'KATL' for Atlanta when prompted.
mrsm register pipe -c plugin:noaa -m weather -l atl -i sql:local

### Pull data and create the table "plugin_noaa_weather_atl".
mrsm sync pipes -l atl -i sql:local
```

### Python API

```python
import meerschaum as mrsm
pipe = mrsm.Pipe(
    'foo', 'bar',              ### Connector and metric labels.
    target   = 'MyTableName!', ### Table name. Defaults to 'foo_bar'.
    instance = 'sql:local',    ### Built-in SQLite DB. Defaults to 'sql:main'.
    columns  = {
        'datetime': 'dt',      ### Column for the datetime index.
        'id'      : 'id',      ### Column for the ID index (optional).
    },
)
### Pass a DataFrame to create the table and indices.
pipe.sync([{'dt': '2022-07-01', 'id': 1, 'val': 10}])

### Duplicate rows are ignored.
pipe.sync([{'dt': '2022-07-01', 'id': 1, 'val': 10}])
assert len(pipe.get_data()) == 1

### Rows with existing keys (datetime and/or id) are updated.
pipe.sync([{'dt': '2022-07-01', 'id': 1, 'val': 100}])
assert len(pipe.get_data()) == 1

### Translates to this query for SQLite:
###
### SELECT *
### FROM "MyTableName!"
### WHERE "dt" >= datetime('2022-01-01', '0 minute')
###   AND "dt" <  datetime('2023-01-01', '0 minute')
###   AND "id" IN ('1')
df = pipe.get_data(
    begin  = '2022-01-01',
    end    = '2023-01-01',
    params = {'id': [1]},
)

### Shape of the DataFrame:
###           dt  id  val
### 0 2022-07-01   1  100

### Drop the table and remove the pipe's metadata.
pipe.delete()
```

### Simple Plugin

```python
# ~/.config/plugins/example.py

__version__ = '1.0.0'
required = ['requests']

def register(pipe, **kw):
    return {
        'columns': {
            'datetime': 'dt',
            'id'      : 'id',
        },
    }

def fetch(pipe, **kw):
    import requests, datetime, random
    response = requests.get('http://date.jsontest.com/')

    ### The fetched JSON has the following shape:
    ### {
    ###     "date": "07-01-2022",
    ###     "milliseconds_since_epoch": 1656718801566,
    ###     "time": "11:40:01 PM"
    ### }
    data = response.json()
    timestamp = datetime.datetime.fromtimestamp(
        int(str(data['milliseconds_since_epoch'])[:-3])
    )

    ### You may also return a Pandas DataFrame.
    return [{
        "dt"   : timestamp,
        "id"   : random.randint(1, 4),
        "value": random.uniform(1, 100),
    }]
```

## Features

- ๐Ÿ“Š **Built for Data Scientists and Analysts**  
  - Integrate with Pandas, Grafana and other popular [data analysis tools](https://meerschaum.io/reference/data-analysis-tools/).
  - Persist your dataframes and always get the latest data.
- โšก๏ธ **Production-Ready, Batteries Included**  
  - [Synchronization engine](https://meerschaum.io/reference/pipes/syncing/) concurrently updates many time-series data streams.
  - One-click deploy a [TimescaleDB and Grafana stack](https://meerschaum.io/reference/stack/) for prototyping.
  - Serve data to your entire organization through the power of `uvicorn`, `gunicorn`, and `FastAPI`.
- ๐Ÿ”Œ **Easily Expandable**  
  -  Ingest any data source with a simple [plugin](https://meerschaum.io/reference/plugins/writing-plugins/). Just return a DataFrame, and Meerschaum handles the rest.
  - [Add any function as a command](https://meerschaum.io/reference/plugins/types-of-plugins/#action-plugins) to the Meerschaum system.
  - Include Meerschaum in your projects with its [easy-to-use Python API](https://docs.meerschaum.io).
- โœจ **Tailored for Your Experience**  
  - Rich CLI makes managing your data streams surprisingly enjoyable!
  - Web dashboard for those who prefer a more graphical experience.
  - Manage your database connections with [Meerschaum connectors](https://meerschaum.io/reference/connectors/).
  - Utility commands with sensible syntax let you control many pipes with grace.
- ๐Ÿ’ผ **Portable from the Start**  
  - The environment variable `$MRSM_ROOT_DIR` lets you emulate multiple installations and group together your [instances](https://meerschaum.io/reference/connectors/#instances-and-repositories).
  - No dependencies required; anything needed will be installed into a virtual environment.
  - [Specify required packages for your plugins](https://meerschaum.io/reference/plugins/writing-plugins/), and users will get those packages in a virtual environment.

## Support Meerschaum's Development

For consulting services and to support Meerschaum's development, please considering sponsoring me on [GitHub sponsors](https://github.com/sponsors/bmeares).

Additionally, you can always [buy me a coffeeโ˜•](https://www.buymeacoffee.com/bmeares)!

### License

Copyright 2021 Bennett Meares

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

            

Raw data

            {
    "_id": null,
    "home_page": "https://meerschaum.io",
    "name": "meerschaum",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "bennett.meares@gmail.com",
    "keywords": "",
    "author": "Bennett Meares",
    "author_email": "bennett.meares@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/b9/35/5792c9e8b489b93ef3682f716dcbd312d8f80d44d4df38744f022136e909/meerschaum-2.1.7.tar.gz",
    "platform": null,
    "description": "<img src=\"https://meerschaum.io/assets/banner_1920x320.png\" alt=\"Meerschaum banner\" style=\"width: 100%\"/>\n\n| PyPI | GitHub | Info | Stats |\n|---|---|---|---|\n| ![PyPI]( https://img.shields.io/pypi/v/meerschaum?color=%2300cc66&label=Version ) | ![GitHub Repo stars](https://img.shields.io/github/stars/bmeares/Meerschaum?style=social) | ![License](https://img.shields.io/github/license/bmeares/Meerschaum?label=License) | ![Number of plugins]( https://img.shields.io/badge/dynamic/json?color=3098c1&label=Public%20Plugins&query=num_plugins&url=https%3A%2F%2Fapi.mrsm.io%2Finfo ) |\n| ![PyPI - Python Version]( https://img.shields.io/pypi/pyversions/meerschaum?label=Python&logo=python&logoColor=%23ffffff ) | ![GitHub Sponsors](https://img.shields.io/github/sponsors/bmeares?color=eadf15&label=Sponsors) | [![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/meerschaum)](https://artifacthub.io/packages/search?repo=meerschaum) | ![Number of registered users]( https://img.shields.io/badge/dynamic/json?color=3098c1&label=Registered%20Users&query=num_users&url=https%3A%2F%2Fapi.mrsm.io%2Finfo ) |\n\n<p align=\"center\">\n<img src=\"https://meerschaum.io/files/images/demo.gif\" alt=\"Meerschaum demo\" height=\"450px\">\n</p>\n\n## What is Meerschaum?\nMeerschaum is a tool for quickly synchronizing time-series data streams called **pipes**. With Meerschaum, you can have a data visualization stack running in minutes.\n\n<p align=\"center\">\n<img src=\"https://meerschaum.io/assets/screenshots/weather_pipes.png\"/>\n</p>\n\n## Why Meerschaum?\n\nTwo words: *incremental updates*. Fetch the data you need, and Meerschaum will handle the rest.\n\nIf you've worked with time-series data, you know the headaches that come with ETL.\nData engineering often gets in analysts' way, and when work needs to get done, every minute spent on pipelining is time taken away from real analysis.\n\nRather than copy / pasting your ETL scripts, simply build pipes with Meerschaum! [Meerschaum gives you the tools to design your data streams how you like](https://towardsdatascience.com/easy-time-series-etl-for-data-scientists-with-meerschaum-5aade339b398) \u2015 and don't worry \u2014 you can always incorporate Meerschaum into your existing systems!\n\n### Want to Learn More?\n\nYou can find a wealth of information at [meerschaum.io](https://meerschaum.io)!\n\nAdditionally, below are several articles published about Meerschaum:\n\n- Interview featured in [*Console 100 - The Open Source Newsletter*](https://console.substack.com/p/console-100)\n- [*A Data Scientist's Guide to Fetching COVID-19 Data in 2022*](https://towardsdatascience.com/a-data-scientists-guide-to-fetching-covid-19-data-in-2022-d952b4697) (Towards Data Science)\n- [*Time-Series ETL with Meerschaum*](https://towardsdatascience.com/easy-time-series-etl-for-data-scientists-with-meerschaum-5aade339b398) (Towards Data Science)\n- [*How I automatically extract my M1 Finance transactions*](https://bmeares.medium.com/how-i-automatically-extract-my-m1-finance-transactions-b43cef857bc7)\n\n## Installation\n\nFor a more thorough setup guide, visit the [Getting Started](https://meerschaum.io/get-started/) page at [meerschaum.io](https://meerschaum.io).\n\n### TL;DR\n\n```bash\npip install -U --user meerschaum\nmrsm stack up -d db grafana\nmrsm bootstrap pipes\n```\n\n## Usage\n\nPlease visit [meerschaum.io](https://meerschaum.io) for setup, usage, and troubleshooting information. You can find technical documentation at [docs.meerschaum.io](https://docs.meerschaum.io), and here is a complete list of the [Meerschaum actions](https://meerschaum.io/reference/actions/).\n\n### CLI\n```bash\n### Install the NOAA weather plugin.\nmrsm install plugin noaa\n\n### Register a new pipe to the built-in SQLite DB.\n### You can instead run `bootstrap pipe` for a wizard.\n### Enter 'KATL' for Atlanta when prompted.\nmrsm register pipe -c plugin:noaa -m weather -l atl -i sql:local\n\n### Pull data and create the table \"plugin_noaa_weather_atl\".\nmrsm sync pipes -l atl -i sql:local\n```\n\n### Python API\n\n```python\nimport meerschaum as mrsm\npipe = mrsm.Pipe(\n    'foo', 'bar',              ### Connector and metric labels.\n    target   = 'MyTableName!', ### Table name. Defaults to 'foo_bar'.\n    instance = 'sql:local',    ### Built-in SQLite DB. Defaults to 'sql:main'.\n    columns  = {\n        'datetime': 'dt',      ### Column for the datetime index.\n        'id'      : 'id',      ### Column for the ID index (optional).\n    },\n)\n### Pass a DataFrame to create the table and indices.\npipe.sync([{'dt': '2022-07-01', 'id': 1, 'val': 10}])\n\n### Duplicate rows are ignored.\npipe.sync([{'dt': '2022-07-01', 'id': 1, 'val': 10}])\nassert len(pipe.get_data()) == 1\n\n### Rows with existing keys (datetime and/or id) are updated.\npipe.sync([{'dt': '2022-07-01', 'id': 1, 'val': 100}])\nassert len(pipe.get_data()) == 1\n\n### Translates to this query for SQLite:\n###\n### SELECT *\n### FROM \"MyTableName!\"\n### WHERE \"dt\" >= datetime('2022-01-01', '0 minute')\n###   AND \"dt\" <  datetime('2023-01-01', '0 minute')\n###   AND \"id\" IN ('1')\ndf = pipe.get_data(\n    begin  = '2022-01-01',\n    end    = '2023-01-01',\n    params = {'id': [1]},\n)\n\n### Shape of the DataFrame:\n###           dt  id  val\n### 0 2022-07-01   1  100\n\n### Drop the table and remove the pipe's metadata.\npipe.delete()\n```\n\n### Simple Plugin\n\n```python\n# ~/.config/plugins/example.py\n\n__version__ = '1.0.0'\nrequired = ['requests']\n\ndef register(pipe, **kw):\n    return {\n        'columns': {\n            'datetime': 'dt',\n            'id'      : 'id',\n        },\n    }\n\ndef fetch(pipe, **kw):\n    import requests, datetime, random\n    response = requests.get('http://date.jsontest.com/')\n\n    ### The fetched JSON has the following shape:\n    ### {\n    ###     \"date\": \"07-01-2022\",\n    ###     \"milliseconds_since_epoch\": 1656718801566,\n    ###     \"time\": \"11:40:01 PM\"\n    ### }\n    data = response.json()\n    timestamp = datetime.datetime.fromtimestamp(\n        int(str(data['milliseconds_since_epoch'])[:-3])\n    )\n\n    ### You may also return a Pandas DataFrame.\n    return [{\n        \"dt\"   : timestamp,\n        \"id\"   : random.randint(1, 4),\n        \"value\": random.uniform(1, 100),\n    }]\n```\n\n## Features\n\n- \ud83d\udcca **Built for Data Scientists and Analysts**  \n  - Integrate with Pandas, Grafana and other popular [data analysis tools](https://meerschaum.io/reference/data-analysis-tools/).\n  - Persist your dataframes and always get the latest data.\n- \u26a1\ufe0f **Production-Ready, Batteries Included**  \n  - [Synchronization engine](https://meerschaum.io/reference/pipes/syncing/) concurrently updates many time-series data streams.\n  - One-click deploy a [TimescaleDB and Grafana stack](https://meerschaum.io/reference/stack/) for prototyping.\n  - Serve data to your entire organization through the power of `uvicorn`, `gunicorn`, and `FastAPI`.\n- \ud83d\udd0c **Easily Expandable**  \n  -  Ingest any data source with a simple [plugin](https://meerschaum.io/reference/plugins/writing-plugins/). Just return a DataFrame, and Meerschaum handles the rest.\n  - [Add any function as a command](https://meerschaum.io/reference/plugins/types-of-plugins/#action-plugins) to the Meerschaum system.\n  - Include Meerschaum in your projects with its [easy-to-use Python API](https://docs.meerschaum.io).\n- \u2728 **Tailored for Your Experience**  \n  - Rich CLI makes managing your data streams surprisingly enjoyable!\n  - Web dashboard for those who prefer a more graphical experience.\n  - Manage your database connections with [Meerschaum connectors](https://meerschaum.io/reference/connectors/).\n  - Utility commands with sensible syntax let you control many pipes with grace.\n- \ud83d\udcbc **Portable from the Start**  \n  - The environment variable `$MRSM_ROOT_DIR` lets you emulate multiple installations and group together your [instances](https://meerschaum.io/reference/connectors/#instances-and-repositories).\n  - No dependencies required; anything needed will be installed into a virtual environment.\n  - [Specify required packages for your plugins](https://meerschaum.io/reference/plugins/writing-plugins/), and users will get those packages in a virtual environment.\n\n## Support Meerschaum's Development\n\nFor consulting services and to support Meerschaum's development, please considering sponsoring me on [GitHub sponsors](https://github.com/sponsors/bmeares).\n\nAdditionally, you can always [buy me a coffee\u2615](https://www.buymeacoffee.com/bmeares)!\n\n### License\n\nCopyright 2021 Bennett Meares\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n   http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "Sync Time-Series Pipes with Meerschaum",
    "version": "2.1.7",
    "project_urls": {
        "Changelog": "https://meerschaum.io/news/changelog",
        "Discord": "https://discord.gg/8U8qMUjvcc",
        "Documentation": "https://docs.meerschaum.io",
        "Donate": "https://github.com/sponsors/bmeares",
        "GitHub": "https://github.com/bmeares/Meerschaum",
        "Homepage": "https://meerschaum.io"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9088770df55b0df15d2df0e9cb544e105f405328e3e0c5a2763f5a74f91eda7e",
                "md5": "89070eb9f65ac3467b11efeeda716d82",
                "sha256": "801eee14aa7d6bd8fc7c8714eb073f9de69970273e5cb64d39efaaf386a115aa"
            },
            "downloads": -1,
            "filename": "meerschaum-2.1.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "89070eb9f65ac3467b11efeeda716d82",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 971505,
            "upload_time": "2024-01-08T03:49:35",
            "upload_time_iso_8601": "2024-01-08T03:49:35.722452Z",
            "url": "https://files.pythonhosted.org/packages/90/88/770df55b0df15d2df0e9cb544e105f405328e3e0c5a2763f5a74f91eda7e/meerschaum-2.1.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b9355792c9e8b489b93ef3682f716dcbd312d8f80d44d4df38744f022136e909",
                "md5": "4ac3807d62fcf2698f7e708c4cea843e",
                "sha256": "8eda73632c4cdfd7ed4b3ccf95523be8f6ec7e197b80502967e200313542a9cc"
            },
            "downloads": -1,
            "filename": "meerschaum-2.1.7.tar.gz",
            "has_sig": false,
            "md5_digest": "4ac3807d62fcf2698f7e708c4cea843e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 885960,
            "upload_time": "2024-01-08T03:49:40",
            "upload_time_iso_8601": "2024-01-08T03:49:40.909569Z",
            "url": "https://files.pythonhosted.org/packages/b9/35/5792c9e8b489b93ef3682f716dcbd312d8f80d44d4df38744f022136e909/meerschaum-2.1.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-08 03:49:40",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sponsors",
    "github_project": "bmeares",
    "github_not_found": true,
    "lcname": "meerschaum"
}
        
Elapsed time: 0.19648s