buckaroo


Namebuckaroo JSON
Version 0.7.12 PyPI version JSON
download
home_pageNone
SummaryBuckaroo - GUI Data wrangling for pandas
upload_time2024-11-13 19:11:49
maintainerNone
docs_urlNone
authorPaddy Mullen
requires_python>=3.9
licenseCopyright (c) 2019 Bloomberg All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
keywords ipython jupyter widgets pandas
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # Buckaroo - The Data Table for Jupyter

Buckaroo is a modern data table for Jupyter that expedites the most common exploratory data analysis tasks. The most basic data analysis task - looking at the raw data, is cumbersome with the existing pandas tooling.  Buckaroo starts with a modern performant data table that displays up to 10k rows, is sortable, has value formatting, and scrolls.  On top of the core table experience extra features like summary stats, histograms, smart sampling, auto-cleaning, and a low code UI are added.  All of the functionality has sensible defaults that can be overridden to customize the experience for your workflow.

<img width="1002" alt="Polars-Buckaroo" src="https://github.com/paddymul/buckaroo/assets/40453/f48b701b-dfc4-4470-8588-05b6a9f33eec">


## Try it with Jupyterlite
Play with Buckaroo without any installation.
[Full Tour](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Full-tour.ipynb)
[![lite-badge](https://jupyterlite.rtfd.io/en/latest/_static/badge.svg)](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Full-tour.ipynb)

## Quick start

run `pip install buckaroo` then restart your jupyter server

The following code shows Buckaroo on a simple dataframe

```
import pandas as pd
import buckaroo
pd.DataFrame({'a':[1, 2, 10, 30, 50, 60, 50], 'b': ['foo', 'foo', 'bar', pd.NA, pd.NA, pd.NA, pd. NA]})

```

When you run `import buckaroo` in a Jupyter notebook, Buckaroo becomes the default display method for Pandas and Polars DataFrames


## Compatibility

Buckaroo works in the following notebook environments

- `jupyter lab` (version >=3.6.0)
- `jupyter notebook` (version >=7.0) 
- `VS Code notebooks` (with extra install)
- [Jupyter Lite](https://paddymul.github.io/buckaroo-examples/lab/index.html)
- `Google colab`  (with special initiation code, currently broken)


Buckaroo works with the following DataFrame libraries
- `pandas` (version >=1.3.5)
- `polars` optional
- `geopandas` optional


# Learn More

Buckaroo has extensive docs and tests, the best way to learn about the system is from feature example videos on youtube

## Interactive Styling Gallery

The interactive [styling gallery](https://py.cafe/app/paddymul/buckaroo-gallery) lets you see different styling configurations.  You can live edit code and play with different configs.

## Videos 
- [Extending Buckaroo](https://www.youtube.com/watch?v=GPl6_9n31NE)
- [Styling Buckaroo](https://www.youtube.com/watch?v=cbwJyo_PzKY)
- [GeoPandas Support](https://youtu.be/8WBhoNjDJsA)

## Example Notebooks

The following examples are loaded into a jupyter lite environment with Buckaroo installed.
- [Full Tour](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Full-tour.ipynb) Start here. This gives a broad overview of Buckaroo's features.
- [Histogram Demo](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Histograms-demo.ipynb) Explantion of the embedded histograms of Buckaroo.
- [Styling Gallery](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=styling-gallery.ipynb) Examples of all of the different formatters and styling available for the table
- [Extending Buckaroo](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Extending.ipynb) Broad overview of how to add post processing methods and custom styling methods to Buckaroo
- [Styling Howto](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=styling-howto.ipynb) In depth explanation of how to write custom styling methods
- [Pluggable Analysis Framework](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Pluggable-Analysis-Framework.ipynb) How to add new summary stats to Buckaroo
- [Solara Buckaroo](https://github.com/paddymul/buckaroo/blob/main/example-notebooks/Solara-Buckaroo.ipynb) Using Buckaroo with Solara
- [GeoPandas with Bucakroo](https://github.com/paddymul/buckaroo/blob/main/example-notebooks/GeoPandas.ipynb)

# Features

## High performance table
The core data grid of buckaroo is based on [AG-Grid](https://www.ag-grid.com/). This loads 1000s of cells in less than a second, with highly customizable display, formatting and scrolling.  You no longer have to use `df.head()` to poke at portions of your data.

## Fixed width formatting by default

By default numeric columns are formatted to use a fixed width font and commas are added.  This allows quick visual confirmation of magnitudes in a column.

## Histograms

[Histograms](https://buckaroo-data.readthedocs.io/en/latest/articles/histograms.html) for every column give you a very quick overview of the distribution of values, including uniques and N/A.

## Summary stats
The summary stats view can be toggled by clicking on the `0` below the `Σ` icon.  Summary stats are similar to `df.describe` and extensible.

## Inteligent sampling

Buckaroo will display entire DataFrames up to 10k rows.  Displaying more than that would run into performance problems that would make display too slow.  When a DataFrame has more than 10k rows, Buckaroo samples a random set of 10k rows, and also adds in the rwos with the 5 most extreme values for each column.

## Sorting

All of the data visible in the table (rows shown), is sortable by clicking on a column name, further clicks change sort direction then disable sort for that column.  Because extreme values are included with sample rows, you can see outlier values too.

## Extensibility at the core

Buckaroo summary stats are built on the [Pluggable Analysis Framework](https://buckaroo-data.readthedocs.io/en/latest/articles/pluggable.html) that allows individual summary stats to be overridden, and new summary stats to be built in terms of existing summary stats.  Care is taken to prevent errors in summary stats from preventing display of a dataframe.

## Lowcode UI (beta)

Buckaroo has a simple low code UI with python code gen. This view can be toggled by clicking on the `0` below the ` λ ` icon.

## Auto cleaning (beta)

Buckaroo can [automatically clean](https://buckaroo-data.readthedocs.io/en/latest/articles/auto_clean.html) dataframes to remove common data errors (a single string in a column of ints, recognizing date times...).  This feature is in beta.  You can access it by invoking buckaroo as `BuckarooWidget(df, auto_clean=True)`

## Development installation

For a development installation:

```bash
git clone https://github.com/paddymul/buckaroo.git
cd buckaroo
#we need to build against 3.6.5, jupyterlab 4.0 has different JS typing that conflicts
# the installable still works in JL4
pip install build twine pytest sphinx polars mypy jupyterlab==3.6.5 pandas-stubs geopolars pyarrow
pip install -ve .
```

Enabling development install for Jupyter notebook:


Enabling development install for JupyterLab:

```bash
jupyter labextension develop . --overwrite
```

Note for developers: the `--symlink` argument on Linux or OS X allows one to modify the JavaScript code in-place. This feature is not available with Windows.
`
### Developing the JS side

There are a series of examples of the components in [examples/ex](./examples/ex).



Instructions
```bash
npm install
npm run dev
```


## Contributions

We :heart: contributions.

Have you had a good experience with this project? Why not share some love and contribute code, or just let us know about any issues you had with it?

We welcome issue reports [here](../../issues); be sure to choose the proper issue template for your issue, so that we can be sure you're providing the necessary information.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "buckaroo",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "IPython, Jupyter, Widgets, pandas",
    "author": "Paddy Mullen",
    "author_email": null,
    "download_url": null,
    "platform": null,
    "description": "# Buckaroo - The Data Table for Jupyter\n\nBuckaroo is a modern data table for Jupyter that expedites the most common exploratory data analysis tasks. The most basic data analysis task - looking at the raw data, is cumbersome with the existing pandas tooling.  Buckaroo starts with a modern performant data table that displays up to 10k rows, is sortable, has value formatting, and scrolls.  On top of the core table experience extra features like summary stats, histograms, smart sampling, auto-cleaning, and a low code UI are added.  All of the functionality has sensible defaults that can be overridden to customize the experience for your workflow.\n\n<img width=\"1002\" alt=\"Polars-Buckaroo\" src=\"https://github.com/paddymul/buckaroo/assets/40453/f48b701b-dfc4-4470-8588-05b6a9f33eec\">\n\n\n## Try it with Jupyterlite\nPlay with Buckaroo without any installation.\n[Full Tour](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Full-tour.ipynb)\n[![lite-badge](https://jupyterlite.rtfd.io/en/latest/_static/badge.svg)](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Full-tour.ipynb)\n\n## Quick start\n\nrun `pip install buckaroo` then restart your jupyter server\n\nThe following code shows Buckaroo on a simple dataframe\n\n```\nimport pandas as pd\nimport buckaroo\npd.DataFrame({'a':[1, 2, 10, 30, 50, 60, 50], 'b': ['foo', 'foo', 'bar', pd.NA, pd.NA, pd.NA, pd. NA]})\n\n```\n\nWhen you run `import buckaroo` in a Jupyter notebook, Buckaroo becomes the default display method for Pandas and Polars DataFrames\n\n\n## Compatibility\n\nBuckaroo works in the following notebook environments\n\n- `jupyter lab` (version >=3.6.0)\n- `jupyter notebook` (version >=7.0) \n- `VS Code notebooks` (with extra install)\n- [Jupyter Lite](https://paddymul.github.io/buckaroo-examples/lab/index.html)\n- `Google colab`  (with special initiation code, currently broken)\n\n\nBuckaroo works with the following DataFrame libraries\n- `pandas` (version >=1.3.5)\n- `polars` optional\n- `geopandas` optional\n\n\n# Learn More\n\nBuckaroo has extensive docs and tests, the best way to learn about the system is from feature example videos on youtube\n\n## Interactive Styling Gallery\n\nThe interactive [styling gallery](https://py.cafe/app/paddymul/buckaroo-gallery) lets you see different styling configurations.  You can live edit code and play with different configs.\n\n## Videos \n- [Extending Buckaroo](https://www.youtube.com/watch?v=GPl6_9n31NE)\n- [Styling Buckaroo](https://www.youtube.com/watch?v=cbwJyo_PzKY)\n- [GeoPandas Support](https://youtu.be/8WBhoNjDJsA)\n\n## Example Notebooks\n\nThe following examples are loaded into a jupyter lite environment with Buckaroo installed.\n- [Full Tour](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Full-tour.ipynb) Start here. This gives a broad overview of Buckaroo's features.\n- [Histogram Demo](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Histograms-demo.ipynb) Explantion of the embedded histograms of Buckaroo.\n- [Styling Gallery](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=styling-gallery.ipynb) Examples of all of the different formatters and styling available for the table\n- [Extending Buckaroo](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Extending.ipynb) Broad overview of how to add post processing methods and custom styling methods to Buckaroo\n- [Styling Howto](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=styling-howto.ipynb) In depth explanation of how to write custom styling methods\n- [Pluggable Analysis Framework](https://paddymul.github.io/buckaroo-examples/lab/index.html?path=Pluggable-Analysis-Framework.ipynb) How to add new summary stats to Buckaroo\n- [Solara Buckaroo](https://github.com/paddymul/buckaroo/blob/main/example-notebooks/Solara-Buckaroo.ipynb) Using Buckaroo with Solara\n- [GeoPandas with Bucakroo](https://github.com/paddymul/buckaroo/blob/main/example-notebooks/GeoPandas.ipynb)\n\n# Features\n\n## High performance table\nThe core data grid of buckaroo is based on [AG-Grid](https://www.ag-grid.com/). This loads 1000s of cells in less than a second, with highly customizable display, formatting and scrolling.  You no longer have to use `df.head()` to poke at portions of your data.\n\n## Fixed width formatting by default\n\nBy default numeric columns are formatted to use a fixed width font and commas are added.  This allows quick visual confirmation of magnitudes in a column.\n\n## Histograms\n\n[Histograms](https://buckaroo-data.readthedocs.io/en/latest/articles/histograms.html) for every column give you a very quick overview of the distribution of values, including uniques and N/A.\n\n## Summary stats\nThe summary stats view can be toggled by clicking on the `0` below the `\u03a3` icon.  Summary stats are similar to `df.describe` and extensible.\n\n## Inteligent sampling\n\nBuckaroo will display entire DataFrames up to 10k rows.  Displaying more than that would run into performance problems that would make display too slow.  When a DataFrame has more than 10k rows, Buckaroo samples a random set of 10k rows, and also adds in the rwos with the 5 most extreme values for each column.\n\n## Sorting\n\nAll of the data visible in the table (rows shown), is sortable by clicking on a column name, further clicks change sort direction then disable sort for that column.  Because extreme values are included with sample rows, you can see outlier values too.\n\n## Extensibility at the core\n\nBuckaroo summary stats are built on the [Pluggable Analysis Framework](https://buckaroo-data.readthedocs.io/en/latest/articles/pluggable.html) that allows individual summary stats to be overridden, and new summary stats to be built in terms of existing summary stats.  Care is taken to prevent errors in summary stats from preventing display of a dataframe.\n\n## Lowcode UI (beta)\n\nBuckaroo has a simple low code UI with python code gen. This view can be toggled by clicking on the `0` below the ` \u03bb ` icon.\n\n## Auto cleaning (beta)\n\nBuckaroo can [automatically clean](https://buckaroo-data.readthedocs.io/en/latest/articles/auto_clean.html) dataframes to remove common data errors (a single string in a column of ints, recognizing date times...).  This feature is in beta.  You can access it by invoking buckaroo as `BuckarooWidget(df, auto_clean=True)`\n\n## Development installation\n\nFor a development installation:\n\n```bash\ngit clone https://github.com/paddymul/buckaroo.git\ncd buckaroo\n#we need to build against 3.6.5, jupyterlab 4.0 has different JS typing that conflicts\n# the installable still works in JL4\npip install build twine pytest sphinx polars mypy jupyterlab==3.6.5 pandas-stubs geopolars pyarrow\npip install -ve .\n```\n\nEnabling development install for Jupyter notebook:\n\n\nEnabling development install for JupyterLab:\n\n```bash\njupyter labextension develop . --overwrite\n```\n\nNote for developers: the `--symlink` argument on Linux or OS X allows one to modify the JavaScript code in-place. This feature is not available with Windows.\n`\n### Developing the JS side\n\nThere are a series of examples of the components in [examples/ex](./examples/ex).\n\n\n\nInstructions\n```bash\nnpm install\nnpm run dev\n```\n\n\n## Contributions\n\nWe :heart: contributions.\n\nHave you had a good experience with this project? Why not share some love and contribute code, or just let us know about any issues you had with it?\n\nWe welcome issue reports [here](../../issues); be sure to choose the proper issue template for your issue, so that we can be sure you're providing the necessary information.\n\n",
    "bugtrack_url": null,
    "license": "Copyright (c) 2019 Bloomberg All rights reserved.  Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.  3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.",
    "summary": "Buckaroo - GUI Data wrangling for pandas",
    "version": "0.7.12",
    "project_urls": {
        "Homepage": "https://github.com/paddymul/buckaroo"
    },
    "split_keywords": [
        "ipython",
        " jupyter",
        " widgets",
        " pandas"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f7934697bcb0bd589960999aae351503344fb78d258eaf80af0b3030916a0780",
                "md5": "9632d7b594d77fc91cc39b2cfe4f7d05",
                "sha256": "c7aea760f742200cdf4616a81d112d579dbcae0f247ef12e4286f4f6afe9ee15"
            },
            "downloads": -1,
            "filename": "buckaroo-0.7.12-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9632d7b594d77fc91cc39b2cfe4f7d05",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 1123287,
            "upload_time": "2024-11-13T19:11:49",
            "upload_time_iso_8601": "2024-11-13T19:11:49.918962Z",
            "url": "https://files.pythonhosted.org/packages/f7/93/4697bcb0bd589960999aae351503344fb78d258eaf80af0b3030916a0780/buckaroo-0.7.12-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-13 19:11:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "paddymul",
    "github_project": "buckaroo",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "buckaroo"
}
        
Elapsed time: 4.98676s