baribal


Namebaribal JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryHelper functions for pandas data analysis, inspired by R
upload_time2025-02-06 21:30:28
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords data-analysis glimpse pandas r
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![](images/logo%20baribal.png)

# baribal ๐Ÿป

[![Build Status](https://img.shields.io/github/actions/workflow/status/gpenessot/baribal/main.yml?branch=main)](https://github.com/gpenessot/baribal/actions)
[![PyPI version](https://img.shields.io/pypi/v/baribal)](https://pypi.org/project/baribal/)
[![PyPI downloads](https://img.shields.io/pypi/dm/baribal)](https://pypi.org/project/baribal/)
[![Coverage](https://img.shields.io/codecov/c/github/gpenessot/baribal)](https://codecov.io/gh/gpenessot/baribal)
[![License](https://img.shields.io/github/license/gpenessot/baribal)](https://github.com/gpenessot/baribal/blob/main/LICENSE)
[![Python Versions](https://img.shields.io/pypi/pyversions/baribal)](https://pypi.org/project/baribal/)
[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)

A Python package extending pandas and polars with helper functions for simpler exploratory data analysis and data wrangling, inspired by R's tidyverse packages.

## Why Baribal?

While pandas and polars are incredibly powerful, some R functions like `glimpse()`, `tabyl()`, or `clean_names()` make data exploration and manipulation particularly smooth. Baribal brings these functionalities to Python, helping you to:

- Get quick, insightful overviews of your DataFrames
- Perform common data cleaning tasks with less code
- Handle missing values more intuitively
- Generate summary statistics with minimal effort
- Optimize memory usage with smart type inference

## Features

### Core Functions

#### ๐Ÿ” `glimpse()`
R-style enhanced DataFrame preview that works with both pandas and polars:

```python
import pandas as pd
import baribal as bb

df = pd.DataFrame({
    'id': range(1, 6),
    'name': ['John Doe', 'Jane Smith', 'Bob Wilson', 'Alice Brown', 'Charlie Davis'],
    'age': [25, 30, 35, 28, 42],
    'score': [92.5, 88.0, None, 95.5, 90.0]
})

bb.glimpse(df)
```

Output:
```
Observations: 5
Variables: 4
DataFrame type: pandas
$ id    <int> 1, 2, 3, 4, 5
$ name  <chr> "John Doe", "Jane Smith", "Bob Wilson", "Alice Brown", "Charlie Davis"
$ age   <int> 25, 30, 35, 28, 42
$ score <num> 92.5, 88.0, NA, 95.5, 90.0
```

#### ๐Ÿ“Š `tabyl()`
Enhanced cross-tabulations with integrated statistics:

```python
import baribal as bb

# Single variable frequency table
result, _ = bb.tabyl(df, 'category')

# Two-way cross-tabulation with chi-square statistics
result, stats = bb.tabyl(df, 'category', 'status')
```

### Data Cleaning

#### ๐Ÿงน `clean_names()`
Smart column name cleaning with multiple case styles:

```python
import baribal as bb

df = pd.DataFrame({
    "First Name": [],
    "Last.Name": [],
    "Email@Address": [],
    "Phone #": []
})

# Snake case (default)
bb.clean_names(df)
# โ†’ columns become: ['first_name', 'last_name', 'email_address', 'phone']

# Camel case
bb.clean_names(df, case='camel')
# โ†’ columns become: ['firstName', 'lastName', 'emailAddress', 'phone']

# Pascal case
bb.clean_names(df, case='pascal')
# โ†’ columns become: ['FirstName', 'LastName', 'EmailAddress', 'Phone']
```

#### ๐Ÿ”„ `rename_all()`
Batch rename columns using patterns:

```python
import baribal as bb

# Using regex pattern
bb.rename_all(df, r'Col_(\d+)')  # Extracts numbers from column names

# Using case transformation
bb.rename_all(df, lambda x: x.lower())  # Convert all to lowercase
```

### Analysis Tools

#### ๐Ÿ” `missing_summary()`
Comprehensive missing values analysis:

```python
import baribal as bb

summary = bb.missing_summary(df)
# Returns DataFrame with missing value statistics for each column
```

## Installation

```bash
pip install baribal
```

## Dependencies

- Python >= 3.8
- pandas >= 1.0.0
- polars >= 0.20.0 (optional)
- numpy
- scipy

## Development

This project uses modern Python development tools:
- `uv` for fast, reliable package management
- `ruff` for lightning-fast linting and formatting
- `pytest` for testing

To set up the development environment:

```bash
make install
```

To run tests:

```bash
make test
```

## Contributing

Contributions are welcome! Whether it's:
- Suggesting new R-inspired features
- Improving documentation
- Adding test cases
- Reporting bugs

Please check out our [Contributing Guidelines](CONTRIBUTING.md) for details on our git commit conventions and development process.

## License

MIT License

## Acknowledgments

Inspired by various R packages including:
- `dplyr`
- `janitor`
- `tibble`
- `naniar`
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "baribal",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "data-analysis, glimpse, pandas, r",
    "author": null,
    "author_email": "Ga\u00ebl Penessot <gael.penessot@data-decision.io>",
    "download_url": "https://files.pythonhosted.org/packages/4d/34/02499b277a307fa7783d1f2bc4374fb930316a4527f9b5a06fb089874869/baribal-0.2.0.tar.gz",
    "platform": null,
    "description": "![](images/logo%20baribal.png)\n\n# baribal \ud83d\udc3b\n\n[![Build Status](https://img.shields.io/github/actions/workflow/status/gpenessot/baribal/main.yml?branch=main)](https://github.com/gpenessot/baribal/actions)\n[![PyPI version](https://img.shields.io/pypi/v/baribal)](https://pypi.org/project/baribal/)\n[![PyPI downloads](https://img.shields.io/pypi/dm/baribal)](https://pypi.org/project/baribal/)\n[![Coverage](https://img.shields.io/codecov/c/github/gpenessot/baribal)](https://codecov.io/gh/gpenessot/baribal)\n[![License](https://img.shields.io/github/license/gpenessot/baribal)](https://github.com/gpenessot/baribal/blob/main/LICENSE)\n[![Python Versions](https://img.shields.io/pypi/pyversions/baribal)](https://pypi.org/project/baribal/)\n[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)\n\nA Python package extending pandas and polars with helper functions for simpler exploratory data analysis and data wrangling, inspired by R's tidyverse packages.\n\n## Why Baribal?\n\nWhile pandas and polars are incredibly powerful, some R functions like `glimpse()`, `tabyl()`, or `clean_names()` make data exploration and manipulation particularly smooth. Baribal brings these functionalities to Python, helping you to:\n\n- Get quick, insightful overviews of your DataFrames\n- Perform common data cleaning tasks with less code\n- Handle missing values more intuitively\n- Generate summary statistics with minimal effort\n- Optimize memory usage with smart type inference\n\n## Features\n\n### Core Functions\n\n#### \ud83d\udd0d `glimpse()`\nR-style enhanced DataFrame preview that works with both pandas and polars:\n\n```python\nimport pandas as pd\nimport baribal as bb\n\ndf = pd.DataFrame({\n    'id': range(1, 6),\n    'name': ['John Doe', 'Jane Smith', 'Bob Wilson', 'Alice Brown', 'Charlie Davis'],\n    'age': [25, 30, 35, 28, 42],\n    'score': [92.5, 88.0, None, 95.5, 90.0]\n})\n\nbb.glimpse(df)\n```\n\nOutput:\n```\nObservations: 5\nVariables: 4\nDataFrame type: pandas\n$ id    <int> 1, 2, 3, 4, 5\n$ name  <chr> \"John Doe\", \"Jane Smith\", \"Bob Wilson\", \"Alice Brown\", \"Charlie Davis\"\n$ age   <int> 25, 30, 35, 28, 42\n$ score <num> 92.5, 88.0, NA, 95.5, 90.0\n```\n\n#### \ud83d\udcca `tabyl()`\nEnhanced cross-tabulations with integrated statistics:\n\n```python\nimport baribal as bb\n\n# Single variable frequency table\nresult, _ = bb.tabyl(df, 'category')\n\n# Two-way cross-tabulation with chi-square statistics\nresult, stats = bb.tabyl(df, 'category', 'status')\n```\n\n### Data Cleaning\n\n#### \ud83e\uddf9 `clean_names()`\nSmart column name cleaning with multiple case styles:\n\n```python\nimport baribal as bb\n\ndf = pd.DataFrame({\n    \"First Name\": [],\n    \"Last.Name\": [],\n    \"Email@Address\": [],\n    \"Phone #\": []\n})\n\n# Snake case (default)\nbb.clean_names(df)\n# \u2192 columns become: ['first_name', 'last_name', 'email_address', 'phone']\n\n# Camel case\nbb.clean_names(df, case='camel')\n# \u2192 columns become: ['firstName', 'lastName', 'emailAddress', 'phone']\n\n# Pascal case\nbb.clean_names(df, case='pascal')\n# \u2192 columns become: ['FirstName', 'LastName', 'EmailAddress', 'Phone']\n```\n\n#### \ud83d\udd04 `rename_all()`\nBatch rename columns using patterns:\n\n```python\nimport baribal as bb\n\n# Using regex pattern\nbb.rename_all(df, r'Col_(\\d+)')  # Extracts numbers from column names\n\n# Using case transformation\nbb.rename_all(df, lambda x: x.lower())  # Convert all to lowercase\n```\n\n### Analysis Tools\n\n#### \ud83d\udd0d `missing_summary()`\nComprehensive missing values analysis:\n\n```python\nimport baribal as bb\n\nsummary = bb.missing_summary(df)\n# Returns DataFrame with missing value statistics for each column\n```\n\n## Installation\n\n```bash\npip install baribal\n```\n\n## Dependencies\n\n- Python >= 3.8\n- pandas >= 1.0.0\n- polars >= 0.20.0 (optional)\n- numpy\n- scipy\n\n## Development\n\nThis project uses modern Python development tools:\n- `uv` for fast, reliable package management\n- `ruff` for lightning-fast linting and formatting\n- `pytest` for testing\n\nTo set up the development environment:\n\n```bash\nmake install\n```\n\nTo run tests:\n\n```bash\nmake test\n```\n\n## Contributing\n\nContributions are welcome! Whether it's:\n- Suggesting new R-inspired features\n- Improving documentation\n- Adding test cases\n- Reporting bugs\n\nPlease check out our [Contributing Guidelines](CONTRIBUTING.md) for details on our git commit conventions and development process.\n\n## License\n\nMIT License\n\n## Acknowledgments\n\nInspired by various R packages including:\n- `dplyr`\n- `janitor`\n- `tibble`\n- `naniar`",
    "bugtrack_url": null,
    "license": null,
    "summary": "Helper functions for pandas data analysis, inspired by R",
    "version": "0.2.0",
    "project_urls": null,
    "split_keywords": [
        "data-analysis",
        " glimpse",
        " pandas",
        " r"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e23c06a51da1a38b1aab09b8edc4dcf7f7a56c14d4ca465f73d748eb2b6acc8d",
                "md5": "9f1b797ce7f6e437338fdc3256c05a50",
                "sha256": "a189885495f15935c1423c0de23b135c539e081cc7e82c1c4211e803e37f071e"
            },
            "downloads": -1,
            "filename": "baribal-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9f1b797ce7f6e437338fdc3256c05a50",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 11440,
            "upload_time": "2025-02-06T21:30:27",
            "upload_time_iso_8601": "2025-02-06T21:30:27.010732Z",
            "url": "https://files.pythonhosted.org/packages/e2/3c/06a51da1a38b1aab09b8edc4dcf7f7a56c14d4ca465f73d748eb2b6acc8d/baribal-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4d3402499b277a307fa7783d1f2bc4374fb930316a4527f9b5a06fb089874869",
                "md5": "e4d04bd80ee12a71e1d7a6bd00cadc27",
                "sha256": "bf341fa6307ad9aece5739e5a0bbf068fa7844e08e1df6e6af7a7f9d3239905b"
            },
            "downloads": -1,
            "filename": "baribal-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e4d04bd80ee12a71e1d7a6bd00cadc27",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 52301,
            "upload_time": "2025-02-06T21:30:28",
            "upload_time_iso_8601": "2025-02-06T21:30:28.047253Z",
            "url": "https://files.pythonhosted.org/packages/4d/34/02499b277a307fa7783d1f2bc4374fb930316a4527f9b5a06fb089874869/baribal-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-06 21:30:28",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "baribal"
}
        
Elapsed time: 4.37698s