Name | baribal JSON |
Version |
0.2.0
JSON |
| download |
home_page | None |
Summary | Helper functions for pandas data analysis, inspired by R |
upload_time | 2025-02-06 21:30:28 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | None |
keywords |
data-analysis
glimpse
pandas
r
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|

# baribal ๐ป
[](https://github.com/gpenessot/baribal/actions)
[](https://pypi.org/project/baribal/)
[](https://pypi.org/project/baribal/)
[](https://codecov.io/gh/gpenessot/baribal)
[](https://github.com/gpenessot/baribal/blob/main/LICENSE)
[](https://pypi.org/project/baribal/)
[](https://github.com/astral-sh/ruff)
A Python package extending pandas and polars with helper functions for simpler exploratory data analysis and data wrangling, inspired by R's tidyverse packages.
## Why Baribal?
While pandas and polars are incredibly powerful, some R functions like `glimpse()`, `tabyl()`, or `clean_names()` make data exploration and manipulation particularly smooth. Baribal brings these functionalities to Python, helping you to:
- Get quick, insightful overviews of your DataFrames
- Perform common data cleaning tasks with less code
- Handle missing values more intuitively
- Generate summary statistics with minimal effort
- Optimize memory usage with smart type inference
## Features
### Core Functions
#### ๐ `glimpse()`
R-style enhanced DataFrame preview that works with both pandas and polars:
```python
import pandas as pd
import baribal as bb
df = pd.DataFrame({
'id': range(1, 6),
'name': ['John Doe', 'Jane Smith', 'Bob Wilson', 'Alice Brown', 'Charlie Davis'],
'age': [25, 30, 35, 28, 42],
'score': [92.5, 88.0, None, 95.5, 90.0]
})
bb.glimpse(df)
```
Output:
```
Observations: 5
Variables: 4
DataFrame type: pandas
$ id <int> 1, 2, 3, 4, 5
$ name <chr> "John Doe", "Jane Smith", "Bob Wilson", "Alice Brown", "Charlie Davis"
$ age <int> 25, 30, 35, 28, 42
$ score <num> 92.5, 88.0, NA, 95.5, 90.0
```
#### ๐ `tabyl()`
Enhanced cross-tabulations with integrated statistics:
```python
import baribal as bb
# Single variable frequency table
result, _ = bb.tabyl(df, 'category')
# Two-way cross-tabulation with chi-square statistics
result, stats = bb.tabyl(df, 'category', 'status')
```
### Data Cleaning
#### ๐งน `clean_names()`
Smart column name cleaning with multiple case styles:
```python
import baribal as bb
df = pd.DataFrame({
"First Name": [],
"Last.Name": [],
"Email@Address": [],
"Phone #": []
})
# Snake case (default)
bb.clean_names(df)
# โ columns become: ['first_name', 'last_name', 'email_address', 'phone']
# Camel case
bb.clean_names(df, case='camel')
# โ columns become: ['firstName', 'lastName', 'emailAddress', 'phone']
# Pascal case
bb.clean_names(df, case='pascal')
# โ columns become: ['FirstName', 'LastName', 'EmailAddress', 'Phone']
```
#### ๐ `rename_all()`
Batch rename columns using patterns:
```python
import baribal as bb
# Using regex pattern
bb.rename_all(df, r'Col_(\d+)') # Extracts numbers from column names
# Using case transformation
bb.rename_all(df, lambda x: x.lower()) # Convert all to lowercase
```
### Analysis Tools
#### ๐ `missing_summary()`
Comprehensive missing values analysis:
```python
import baribal as bb
summary = bb.missing_summary(df)
# Returns DataFrame with missing value statistics for each column
```
## Installation
```bash
pip install baribal
```
## Dependencies
- Python >= 3.8
- pandas >= 1.0.0
- polars >= 0.20.0 (optional)
- numpy
- scipy
## Development
This project uses modern Python development tools:
- `uv` for fast, reliable package management
- `ruff` for lightning-fast linting and formatting
- `pytest` for testing
To set up the development environment:
```bash
make install
```
To run tests:
```bash
make test
```
## Contributing
Contributions are welcome! Whether it's:
- Suggesting new R-inspired features
- Improving documentation
- Adding test cases
- Reporting bugs
Please check out our [Contributing Guidelines](CONTRIBUTING.md) for details on our git commit conventions and development process.
## License
MIT License
## Acknowledgments
Inspired by various R packages including:
- `dplyr`
- `janitor`
- `tibble`
- `naniar`
Raw data
{
"_id": null,
"home_page": null,
"name": "baribal",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "data-analysis, glimpse, pandas, r",
"author": null,
"author_email": "Ga\u00ebl Penessot <gael.penessot@data-decision.io>",
"download_url": "https://files.pythonhosted.org/packages/4d/34/02499b277a307fa7783d1f2bc4374fb930316a4527f9b5a06fb089874869/baribal-0.2.0.tar.gz",
"platform": null,
"description": "\n\n# baribal \ud83d\udc3b\n\n[](https://github.com/gpenessot/baribal/actions)\n[](https://pypi.org/project/baribal/)\n[](https://pypi.org/project/baribal/)\n[](https://codecov.io/gh/gpenessot/baribal)\n[](https://github.com/gpenessot/baribal/blob/main/LICENSE)\n[](https://pypi.org/project/baribal/)\n[](https://github.com/astral-sh/ruff)\n\nA Python package extending pandas and polars with helper functions for simpler exploratory data analysis and data wrangling, inspired by R's tidyverse packages.\n\n## Why Baribal?\n\nWhile pandas and polars are incredibly powerful, some R functions like `glimpse()`, `tabyl()`, or `clean_names()` make data exploration and manipulation particularly smooth. Baribal brings these functionalities to Python, helping you to:\n\n- Get quick, insightful overviews of your DataFrames\n- Perform common data cleaning tasks with less code\n- Handle missing values more intuitively\n- Generate summary statistics with minimal effort\n- Optimize memory usage with smart type inference\n\n## Features\n\n### Core Functions\n\n#### \ud83d\udd0d `glimpse()`\nR-style enhanced DataFrame preview that works with both pandas and polars:\n\n```python\nimport pandas as pd\nimport baribal as bb\n\ndf = pd.DataFrame({\n 'id': range(1, 6),\n 'name': ['John Doe', 'Jane Smith', 'Bob Wilson', 'Alice Brown', 'Charlie Davis'],\n 'age': [25, 30, 35, 28, 42],\n 'score': [92.5, 88.0, None, 95.5, 90.0]\n})\n\nbb.glimpse(df)\n```\n\nOutput:\n```\nObservations: 5\nVariables: 4\nDataFrame type: pandas\n$ id <int> 1, 2, 3, 4, 5\n$ name <chr> \"John Doe\", \"Jane Smith\", \"Bob Wilson\", \"Alice Brown\", \"Charlie Davis\"\n$ age <int> 25, 30, 35, 28, 42\n$ score <num> 92.5, 88.0, NA, 95.5, 90.0\n```\n\n#### \ud83d\udcca `tabyl()`\nEnhanced cross-tabulations with integrated statistics:\n\n```python\nimport baribal as bb\n\n# Single variable frequency table\nresult, _ = bb.tabyl(df, 'category')\n\n# Two-way cross-tabulation with chi-square statistics\nresult, stats = bb.tabyl(df, 'category', 'status')\n```\n\n### Data Cleaning\n\n#### \ud83e\uddf9 `clean_names()`\nSmart column name cleaning with multiple case styles:\n\n```python\nimport baribal as bb\n\ndf = pd.DataFrame({\n \"First Name\": [],\n \"Last.Name\": [],\n \"Email@Address\": [],\n \"Phone #\": []\n})\n\n# Snake case (default)\nbb.clean_names(df)\n# \u2192 columns become: ['first_name', 'last_name', 'email_address', 'phone']\n\n# Camel case\nbb.clean_names(df, case='camel')\n# \u2192 columns become: ['firstName', 'lastName', 'emailAddress', 'phone']\n\n# Pascal case\nbb.clean_names(df, case='pascal')\n# \u2192 columns become: ['FirstName', 'LastName', 'EmailAddress', 'Phone']\n```\n\n#### \ud83d\udd04 `rename_all()`\nBatch rename columns using patterns:\n\n```python\nimport baribal as bb\n\n# Using regex pattern\nbb.rename_all(df, r'Col_(\\d+)') # Extracts numbers from column names\n\n# Using case transformation\nbb.rename_all(df, lambda x: x.lower()) # Convert all to lowercase\n```\n\n### Analysis Tools\n\n#### \ud83d\udd0d `missing_summary()`\nComprehensive missing values analysis:\n\n```python\nimport baribal as bb\n\nsummary = bb.missing_summary(df)\n# Returns DataFrame with missing value statistics for each column\n```\n\n## Installation\n\n```bash\npip install baribal\n```\n\n## Dependencies\n\n- Python >= 3.8\n- pandas >= 1.0.0\n- polars >= 0.20.0 (optional)\n- numpy\n- scipy\n\n## Development\n\nThis project uses modern Python development tools:\n- `uv` for fast, reliable package management\n- `ruff` for lightning-fast linting and formatting\n- `pytest` for testing\n\nTo set up the development environment:\n\n```bash\nmake install\n```\n\nTo run tests:\n\n```bash\nmake test\n```\n\n## Contributing\n\nContributions are welcome! Whether it's:\n- Suggesting new R-inspired features\n- Improving documentation\n- Adding test cases\n- Reporting bugs\n\nPlease check out our [Contributing Guidelines](CONTRIBUTING.md) for details on our git commit conventions and development process.\n\n## License\n\nMIT License\n\n## Acknowledgments\n\nInspired by various R packages including:\n- `dplyr`\n- `janitor`\n- `tibble`\n- `naniar`",
"bugtrack_url": null,
"license": null,
"summary": "Helper functions for pandas data analysis, inspired by R",
"version": "0.2.0",
"project_urls": null,
"split_keywords": [
"data-analysis",
" glimpse",
" pandas",
" r"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "e23c06a51da1a38b1aab09b8edc4dcf7f7a56c14d4ca465f73d748eb2b6acc8d",
"md5": "9f1b797ce7f6e437338fdc3256c05a50",
"sha256": "a189885495f15935c1423c0de23b135c539e081cc7e82c1c4211e803e37f071e"
},
"downloads": -1,
"filename": "baribal-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9f1b797ce7f6e437338fdc3256c05a50",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 11440,
"upload_time": "2025-02-06T21:30:27",
"upload_time_iso_8601": "2025-02-06T21:30:27.010732Z",
"url": "https://files.pythonhosted.org/packages/e2/3c/06a51da1a38b1aab09b8edc4dcf7f7a56c14d4ca465f73d748eb2b6acc8d/baribal-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4d3402499b277a307fa7783d1f2bc4374fb930316a4527f9b5a06fb089874869",
"md5": "e4d04bd80ee12a71e1d7a6bd00cadc27",
"sha256": "bf341fa6307ad9aece5739e5a0bbf068fa7844e08e1df6e6af7a7f9d3239905b"
},
"downloads": -1,
"filename": "baribal-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "e4d04bd80ee12a71e1d7a6bd00cadc27",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 52301,
"upload_time": "2025-02-06T21:30:28",
"upload_time_iso_8601": "2025-02-06T21:30:28.047253Z",
"url": "https://files.pythonhosted.org/packages/4d/34/02499b277a307fa7783d1f2bc4374fb930316a4527f9b5a06fb089874869/baribal-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-06 21:30:28",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "baribal"
}