BambooTools


NameBambooTools JSON
Version 0.4.0 PyPI version JSON
download
home_pagehttps://github.com/KwstasMCPU/BambooTools
SummaryPandas extension to enchance your data analysis.
upload_time2023-11-04 08:05:02
maintainer
docs_urlNone
authorKonstantinos Maravegias
requires_python>=3.8
licenseMIT
keywords bambootools pandas pandas extensions data analysis data science analytics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # BambooTools

BambooTools is a Python library designed to enhance your data analysis workflows. Built as an extension to the widely-used pandas library, BambooTools provides one liner methods for outlier detection and investigation of missing values.

With BambooTools, you can easily identify and handle outliers in your data, enabling more accurate analyses and predictions. The library also offers a completeness summary feature, which provides a quick and efficient way to assess the completeness of your dataset.

## Installation

Install from PyPi

```bash
pip install BambooTools
```

Install from source

```bash
pip install git+https://github.com/KwstasMCPU/BambooTools
```

# Usage

You can find examples in the `bin\examples.py` file. I have illustrated some below as well.

## Completeness summary

`completeness()` retuns a completeness summary table, stating the percentages and counts of complete (not NULL) values for each column:

```python
from bambootools import bambootools
import pandas as pd
import numpy as np

df = pd.DataFrame({'Animal': ['Falcon', 'Falcon',
                              'Parrot', 'Parrot',
                              'Lama', 'Falcon'],
                   'Max Speed': [380, 370,
                                 24, 26,
                                 np.nan, np.nan],
                   'Weight': [np.nan, 2,
                              1.5, np.nan,
                              80, 2.2]
                   })
# check the completeness of the dataset per column
print(df.bbt.completeness())
```
|           | complete values | completeness ratio | total |
|-----------|-----------------|--------------------|-------|
| Animal    | 6               | 1.0                | 6     |
| Max Speed | 4               | 0.6666666666666666 | 6     |
| Weight    | 4               | 0.6666666666666666 | 6     |

Specifying a list of categorical columns would result the completeness per category:
```python
# check the completeness of the datataset per category
print(df.bbt.completeness(by=['Animal']))
```
|        | Max Speed       |                    |       | Weight          |                    |       |
|--------|-----------------|--------------------|-------|-----------------|--------------------|-------|
| Animal | complete values | completeness ratio | total | complete values | completeness ratio | total |
| Falcon | 2               | 0.6666666666666666 | 3     | 2               | 0.6666666666666666 | 3     |
| Lama   | 0               | 0.0                | 1     | 1               | 1.0                | 1     |
| Parrot | 2               | 1.0                | 2     | 1               | 0.5                | 2     |

## Missing values correlation matrix
`missing_corr_matrix()` This matrix aims to help to pintpoint relationships between missing values of different columns. Calculates
the conditional probability of a record's value being NaN in a specific column, given the fact another value for the same record is missing at a different column.

For a dataset with two columns `'A', 'B'` the conditional probability of a record having a missing value at column `'A'` is:

$$P(A \text{ is NULL } | B \text{ is NULL}) = \frac{P(A \text{ is NULL } \cap B \text{ is NULL})}{P(B \text{ is NULL})}$$

*Note:* The matrix alone will not tell the whole story. Additional metrics, such dataset's completeness can help if any relationship exists.

```python
# Generate a bigger dataset
# Set a seed for reproducibility
np.random.seed(0)

# Define the number of records
n_records = 50

# Define the categories for the 'animal' column
animals = ['cat', 'dog', 'lama']

# Generate random data
df = pd.DataFrame({
    'animal': np.random.choice(animals, n_records),
    'color': np.random.choice(['black', 'white', 'brown', 'gray'], n_records),
    'weight': np.random.randint(1, 100, n_records),
    'tail length': np.random.randint(1, 50, n_records),
    'height': np.random.randint(10, 500, n_records)
})

# Insert NULL values in the 'animal', 'color', 'weight', 'tail length' and 'height' columns
for col, n_nulls in zip(df.columns, [2, 15, 20, 48, 17]):
    null_indices = np.random.choice(df.index, n_nulls, replace=False)
    df.loc[null_indices, col] = np.nan

# missing values correlations
print(df.bbt.missing_corr_matrix())
```
|             | animal   | color    | weight   | tail length | height   |
|-------------|----------|----------|----------|-------------|----------|
| animal      | NaN      | 0.5      | 0.5      | 1           | 0        |
| color       | 0.066667 | NaN      | 0.333333 | 1           | 0.4      |
| weight      | 0.05     | 0.25     | NaN      | 0.95        | 0.25     |
| tail length | 0.041667 | 0.3125   | 0.395833 | NaN         | 0.354167 |
| height      | 0        | 0.352941 | 0.294118 | 1           | NaN      |

## Outlier summary

`outlier_summary()` retuns a summary of the outliers found in the dataset based on a specific method (eg. IQR).
It returns the number of outliers below and above the boundaries calculated by the specific method.
```python
penguins = sns.load_dataset("penguins")
# identify outliers using the  Inter Quartile Range approach
print(penguins.bbt.outlier_summary('iqr', factor=1))
```
|                   | n_outliers_upper | n_outliers_lower | n_non_outliers | n_total_outliers | total_records |
|-------------------|------------------|------------------|----------------|------------------|---------------|
| bill_depth_mm     | 0                | 0                | 342            | 0                | 342           |
| bill_length_mm    | 2                | 0                | 340            | 2                | 342           |
| body_mass_g       | 4                | 0                | 338            | 4                | 342           |
| flipper_length_mm | 0                | 0                | 342            | 0                | 342           |

You can also get the summary per group:

```python
# outliers per category
print(penguins.bbt.outlier_summary(method='iqr', by=['sex', 'species'], factor=1))
```
|                         |                   | n_non_outliers | n_outliers_lower | n_outliers_upper | n_total_outliers | total_records |
|-------------------------|-------------------|----------------|------------------|------------------|------------------|---------------|
| ('Female', 'Adelie')    | bill_depth_mm     | 71             | 1                | 1                | 2                | 73            |
| ('Female', 'Adelie')    | bill_length_mm    | 71             | 1                | 1                | 2                | 73            |
| ('Female', 'Adelie')    | body_mass_g       | 73             | 0                | 0                | 0                | 73            |
| ('Female', 'Adelie')    | flipper_length_mm | 65             | 5                | 3                | 8                | 73            |
| ('Female', 'Chinstrap') | bill_depth_mm     | 33             | 0                | 1                | 1                | 34            |
| ('Female', 'Chinstrap') | bill_length_mm    | 23             | 5                | 6                | 11               | 34            |
| ('Female', 'Chinstrap') | body_mass_g       | 31             | 2                | 1                | 3                | 34            |
| ('Female', 'Chinstrap') | flipper_length_mm | 33             | 1                | 0                | 1                | 34            |
| ('Female', 'Gentoo')    | bill_depth_mm     | 57             | 0                | 1                | 1                | 58            |
| ('Female', 'Gentoo')    | bill_length_mm    | 57             | 0                | 1                | 1                | 58            |
| ('Female', 'Gentoo')    | body_mass_g       | 57             | 1                | 0                | 1                | 58            |
| ('Female', 'Gentoo')    | flipper_length_mm | 56             | 1                | 1                | 2                | 58            |
| ('Male', 'Adelie')      | bill_depth_mm     | 64             | 3                | 6                | 9                | 73            |
| ('Male', 'Adelie')      | bill_length_mm    | 65             | 3                | 5                | 8                | 73            |
| ('Male', 'Adelie')      | body_mass_g       | 73             | 0                | 0                | 0                | 73            |
| ('Male', 'Adelie')      | flipper_length_mm | 67             | 4                | 2                | 6                | 73            |
| ('Male', 'Chinstrap')   | bill_depth_mm     | 33             | 1                | 0                | 1                | 34            |
| ('Male', 'Chinstrap')   | bill_length_mm    | 32             | 0                | 2                | 2                | 34            |
| ('Male', 'Chinstrap')   | body_mass_g       | 29             | 2                | 3                | 5                | 34            |
| ('Male', 'Chinstrap')   | flipper_length_mm | 32             | 1                | 1                | 2                | 34            |
| ('Male', 'Gentoo')      | bill_depth_mm     | 56             | 2                | 3                | 5                | 61            |
| ('Male', 'Gentoo')      | bill_length_mm    | 51             | 5                | 5                | 10               | 61            |
| ('Male', 'Gentoo')      | body_mass_g       | 59             | 1                | 1                | 2                | 61            |
| ('Male', 'Gentoo')      | flipper_length_mm | 59             | 2                | 0                | 2                | 61            |

## Outlier boundaries

`outlier_bounds()` returns the boundary values which any value below or above is considered an outlier:
```python
print(penguins.bbt.outlier_bounds(method='iqr', by=['sex', 'species'], factor=1))
```
|            |               | bill_length_mm | bill_length_mm | bill_depth_mm | bill_depth_mm | flipper_length_mm | flipper_length_mm | body_mass_g | body_mass_g |
|------------|---------------|----------------|----------------|---------------|---------------|-------------------|-------------------|-------------|-------------|
|            |               | lower          | upper          | lower         | upper         | lower             | upper             | lower       | upper       |
| **sex**    | **species**   |                |                |               |               |                   |                   |             |             |
| **Female** | **Adelie**    | 33             | 41.7           | 15.7          | 19.6          | 179               | 197               | 2800        | 3925        |
| **Female** | **Chinstrap** | 43.475         | 49.325         | 15.95         | 19.1          | 178.75            | 204.25            | 3031.25     | 4025        |
| **Female** | **Gentoo**    | 40.825         | 49.9           | 13            | 15.4          | 205               | 220               | 4050        | 5287.5      |
| **Male**   | **Adelie**    | 36.5           | 44             | 17.4          | 20.7          | 181               | 205               | 3300        | 4800        |
| **Male**   | **Chinstrap** | 48.125         | 53.9           | 17.8          | 20.8          | 189               | 210               | 3362.5      | 4468.75     |
| **Male**   | **Gentoo**    | 45.7           | 52.9           | 14.3          | 17            | 211               | 232               | 4900        | 6100        |

## Duplication summary

`duplication_summary()` returns metrics regarding the duplicate records of the given dataset. It states the number of total rows, unique rows, unique rows without duplications, unique records with duplications and total duplicated records:

```python
print(penguins.bbt.duplication_summary(subset=['sex',
                                               'species',
                                               'island']))
```
|                                     | counts |
|-------------------------------------|--------|
| total records                       | 344    |
| unique records                      | 13     |
| unique records without duplications | 1      |
| unique records with duplications    | 12     |
| total duplicated records            | 343    |

## Duplication frequency table
`duplication_frequency_table` generates a table which states the frequency of records with duplications. Categorizes the duplicated records according to their number of duplications, and reports the frequency of those categories.

In the example below, we notice that there are 2 cases of 5 identical records.

```python
print(penguins.bbt.duplication_frequency_table(subset=['sex',
                                                       'species',
                                                       'island']))
```
| n identical bins | frequency | sum of duplications | percentage to total duplications |
|------------------|-----------|---------------------|----------------------------------|
| 2                | 0         | 0                   | 0                                |
| 3                | 0         | 0                   | 0                                |
| 4                | 0         | 0                   | 0                                |
| 5                | 2         | 10                  | 0.029154519                      |
| [6, 10)          | 0         | 0                   | 0                                |
| [10, 15)         | 0         | 0                   | 0                                |
| [15, 50)         | 8         | 214                 | 0.623906706                      |
| 50>              | 2         | 119                 | 0.346938776                      |

# Contributing

Contributions are more than welcome! You can contribute with several ways:

* Bug reports and bug fixes
* Recommendations for new features and implementation of those
* Writing and or improving existing tests, to ensure quality

**Prior yout contribution, opening an issue is recommended.**

It is also recommended to install the package in ["development mode"](https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/#working-in-development-mode) while working on it. *When installed as editable, a project can be edited in-place without reinstallation.*

To install the Python package in "editable"/"development" mode, change directory to the root of the project directory and run:

```bash
pip install -e .
pip install -r requirements-dev.txt # this will install the development dependencies (e.g. pytest)
```

OR in order to install the package and the development dependencies with a one liner, run the below:

```bash
pip install -e ".[dev]"
```

To ensure that the development workflow is followed, please also setup the pre-commit hooks:

```bash
pre-commit install
```

## General Guidelines

1. Fork the repository on GitHub.
2. Clone the forked repository to your local machine.
3. Make a new branch, from the `develop` branch for your feature or bug fix.
4. Implement your changes.
   - It is recommended to write tests and examples for them in `tests\test_bambootols.py` and `bin\examples.py` respectively.
5. Create a Pull Request. Link it to the issue you have opened.

# Credits

Special thanks to [danikavu](https://github.com/danikavu) for the code reviews

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/KwstasMCPU/BambooTools",
    "name": "BambooTools",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "BambooTools,pandas,pandas extensions,data analysis,data science,analytics",
    "author": "Konstantinos Maravegias",
    "author_email": "kwstas.maras@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/8d/15/8b096e5017889599c35a765de69b1aba0a0991ea3fd903a644af7082ac24/BambooTools-0.4.0.tar.gz",
    "platform": null,
    "description": "# BambooTools\n\nBambooTools is a Python library designed to enhance your data analysis workflows. Built as an extension to the widely-used pandas library, BambooTools provides one liner methods for outlier detection and investigation of missing values.\n\nWith BambooTools, you can easily identify and handle outliers in your data, enabling more accurate analyses and predictions. The library also offers a completeness summary feature, which provides a quick and efficient way to assess the completeness of your dataset.\n\n## Installation\n\nInstall from PyPi\n\n```bash\npip install BambooTools\n```\n\nInstall from source\n\n```bash\npip install git+https://github.com/KwstasMCPU/BambooTools\n```\n\n# Usage\n\nYou can find examples in the `bin\\examples.py` file. I have illustrated some below as well.\n\n## Completeness summary\n\n`completeness()` retuns a completeness summary table, stating the percentages and counts of complete (not NULL) values for each column:\n\n```python\nfrom bambootools import bambootools\nimport pandas as pd\nimport numpy as np\n\ndf = pd.DataFrame({'Animal': ['Falcon', 'Falcon',\n                              'Parrot', 'Parrot',\n                              'Lama', 'Falcon'],\n                   'Max Speed': [380, 370,\n                                 24, 26,\n                                 np.nan, np.nan],\n                   'Weight': [np.nan, 2,\n                              1.5, np.nan,\n                              80, 2.2]\n                   })\n# check the completeness of the dataset per column\nprint(df.bbt.completeness())\n```\n|           | complete values | completeness ratio | total |\n|-----------|-----------------|--------------------|-------|\n| Animal    | 6               | 1.0                | 6     |\n| Max Speed | 4               | 0.6666666666666666 | 6     |\n| Weight    | 4               | 0.6666666666666666 | 6     |\n\nSpecifying a list of categorical columns would result the completeness per category:\n```python\n# check the completeness of the datataset per category\nprint(df.bbt.completeness(by=['Animal']))\n```\n|        | Max Speed       |                    |       | Weight          |                    |       |\n|--------|-----------------|--------------------|-------|-----------------|--------------------|-------|\n| Animal | complete values | completeness ratio | total | complete values | completeness ratio | total |\n| Falcon | 2               | 0.6666666666666666 | 3     | 2               | 0.6666666666666666 | 3     |\n| Lama   | 0               | 0.0                | 1     | 1               | 1.0                | 1     |\n| Parrot | 2               | 1.0                | 2     | 1               | 0.5                | 2     |\n\n## Missing values correlation matrix\n`missing_corr_matrix()` This matrix aims to help to pintpoint relationships between missing values of different columns. Calculates\nthe conditional probability of a record's value being NaN in a specific column, given the fact another value for the same record is missing at a different column.\n\nFor a dataset with two columns `'A', 'B'` the conditional probability of a record having a missing value at column `'A'` is:\n\n$$P(A \\text{ is NULL } | B \\text{ is NULL}) = \\frac{P(A \\text{ is NULL } \\cap B \\text{ is NULL})}{P(B \\text{ is NULL})}$$\n\n*Note:* The matrix alone will not tell the whole story. Additional metrics, such dataset's completeness can help if any relationship exists.\n\n```python\n# Generate a bigger dataset\n# Set a seed for reproducibility\nnp.random.seed(0)\n\n# Define the number of records\nn_records = 50\n\n# Define the categories for the 'animal' column\nanimals = ['cat', 'dog', 'lama']\n\n# Generate random data\ndf = pd.DataFrame({\n    'animal': np.random.choice(animals, n_records),\n    'color': np.random.choice(['black', 'white', 'brown', 'gray'], n_records),\n    'weight': np.random.randint(1, 100, n_records),\n    'tail length': np.random.randint(1, 50, n_records),\n    'height': np.random.randint(10, 500, n_records)\n})\n\n# Insert NULL values in the 'animal', 'color', 'weight', 'tail length' and 'height' columns\nfor col, n_nulls in zip(df.columns, [2, 15, 20, 48, 17]):\n    null_indices = np.random.choice(df.index, n_nulls, replace=False)\n    df.loc[null_indices, col] = np.nan\n\n# missing values correlations\nprint(df.bbt.missing_corr_matrix())\n```\n|             | animal   | color    | weight   | tail length | height   |\n|-------------|----------|----------|----------|-------------|----------|\n| animal      | NaN      | 0.5      | 0.5      | 1           | 0        |\n| color       | 0.066667 | NaN      | 0.333333 | 1           | 0.4      |\n| weight      | 0.05     | 0.25     | NaN      | 0.95        | 0.25     |\n| tail length | 0.041667 | 0.3125   | 0.395833 | NaN         | 0.354167 |\n| height      | 0        | 0.352941 | 0.294118 | 1           | NaN      |\n\n## Outlier summary\n\n`outlier_summary()` retuns a summary of the outliers found in the dataset based on a specific method (eg. IQR).\nIt returns the number of outliers below and above the boundaries calculated by the specific method.\n```python\npenguins = sns.load_dataset(\"penguins\")\n# identify outliers using the  Inter Quartile Range approach\nprint(penguins.bbt.outlier_summary('iqr', factor=1))\n```\n|                   | n_outliers_upper | n_outliers_lower | n_non_outliers | n_total_outliers | total_records |\n|-------------------|------------------|------------------|----------------|------------------|---------------|\n| bill_depth_mm     | 0                | 0                | 342            | 0                | 342           |\n| bill_length_mm    | 2                | 0                | 340            | 2                | 342           |\n| body_mass_g       | 4                | 0                | 338            | 4                | 342           |\n| flipper_length_mm | 0                | 0                | 342            | 0                | 342           |\n\nYou can also get the summary per group:\n\n```python\n# outliers per category\nprint(penguins.bbt.outlier_summary(method='iqr', by=['sex', 'species'], factor=1))\n```\n|                         |                   | n_non_outliers | n_outliers_lower | n_outliers_upper | n_total_outliers | total_records |\n|-------------------------|-------------------|----------------|------------------|------------------|------------------|---------------|\n| ('Female', 'Adelie')    | bill_depth_mm     | 71             | 1                | 1                | 2                | 73            |\n| ('Female', 'Adelie')    | bill_length_mm    | 71             | 1                | 1                | 2                | 73            |\n| ('Female', 'Adelie')    | body_mass_g       | 73             | 0                | 0                | 0                | 73            |\n| ('Female', 'Adelie')    | flipper_length_mm | 65             | 5                | 3                | 8                | 73            |\n| ('Female', 'Chinstrap') | bill_depth_mm     | 33             | 0                | 1                | 1                | 34            |\n| ('Female', 'Chinstrap') | bill_length_mm    | 23             | 5                | 6                | 11               | 34            |\n| ('Female', 'Chinstrap') | body_mass_g       | 31             | 2                | 1                | 3                | 34            |\n| ('Female', 'Chinstrap') | flipper_length_mm | 33             | 1                | 0                | 1                | 34            |\n| ('Female', 'Gentoo')    | bill_depth_mm     | 57             | 0                | 1                | 1                | 58            |\n| ('Female', 'Gentoo')    | bill_length_mm    | 57             | 0                | 1                | 1                | 58            |\n| ('Female', 'Gentoo')    | body_mass_g       | 57             | 1                | 0                | 1                | 58            |\n| ('Female', 'Gentoo')    | flipper_length_mm | 56             | 1                | 1                | 2                | 58            |\n| ('Male', 'Adelie')      | bill_depth_mm     | 64             | 3                | 6                | 9                | 73            |\n| ('Male', 'Adelie')      | bill_length_mm    | 65             | 3                | 5                | 8                | 73            |\n| ('Male', 'Adelie')      | body_mass_g       | 73             | 0                | 0                | 0                | 73            |\n| ('Male', 'Adelie')      | flipper_length_mm | 67             | 4                | 2                | 6                | 73            |\n| ('Male', 'Chinstrap')   | bill_depth_mm     | 33             | 1                | 0                | 1                | 34            |\n| ('Male', 'Chinstrap')   | bill_length_mm    | 32             | 0                | 2                | 2                | 34            |\n| ('Male', 'Chinstrap')   | body_mass_g       | 29             | 2                | 3                | 5                | 34            |\n| ('Male', 'Chinstrap')   | flipper_length_mm | 32             | 1                | 1                | 2                | 34            |\n| ('Male', 'Gentoo')      | bill_depth_mm     | 56             | 2                | 3                | 5                | 61            |\n| ('Male', 'Gentoo')      | bill_length_mm    | 51             | 5                | 5                | 10               | 61            |\n| ('Male', 'Gentoo')      | body_mass_g       | 59             | 1                | 1                | 2                | 61            |\n| ('Male', 'Gentoo')      | flipper_length_mm | 59             | 2                | 0                | 2                | 61            |\n\n## Outlier boundaries\n\n`outlier_bounds()` returns the boundary values which any value below or above is considered an outlier:\n```python\nprint(penguins.bbt.outlier_bounds(method='iqr', by=['sex', 'species'], factor=1))\n```\n|            |               | bill_length_mm | bill_length_mm | bill_depth_mm | bill_depth_mm | flipper_length_mm | flipper_length_mm | body_mass_g | body_mass_g |\n|------------|---------------|----------------|----------------|---------------|---------------|-------------------|-------------------|-------------|-------------|\n|            |               | lower          | upper          | lower         | upper         | lower             | upper             | lower       | upper       |\n| **sex**    | **species**   |                |                |               |               |                   |                   |             |             |\n| **Female** | **Adelie**    | 33             | 41.7           | 15.7          | 19.6          | 179               | 197               | 2800        | 3925        |\n| **Female** | **Chinstrap** | 43.475         | 49.325         | 15.95         | 19.1          | 178.75            | 204.25            | 3031.25     | 4025        |\n| **Female** | **Gentoo**    | 40.825         | 49.9           | 13            | 15.4          | 205               | 220               | 4050        | 5287.5      |\n| **Male**   | **Adelie**    | 36.5           | 44             | 17.4          | 20.7          | 181               | 205               | 3300        | 4800        |\n| **Male**   | **Chinstrap** | 48.125         | 53.9           | 17.8          | 20.8          | 189               | 210               | 3362.5      | 4468.75     |\n| **Male**   | **Gentoo**    | 45.7           | 52.9           | 14.3          | 17            | 211               | 232               | 4900        | 6100        |\n\n## Duplication summary\n\n`duplication_summary()` returns metrics regarding the duplicate records of the given dataset. It states the number of total rows, unique rows, unique rows without duplications, unique records with duplications and total duplicated records:\n\n```python\nprint(penguins.bbt.duplication_summary(subset=['sex',\n                                               'species',\n                                               'island']))\n```\n|                                     | counts |\n|-------------------------------------|--------|\n| total records                       | 344    |\n| unique records                      | 13     |\n| unique records without duplications | 1      |\n| unique records with duplications    | 12     |\n| total duplicated records            | 343    |\n\n## Duplication frequency table\n`duplication_frequency_table` generates a table which states the frequency of records with duplications. Categorizes the duplicated records according to their number of duplications, and reports the frequency of those categories.\n\nIn the example below, we notice that there are 2 cases of 5 identical records.\n\n```python\nprint(penguins.bbt.duplication_frequency_table(subset=['sex',\n                                                       'species',\n                                                       'island']))\n```\n| n identical bins | frequency | sum of duplications | percentage to total duplications |\n|------------------|-----------|---------------------|----------------------------------|\n| 2                | 0         | 0                   | 0                                |\n| 3                | 0         | 0                   | 0                                |\n| 4                | 0         | 0                   | 0                                |\n| 5                | 2         | 10                  | 0.029154519                      |\n| [6, 10)          | 0         | 0                   | 0                                |\n| [10, 15)         | 0         | 0                   | 0                                |\n| [15, 50)         | 8         | 214                 | 0.623906706                      |\n| 50>              | 2         | 119                 | 0.346938776                      |\n\n# Contributing\n\nContributions are more than welcome! You can contribute with several ways:\n\n* Bug reports and bug fixes\n* Recommendations for new features and implementation of those\n* Writing and or improving existing tests, to ensure quality\n\n**Prior yout contribution, opening an issue is recommended.**\n\nIt is also recommended to install the package in [\"development mode\"](https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/#working-in-development-mode) while working on it. *When installed as editable, a project can be edited in-place without reinstallation.*\n\nTo install the Python package in \"editable\"/\"development\" mode, change directory to the root of the project directory and run:\n\n```bash\npip install -e .\npip install -r requirements-dev.txt # this will install the development dependencies (e.g. pytest)\n```\n\nOR in order to install the package and the development dependencies with a one liner, run the below:\n\n```bash\npip install -e \".[dev]\"\n```\n\nTo ensure that the development workflow is followed, please also setup the pre-commit hooks:\n\n```bash\npre-commit install\n```\n\n## General Guidelines\n\n1. Fork the repository on GitHub.\n2. Clone the forked repository to your local machine.\n3. Make a new branch, from the `develop` branch for your feature or bug fix.\n4. Implement your changes.\n   - It is recommended to write tests and examples for them in `tests\\test_bambootols.py` and `bin\\examples.py` respectively.\n5. Create a Pull Request. Link it to the issue you have opened.\n\n# Credits\n\nSpecial thanks to [danikavu](https://github.com/danikavu) for the code reviews\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Pandas extension to enchance your data analysis.",
    "version": "0.4.0",
    "project_urls": {
        "Changelog": "https://github.com/KwstasMCPU/BambooTools/releases",
        "Documentation": "https://github.com/KwstasMCPU/BambooTools",
        "Homepage": "https://github.com/KwstasMCPU/BambooTools",
        "Source": "https://github.com/KwstasMCPU/BambooTools"
    },
    "split_keywords": [
        "bambootools",
        "pandas",
        "pandas extensions",
        "data analysis",
        "data science",
        "analytics"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5f55cb8c9e00a531bced2038ec821f96fbce88426018d409e83027ec4614a5f9",
                "md5": "029d150172ef5d74c80f99d41f75adfa",
                "sha256": "ca262852fcd9f61a0b794e7e93153d3050e271b1852ad0b2cee475ff58ab1e38"
            },
            "downloads": -1,
            "filename": "BambooTools-0.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "029d150172ef5d74c80f99d41f75adfa",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 12181,
            "upload_time": "2023-11-04T08:05:00",
            "upload_time_iso_8601": "2023-11-04T08:05:00.593959Z",
            "url": "https://files.pythonhosted.org/packages/5f/55/cb8c9e00a531bced2038ec821f96fbce88426018d409e83027ec4614a5f9/BambooTools-0.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8d158b096e5017889599c35a765de69b1aba0a0991ea3fd903a644af7082ac24",
                "md5": "42029c5c77fd37e9593e56cec0745363",
                "sha256": "ef0d0a3d7b340a65ed5133b66e07ac8fc61bd035ba399a085c24148294ffce29"
            },
            "downloads": -1,
            "filename": "BambooTools-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "42029c5c77fd37e9593e56cec0745363",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 13439,
            "upload_time": "2023-11-04T08:05:02",
            "upload_time_iso_8601": "2023-11-04T08:05:02.176112Z",
            "url": "https://files.pythonhosted.org/packages/8d/15/8b096e5017889599c35a765de69b1aba0a0991ea3fd903a644af7082ac24/BambooTools-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-04 08:05:02",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "KwstasMCPU",
    "github_project": "BambooTools",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "bambootools"
}
        
Elapsed time: 0.13288s