icanexplain


Nameicanexplain JSON
Version 0.3.0 PyPI version JSON
download
home_pageNone
SummaryExplain why metrics change by unpacking them
upload_time2024-10-08 18:33:24
maintainerNone
docs_urlNone
authorMax Halford
requires_python<4.0,>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # icanexplain

<p>
<!-- Tests -->
<a href="https://github.com/carbonfact/icanexplain/actions/workflows/unit-tests.yml">
    <img src="https://github.com/carbonfact/icanexplain/actions/workflows/unit-tests.yml/badge.svg" alt="tests">
</a>

<!-- Code quality -->
<a href="https://github.com/carbonfact/icanexplain/actions/workflows/code-quality.yml">
    <img src="https://github.com/carbonfact/icanexplain/actions/workflows/code-quality.yml/badge.svg" alt="code_quality">
</a>

<!-- Documentation -->
<a href="https://carbonfact.github.io/icanexplain">
    <img src="https://img.shields.io/website?label=docs&style=flat-square&url=https%3A%2F%2Fcarbonfact.github.io/icanexplain%2F" alt="documentation">
</a>

<!-- PyPI -->
<a href="https://pypi.org/project/icanexplain">
    <img src="https://img.shields.io/pypi/v/icanexplain.svg?label=release&color=blue" alt="pypi">
</a>

<!-- License -->
<a href="https://opensource.org/license/apache-2-0/">
    <img src="https://img.shields.io/github/license/carbonfact/icanexplain" alt="license">
</a>
</p>

_Explain why metrics change by unpacking them_

This library is here to help with the difficult task of explaining why a metric changes. It's particularly useful for analysts, data scientists, analytics engineers, and business intelligence professionals who need to understand the drivers of a metric's change.

This README provides a small introduction. For more information, please refer to the [documentation](https://carbonfact.github.io/icanexplain).

Check out [this blog post](https://maxhalford.github.io/blog/kpi-evolution-decomposition/) for some in-depth explanation.

## Quickstart

Let's say you're an analyst at an Airbnb-like company. You're tasked with analyzing year-over-year revenue growth. You have obtained the following dataset:

```py
>>> import pandas as pd
>>> fmt_currency = lambda x: '' if pd.isna(x) else '${:,.0f}'.format(x)

>>> revenue = pd.DataFrame.from_dict([
...     {'year': 2019, 'bookings': 1_000, 'revenue_per_booking': 200},
...     {'year': 2020, 'bookings': 1_000, 'revenue_per_booking': 220},
...     {'year': 2021, 'bookings': 1_500, 'revenue_per_booking': 220},
...     {'year': 2022, 'bookings': 1_700, 'revenue_per_booking': 225},
... ])
>>> (
...     revenue
...     .assign(bookings=revenue.bookings.apply('{:,d}'.format))
...     .assign(revenue_per_booking=revenue.revenue_per_booking.apply(fmt_currency))
...     .set_index('year')
... )
     bookings revenue_per_booking
year
2019    1,000                $200
2020    1,000                $220
2021    1,500                $220
2022    1,700                $225

```

It's quite straightforward to calculate the revenue for each year, and then to measure the year-over-year growth:

```py
>>> (
...     revenue
...     .assign(revenue=revenue.eval('bookings * revenue_per_booking'))
...     .assign(growth=lambda x: x.revenue.diff())
...     .assign(bookings=revenue.bookings.apply('{:,d}'.format))
...     .assign(revenue_per_booking=revenue.revenue_per_booking.apply(fmt_currency))
...     .assign(revenue=lambda x: x.revenue.apply(fmt_currency))
...     .assign(growth=lambda x: x.growth.apply(fmt_currency))
...     .set_index('year')
... )
     bookings revenue_per_booking   revenue    growth
year
2019    1,000                $200  $200,000
2020    1,000                $220  $220,000   $20,000
2021    1,500                $220  $330,000  $110,000
2022    1,700                $225  $382,500   $52,500

```

Growth can be due to two factors: an increase in the number of bookings, or an increase in the revenue per booking. The icanexplain library to decompose the growth into these two factors. First, let's install the package:

```sh
pip install icanexplain
```

Then, we can use the `SumExplainer` to decompose the growth:

```py
>>> import icanexplain as ice
>>> explainer = ice.SumExplainer(
...     fact='revenue_per_booking',
...     period='year',
...     count='bookings'
... )
>>> explanation = explainer(revenue)
>>> explanation.map(fmt_currency)
        inner       mix
year
2020  $20,000        $0
2021       $0  $110,000
2022   $7,500   $45,000

```

Here's how to interpret this explanation:

- From 2019 to 2020, the revenue growth was entirely due to an increase in the revenue per booking. The number of bookings was exactly the same. Therefore, the $20,000 is entirely due to the inner effect (increase in revenue per booking).
- From 2020 to 2021, the revenue growth was entirely due to an increase in the number of bookings. The revenue per booking was exactly the same. Therefore, the $110,000 is entirely due to the mix effect (increase in bookings).
- From 2021 to 2022, there was a $52,500 revenue growth. However, the revenue per booking went down by $10, so the increase is due to the higher number of bookings. The inner effect is -$7,500 while the mix effect is $45,000.

Here's a visual representation of this last interpretation:

<p align="center">
  <img src="https://github.com/user-attachments/assets/19a10291-18d3-42aa-ad45-17af32f01e8f" alt="example" width="70%"/>
</p>

## Contributing

Feel free to reach out to [max@carbonfact.com](mailto:max@carbonfact.com) if you want to know more and/or contribute 🤗

Check out the [contribution guidelines](CONTRIBUTING.md) to get started.

## License

icanexplain is free and open-source software licensed under the Apache License, Version 2.0.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "icanexplain",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Max Halford",
    "author_email": "maxhalford25@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/4f/b2/a9385967918b8f8db6c7c0643fbf10bbe599fa9e352d2c0fc96535bff010/icanexplain-0.3.0.tar.gz",
    "platform": null,
    "description": "# icanexplain\n\n<p>\n<!-- Tests -->\n<a href=\"https://github.com/carbonfact/icanexplain/actions/workflows/unit-tests.yml\">\n    <img src=\"https://github.com/carbonfact/icanexplain/actions/workflows/unit-tests.yml/badge.svg\" alt=\"tests\">\n</a>\n\n<!-- Code quality -->\n<a href=\"https://github.com/carbonfact/icanexplain/actions/workflows/code-quality.yml\">\n    <img src=\"https://github.com/carbonfact/icanexplain/actions/workflows/code-quality.yml/badge.svg\" alt=\"code_quality\">\n</a>\n\n<!-- Documentation -->\n<a href=\"https://carbonfact.github.io/icanexplain\">\n    <img src=\"https://img.shields.io/website?label=docs&style=flat-square&url=https%3A%2F%2Fcarbonfact.github.io/icanexplain%2F\" alt=\"documentation\">\n</a>\n\n<!-- PyPI -->\n<a href=\"https://pypi.org/project/icanexplain\">\n    <img src=\"https://img.shields.io/pypi/v/icanexplain.svg?label=release&color=blue\" alt=\"pypi\">\n</a>\n\n<!-- License -->\n<a href=\"https://opensource.org/license/apache-2-0/\">\n    <img src=\"https://img.shields.io/github/license/carbonfact/icanexplain\" alt=\"license\">\n</a>\n</p>\n\n_Explain why metrics change by unpacking them_\n\nThis library is here to help with the difficult task of explaining why a metric changes. It's particularly useful for analysts, data scientists, analytics engineers, and business intelligence professionals who need to understand the drivers of a metric's change.\n\nThis README provides a small introduction. For more information, please refer to the [documentation](https://carbonfact.github.io/icanexplain).\n\nCheck out [this blog post](https://maxhalford.github.io/blog/kpi-evolution-decomposition/) for some in-depth explanation.\n\n## Quickstart\n\nLet's say you're an analyst at an Airbnb-like company. You're tasked with analyzing year-over-year revenue growth. You have obtained the following dataset:\n\n```py\n>>> import pandas as pd\n>>> fmt_currency = lambda x: '' if pd.isna(x) else '${:,.0f}'.format(x)\n\n>>> revenue = pd.DataFrame.from_dict([\n...     {'year': 2019, 'bookings': 1_000, 'revenue_per_booking': 200},\n...     {'year': 2020, 'bookings': 1_000, 'revenue_per_booking': 220},\n...     {'year': 2021, 'bookings': 1_500, 'revenue_per_booking': 220},\n...     {'year': 2022, 'bookings': 1_700, 'revenue_per_booking': 225},\n... ])\n>>> (\n...     revenue\n...     .assign(bookings=revenue.bookings.apply('{:,d}'.format))\n...     .assign(revenue_per_booking=revenue.revenue_per_booking.apply(fmt_currency))\n...     .set_index('year')\n... )\n     bookings revenue_per_booking\nyear\n2019    1,000                $200\n2020    1,000                $220\n2021    1,500                $220\n2022    1,700                $225\n\n```\n\nIt's quite straightforward to calculate the revenue for each year, and then to measure the year-over-year growth:\n\n```py\n>>> (\n...     revenue\n...     .assign(revenue=revenue.eval('bookings * revenue_per_booking'))\n...     .assign(growth=lambda x: x.revenue.diff())\n...     .assign(bookings=revenue.bookings.apply('{:,d}'.format))\n...     .assign(revenue_per_booking=revenue.revenue_per_booking.apply(fmt_currency))\n...     .assign(revenue=lambda x: x.revenue.apply(fmt_currency))\n...     .assign(growth=lambda x: x.growth.apply(fmt_currency))\n...     .set_index('year')\n... )\n     bookings revenue_per_booking   revenue    growth\nyear\n2019    1,000                $200  $200,000\n2020    1,000                $220  $220,000   $20,000\n2021    1,500                $220  $330,000  $110,000\n2022    1,700                $225  $382,500   $52,500\n\n```\n\nGrowth can be due to two factors: an increase in the number of bookings, or an increase in the revenue per booking. The icanexplain library to decompose the growth into these two factors. First, let's install the package:\n\n```sh\npip install icanexplain\n```\n\nThen, we can use the `SumExplainer` to decompose the growth:\n\n```py\n>>> import icanexplain as ice\n>>> explainer = ice.SumExplainer(\n...     fact='revenue_per_booking',\n...     period='year',\n...     count='bookings'\n... )\n>>> explanation = explainer(revenue)\n>>> explanation.map(fmt_currency)\n        inner       mix\nyear\n2020  $20,000        $0\n2021       $0  $110,000\n2022   $7,500   $45,000\n\n```\n\nHere's how to interpret this explanation:\n\n- From 2019 to 2020, the revenue growth was entirely due to an increase in the revenue per booking. The number of bookings was exactly the same. Therefore, the $20,000 is entirely due to the inner effect (increase in revenue per booking).\n- From 2020 to 2021, the revenue growth was entirely due to an increase in the number of bookings. The revenue per booking was exactly the same. Therefore, the $110,000 is entirely due to the mix effect (increase in bookings).\n- From 2021 to 2022, there was a $52,500 revenue growth. However, the revenue per booking went down by $10, so the increase is due to the higher number of bookings. The inner effect is -$7,500 while the mix effect is $45,000.\n\nHere's a visual representation of this last interpretation:\n\n<p align=\"center\">\n  <img src=\"https://github.com/user-attachments/assets/19a10291-18d3-42aa-ad45-17af32f01e8f\" alt=\"example\" width=\"70%\"/>\n</p>\n\n## Contributing\n\nFeel free to reach out to [max@carbonfact.com](mailto:max@carbonfact.com) if you want to know more and/or contribute \ud83e\udd17\n\nCheck out the [contribution guidelines](CONTRIBUTING.md) to get started.\n\n## License\n\nicanexplain is free and open-source software licensed under the Apache License, Version 2.0.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Explain why metrics change by unpacking them",
    "version": "0.3.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ad923eee8b689f60a7c576ffa37ea9eb702fee94a57ff4a6b8f8f094f042ba96",
                "md5": "0ec378c845c0a92a5f47e6ca33ca4596",
                "sha256": "18164992289c73fafcdbaa903f2b67c32826f8ae63bd34a29d8049b23713d58b"
            },
            "downloads": -1,
            "filename": "icanexplain-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0ec378c845c0a92a5f47e6ca33ca4596",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 2408344,
            "upload_time": "2024-10-08T18:33:18",
            "upload_time_iso_8601": "2024-10-08T18:33:18.800918Z",
            "url": "https://files.pythonhosted.org/packages/ad/92/3eee8b689f60a7c576ffa37ea9eb702fee94a57ff4a6b8f8f094f042ba96/icanexplain-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4fb2a9385967918b8f8db6c7c0643fbf10bbe599fa9e352d2c0fc96535bff010",
                "md5": "2acbe7579c37f6b67e9a5d1c799f1cec",
                "sha256": "1f6483b631cac9631b70738c2c80976e86422f5ddea0a8fad8207a503b4f3e72"
            },
            "downloads": -1,
            "filename": "icanexplain-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2acbe7579c37f6b67e9a5d1c799f1cec",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 2410755,
            "upload_time": "2024-10-08T18:33:24",
            "upload_time_iso_8601": "2024-10-08T18:33:24.172748Z",
            "url": "https://files.pythonhosted.org/packages/4f/b2/a9385967918b8f8db6c7c0643fbf10bbe599fa9e352d2c0fc96535bff010/icanexplain-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-08 18:33:24",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "icanexplain"
}
        
Elapsed time: 4.56372s