# Confidence Interval Tools
A small python library for calculating and drawing confidence intervals.
1. [Requirements](#requirements)
2. [Status](#status)
3. [Documentation](#documentation)
+ [Installation](#installation)
+ [Usage](#usage)
+ [Classes and methods](#classes-and-methods)
4. [Roadmap](#roadmap)
5. [Contribution](#contribution)
## Requirements
```python
Python^3.12 ## might also work with lower versions, but untested
pandas^2.2
matplotlib^3.9
seaborn^0.13
scipy^1.14
numpy^1.26
```
Note that to avoid blocking the install on machines with lower dependency versions, the dependency requirements were
lowered to the nearest major version.
## Status
> [!WARNING]
> The project is in a very early development phase. Expect important changes between updates.
**Latest version**: 0.1.6
**Updated**: August 2024
**Changes since previous version**:
+ implemented of the `ste_ci` method for calculating the standard error.
+ added documentation
## Documentation
> [!NOTE]
> **Last documentation update**: August 2024
### Installation
This project is published in PyPi under the name `confidence_interval_tools`.
#### With pip
```bash
pip install confidence_interval_tools
```
For updating to the latest available version:
```bash
pip install -U confidence_interval_tools
```
To force a specific version (for example 0.2.0):
```bash
pip install --force-reinstall -v "confidence_interval_tools==0.2.0"
```
#### With poetry
```bash
poetry add confidence_interval_tools@latest
```
For updating:
```bash
poetry update
```
#### In a Jupyter notebook (notice the exclamation mark)
```python
!pip install -U confidence_interval_tools
```
### Usage
Methods and classes can be imported directly, for example:
```python
from confidence_interval_tools import CI_Drawer
## [...]
a = CI_Drawer(data=data, x="x", y="y", kind=["bars", "area"], ci_type="std")
```
However, for the sake of readability and traceability, it might be better to import (and alias) the whole package at once:
```python
import confidence_interval_tools as cit
## [...]
a = cit.CI_Drawer(data=data, x="x", y="y", kind=["bars", "area"], ci_type="std")
```
As this package aims to be a complement to `Seaborn` and `Matplotlib`, we recommend reading the respective documentation of these two packages:
+ [Seaborn](https://seaborn.pydata.org/tutorial.html)
+ [Matplotlib](https://matplotlib.org/stable/users/index.html)
And additionally:
+ [Pandas](https://pandas.pydata.org/docs/user_guide/index.html)
+ [Numpy](https://numpy.org/doc/stable/user/)
+ [Scipy](https://docs.scipy.org/doc/scipy/tutorial/index.html)
### Classes and methods
#### Main module
> `CI_Drawer` (>=0.1.5)
>
> "A class for drawing a confidence interval in whatever way you prefer."
>
> **Arguments**:
> + `data` (pandas.DataFrame, optional): a pandas dataframe containing the necessary information to draw confidence intervals. If **data** is provided, **x**, **y**, **lower**, **upper**, and **std** can be given as column names.
> + `x` (str | *data type*, optional): column name or list / array / series with information about the horizontal coordinate of the data. If not provided, it will be assumed to be [1, 2, 3, 4, ...].
> + `y` (str | *data type*, optional): column name or list / array / series with information about the vertical coordinate of the data. Usually required unless **lower** and **upper** are provided directly.
> + `lower` (str | *data type*, optional): bypass the internal calculation by directly providing values for the lower bound of each confidence interval.
> + `upper` (str | *data type*, optional): bypass the internal calculation by directly providing values for the upper bound of each confidence interval.
> + `kind` ("lines" | "bars" | "area" | "scatterplot" | "none", optional): a selection of what kind of confidence interval is to be drawn. The default is "none" (does nothing). Several kinds can be seleted at once and passed as a list or tuple, e.g., ["area", "bars"].
> + `ci_type` ("std" | "ste", optional): the type of calculation used for the confidence intervals. Currently available types are: standard deviation (std), standard error (ste). The default is set to "std".
> + `std` (str | *data type*, optional): bypass the internal calculation for the standard deviation by providing pre-calculated values.
> + `std_multiplier` (*numerical type*, optional): constant to be used as a multiplier of the standard deviation or standard error when a normal approximation is done. Currently used for "std" and "ste" CI types. Default is 1.96 (i.e., alpha risk level of 5%, two-sided).
> + `orientation` ("horizontal" | "vertical", optional): orientation of the confidence interval, i.e., whether a confidence interval should be calculated for each value of **x** ("vertical"), or each value of **y** ("horizontal").
> + CI lines options:
> + `draw_lines` (bool, optional): manual toggle for the drawing of CI lines. Same as using **kind="lines"**.
> + `draw_lower_line` (bool, optional): manual toggle for the drawing of a line for the lower bound of the confidence interval.
> + `draw_upper_line` (bool, optional): manual toggle for the drawing of a line for the upper bound of the confidence interval.
> + `lines_style` (*matplotlib linestyles type*, optional): style for the CI lines. Follows the same syntax as [Matplotlib linestyles](https://matplotlib.org/stable/gallery/lines_bars_and_markers/linestyles.html). Default: "solid".
> + `lower_line_style` (*matplotlib linestyles type*, optional): specify a different linestyle for the lower bound. Cf: **lines_style**.
> + `upper_line_style` (*matplotlib linestyles type*, optional): specify a different linestyle for the upper bound. Cf: **lines_style**.
> + `lines_color` (*matplotlib colors type*, optional): colo(u)r of the CI lines. See the lst of available [Matplotlib named colo(u)rs](https://matplotlib.org/stable/gallery/color/named_colors.html). Default: "black".
> + `lower_line_color` (*matplotlib colors type*, optional): specify a different colo(u)r for the lower bound. Cf: **lines_color**.
> + `upper_line_color` (*matplotlib colors type*, optional): specify a different colo(u)r for the upper bound. Cf: **lines_color**.
> + `lines_linewidth` (*numerical type*, optional): linewidth for the CI lines. Default: 1 (pt).
> + `lower_line_linewidth` (*numerical type*, optional): specify the linewidth for the lower bound. Cf: **lines_linewidth**.
> + `upper_line_linewidth` (*numerical type*, optional): specify the linewidth for the upper bound. Cf: **lines_linewidth**.
> + `lines_alpha` (*numerical type*, optional): opacity / transparency value (a.k.a. "alpha channel") for the CI lines. Must be a decimal value between 0 and 1. Default: 0.8.
> + `lower_line_alpha` (*numerical type*, optional): specify the opacity for the lower bound. Cf: **lines_alpha**.
> + `upper_line_alpha` (*numerical type*, optional): specify the opacity for the upper bound. Cf: **lines_alpha**.
> + CI bars options:
> + `draw_bars` (bool, optional): manual toggle for the drawing of CI bars. Same as using **kind="bars"**.
> + `draw_bar_ends` (bool, optional): whether to draw the perpendicular ends of the CI bars. Default: True when **draw_bars** is activated. Can be "abused" to draw the ends without drawing the actual body of the CI bars.
> + `draw_lower_bar_end` (bool, optional): specify whether to draw the perpendicular ends of the CI bars for the lower bound.
> + `draw_upper_bar_end` (bool, optional): specify whether to draw the perpendicular ends of the CI bars for the upper bound.
> + `bars_style` (*matplotlib linestyles type*, optional): linestyle for the CI bars. See [Matplotlib linestyles](https://matplotlib.org/stable/gallery/lines_bars_and_markers/linestyles.html). Default: "solid".
> + `bars_color` (*matplotlib colors type*, optional): colo(u)r of the CI bars. See [Matplotlib named colo(u)rs](https://matplotlib.org/stable/gallery/color/named_colors.html). Default: "black".
> + `bars_linewidth` (*numerical type*, optional): linewidth for the CI bars. Default: 1 (pt).
> + `bars_alpha` (*numerical type*, optional): opacity of the CI bars. Default: 1.
> + `bar_ends_style` (*matplotlib linestyles type*, optional): specify the linestyle used for the perpendicular ends of the CI bars. The default is "solid" and is independent from the linestyle of the main body of the bars.
> + `bar_ends_color` (*matplotlib colors type*, optional): specify the colo(u)r of both ends of the CI bars. CF: **bars_color**.
> + `lower_bar_end_color` (*matplotlib colors type*, optional): specify a colo(u)r for the lower bound.
> + `upper_bar_end_color` (*matplotlib colors type*, optional): specify a colo(u)r for the upper bound.
> + `bar_ends_width` (*numerical type*, optional): specify a fixed width for the perpendicular ends of the CI bars. Currently relative to the scale of the data, might change in the future (see [roadmap](#roadmap)). Takes priority over the **bar_ends_ratio** if specified.
> + `bar_ends_ratio` (*numerical type*, optional): width of the perpendicular ends of the CI bars, expressed as a proportion of the average distance between two adjacent x (or y) coordinates. Values greater than 1 should result in overlaps between adjacent CI bars, which is usually not a desired behaviour. Default: 0.3.
> + `hide_bars_center_portion` (bool, optional): when set to True, the middle part of the CI bars will not be drawn, so as to avoid obscuring the plot (for example if a central tendency was already plotted). Default: False.
> + `bars_center_portion_length` (*numerical type*, optional): length of the central portion (i.e., the "middle part) of the CI bars. Currently relative to the scale of the data. Takes priority over **bars_center_portion_ratio** when specified. Used with **hide_bars_center_portion**.
> + `bars_center_portion_ratio` (*numerical type*, optional): length of the central portion of the CI bars, expressed as a proportion of the bars' length. Used with **hide_bars_center_portion**. Default: 0.5.
> + CI area options:
> + `fill_area` (bool, optional): manual toggle for the drawing of the confidence interval as a shaded area. Same as using **kind="area"**.
> + `fill_color` (*matplotlib colors type*, optional): colo(u)r used for the shading of the CI area. See [Matplotlib named colo(u)rs](https://matplotlib.org/stable/gallery/color/named_colors.html). Default: "lavender".
> + `fill_alpha` (*numerical type*, optional): opacity of the shaded area. Default: 0.4.
> + options for the scatterplot of the lowers and upper bounds:
> + `plot_limits` (bool, optional): manual toggle for plotting the lower and upper bounds of the confidence intervals as separate datapoints. Same as using **kind="scatterplot"**.
> + `plot_lower_limit` (bool, optional): whether to plot the lower bound.
> + `plot_upper_limit` (bool, optional): whether to plot the upper bound.
> + `plot_marker` (*matplotlib markers type*, optional): marker to be used when plotting the lower and upper bounds. See the list of [Matplotlib markers](https://matplotlib.org/stable/api/markers_api.html). Default: see **lower_plot_marker** and **upper_plot_marker**.
> + `lower_plot_marker` (*matplotlib markers type*, optional): marker to be used when plotting the lower bound. Cf: **plot_marker**.
> + `upper_plot_marker` (*matplotlib markers type*, optional): marker to be used when plotting the upper bound. Cf: **plot_marker**.
> + `plot_color` (*matplotlib colors type*, optional): colo(u)r of the markers. See [Matplotlib named colo(u)rs](https://matplotlib.org/stable/gallery/color/named_colors.html). Default: "black".
> + `lower_plot_color` (*matplotlib colors type*, optional): specify a colo(u)r for the lower bound. Cf: **plot_color**.
> + `upper_plot_color` (*matplotlib colors type*, optional): specify a colo(u)r for the upper bound. Cf: **plot_color**.
> + `plot_alpha` (*numerical type*, optional): opacity of the markers used when plotting the lower and upper bounds. Default: 0.8.
> + `lower_plot_alpha` (*numerical type*, optional): specify the opacity for the lower bound.
> + `upper_plot_alpha` (*numerical type*, optional): specify the opacity for the upper bound.
> + `plot_size` (*numerical type*, optional): size of the markers (in pt square) when plotting the lower and upper bounds. Default: None (let Seaborn / Matplotlib decide).
> + `lower_plot_size` (*numerical type*, optional): specify a size for the markers of the lower bound. Cf: **plot_size**.
> + `upper_plot_size` (*numerical type*, optional): specify a size for the markers of the upper bound. Cf: **plot_size**.
> + ax (matplotlib.axes.Axes, optional): a matplotlib Axes object to be used for drawing the confidence intervals. Defaut: last used object, identified with matplotlib.pyplot.gca().
>
> **Returns**: a new instance of the CI_Drawer class.
>
> **Instance attributes and methods**:
> + `.data` (pandas.DataFrame): a copy of the dataframe passed as argument.
> + `.x`, `.y` (pandas.Series): a copy of the x and y data passed as arguments.
> + `.lower`, `.upper` (pandas.Series): series containing the (calculated or specified) lower bounds and upper bounds.
> + `.unique_x`, `.unique_y` (pandas.Series): series containing the unique values filtered from x and y respectively.
> + `.std` (pandas.Series): series containing the (calculated or specified) standard deviation for each unique value of x (vertical CI) or y (horizontal CI).
> + `.mean` (pandas.Series): series containing the calculated mean for each unique value of x or y.
> + `.median` (pandas.Series): series containing the calculated median for each unique value of or y.
> + `.q1`, `.q3` (pandas.Series): series containing the calculated first and third quartiles for each unique value of x or y.
> + `.as_datafrae()` (pandas.DataFrame): returns a dataframe containing most of the information listed above.
> + `.params` (dict): dictionary containing most of the parameters used for deciding what to draw and how to draw.
> + `.draw()` (None): method for drawing (or, redrawing) the confidence intervals with the given parameters.
> + `.help()` (None): method to return a help message in an interpreter or a jupyter notebook. Not yet implemented.
> `std_ci` (>=0.1.5)
>
> "Upper and lower bounds of a CI based on standard deviation (normal approximation around mean)"
>
> **Arguments**:
> + `v` (*data type*): a one-dimensional data vector (for example, all y values for a unique value of x)
> + `std_multiplier` (*numerical type*): a number by which the standard deviation is multiplied to yield the confidence interval.
>
> **Returns**: a tuple, of the form (\<lower bound\>, \<upper bound\>).
> `ste_ci` (>=0.1.6)
>
> "Upper and lower bounds of a CI based on standard error (normal approximation around mean)"
>
> **Arguments**:
> + `v` (*data type*): a one-dimensional data vector (for example, all y values for a unique value of x)
> + `ste_multiplier` (*numerical type*): a number by which the standard error is multiplied to yield the confidence interval.
>
> **Returns**: a tuple, of the form (\<lower bound\>, \<upper bound\>).
> `vectorized_to_df` (>=0.1.5)
>
> "General utility function, to return a dataframe calculated with several vectors, from a function accepting a single vector"
>
> **Arguments**:
> + `func` (callable): a callable function (such as **std_ci** or **ste_ci**), accepting a vector (pandas Series) as argument and returning a tuple.
> + `*args`, `**kwargs`: any other positional or keyword argument to be passed to the function.
>
> **Returns**: a pandas.DataFrame built from the output of **func** for each individual vector (stacked vertically).
## Roadmap
Features to be added, changes to be implemented in future versions:
+ Add more methods for the calculation of confidence intervals
+ Ensure the support of Matplotlib's parametrized linestyles.
+ Create and expose a submodule for drawing methods (e.g., lines, bars, etc.)
+ Add the capability to draw and configure boxes (as in boxplots)? ... See solutions by Seaborn and Matplotlib for now.
+ Write a more detailed, wiki-like documentation, either on Gitlab or a separate website like readthedocs.com
+ Support passing a configuration dictionary to avoid re-typing all arguments every time.
+ Add support for providing a nominal alpha risk level, in complement of the std_multiplier argument.
+ Add a clipping option, for bounded scales.
+ Add a rounding option, for categorical scales.
+ Express the width of bar_ends_width in pt or similarly convenient unit of measurement, independent of the scale of the data.
+ Add support for individual values of bars_center_portion_length and bars_center_portion_ratio.
+ Improve loading time if possible.
+ _Maybe more to come..._
## Contribution
Feel free to contribute, report bugs, suggest features, etc., on [GitLab](https://gitlab.com/aufildelanuit/confidence-interval-tools).
Raw data
{
"_id": null,
"home_page": "https://gitlab.com/aufildelanuit/confidence-interval-tools",
"name": "confidence-interval-tools",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "data science, confidence intervals, graphing, extreme values",
"author": "Yohann OPOLKA",
"author_email": "yohann.opolka@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/51/3b/24b320d2e15de7d48c84d19717272b0a6f6fb7ab2ec2b6dbbbe8868b44b3/confidence_interval_tools-0.1.7.tar.gz",
"platform": null,
"description": "# Confidence Interval Tools\n\nA small python library for calculating and drawing confidence intervals. \n\n\n1. [Requirements](#requirements)\n2. [Status](#status)\n3. [Documentation](#documentation)\n + [Installation](#installation)\n + [Usage](#usage) \n + [Classes and methods](#classes-and-methods)\n4. [Roadmap](#roadmap)\n5. [Contribution](#contribution)\n\n\n## Requirements \n\n```python\nPython^3.12 ## might also work with lower versions, but untested \npandas^2.2 \nmatplotlib^3.9 \nseaborn^0.13 \nscipy^1.14 \nnumpy^1.26\n```\n\nNote that to avoid blocking the install on machines with lower dependency versions, the dependency requirements were \nlowered to the nearest major version.\n\n\n## Status\n\n> [!WARNING] \n> The project is in a very early development phase. Expect important changes between updates. \n\n**Latest version**: 0.1.6\n\n**Updated**: August 2024\n\n**Changes since previous version**:\n+ implemented of the `ste_ci` method for calculating the standard error.\n+ added documentation\n\n\n## Documentation\n\n> [!NOTE] \n> **Last documentation update**: August 2024 \n\n### Installation\n\nThis project is published in PyPi under the name `confidence_interval_tools`. \n\n#### With pip\n\n```bash\npip install confidence_interval_tools\n```\n\nFor updating to the latest available version:\n\n```bash\npip install -U confidence_interval_tools\n```\n\nTo force a specific version (for example 0.2.0):\n\n```bash\npip install --force-reinstall -v \"confidence_interval_tools==0.2.0\"\n```\n\n#### With poetry\n\n```bash\npoetry add confidence_interval_tools@latest\n```\n\nFor updating:\n\n```bash\npoetry update\n```\n\n#### In a Jupyter notebook (notice the exclamation mark)\n\n```python\n!pip install -U confidence_interval_tools\n```\n\n\n### Usage\n\nMethods and classes can be imported directly, for example:\n\n```python\nfrom confidence_interval_tools import CI_Drawer\n\n## [...]\n\na = CI_Drawer(data=data, x=\"x\", y=\"y\", kind=[\"bars\", \"area\"], ci_type=\"std\")\n```\n\nHowever, for the sake of readability and traceability, it might be better to import (and alias) the whole package at once:\n\n```python\nimport confidence_interval_tools as cit\n\n## [...]\n\na = cit.CI_Drawer(data=data, x=\"x\", y=\"y\", kind=[\"bars\", \"area\"], ci_type=\"std\")\n```\n\nAs this package aims to be a complement to `Seaborn` and `Matplotlib`, we recommend reading the respective documentation of these two packages:\n+ [Seaborn](https://seaborn.pydata.org/tutorial.html)\n+ [Matplotlib](https://matplotlib.org/stable/users/index.html)\n\nAnd additionally:\n+ [Pandas](https://pandas.pydata.org/docs/user_guide/index.html)\n+ [Numpy](https://numpy.org/doc/stable/user/)\n+ [Scipy](https://docs.scipy.org/doc/scipy/tutorial/index.html)\n\n### Classes and methods\n\n#### Main module\n\n> `CI_Drawer` (>=0.1.5)\n>\n> \"A class for drawing a confidence interval in whatever way you prefer.\"\n>\n> **Arguments**:\n> + `data` (pandas.DataFrame, optional): a pandas dataframe containing the necessary information to draw confidence intervals. If **data** is provided, **x**, **y**, **lower**, **upper**, and **std** can be given as column names.\n> + `x` (str | *data type*, optional): column name or list / array / series with information about the horizontal coordinate of the data. If not provided, it will be assumed to be [1, 2, 3, 4, ...].\n> + `y` (str | *data type*, optional): column name or list / array / series with information about the vertical coordinate of the data. Usually required unless **lower** and **upper** are provided directly.\n> + `lower` (str | *data type*, optional): bypass the internal calculation by directly providing values for the lower bound of each confidence interval.\n> + `upper` (str | *data type*, optional): bypass the internal calculation by directly providing values for the upper bound of each confidence interval.\n> + `kind` (\"lines\" | \"bars\" | \"area\" | \"scatterplot\" | \"none\", optional): a selection of what kind of confidence interval is to be drawn. The default is \"none\" (does nothing). Several kinds can be seleted at once and passed as a list or tuple, e.g., [\"area\", \"bars\"].\n> + `ci_type` (\"std\" | \"ste\", optional): the type of calculation used for the confidence intervals. Currently available types are: standard deviation (std), standard error (ste). The default is set to \"std\".\n> + `std` (str | *data type*, optional): bypass the internal calculation for the standard deviation by providing pre-calculated values.\n> + `std_multiplier` (*numerical type*, optional): constant to be used as a multiplier of the standard deviation or standard error when a normal approximation is done. Currently used for \"std\" and \"ste\" CI types. Default is 1.96 (i.e., alpha risk level of 5%, two-sided). \n> + `orientation` (\"horizontal\" | \"vertical\", optional): orientation of the confidence interval, i.e., whether a confidence interval should be calculated for each value of **x** (\"vertical\"), or each value of **y** (\"horizontal\").\n> + CI lines options: \n> + `draw_lines` (bool, optional): manual toggle for the drawing of CI lines. Same as using **kind=\"lines\"**.\n> + `draw_lower_line` (bool, optional): manual toggle for the drawing of a line for the lower bound of the confidence interval.\n> + `draw_upper_line` (bool, optional): manual toggle for the drawing of a line for the upper bound of the confidence interval.\n> + `lines_style` (*matplotlib linestyles type*, optional): style for the CI lines. Follows the same syntax as [Matplotlib linestyles](https://matplotlib.org/stable/gallery/lines_bars_and_markers/linestyles.html). Default: \"solid\".\n> + `lower_line_style` (*matplotlib linestyles type*, optional): specify a different linestyle for the lower bound. Cf: **lines_style**.\n> + `upper_line_style` (*matplotlib linestyles type*, optional): specify a different linestyle for the upper bound. Cf: **lines_style**.\n> + `lines_color` (*matplotlib colors type*, optional): colo(u)r of the CI lines. See the lst of available [Matplotlib named colo(u)rs](https://matplotlib.org/stable/gallery/color/named_colors.html). Default: \"black\".\n> + `lower_line_color` (*matplotlib colors type*, optional): specify a different colo(u)r for the lower bound. Cf: **lines_color**.\n> + `upper_line_color` (*matplotlib colors type*, optional): specify a different colo(u)r for the upper bound. Cf: **lines_color**.\n> + `lines_linewidth` (*numerical type*, optional): linewidth for the CI lines. Default: 1 (pt).\n> + `lower_line_linewidth` (*numerical type*, optional): specify the linewidth for the lower bound. Cf: **lines_linewidth**.\n> + `upper_line_linewidth` (*numerical type*, optional): specify the linewidth for the upper bound. Cf: **lines_linewidth**.\n> + `lines_alpha` (*numerical type*, optional): opacity / transparency value (a.k.a. \"alpha channel\") for the CI lines. Must be a decimal value between 0 and 1. Default: 0.8.\n> + `lower_line_alpha` (*numerical type*, optional): specify the opacity for the lower bound. Cf: **lines_alpha**.\n> + `upper_line_alpha` (*numerical type*, optional): specify the opacity for the upper bound. Cf: **lines_alpha**.\n> + CI bars options:\n> + `draw_bars` (bool, optional): manual toggle for the drawing of CI bars. Same as using **kind=\"bars\"**.\n> + `draw_bar_ends` (bool, optional): whether to draw the perpendicular ends of the CI bars. Default: True when **draw_bars** is activated. Can be \"abused\" to draw the ends without drawing the actual body of the CI bars.\n> + `draw_lower_bar_end` (bool, optional): specify whether to draw the perpendicular ends of the CI bars for the lower bound.\n> + `draw_upper_bar_end` (bool, optional): specify whether to draw the perpendicular ends of the CI bars for the upper bound.\n> + `bars_style` (*matplotlib linestyles type*, optional): linestyle for the CI bars. See [Matplotlib linestyles](https://matplotlib.org/stable/gallery/lines_bars_and_markers/linestyles.html). Default: \"solid\".\n> + `bars_color` (*matplotlib colors type*, optional): colo(u)r of the CI bars. See [Matplotlib named colo(u)rs](https://matplotlib.org/stable/gallery/color/named_colors.html). Default: \"black\".\n> + `bars_linewidth` (*numerical type*, optional): linewidth for the CI bars. Default: 1 (pt).\n> + `bars_alpha` (*numerical type*, optional): opacity of the CI bars. Default: 1.\n> + `bar_ends_style` (*matplotlib linestyles type*, optional): specify the linestyle used for the perpendicular ends of the CI bars. The default is \"solid\" and is independent from the linestyle of the main body of the bars.\n> + `bar_ends_color` (*matplotlib colors type*, optional): specify the colo(u)r of both ends of the CI bars. CF: **bars_color**.\n> + `lower_bar_end_color` (*matplotlib colors type*, optional): specify a colo(u)r for the lower bound.\n> + `upper_bar_end_color` (*matplotlib colors type*, optional): specify a colo(u)r for the upper bound.\n> + `bar_ends_width` (*numerical type*, optional): specify a fixed width for the perpendicular ends of the CI bars. Currently relative to the scale of the data, might change in the future (see [roadmap](#roadmap)). Takes priority over the **bar_ends_ratio** if specified.\n> + `bar_ends_ratio` (*numerical type*, optional): width of the perpendicular ends of the CI bars, expressed as a proportion of the average distance between two adjacent x (or y) coordinates. Values greater than 1 should result in overlaps between adjacent CI bars, which is usually not a desired behaviour. Default: 0.3.\n> + `hide_bars_center_portion` (bool, optional): when set to True, the middle part of the CI bars will not be drawn, so as to avoid obscuring the plot (for example if a central tendency was already plotted). Default: False.\n> + `bars_center_portion_length` (*numerical type*, optional): length of the central portion (i.e., the \"middle part) of the CI bars. Currently relative to the scale of the data. Takes priority over **bars_center_portion_ratio** when specified. Used with **hide_bars_center_portion**.\n> + `bars_center_portion_ratio` (*numerical type*, optional): length of the central portion of the CI bars, expressed as a proportion of the bars' length. Used with **hide_bars_center_portion**. Default: 0.5.\n> + CI area options:\n> + `fill_area` (bool, optional): manual toggle for the drawing of the confidence interval as a shaded area. Same as using **kind=\"area\"**.\n> + `fill_color` (*matplotlib colors type*, optional): colo(u)r used for the shading of the CI area. See [Matplotlib named colo(u)rs](https://matplotlib.org/stable/gallery/color/named_colors.html). Default: \"lavender\".\n> + `fill_alpha` (*numerical type*, optional): opacity of the shaded area. Default: 0.4.\n> + options for the scatterplot of the lowers and upper bounds:\n> + `plot_limits` (bool, optional): manual toggle for plotting the lower and upper bounds of the confidence intervals as separate datapoints. Same as using **kind=\"scatterplot\"**.\n> + `plot_lower_limit` (bool, optional): whether to plot the lower bound.\n> + `plot_upper_limit` (bool, optional): whether to plot the upper bound.\n> + `plot_marker` (*matplotlib markers type*, optional): marker to be used when plotting the lower and upper bounds. See the list of [Matplotlib markers](https://matplotlib.org/stable/api/markers_api.html). Default: see **lower_plot_marker** and **upper_plot_marker**.\n> + `lower_plot_marker` (*matplotlib markers type*, optional): marker to be used when plotting the lower bound. Cf: **plot_marker**.\n> + `upper_plot_marker` (*matplotlib markers type*, optional): marker to be used when plotting the upper bound. Cf: **plot_marker**.\n> + `plot_color` (*matplotlib colors type*, optional): colo(u)r of the markers. See [Matplotlib named colo(u)rs](https://matplotlib.org/stable/gallery/color/named_colors.html). Default: \"black\".\n> + `lower_plot_color` (*matplotlib colors type*, optional): specify a colo(u)r for the lower bound. Cf: **plot_color**.\n> + `upper_plot_color` (*matplotlib colors type*, optional): specify a colo(u)r for the upper bound. Cf: **plot_color**.\n> + `plot_alpha` (*numerical type*, optional): opacity of the markers used when plotting the lower and upper bounds. Default: 0.8.\n> + `lower_plot_alpha` (*numerical type*, optional): specify the opacity for the lower bound.\n> + `upper_plot_alpha` (*numerical type*, optional): specify the opacity for the upper bound.\n> + `plot_size` (*numerical type*, optional): size of the markers (in pt square) when plotting the lower and upper bounds. Default: None (let Seaborn / Matplotlib decide). \n> + `lower_plot_size` (*numerical type*, optional): specify a size for the markers of the lower bound. Cf: **plot_size**.\n> + `upper_plot_size` (*numerical type*, optional): specify a size for the markers of the upper bound. Cf: **plot_size**.\n> + ax (matplotlib.axes.Axes, optional): a matplotlib Axes object to be used for drawing the confidence intervals. Defaut: last used object, identified with matplotlib.pyplot.gca(). \n>\n> **Returns**: a new instance of the CI_Drawer class.\n>\n> **Instance attributes and methods**:\n> + `.data` (pandas.DataFrame): a copy of the dataframe passed as argument.\n> + `.x`, `.y` (pandas.Series): a copy of the x and y data passed as arguments.\n> + `.lower`, `.upper` (pandas.Series): series containing the (calculated or specified) lower bounds and upper bounds.\n> + `.unique_x`, `.unique_y` (pandas.Series): series containing the unique values filtered from x and y respectively.\n> + `.std` (pandas.Series): series containing the (calculated or specified) standard deviation for each unique value of x (vertical CI) or y (horizontal CI).\n> + `.mean` (pandas.Series): series containing the calculated mean for each unique value of x or y.\n> + `.median` (pandas.Series): series containing the calculated median for each unique value of or y.\n> + `.q1`, `.q3` (pandas.Series): series containing the calculated first and third quartiles for each unique value of x or y.\n> + `.as_datafrae()` (pandas.DataFrame): returns a dataframe containing most of the information listed above.\n> + `.params` (dict): dictionary containing most of the parameters used for deciding what to draw and how to draw.\n> + `.draw()` (None): method for drawing (or, redrawing) the confidence intervals with the given parameters. \n> + `.help()` (None): method to return a help message in an interpreter or a jupyter notebook. Not yet implemented.\n\n\n> `std_ci` (>=0.1.5) \n>\n> \"Upper and lower bounds of a CI based on standard deviation (normal approximation around mean)\"\n>\n> **Arguments**:\n> + `v` (*data type*): a one-dimensional data vector (for example, all y values for a unique value of x) \n> + `std_multiplier` (*numerical type*): a number by which the standard deviation is multiplied to yield the confidence interval. \n> \n> **Returns**: a tuple, of the form (\\<lower bound\\>, \\<upper bound\\>).\n\n\n> `ste_ci` (>=0.1.6)\n>\n> \"Upper and lower bounds of a CI based on standard error (normal approximation around mean)\"\n>\n> **Arguments**:\n> + `v` (*data type*): a one-dimensional data vector (for example, all y values for a unique value of x) \n> + `ste_multiplier` (*numerical type*): a number by which the standard error is multiplied to yield the confidence interval. \n> \n> **Returns**: a tuple, of the form (\\<lower bound\\>, \\<upper bound\\>).\n\n\n> `vectorized_to_df` (>=0.1.5)\n>\n> \"General utility function, to return a dataframe calculated with several vectors, from a function accepting a single vector\"\n>\n> **Arguments**:\n> + `func` (callable): a callable function (such as **std_ci** or **ste_ci**), accepting a vector (pandas Series) as argument and returning a tuple.\n> + `*args`, `**kwargs`: any other positional or keyword argument to be passed to the function.\n>\n> **Returns**: a pandas.DataFrame built from the output of **func** for each individual vector (stacked vertically). \n\n\n## Roadmap\n\nFeatures to be added, changes to be implemented in future versions:\n+ Add more methods for the calculation of confidence intervals\n+ Ensure the support of Matplotlib's parametrized linestyles.\n+ Create and expose a submodule for drawing methods (e.g., lines, bars, etc.)\n+ Add the capability to draw and configure boxes (as in boxplots)? ... See solutions by Seaborn and Matplotlib for now.\n+ Write a more detailed, wiki-like documentation, either on Gitlab or a separate website like readthedocs.com \n+ Support passing a configuration dictionary to avoid re-typing all arguments every time.\n+ Add support for providing a nominal alpha risk level, in complement of the std_multiplier argument.\n+ Add a clipping option, for bounded scales.\n+ Add a rounding option, for categorical scales.\n+ Express the width of bar_ends_width in pt or similarly convenient unit of measurement, independent of the scale of the data.\n+ Add support for individual values of bars_center_portion_length and bars_center_portion_ratio.\n+ Improve loading time if possible.\n+ _Maybe more to come..._ \n\n\n## Contribution\n\nFeel free to contribute, report bugs, suggest features, etc., on [GitLab](https://gitlab.com/aufildelanuit/confidence-interval-tools).",
"bugtrack_url": null,
"license": "MIT",
"summary": "A small package for calculating drawing confidence intervals.",
"version": "0.1.7",
"project_urls": {
"Homepage": "https://gitlab.com/aufildelanuit/confidence-interval-tools",
"Repository": "https://gitlab.com/aufildelanuit/confidence-interval-tools"
},
"split_keywords": [
"data science",
" confidence intervals",
" graphing",
" extreme values"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1102aa221e75b39f7176d2af7920b895818681e83680228bf4bf5dc16aa287cd",
"md5": "f9a4f3a18f9e378f616c885e6ec76290",
"sha256": "16893a875528080c2b56a379264cacca39c06ec5a3877ef35d97ec8ecd098c91"
},
"downloads": -1,
"filename": "confidence_interval_tools-0.1.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f9a4f3a18f9e378f616c885e6ec76290",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 20840,
"upload_time": "2024-10-26T17:18:05",
"upload_time_iso_8601": "2024-10-26T17:18:05.649775Z",
"url": "https://files.pythonhosted.org/packages/11/02/aa221e75b39f7176d2af7920b895818681e83680228bf4bf5dc16aa287cd/confidence_interval_tools-0.1.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "513b24b320d2e15de7d48c84d19717272b0a6f6fb7ab2ec2b6dbbbe8868b44b3",
"md5": "ddcebd70362135b8539996272fb5d88b",
"sha256": "2338778e262b315163ac2e240b423ae288dc3d41dbb7664885f050210b7f08ed"
},
"downloads": -1,
"filename": "confidence_interval_tools-0.1.7.tar.gz",
"has_sig": false,
"md5_digest": "ddcebd70362135b8539996272fb5d88b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 23711,
"upload_time": "2024-10-26T17:18:07",
"upload_time_iso_8601": "2024-10-26T17:18:07.696102Z",
"url": "https://files.pythonhosted.org/packages/51/3b/24b320d2e15de7d48c84d19717272b0a6f6fb7ab2ec2b6dbbbe8868b44b3/confidence_interval_tools-0.1.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-26 17:18:07",
"github": false,
"gitlab": true,
"bitbucket": false,
"codeberg": false,
"gitlab_user": "aufildelanuit",
"gitlab_project": "confidence-interval-tools",
"lcname": "confidence-interval-tools"
}