stat-analysis


Namestat-analysis JSON
Version 1.0.0 PyPI version JSON
download
home_pagehttps://github.com/Hermann-web/some-common-statistical-methods
SummaryA Python library providing hands on implementation of a collection of common statistical methods for data analysis.
upload_time2024-04-04 19:33:43
maintainerHermann Agossou
docs_urlNone
authorHermann Agossou
requires_python<4.0,>=3.9
licenseApache License
keywords statistics data analysis confidence intervals hypothesis testing model estimation regression statistical learning
VCS
bugtrack_url
requirements statsmodels numpy pandas seaborn
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Statistical Analysis Toolkit

Welcome to `statanalysis`, a repository of statistical methods and tools tailored for data analysis enthusiasts. Inspired by my completion of a Coursera certificate in statistics, this repository encompasses a plethora of statistical concepts meticulously crafted into implementations. From prediction metrics to regression analysis, hypothesis testing to confidence intervals, and population parameter estimation to model estimation, `statanalysis` covers it all.

Built in Python, `statanalysis` provides meticulously crafted modules and utilities aimed at beginners in statistics, data science, and research. While following a certification on statistics on Coursera, I chose to solidify my knowledge through implementations instead of solely relying on existing modules. I believe there is no better way to understand a statistical formula than by implementing it in code, documenting it thoroughly, and validating the results through tests.

So, I've rewritten common statistical learning tools then create a repository that offers direct access to my implementations, ensuring simplicity without compromising accuracy. Futhermore, these implementations have undergone rigorous testing against established libraries like [scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html), [statsmodels](https://www.statsmodels.org/stable/index.html), and [scikit-learn](https://scikit-learn.org/stable/modules/classes.html) to uphold industry standards.

Whether you're a novice or an experienced data analyst, `statanalysis` aims to simplify and enhance your statistical analysis journey. Dive in and explore a wealth of statistical methods and techniques designed to streamline your analytical processes and empower your insights.

## Features

1. **Utility Functions:**
   - **Module:** `utils_md`
   - **Description:** The `utils_md` module provides a collection of helper functions for various statistical tasks, including data preprocessing, standard deviation estimation, and computation of probabilities and percentiles.

1. **Hypothesis Validation:**
   - **Module:** `hyp_vali_md`
   - **Description:** The `hyp_vali_md` module includes functions for hypothesis validation, such as checking residuals, coefficients, and conducting hypothesis tests. Features encompass:
     - **Constraint Checking:** Functions for verifying constraints, such as checking if values fall within specific ranges.
     - **Hypothesis Sample Size:** Tools for ensuring minimum sample sizes for hypothesis testing scenarios.

1. **Confidence Interval Estimation:**
   - **Module:** `conf_inte_md`
   - **Description:** The `conf_inte_md` module offers methods for estimating confidence intervals for population parameters, such as proportions and means. Features include:
     - **One-sample Proportion:** Functions for estimating confidence intervals for population proportions based on a single sample.
     - **Two-sample Mean:** Methods for computing confidence intervals for the difference between two population means, considering paired and unpaired data.

1. **Hypothesis Testing:**
   - **Module:** `hyp_testi_md`
   - **Description:** This module encompasses a comprehensive suite of functions for hypothesis testing, covering a variety of scenarios:
     - **Testing Population Proportions:** Methods for assessing hypotheses related to population proportions using z-tests.
     - **Comparing Means:** Functions for conducting hypothesis tests to compare means between two or more populations, employing t-tests and ANOVA.

1. **Model Estimation:**
   - **Module:** `mdl_esti_md`
   - **Description:** The `mdl_esti_md` module houses classes and functions dedicated to model estimation. Notable features include:
     - **Linear Regression:** Implementation of linear regression models, including ordinary least squares (OLS) and robust regression.
     - **Logistic Regression:** Classes for logistic regression analysis, enabling binary classification tasks with probability predictions.
     - **Multiple Regression:** Tools for conducting multiple regression analysis, facilitating the exploration of relationships between multiple independent variables and a dependent variable.

## Repository Structure

The repository is organized into two main folders:

1. **`statanalysis/` Folder:**

   This folder contains the following modules:

   - **`utils_md:`** Module for utility functions, offering a collection of helper functions for statistical tasks.
   - **`hyp_vali_md:`** Module for hypothesis validation, containing functions for checking residuals, coefficients, and conducting hypothesis tests.
   - **`conf_inte_md:`** Module for confidence interval estimation, providing methods for estimating confidence intervals for proportions and means.
   - **`hyp_testi_md:`** Module for hypothesis testing, including functions for conducting hypothesis tests on proportions and means.
   - **`mdl_esti_md:`** Module for model estimation, including classes and functions for linear regression, logistic regression, and multiple regression.

2. **`tests/` Folder:**

   This folder features tests for all methods mentioned above.

## Usage

To utilize the statistical analysis functionalities provided by this library, you have either clone the repo or install from pypi depending on your usage

### **Clone the Repository:**

Clone the repository to your local machine using the following command:

```bash
git clone https://github.com/hermann-web/some-common-statistical-methods
```

### **Install the Library from PyPI:**

Install the library from PyPI using pip:

```bash
pip install statanalysis
```

Choose the option that best suits your needs and get started with your statistical analysis.

### **Import Modules or Functions:**

In your Python script, import the desired modules or functions using the following syntax:

```python
from statanalysis import utils_md, hyp_vali_md, conf_inte_md, hyp_testi_md, mdl_esti_md
```

### **Perform Statistical Analysis:**

Utilize the imported functions and classes to perform a wide range of statistical analysis tasks on your data. For example:

```python
# Example: Compute a confidence interval for a population proportion
confidence_interval = conf_inte_md.IC_PROPORTION_ONE(sample_size=100, parameter=0.5, confidence=0.95)
```

Leverage advanced statistical techniques and methodologies provided by the modules to analyze your data effectively.

Additionally, if you prefer to browse documentation in a more structured format, you can refer to the documentation files included in the repository, which provides detailed information about the library's functionalities and usage. There is a [detailled one](./docs/detailled-docu.md) and a[more concice one](./docs/concise-docu.md)

## Additional Information

- The repository includes a comprehensive test suite in [tests](./tests/) folder to validate the accuracy and consistency of the implemented methods against standard industry-standard libraries like scipy.stats, statsmodels, and scikit-learn.
- The module is available on PyPI for easy installation and use in various statistical analysis projects.
- For detailed explanations and references, refer to the respective sections in the code files.
- Further insights and explanations on statistical concepts can be found in the provided links.
- For inquiries or assistance regarding the repository, please contact [Hermann Agossou](mailto:hermannagossou7[at]gmail.com).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Hermann-web/some-common-statistical-methods",
    "name": "stat-analysis",
    "maintainer": "Hermann Agossou",
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": "agossouhermann7@gmail.com",
    "keywords": "statistics, data analysis, confidence intervals, hypothesis testing, model estimation, regression, statistical learning",
    "author": "Hermann Agossou",
    "author_email": "agossouhermann7@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a4/37/334ee9ed963ec63145f52a1c80698bc8438da0a38854679fa0106d7157b9/stat_analysis-1.0.0.tar.gz",
    "platform": null,
    "description": "# Statistical Analysis Toolkit\n\nWelcome to `statanalysis`, a repository of statistical methods and tools tailored for data analysis enthusiasts. Inspired by my completion of a Coursera certificate in statistics, this repository encompasses a plethora of statistical concepts meticulously crafted into implementations. From prediction metrics to regression analysis, hypothesis testing to confidence intervals, and population parameter estimation to model estimation, `statanalysis` covers it all.\n\nBuilt in Python, `statanalysis` provides meticulously crafted modules and utilities aimed at beginners in statistics, data science, and research. While following a certification on statistics on Coursera, I chose to solidify my knowledge through implementations instead of solely relying on existing modules. I believe there is no better way to understand a statistical formula than by implementing it in code, documenting it thoroughly, and validating the results through tests.\n\nSo, I've rewritten common statistical learning tools then create a repository that offers direct access to my implementations, ensuring simplicity without compromising accuracy. Futhermore, these implementations have undergone rigorous testing against established libraries like [scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html), [statsmodels](https://www.statsmodels.org/stable/index.html), and [scikit-learn](https://scikit-learn.org/stable/modules/classes.html) to uphold industry standards.\n\nWhether you're a novice or an experienced data analyst, `statanalysis` aims to simplify and enhance your statistical analysis journey. Dive in and explore a wealth of statistical methods and techniques designed to streamline your analytical processes and empower your insights.\n\n## Features\n\n1. **Utility Functions:**\n   - **Module:** `utils_md`\n   - **Description:** The `utils_md` module provides a collection of helper functions for various statistical tasks, including data preprocessing, standard deviation estimation, and computation of probabilities and percentiles.\n\n1. **Hypothesis Validation:**\n   - **Module:** `hyp_vali_md`\n   - **Description:** The `hyp_vali_md` module includes functions for hypothesis validation, such as checking residuals, coefficients, and conducting hypothesis tests. Features encompass:\n     - **Constraint Checking:** Functions for verifying constraints, such as checking if values fall within specific ranges.\n     - **Hypothesis Sample Size:** Tools for ensuring minimum sample sizes for hypothesis testing scenarios.\n\n1. **Confidence Interval Estimation:**\n   - **Module:** `conf_inte_md`\n   - **Description:** The `conf_inte_md` module offers methods for estimating confidence intervals for population parameters, such as proportions and means. Features include:\n     - **One-sample Proportion:** Functions for estimating confidence intervals for population proportions based on a single sample.\n     - **Two-sample Mean:** Methods for computing confidence intervals for the difference between two population means, considering paired and unpaired data.\n\n1. **Hypothesis Testing:**\n   - **Module:** `hyp_testi_md`\n   - **Description:** This module encompasses a comprehensive suite of functions for hypothesis testing, covering a variety of scenarios:\n     - **Testing Population Proportions:** Methods for assessing hypotheses related to population proportions using z-tests.\n     - **Comparing Means:** Functions for conducting hypothesis tests to compare means between two or more populations, employing t-tests and ANOVA.\n\n1. **Model Estimation:**\n   - **Module:** `mdl_esti_md`\n   - **Description:** The `mdl_esti_md` module houses classes and functions dedicated to model estimation. Notable features include:\n     - **Linear Regression:** Implementation of linear regression models, including ordinary least squares (OLS) and robust regression.\n     - **Logistic Regression:** Classes for logistic regression analysis, enabling binary classification tasks with probability predictions.\n     - **Multiple Regression:** Tools for conducting multiple regression analysis, facilitating the exploration of relationships between multiple independent variables and a dependent variable.\n\n## Repository Structure\n\nThe repository is organized into two main folders:\n\n1. **`statanalysis/` Folder:**\n\n   This folder contains the following modules:\n\n   - **`utils_md:`** Module for utility functions, offering a collection of helper functions for statistical tasks.\n   - **`hyp_vali_md:`** Module for hypothesis validation, containing functions for checking residuals, coefficients, and conducting hypothesis tests.\n   - **`conf_inte_md:`** Module for confidence interval estimation, providing methods for estimating confidence intervals for proportions and means.\n   - **`hyp_testi_md:`** Module for hypothesis testing, including functions for conducting hypothesis tests on proportions and means.\n   - **`mdl_esti_md:`** Module for model estimation, including classes and functions for linear regression, logistic regression, and multiple regression.\n\n2. **`tests/` Folder:**\n\n   This folder features tests for all methods mentioned above.\n\n## Usage\n\nTo utilize the statistical analysis functionalities provided by this library, you have either clone the repo or install from pypi depending on your usage\n\n### **Clone the Repository:**\n\nClone the repository to your local machine using the following command:\n\n```bash\ngit clone https://github.com/hermann-web/some-common-statistical-methods\n```\n\n### **Install the Library from PyPI:**\n\nInstall the library from PyPI using pip:\n\n```bash\npip install statanalysis\n```\n\nChoose the option that best suits your needs and get started with your statistical analysis.\n\n### **Import Modules or Functions:**\n\nIn your Python script, import the desired modules or functions using the following syntax:\n\n```python\nfrom statanalysis import utils_md, hyp_vali_md, conf_inte_md, hyp_testi_md, mdl_esti_md\n```\n\n### **Perform Statistical Analysis:**\n\nUtilize the imported functions and classes to perform a wide range of statistical analysis tasks on your data. For example:\n\n```python\n# Example: Compute a confidence interval for a population proportion\nconfidence_interval = conf_inte_md.IC_PROPORTION_ONE(sample_size=100, parameter=0.5, confidence=0.95)\n```\n\nLeverage advanced statistical techniques and methodologies provided by the modules to analyze your data effectively.\n\nAdditionally, if you prefer to browse documentation in a more structured format, you can refer to the documentation files included in the repository, which provides detailed information about the library's functionalities and usage. There is a [detailled one](./docs/detailled-docu.md) and a[more concice one](./docs/concise-docu.md)\n\n## Additional Information\n\n- The repository includes a comprehensive test suite in [tests](./tests/) folder to validate the accuracy and consistency of the implemented methods against standard industry-standard libraries like scipy.stats, statsmodels, and scikit-learn.\n- The module is available on PyPI for easy installation and use in various statistical analysis projects.\n- For detailed explanations and references, refer to the respective sections in the code files.\n- Further insights and explanations on statistical concepts can be found in the provided links.\n- For inquiries or assistance regarding the repository, please contact [Hermann Agossou](mailto:hermannagossou7[at]gmail.com).\n",
    "bugtrack_url": null,
    "license": "Apache License",
    "summary": "A Python library providing hands on implementation of a collection of common statistical methods for data analysis.",
    "version": "1.0.0",
    "project_urls": {
        "Documentation": "https://github.com/Hermann-web/some-common-statistical-methods/blob/main/docs/detailled-docu.md",
        "Homepage": "https://github.com/Hermann-web/some-common-statistical-methods",
        "Repository": "https://github.com/Hermann-web/some-common-statistical-methods"
    },
    "split_keywords": [
        "statistics",
        " data analysis",
        " confidence intervals",
        " hypothesis testing",
        " model estimation",
        " regression",
        " statistical learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "272edcb82db08a8d0f63bf345ad7af56b1d3658031b8fa4720b6a95fbf139ac7",
                "md5": "0d1212491fcb8a73a316e242a6f6c8ce",
                "sha256": "7aff9e0128aec649cd5d991b53ad064371913f67416e0b595d57f4543551ec62"
            },
            "downloads": -1,
            "filename": "stat_analysis-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0d1212491fcb8a73a316e242a6f6c8ce",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 49165,
            "upload_time": "2024-04-04T19:33:41",
            "upload_time_iso_8601": "2024-04-04T19:33:41.014590Z",
            "url": "https://files.pythonhosted.org/packages/27/2e/dcb82db08a8d0f63bf345ad7af56b1d3658031b8fa4720b6a95fbf139ac7/stat_analysis-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a437334ee9ed963ec63145f52a1c80698bc8438da0a38854679fa0106d7157b9",
                "md5": "4aa2d8a52ae9819d31d2dd7d3367797d",
                "sha256": "5bae3c5a15d56e7c6d2c23445f4fa23ed7e2d280a51a1e31fe9e9b902dfe2e72"
            },
            "downloads": -1,
            "filename": "stat_analysis-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "4aa2d8a52ae9819d31d2dd7d3367797d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 40551,
            "upload_time": "2024-04-04T19:33:43",
            "upload_time_iso_8601": "2024-04-04T19:33:43.389935Z",
            "url": "https://files.pythonhosted.org/packages/a4/37/334ee9ed963ec63145f52a1c80698bc8438da0a38854679fa0106d7157b9/stat_analysis-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-04 19:33:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Hermann-web",
    "github_project": "some-common-statistical-methods",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "statsmodels",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "seaborn",
            "specs": []
        }
    ],
    "lcname": "stat-analysis"
}
        
Elapsed time: 0.20104s