fisher-scoring

Name	fisher-scoring JSON
Version	2.0.5 JSON
	download
home_page	None
Summary	A Python implementation of the Fisher Scoring algorithm for proportion and incidence rate modeling.
upload_time	2025-07-18 20:21:18
maintainer	None
docs_url	None
author	None
requires_python	>=3.9
license	None
keywords	fisher scoring logistic regression maximum likelihood classification machine learning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Fisher Scoring with Python

**Author:** [xRiskLab](https://github.com/xRiskLab)<br>
**Version:** v2.0.5<br>
**License:** [MIT License](https://opensource.org/licenses/MIT) (2025)

![Title](https://github.com/xRiskLab/fisher-scoring/raw/main/docs/images/title.png)

This repository contains optimized Python implementations of the Fisher Scoring algorithm for various logistic regression models. With version 2.0, the core algorithms are now significantly faster due to optimized matrix operations and reduced memory usage, providing faster convergence for larger datasets.

```python
%pip install fisher-scoring
from fisher_scoring import LogisticRegression, RobustLogisticRegression, PoissonRegression

# Binary Classification
model = LogisticRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
model.display_summary()  # Rich formatted output

# Robust Classification (outlier-resistant)
robust_model = RobustLogisticRegression(epsilon_contamination=0.05)
robust_model.fit(X_train_contaminated, y_train_contaminated)
robust_model.display_summary()  # Rich formatted output with robustness metrics

# Count Data with Rate Modeling  
import numpy as np
exposure_times = np.random.uniform(0.5, 3.0, len(y_train))
offset = np.log(exposure_times)  # Log exposure for rate modeling

poisson_model = PoissonRegression(offset=offset, information="empirical")
poisson_model.fit(X_train, y_train)
poisson_model.display_summary()  # Rich formatted output
```

## Overview

### Introduction

This repository contains a Python package with scikit-learn compatible implementations of the Fisher Scoring algorithm for various modeling problems.

The packages provides implementations of logistic regression (MLE for binary, multiclass, and binary imbalanced) for proportions (risk or prevalence), robust logistic regression for outlier-resistant classification, and Poisson and Negative Binomial regression for log-linear regression for incidence rates.

1. Binary classification problems: **Logistic Regression**.
2. Robust binary classification problems: **Robust Logistic Regression**.
3. Multi-class classification problems: **Multinomial Logistic Regression**.
4. Imbalanced classification problems: **Focal Loss Logistic Regression**.
5. Count modeling problems: **Poisson Regression** and **Negative Binomial Regression**.

### Fisher Scoring Algorithm

The Fisher Scoring algorithm is an iterative optimization technique that estimates maximum likelihood estimates by leveraging the expected or observed Fisher information matrix. This second-order optimization method allows to avoid the use of learning rates and provides more stable convergence compared to gradient descent.

There are two types of information matrices used in the Fisher Scoring algorithm:

* **Expected Information Matrix**: Relies on predicted probabilities, providing an efficient approximation for the information matrix.
* **Empirical Information Matrix**: Uses ground truth labels to calculate the information matrix, often resulting in more reliable inference metrics.

These information matrices are used to derive standard errors of estimates to calculate detailed model statistics, including Wald statistics, p-values, and confidence intervals at a chosen level.

Source: [Limitations of the Empirical Fisher Approximation for Natural Gradient Descent](https://arxiv.org/pdf/1905.12558).

### Implementation Notes

- **Multinomial Logistic Regression**  
  The `MultinomialLogisticRegression` model differs from standard statistical multinomial logistic regression by using all classes rather than $K - 1$. This approach allows multi-class classification problems to be converted to binary problems by calculating $1 - P_{Class=1}$.

- **Focal Loss Regression**  
  The `FocalLossRegression` class employs a non-standard focal log-likelihood function in its optimization process leveraging $\gamma$ to focus on difficult-to-classify examples.
  The focal loss function, originally developed for object detection, prioritizes difficult-to-classify examples—often the minority class—by reducing the contribution of easy-to-classify samples. It introduces a focusing parameter, *gamma*, which down-weights the influence of easily classified instances, thereby concentrating learning on challenging cases.

  Source: [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002).

## Models

### Logistic Regression

The `LogisticRegression` class is a custom implementation of logistic regression using the Fisher scoring algorithm. It provides methods for fitting the model, making predictions, and computing model statistics, including standard errors, Wald statistics, p-values, and confidence intervals.

**Parameters:**
- `epsilon`: Convergence threshold for the algorithm.
- `max_iter`: Maximum number of iterations for the algorithm.
- `information`: Type of information matrix to use ('expected' or 'empirical').
- `use_bias`: Include a bias term in the model.
- `significance`: Significance level for computing confidence intervals.

**Methods:**
- `fit(X, y)`: Fit the model to the data.
- `predict(X)`: Predict target labels for input data.
- `predict_proba(X)`: Predict class probabilities for input data.
- `predict_ci(X)`: Predict class probabilities with confidence intervals.
- `get_params()`: Get model parameters.
- `set_params(**params)`: Set model parameters.
- `summary()`: Get a summary of model parameters, standard errors, p-values, and confidence intervals.
- `display_summary()`: Display a summary of model parameters, standard errors, p-values, and confidence intervals.

### Robust Logistic Regression

The `RobustLogisticRegression` class implements robust logistic regression using the Fisher scoring algorithm with epsilon-contamination for outlier resistance. This method down-weights observations that are unlikely under the main model, providing robustness against data contamination and outliers.

**Parameters:**
- `epsilon_contamination`: Contamination level (0 ≤ ε ≤ 1). Higher values provide more robustness but may reduce efficiency (default: 0.05).
- `contamination_prob`: Probability for contamination distribution (default: 0.5).
- `tol`: Convergence tolerance for parameter updates.
- `max_iter`: Maximum number of iterations for the algorithm.
- `information`: Type of information matrix to use ('expected' or 'empirical').
- `use_bias`: Include a bias term in the model.
- `significance`: Significance level for computing confidence intervals.

**Methods:**
- `fit(X, y)`: Fit the robust model to the data with automatic outlier down-weighting.
- `predict(X)`: Predict target labels for input data.
- `predict_proba(X)`: Predict class probabilities for input data.
- `predict_ci(X)`: Predict class probabilities with confidence intervals.
- `get_params()`: Get model parameters.
- `set_params(**params)`: Set model parameters.
- `summary()`: Get a summary of model parameters, standard errors, p-values, confidence intervals, and robust weights.
- `display_summary()`: Display a comprehensive summary including robustness metrics (epsilon contamination, average/minimum robust weights).

**Key Features:**
- **Outlier Resistance**: Automatic down-weighting of observations unlikely under the main model.
- **Robust Weights**: Access to individual observation weights showing outlier identification.
- **Fisher Scoring Framework**: Consistent with other models using both expected and empirical information matrices.
- **Statistical Inference**: Complete inference statistics with robust standard errors and confidence intervals.
- **Rich Output**: Beautiful formatted summaries with robust-specific metrics and diagnostics.

### Multinomial Logistic Regression

The `MultinomialLogisticRegression` class implements the Fisher Scoring algorithm for multinomial logistic regression, suitable for multi-class classification tasks.

**Parameters:**
- `epsilon`: Convergence threshold for the algorithm.
- `max_iter`: Maximum number of iterations for the algorithm.
- `information`: Type of information matrix to use ('expected' or 'empirical').
- `use_bias`: Include a bias term in the model.
- `significance`: Significance level for computing confidence intervals.
- `verbose`: Enable verbose output.

**Methods:**
- `fit(X, y)`: Fit the model to the data.
- `predict(X)`: Predict target labels for input data.
- `predict_proba(X)`: Predict class probabilities for input data.
- `predict_ci(X)`: Predict class probabilities with confidence intervals.
- `summary(class_idx)`: Get a summary of model parameters, standard errors, p-values, and confidence intervals for a specific class.
- `display_summary(class_idx)`: Display a summary of model parameters, standard errors, p-values, and confidence intervals for a specific class.

The algorithm is in a beta version and may require further testing and optimization to speed up matrix operations.

### Focal Loss Regression

The `FocalLossRegression` class implements the Fisher Scoring algorithm with focal loss, designed for imbalanced classification problems where the positive class is rare.

**Parameters:**
- `gamma`: Focusing parameter for focal loss.
- `epsilon`: Convergence threshold for the algorithm.
- `max_iter`: Maximum number of iterations for the algorithm.
- `information`: Type of information matrix to use ('expected' or 'empirical').
- `use_bias`: Include a bias term in the model.
- `verbose`: Enable verbose output.

**Methods:**
- `fit(X, y)`: Fit the model to the data.
- `predict(X)`: Predict target labels for input data.
- `predict_proba(X)`: Predict class probabilities for input data.
- `predict_ci(X)`: Predict class probabilities with confidence intervals.
- `get_params()`: Get model parameters.
- `set_params(**params)`: Set model parameters.
- `summary()`: Get a summary of model parameters, standard errors, p-values, and confidence intervals.
- `display_summary()`: Display a summary of model parameters, standard errors, p-values, and confidence intervals.

### Poisson Regression

The `PoissonRegression` class implements the Fisher Scoring algorithm for Poisson regression, suitable for modeling count data and incidence rates. Features robust matrix operations with automatic fallback to pseudo-inverse for numerical stability.

**Parameters:**
- `max_iter`: Maximum number of iterations for optimization.
- `epsilon`: Convergence tolerance.
- `use_bias`: Whether to include an intercept term.
- `offset`: Offset term for rate modeling (e.g., log exposure times).
- `significance`: Significance level for confidence intervals.
- `information`: Type of information matrix to use ('expected' or 'empirical').

**Methods:**
- `fit(X, y)`: Fit the model to the data.
- `predict(X, offset=None)`: Predict mean values with optional custom offset.
- `calculate_st_errors(X)`: Calculate standard errors for the coefficients.
- `summary()`: Get comprehensive model statistics including coefficients, standard errors, p-values, and confidence intervals.
- `display_summary()`: Display beautiful formatted summary with Rich styling.

**Key Features:**
- **Offset Support**: Full support for rate modeling with log exposure times.
- **Information Matrix Choice**: Both expected and empirical Fisher information matrices supported.
- **Robust Implementation**: Safe matrix inversion with automatic pseudo-inverse fallback.
- **Statistical Summaries**: Complete inference statistics with Wald tests and confidence intervals.
- **Validated Accuracy**: Mathematical correctness verified against statsmodels with machine precision accuracy.

### Negative Binomial Regression

The `NegativeBinomialRegression` class implements the Fisher Scoring algorithm for Negative Binomial regression, suitable for overdispersed count data. Features enhanced robustness with comprehensive statistical inference and fixed critical implementation bugs.

**Parameters:**
- `max_iter`: Maximum number of iterations for optimization.
- `epsilon`: Convergence tolerance.
- `use_bias`: Whether to include an intercept term.
- `alpha`: Fixed dispersion parameter (overdispersion adjustment).
- `phi`: Constant scale parameter.
- `offset`: Offset term for the linear predictor.
- `significance`: Significance level for confidence intervals.
- `information`: Type of information matrix to use ('expected' or 'empirical').

**Methods:**
- `fit(X, y)`: Fit the model to the data.
- `predict(X, offset=None)`: Predict mean values with proper offset handling.
- `calculate_st_errors(X)`: Calculate standard errors with corrected implementation.
- `summary()`: Get comprehensive model statistics including coefficients, standard errors, p-values, and confidence intervals.
- `display_summary()`: Display beautiful formatted summary with Rich styling.

**Key Improvements:**
- **Fisher Scoring Conversion**: Converted from IWLS to proper Fisher scoring for consistency.
- **Information Matrix Choice**: Both expected and empirical Fisher information matrices supported (empirical recommended for numerical stability).
- **Bug Fixes**: Fixed missing offset in prediction and standard error calculations.
- **Robust Implementation**: Safe matrix inversion with automatic pseudo-inverse fallback.
- **Statistical Summaries**: Complete inference statistics with Wald tests and confidence intervals.
- **Enhanced Reliability**: Comprehensive testing ensures mathematical correctness.

## Utilities

### Visualization

The package includes a utility function for visualizing observed vs predicted probabilities for count data, which can be useful for users working with Poisson and Negative Binomial models.

**Function:**
- `plot_observed_vs_predicted(y, mu, max_count=15, alpha=None, title="Observed vs Predicted Probabilities", model_name="Model", ax=None, plot_params=None)`: Plot observed vs predicted probabilities for count data.

**Parameters:**
- `y`: Observed count data.
- `mu`: Predicted mean values from the model.
- `max_count`: Maximum count to consider for probabilities.
- `alpha`: Overdispersion parameter for Negative Binomial. If None, assumes Poisson (alpha=0).
- `title`: Title for the plot.
- `model_name`: Name of the model for labeling.
- `ax`: Matplotlib axis to plot on.

## Change Log

- **v2.0.5**
  - **New**: Added `RobustLogisticRegression` class with epsilon-contamination for outlier-resistant classification.
  - **Enhanced**: Poisson and Negative Binomial regression with empirical Fisher information matrix support.
  - **Enhanced**: Converted Negative Binomial from IWLS to proper Fisher scoring for consistency.
  - **Added**: Comprehensive offset support for Poisson regression rate modeling.
  - **Fixed**: Critical bugs in Negative Binomial prediction and standard error calculations.
  - **Added**: `summary()` and `display_summary()` methods with rich statistical output.
  - **Validated**: Mathematical correctness verified against statsmodels with machine precision accuracy.

- **v2.0.4**
  - Added a beta version of Poisson and Negative Binomial regression using Fisher Scoring.
  - Changed naming conventions for simplicity and consistency.
  - Changed poetry to uv for packaging.

- **v2.0.3**
  - Added a new functionality of inference of mean responses with confidence intervals for all algorithms.
  - Focal logistic regression now supports all model statistics, including standard errors, Wald statistics, p-values, and confidence intervals.

- **v2.0.2**
  - **Bug Fixes**: Fixed the `MultinomialLogisticRegression` class to have flexible NumPy data types.

- **v2.0.1**
  - **Bug Fixes**: Removed the debug print statement from the `LogisticRegression` class.

- **v2.0**
  - **Performance Improvements**: Performance Enhancements: Optimized matrix calculations for substantial speed and memory efficiency improvements across all models. Leveraging streamlined operations, this version achieves up to 290x faster convergence. Performance gains per model:
    - *Multinomial Logistic Regression*: Training time reduced from 125.10s to 0.43s (~290x speedup).
    - *Logistic Regression*: Training time reduced from 0.24s to 0.05s (~5x speedup).
    - *Focal Loss Logistic Regression*: Training time reduced from 0.26s to 0.01s (~26x speedup).
  - **Bug Fixes**: `verbose` parameter in Focal Loss Logistic Regression now functions as expected, providing accurate logging during training.

- **v0.1.4**
  - Updated log likelihood for Multinomial Regression and minor changes to Logistic Regression for integration with scikit-learn.

- **v0.1.3**
  - Added coefficients, standard errors, p-values, and confidence intervals for Multinomial Regression.

- **v0.1.2**
  - Updated NumPy dependency.

- **v0.1.1**
  - Added support for Python 3.9+ 🐍.

- **v0.1.0**
  - Initial release of Fisher Scoring Logistic, Multinomial, and Focal Loss Regression.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fisher-scoring",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "fisher scoring, logistic regression, maximum likelihood, classification, machine learning",
    "author": null,
    "author_email": "xRiskLab <contact@xrisklab.ai>",
    "download_url": "https://files.pythonhosted.org/packages/6e/ff/d75c858c9aa586823bcde39f60ee42b73a234018137f9b23f16bab015065/fisher_scoring-2.0.5.tar.gz",
    "platform": null,
    "description": "# Fisher Scoring with Python\n\n**Author:** [xRiskLab](https://github.com/xRiskLab)<br>\n**Version:** v2.0.5<br>\n**License:** [MIT License](https://opensource.org/licenses/MIT) (2025)\n\n![Title](https://github.com/xRiskLab/fisher-scoring/raw/main/docs/images/title.png)\n\nThis repository contains optimized Python implementations of the Fisher Scoring algorithm for various logistic regression models. With version 2.0, the core algorithms are now significantly faster due to optimized matrix operations and reduced memory usage, providing faster convergence for larger datasets.\n\n```python\n%pip install fisher-scoring\nfrom fisher_scoring import LogisticRegression, RobustLogisticRegression, PoissonRegression\n\n# Binary Classification\nmodel = LogisticRegression()\nmodel.fit(X_train, y_train)\npredictions = model.predict(X_test)\nmodel.display_summary()  # Rich formatted output\n\n# Robust Classification (outlier-resistant)\nrobust_model = RobustLogisticRegression(epsilon_contamination=0.05)\nrobust_model.fit(X_train_contaminated, y_train_contaminated)\nrobust_model.display_summary()  # Rich formatted output with robustness metrics\n\n# Count Data with Rate Modeling  \nimport numpy as np\nexposure_times = np.random.uniform(0.5, 3.0, len(y_train))\noffset = np.log(exposure_times)  # Log exposure for rate modeling\n\npoisson_model = PoissonRegression(offset=offset, information=\"empirical\")\npoisson_model.fit(X_train, y_train)\npoisson_model.display_summary()  # Rich formatted output\n```\n\n## Overview\n\n### Introduction\n\nThis repository contains a Python package with scikit-learn compatible implementations of the Fisher Scoring algorithm for various modeling problems.\n\nThe packages provides implementations of logistic regression (MLE for binary, multiclass, and binary imbalanced) for proportions (risk or prevalence), robust logistic regression for outlier-resistant classification, and Poisson and Negative Binomial regression for log-linear regression for incidence rates.\n\n1. Binary classification problems: **Logistic Regression**.\n2. Robust binary classification problems: **Robust Logistic Regression**.\n3. Multi-class classification problems: **Multinomial Logistic Regression**.\n4. Imbalanced classification problems: **Focal Loss Logistic Regression**.\n5. Count modeling problems: **Poisson Regression** and **Negative Binomial Regression**.\n\n### Fisher Scoring Algorithm\n\nThe Fisher Scoring algorithm is an iterative optimization technique that estimates maximum likelihood estimates by leveraging the expected or observed Fisher information matrix. This second-order optimization method allows to avoid the use of learning rates and provides more stable convergence compared to gradient descent.\n\nThere are two types of information matrices used in the Fisher Scoring algorithm:\n\n* **Expected Information Matrix**: Relies on predicted probabilities, providing an efficient approximation for the information matrix.\n* **Empirical Information Matrix**: Uses ground truth labels to calculate the information matrix, often resulting in more reliable inference metrics.\n\nThese information matrices are used to derive standard errors of estimates to calculate detailed model statistics, including Wald statistics, p-values, and confidence intervals at a chosen level.\n\nSource: [Limitations of the Empirical Fisher Approximation for Natural Gradient Descent](https://arxiv.org/pdf/1905.12558).\n\n### Implementation Notes\n\n- **Multinomial Logistic Regression**  \n  The `MultinomialLogisticRegression` model differs from standard statistical multinomial logistic regression by using all classes rather than $K - 1$. This approach allows multi-class classification problems to be converted to binary problems by calculating $1 - P_{Class=1}$.\n\n- **Focal Loss Regression**  \n  The `FocalLossRegression` class employs a non-standard focal log-likelihood function in its optimization process leveraging $\\gamma$ to focus on difficult-to-classify examples.\n  The focal loss function, originally developed for object detection, prioritizes difficult-to-classify examples\u2014often the minority class\u2014by reducing the contribution of easy-to-classify samples. It introduces a focusing parameter, *gamma*, which down-weights the influence of easily classified instances, thereby concentrating learning on challenging cases.\n\n  Source: [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002).\n\n## Models\n\n### Logistic Regression\n\nThe `LogisticRegression` class is a custom implementation of logistic regression using the Fisher scoring algorithm. It provides methods for fitting the model, making predictions, and computing model statistics, including standard errors, Wald statistics, p-values, and confidence intervals.\n\n**Parameters:**\n- `epsilon`: Convergence threshold for the algorithm.\n- `max_iter`: Maximum number of iterations for the algorithm.\n- `information`: Type of information matrix to use ('expected' or 'empirical').\n- `use_bias`: Include a bias term in the model.\n- `significance`: Significance level for computing confidence intervals.\n\n**Methods:**\n- `fit(X, y)`: Fit the model to the data.\n- `predict(X)`: Predict target labels for input data.\n- `predict_proba(X)`: Predict class probabilities for input data.\n- `predict_ci(X)`: Predict class probabilities with confidence intervals.\n- `get_params()`: Get model parameters.\n- `set_params(**params)`: Set model parameters.\n- `summary()`: Get a summary of model parameters, standard errors, p-values, and confidence intervals.\n- `display_summary()`: Display a summary of model parameters, standard errors, p-values, and confidence intervals.\n\n### Robust Logistic Regression\n\nThe `RobustLogisticRegression` class implements robust logistic regression using the Fisher scoring algorithm with epsilon-contamination for outlier resistance. This method down-weights observations that are unlikely under the main model, providing robustness against data contamination and outliers.\n\n**Parameters:**\n- `epsilon_contamination`: Contamination level (0 \u2264 \u03b5 \u2264 1). Higher values provide more robustness but may reduce efficiency (default: 0.05).\n- `contamination_prob`: Probability for contamination distribution (default: 0.5).\n- `tol`: Convergence tolerance for parameter updates.\n- `max_iter`: Maximum number of iterations for the algorithm.\n- `information`: Type of information matrix to use ('expected' or 'empirical').\n- `use_bias`: Include a bias term in the model.\n- `significance`: Significance level for computing confidence intervals.\n\n**Methods:**\n- `fit(X, y)`: Fit the robust model to the data with automatic outlier down-weighting.\n- `predict(X)`: Predict target labels for input data.\n- `predict_proba(X)`: Predict class probabilities for input data.\n- `predict_ci(X)`: Predict class probabilities with confidence intervals.\n- `get_params()`: Get model parameters.\n- `set_params(**params)`: Set model parameters.\n- `summary()`: Get a summary of model parameters, standard errors, p-values, confidence intervals, and robust weights.\n- `display_summary()`: Display a comprehensive summary including robustness metrics (epsilon contamination, average/minimum robust weights).\n\n**Key Features:**\n- **Outlier Resistance**: Automatic down-weighting of observations unlikely under the main model.\n- **Robust Weights**: Access to individual observation weights showing outlier identification.\n- **Fisher Scoring Framework**: Consistent with other models using both expected and empirical information matrices.\n- **Statistical Inference**: Complete inference statistics with robust standard errors and confidence intervals.\n- **Rich Output**: Beautiful formatted summaries with robust-specific metrics and diagnostics.\n\n### Multinomial Logistic Regression\n\nThe `MultinomialLogisticRegression` class implements the Fisher Scoring algorithm for multinomial logistic regression, suitable for multi-class classification tasks.\n\n**Parameters:**\n- `epsilon`: Convergence threshold for the algorithm.\n- `max_iter`: Maximum number of iterations for the algorithm.\n- `information`: Type of information matrix to use ('expected' or 'empirical').\n- `use_bias`: Include a bias term in the model.\n- `significance`: Significance level for computing confidence intervals.\n- `verbose`: Enable verbose output.\n\n**Methods:**\n- `fit(X, y)`: Fit the model to the data.\n- `predict(X)`: Predict target labels for input data.\n- `predict_proba(X)`: Predict class probabilities for input data.\n- `predict_ci(X)`: Predict class probabilities with confidence intervals.\n- `summary(class_idx)`: Get a summary of model parameters, standard errors, p-values, and confidence intervals for a specific class.\n- `display_summary(class_idx)`: Display a summary of model parameters, standard errors, p-values, and confidence intervals for a specific class.\n\nThe algorithm is in a beta version and may require further testing and optimization to speed up matrix operations.\n\n### Focal Loss Regression\n\nThe `FocalLossRegression` class implements the Fisher Scoring algorithm with focal loss, designed for imbalanced classification problems where the positive class is rare.\n\n**Parameters:**\n- `gamma`: Focusing parameter for focal loss.\n- `epsilon`: Convergence threshold for the algorithm.\n- `max_iter`: Maximum number of iterations for the algorithm.\n- `information`: Type of information matrix to use ('expected' or 'empirical').\n- `use_bias`: Include a bias term in the model.\n- `verbose`: Enable verbose output.\n\n**Methods:**\n- `fit(X, y)`: Fit the model to the data.\n- `predict(X)`: Predict target labels for input data.\n- `predict_proba(X)`: Predict class probabilities for input data.\n- `predict_ci(X)`: Predict class probabilities with confidence intervals.\n- `get_params()`: Get model parameters.\n- `set_params(**params)`: Set model parameters.\n- `summary()`: Get a summary of model parameters, standard errors, p-values, and confidence intervals.\n- `display_summary()`: Display a summary of model parameters, standard errors, p-values, and confidence intervals.\n\n### Poisson Regression\n\nThe `PoissonRegression` class implements the Fisher Scoring algorithm for Poisson regression, suitable for modeling count data and incidence rates. Features robust matrix operations with automatic fallback to pseudo-inverse for numerical stability.\n\n**Parameters:**\n- `max_iter`: Maximum number of iterations for optimization.\n- `epsilon`: Convergence tolerance.\n- `use_bias`: Whether to include an intercept term.\n- `offset`: Offset term for rate modeling (e.g., log exposure times).\n- `significance`: Significance level for confidence intervals.\n- `information`: Type of information matrix to use ('expected' or 'empirical').\n\n**Methods:**\n- `fit(X, y)`: Fit the model to the data.\n- `predict(X, offset=None)`: Predict mean values with optional custom offset.\n- `calculate_st_errors(X)`: Calculate standard errors for the coefficients.\n- `summary()`: Get comprehensive model statistics including coefficients, standard errors, p-values, and confidence intervals.\n- `display_summary()`: Display beautiful formatted summary with Rich styling.\n\n**Key Features:**\n- **Offset Support**: Full support for rate modeling with log exposure times.\n- **Information Matrix Choice**: Both expected and empirical Fisher information matrices supported.\n- **Robust Implementation**: Safe matrix inversion with automatic pseudo-inverse fallback.\n- **Statistical Summaries**: Complete inference statistics with Wald tests and confidence intervals.\n- **Validated Accuracy**: Mathematical correctness verified against statsmodels with machine precision accuracy.\n\n### Negative Binomial Regression\n\nThe `NegativeBinomialRegression` class implements the Fisher Scoring algorithm for Negative Binomial regression, suitable for overdispersed count data. Features enhanced robustness with comprehensive statistical inference and fixed critical implementation bugs.\n\n**Parameters:**\n- `max_iter`: Maximum number of iterations for optimization.\n- `epsilon`: Convergence tolerance.\n- `use_bias`: Whether to include an intercept term.\n- `alpha`: Fixed dispersion parameter (overdispersion adjustment).\n- `phi`: Constant scale parameter.\n- `offset`: Offset term for the linear predictor.\n- `significance`: Significance level for confidence intervals.\n- `information`: Type of information matrix to use ('expected' or 'empirical').\n\n**Methods:**\n- `fit(X, y)`: Fit the model to the data.\n- `predict(X, offset=None)`: Predict mean values with proper offset handling.\n- `calculate_st_errors(X)`: Calculate standard errors with corrected implementation.\n- `summary()`: Get comprehensive model statistics including coefficients, standard errors, p-values, and confidence intervals.\n- `display_summary()`: Display beautiful formatted summary with Rich styling.\n\n**Key Improvements:**\n- **Fisher Scoring Conversion**: Converted from IWLS to proper Fisher scoring for consistency.\n- **Information Matrix Choice**: Both expected and empirical Fisher information matrices supported (empirical recommended for numerical stability).\n- **Bug Fixes**: Fixed missing offset in prediction and standard error calculations.\n- **Robust Implementation**: Safe matrix inversion with automatic pseudo-inverse fallback.\n- **Statistical Summaries**: Complete inference statistics with Wald tests and confidence intervals.\n- **Enhanced Reliability**: Comprehensive testing ensures mathematical correctness.\n\n## Utilities\n\n### Visualization\n\nThe package includes a utility function for visualizing observed vs predicted probabilities for count data, which can be useful for users working with Poisson and Negative Binomial models.\n\n**Function:**\n- `plot_observed_vs_predicted(y, mu, max_count=15, alpha=None, title=\"Observed vs Predicted Probabilities\", model_name=\"Model\", ax=None, plot_params=None)`: Plot observed vs predicted probabilities for count data.\n\n**Parameters:**\n- `y`: Observed count data.\n- `mu`: Predicted mean values from the model.\n- `max_count`: Maximum count to consider for probabilities.\n- `alpha`: Overdispersion parameter for Negative Binomial. If None, assumes Poisson (alpha=0).\n- `title`: Title for the plot.\n- `model_name`: Name of the model for labeling.\n- `ax`: Matplotlib axis to plot on.\n\n## Change Log\n\n- **v2.0.5**\n  - **New**: Added `RobustLogisticRegression` class with epsilon-contamination for outlier-resistant classification.\n  - **Enhanced**: Poisson and Negative Binomial regression with empirical Fisher information matrix support.\n  - **Enhanced**: Converted Negative Binomial from IWLS to proper Fisher scoring for consistency.\n  - **Added**: Comprehensive offset support for Poisson regression rate modeling.\n  - **Fixed**: Critical bugs in Negative Binomial prediction and standard error calculations.\n  - **Added**: `summary()` and `display_summary()` methods with rich statistical output.\n  - **Validated**: Mathematical correctness verified against statsmodels with machine precision accuracy.\n\n- **v2.0.4**\n  - Added a beta version of Poisson and Negative Binomial regression using Fisher Scoring.\n  - Changed naming conventions for simplicity and consistency.\n  - Changed poetry to uv for packaging.\n\n- **v2.0.3**\n  - Added a new functionality of inference of mean responses with confidence intervals for all algorithms.\n  - Focal logistic regression now supports all model statistics, including standard errors, Wald statistics, p-values, and confidence intervals.\n\n- **v2.0.2**\n  - **Bug Fixes**: Fixed the `MultinomialLogisticRegression` class to have flexible NumPy data types.\n\n- **v2.0.1**\n  - **Bug Fixes**: Removed the debug print statement from the `LogisticRegression` class.\n\n- **v2.0**\n  - **Performance Improvements**: Performance Enhancements: Optimized matrix calculations for substantial speed and memory efficiency improvements across all models. Leveraging streamlined operations, this version achieves up to 290x faster convergence. Performance gains per model:\n    - *Multinomial Logistic Regression*: Training time reduced from 125.10s to 0.43s (~290x speedup).\n    - *Logistic Regression*: Training time reduced from 0.24s to 0.05s (~5x speedup).\n    - *Focal Loss Logistic Regression*: Training time reduced from 0.26s to 0.01s (~26x speedup).\n  - **Bug Fixes**: `verbose` parameter in Focal Loss Logistic Regression now functions as expected, providing accurate logging during training.\n\n- **v0.1.4**\n  - Updated log likelihood for Multinomial Regression and minor changes to Logistic Regression for integration with scikit-learn.\n\n- **v0.1.3**\n  - Added coefficients, standard errors, p-values, and confidence intervals for Multinomial Regression.\n\n- **v0.1.2**\n  - Updated NumPy dependency.\n\n- **v0.1.1**\n  - Added support for Python 3.9+ \ud83d\udc0d.\n\n- **v0.1.0**\n  - Initial release of Fisher Scoring Logistic, Multinomial, and Focal Loss Regression.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python implementation of the Fisher Scoring algorithm for proportion and incidence rate modeling.",
    "version": "2.0.5",
    "project_urls": {
        "Homepage": "https://github.com/xRiskLab/fisher-scoring",
        "Repository": "https://github.com/xRiskLab/fisher-scoring"
    },
    "split_keywords": [
        "fisher scoring",
        " logistic regression",
        " maximum likelihood",
        " classification",
        " machine learning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9492b6fb1560256e7984fb9606461fdb87990a19c2e4ebe9ef1c75cc65588c14",
                "md5": "a02bad0f680713649ac111d8739f92d2",
                "sha256": "1580a3e7fc2c81a89ca624fafaa253b79519999c04039270f87ed8e9b8c25f0b"
            },
            "downloads": -1,
            "filename": "fisher_scoring-2.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a02bad0f680713649ac111d8739f92d2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 30248,
            "upload_time": "2025-07-18T20:21:17",
            "upload_time_iso_8601": "2025-07-18T20:21:17.839064Z",
            "url": "https://files.pythonhosted.org/packages/94/92/b6fb1560256e7984fb9606461fdb87990a19c2e4ebe9ef1c75cc65588c14/fisher_scoring-2.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6effd75c858c9aa586823bcde39f60ee42b73a234018137f9b23f16bab015065",
                "md5": "534d649cf49630603bb94ae1738451e2",
                "sha256": "36793ceba96179329ac96070f9ec316c4e92f03e2eed8f37c09188e89ef26cc9"
            },
            "downloads": -1,
            "filename": "fisher_scoring-2.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "534d649cf49630603bb94ae1738451e2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 33472,
            "upload_time": "2025-07-18T20:21:18",
            "upload_time_iso_8601": "2025-07-18T20:21:18.866103Z",
            "url": "https://files.pythonhosted.org/packages/6e/ff/d75c858c9aa586823bcde39f60ee42b73a234018137f9b23f16bab015065/fisher_scoring-2.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-18 20:21:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "xRiskLab",
    "github_project": "fisher-scoring",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "fisher-scoring"
}

None