icost


Nameicost JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryInstance-complexity based cost-sensitive learning
upload_time2025-08-28 12:27:42
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords machine-learning cost-sensitive learning class-imbalance scikit-learn
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# iCost

iCost is a Python library for instance-level cost-sensitive learning, fully compatible with scikit-learn. It extends traditional cost-sensitive classification by dynamically adjusting sample costs based on instance complexity. Multiple strategies have been incorporated into the algorithm, and it works with any scikit-learn classifier that supports sample_weight.

### Requirements:
[![Python](https://img.shields.io/badge/Python-3.8%2B-blue)](https://www.python.org/downloads/)
[![scikit-learn](https://img.shields.io/badge/scikit--learn-0.24%2B-orange)](https://scikit-learn.org/stable/)
[![numpy](https://img.shields.io/badge/numpy-1.19%2B-ff69b4)](https://numpy.org/)
[![pandas](https://img.shields.io/badge/pandas-1.1%2B-yellow)](https://pandas.pydata.org/)
![Seaborn](https://img.shields.io/badge/Seaborn-Data%20Visualization-blue)
![Matplotlib](https://img.shields.io/badge/Matplotlib-Data%20Visualization-orange)


 

### Key Features:
- Support for any scikit-learn compatible classifier as the base model.
- Multiple strategies for cost-sensitive learning:

    -- ncs → no cost (baseline).
  
    -- org → original sklearn-style cost-sensitive (all minority weighted by imbalance ratio).
  
    -- mst → MST-based linked vs. pure minority categorization.
  
    -- neighbor → neighbor-based categorization with three sub-modes.

- Neighbor-based categorization (5-NN):

    -- Mode 1 → safe, pure, border.
  
    -- Mode 2 → safe, border, outlier.
  
    -- Mode 3 → fine-grained categories g1–g6 with user-defined penalties.

- Utility function: categorize_minority_class for direct analysis of minority-class samples.

  
## Synopsis

The standard weighted classifier applies an increased weight to all the minority class misclassifications in imbalanced classification tasks. This approach is available in the standard implementation of the sklearn library.

However, there is an issue. Should the same weight be applied to all the minority class samples indiscriminately? Some minority class samples are closer to the decision boundary (difficult to identify), while some samples are far way from the border (easy to classify). There are also some instances that are noisy, completely surrounded by instances from the majority class. Now, applying the same higher misclassification cost to all the minority-class samples is unjustifiable. It distorts the decision boundary significantly, resulting in more misclassifications. 

The proposed solution is to apply the cost to only certain samples or apply different costs depending on their level of difficulty. This improves the prediction performance in different imbalanced scenarios.

For more information, please refer to the following paper:

### Paper

arxiv: https://doi.org/10.48550/arXiv.2409.13007 

The paper is currently under review.

## Installation

[![PyPI version](https://img.shields.io/pypi/v/icost?color=blue&label=install%20with%20pip)](https://pypi.org/project/icost/)


```
pip install icost
```


## Usage Example

```
from icost import iCost, categorize_minority_class
from sklearn.svm import SVC

# Example with neighbor-mode cost assignment
clf = iCost(
    base_classifier=SVC(kernel="rbf", probability=True),
    method="neighbor",
    neighbor_mode=2          # Mode 1, 2, or 3
)

clf.fit(X_train, y_train)
print("Test Accuracy:", clf.score(X_test, y_test))

# Example with mode=3 (custom penalties for g1..g6)
clf3 = iCost(
    base_classifier=SVC(),
    method="neighbor",
    neighbor_mode=3,
    neighbor_costs=[1.0, 2.0, 5.0, 5.0, 3.0, 1.0]  # g1..g6
)
clf3.fit(X_train, y_train)
```

### Helper Function

You can analyze minority samples directly with:

```
import pandas as pd
from icost import categorize_minority_class

df = pd.read_csv("your_dataset.csv")
min_idx, groups, opp_counts = categorize_minority_class(
    df,
    minority_label=1,
    mode=1,
    show_summary=True
)
```

### Output:

```
Category summary (minority samples):
  safe: 45
  pure: 28
  border: 62
```

## Structure

```
icost/
├── __init__.py               # Makes icost a package; exposes iCost and helpers
├── __version__.py            # Stores the package version (e.g., 0.1.0)
├── icost.py                  # Main iCost class (methods: ncs, org, mst, neighbor)
├── mst_linked_ind.py         # MST-based helper:
│                             #   - Identifies 'linked' vs 'pure' minority samples
│                             #   - Used for MST variant of iCost
└── categorize_minority_v2.py # Neighbor-based helper:
                              #   - Categorizes minority samples with 5-NN
                              #   - Supports modes (safe, pure, border, outlier, g1–g6)
                              #   - Provides summary statistics
```

### Other files in the repo

 - README.md → Documentation and usage instructions.
 - LICENSE → Project license (MIT by default).
 - pyproject.toml → Build configuration for packaging and PyPI upload.
 - icost_usage_example → tests to check functionality.


## Screenshots

![App Screenshot](https://github.com/newaz-aa/Modified_Cost_Sensitive_Classifier/blob/main/Figures/categorization.png)

![App Screenshot](https://github.com/newaz-aa/Modified_Cost_Sensitive_Classifier/blob/main/Figures/icsot_lr.png)


## BibTex Citation
If you plan to use this module, please cite the paper:

```
@misc{newaz2024icostnovelinstancecomplexity,
      title={iCost: A Novel Instance Complexity Based Cost-Sensitive Learning Framework for Imbalanced Classification}, 
      author={Asif Newaz and Asif Ur Rahman Adib and Taskeed Jabid},
      year={2024},
      eprint={2409.13007},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2409.13007}, 
}
```

### License

This project is licensed under the MIT License.

### Note

The work is currently being updated to include additional features, which I plan to incorporate soon. 

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "icost",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "machine-learning, cost-sensitive learning, class-imbalance, scikit-learn",
    "author": null,
    "author_email": "Asif Newaz <eee.asifnewaz@iut-dhaka.edu>",
    "download_url": "https://files.pythonhosted.org/packages/4c/5b/8e948b64560d9173637fdb5146c71e8ac5fc326541e054f104a6b9f0473d/icost-0.1.1.tar.gz",
    "platform": null,
    "description": "\r\n# iCost\r\n\r\niCost is a Python library for instance-level cost-sensitive learning, fully compatible with scikit-learn. It extends traditional cost-sensitive classification by dynamically adjusting sample costs based on instance complexity. Multiple strategies have been incorporated into the algorithm, and it works with any scikit-learn classifier that supports sample_weight.\r\n\r\n### Requirements:\r\n[![Python](https://img.shields.io/badge/Python-3.8%2B-blue)](https://www.python.org/downloads/)\r\n[![scikit-learn](https://img.shields.io/badge/scikit--learn-0.24%2B-orange)](https://scikit-learn.org/stable/)\r\n[![numpy](https://img.shields.io/badge/numpy-1.19%2B-ff69b4)](https://numpy.org/)\r\n[![pandas](https://img.shields.io/badge/pandas-1.1%2B-yellow)](https://pandas.pydata.org/)\r\n![Seaborn](https://img.shields.io/badge/Seaborn-Data%20Visualization-blue)\r\n![Matplotlib](https://img.shields.io/badge/Matplotlib-Data%20Visualization-orange)\r\n\r\n\r\n \r\n\r\n### Key Features:\r\n- Support for any scikit-learn compatible classifier as the base model.\r\n- Multiple strategies for cost-sensitive learning:\r\n\r\n    -- ncs \u2192 no cost (baseline).\r\n  \r\n    -- org \u2192 original sklearn-style cost-sensitive (all minority weighted by imbalance ratio).\r\n  \r\n    -- mst \u2192 MST-based linked vs. pure minority categorization.\r\n  \r\n    -- neighbor \u2192 neighbor-based categorization with three sub-modes.\r\n\r\n- Neighbor-based categorization (5-NN):\r\n\r\n    -- Mode 1 \u2192 safe, pure, border.\r\n  \r\n    -- Mode 2 \u2192 safe, border, outlier.\r\n  \r\n    -- Mode 3 \u2192 fine-grained categories g1\u2013g6 with user-defined penalties.\r\n\r\n- Utility function: categorize_minority_class for direct analysis of minority-class samples.\r\n\r\n  \r\n## Synopsis\r\n\r\nThe standard weighted classifier applies an increased weight to all the minority class misclassifications in imbalanced classification tasks. This approach is available in the standard implementation of the sklearn library.\r\n\r\nHowever, there is an issue. Should the same weight be applied to all the minority class samples indiscriminately? Some minority class samples are closer to the decision boundary (difficult to identify), while some samples are far way from the border (easy to classify). There are also some instances that are noisy, completely surrounded by instances from the majority class. Now, applying the same higher misclassification cost to all the minority-class samples is unjustifiable. It distorts the decision boundary significantly, resulting in more misclassifications. \r\n\r\nThe proposed solution is to apply the cost to only certain samples or apply different costs depending on their level of difficulty. This improves the prediction performance in different imbalanced scenarios.\r\n\r\nFor more information, please refer to the following paper:\r\n\r\n### Paper\r\n\r\narxiv: https://doi.org/10.48550/arXiv.2409.13007 \r\n\r\nThe paper is currently under review.\r\n\r\n## Installation\r\n\r\n[![PyPI version](https://img.shields.io/pypi/v/icost?color=blue&label=install%20with%20pip)](https://pypi.org/project/icost/)\r\n\r\n\r\n```\r\npip install icost\r\n```\r\n\r\n\r\n## Usage Example\r\n\r\n```\r\nfrom icost import iCost, categorize_minority_class\r\nfrom sklearn.svm import SVC\r\n\r\n# Example with neighbor-mode cost assignment\r\nclf = iCost(\r\n    base_classifier=SVC(kernel=\"rbf\", probability=True),\r\n    method=\"neighbor\",\r\n    neighbor_mode=2          # Mode 1, 2, or 3\r\n)\r\n\r\nclf.fit(X_train, y_train)\r\nprint(\"Test Accuracy:\", clf.score(X_test, y_test))\r\n\r\n# Example with mode=3 (custom penalties for g1..g6)\r\nclf3 = iCost(\r\n    base_classifier=SVC(),\r\n    method=\"neighbor\",\r\n    neighbor_mode=3,\r\n    neighbor_costs=[1.0, 2.0, 5.0, 5.0, 3.0, 1.0]  # g1..g6\r\n)\r\nclf3.fit(X_train, y_train)\r\n```\r\n\r\n### Helper Function\r\n\r\nYou can analyze minority samples directly with:\r\n\r\n```\r\nimport pandas as pd\r\nfrom icost import categorize_minority_class\r\n\r\ndf = pd.read_csv(\"your_dataset.csv\")\r\nmin_idx, groups, opp_counts = categorize_minority_class(\r\n    df,\r\n    minority_label=1,\r\n    mode=1,\r\n    show_summary=True\r\n)\r\n```\r\n\r\n### Output:\r\n\r\n```\r\nCategory summary (minority samples):\r\n  safe: 45\r\n  pure: 28\r\n  border: 62\r\n```\r\n\r\n## Structure\r\n\r\n```\r\nicost/\r\n\u251c\u2500\u2500 __init__.py               # Makes icost a package; exposes iCost and helpers\r\n\u251c\u2500\u2500 __version__.py            # Stores the package version (e.g., 0.1.0)\r\n\u251c\u2500\u2500 icost.py                  # Main iCost class (methods: ncs, org, mst, neighbor)\r\n\u251c\u2500\u2500 mst_linked_ind.py         # MST-based helper:\r\n\u2502                             #   - Identifies 'linked' vs 'pure' minority samples\r\n\u2502                             #   - Used for MST variant of iCost\r\n\u2514\u2500\u2500 categorize_minority_v2.py # Neighbor-based helper:\r\n                              #   - Categorizes minority samples with 5-NN\r\n                              #   - Supports modes (safe, pure, border, outlier, g1\u2013g6)\r\n                              #   - Provides summary statistics\r\n```\r\n\r\n### Other files in the repo\r\n\r\n - README.md \u2192 Documentation and usage instructions.\r\n - LICENSE \u2192 Project license (MIT by default).\r\n - pyproject.toml \u2192 Build configuration for packaging and PyPI upload.\r\n - icost_usage_example \u2192 tests to check functionality.\r\n\r\n\r\n## Screenshots\r\n\r\n![App Screenshot](https://github.com/newaz-aa/Modified_Cost_Sensitive_Classifier/blob/main/Figures/categorization.png)\r\n\r\n![App Screenshot](https://github.com/newaz-aa/Modified_Cost_Sensitive_Classifier/blob/main/Figures/icsot_lr.png)\r\n\r\n\r\n## BibTex Citation\r\nIf you plan to use this module, please cite the paper:\r\n\r\n```\r\n@misc{newaz2024icostnovelinstancecomplexity,\r\n      title={iCost: A Novel Instance Complexity Based Cost-Sensitive Learning Framework for Imbalanced Classification}, \r\n      author={Asif Newaz and Asif Ur Rahman Adib and Taskeed Jabid},\r\n      year={2024},\r\n      eprint={2409.13007},\r\n      archivePrefix={arXiv},\r\n      primaryClass={cs.LG},\r\n      url={https://arxiv.org/abs/2409.13007}, \r\n}\r\n```\r\n\r\n### License\r\n\r\nThis project is licensed under the MIT License.\r\n\r\n### Note\r\n\r\nThe work is currently being updated to include additional features, which I plan to incorporate soon. \r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Instance-complexity based cost-sensitive learning",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/newaz-aa/icost",
        "Issues": "https://github.com/newaz-aa/icost/issues"
    },
    "split_keywords": [
        "machine-learning",
        " cost-sensitive learning",
        " class-imbalance",
        " scikit-learn"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "755b093ec1dd14764f03e83f74bdc6a54c305896fcb297b0d51c697161f0b05d",
                "md5": "7494a14650d016635bf0cdc287314a81",
                "sha256": "2b19cbd94b77f8f6da454916db9057ea893af865c88d802b5350e37d7da0cb14"
            },
            "downloads": -1,
            "filename": "icost-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7494a14650d016635bf0cdc287314a81",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 11557,
            "upload_time": "2025-08-28T12:27:40",
            "upload_time_iso_8601": "2025-08-28T12:27:40.599186Z",
            "url": "https://files.pythonhosted.org/packages/75/5b/093ec1dd14764f03e83f74bdc6a54c305896fcb297b0d51c697161f0b05d/icost-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4c5b8e948b64560d9173637fdb5146c71e8ac5fc326541e054f104a6b9f0473d",
                "md5": "c3c0d61c7a38f35cedf772700c61883f",
                "sha256": "a45a7bad8676e2d2630f34d2baaf21c02a1bfe6aa6d592a257dd8dad2a365bbd"
            },
            "downloads": -1,
            "filename": "icost-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "c3c0d61c7a38f35cedf772700c61883f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 13120,
            "upload_time": "2025-08-28T12:27:42",
            "upload_time_iso_8601": "2025-08-28T12:27:42.219512Z",
            "url": "https://files.pythonhosted.org/packages/4c/5b/8e948b64560d9173637fdb5146c71e8ac5fc326541e054f104a6b9f0473d/icost-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-28 12:27:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "newaz-aa",
    "github_project": "icost",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "icost"
}
        
Elapsed time: 0.45143s