# iCost
iCost is a Python library for instance-level cost-sensitive learning, fully compatible with scikit-learn. It extends traditional cost-sensitive classification by dynamically adjusting sample costs based on instance complexity. Multiple strategies have been incorporated into the algorithm, and it works with any scikit-learn classifier that supports sample_weight.
### Requirements:
[](https://www.python.org/downloads/)
[](https://scikit-learn.org/stable/)
[](https://numpy.org/)
[](https://pandas.pydata.org/)


### Key Features:
- Support for any scikit-learn compatible classifier as the base model.
- Multiple strategies for cost-sensitive learning:
-- ncs → no cost (baseline).
-- org → original sklearn-style cost-sensitive (all minority weighted by imbalance ratio).
-- mst → MST-based linked vs. pure minority categorization.
-- neighbor → neighbor-based categorization with three sub-modes.
- Neighbor-based categorization (5-NN):
-- Mode 1 → safe, pure, border.
-- Mode 2 → safe, border, outlier.
-- Mode 3 → fine-grained categories g1–g6 with user-defined penalties.
- Utility function: categorize_minority_class for direct analysis of minority-class samples.
## Synopsis
The standard weighted classifier applies an increased weight to all the minority class misclassifications in imbalanced classification tasks. This approach is available in the standard implementation of the sklearn library.
However, there is an issue. Should the same weight be applied to all the minority class samples indiscriminately? Some minority class samples are closer to the decision boundary (difficult to identify), while some samples are far way from the border (easy to classify). There are also some instances that are noisy, completely surrounded by instances from the majority class. Now, applying the same higher misclassification cost to all the minority-class samples is unjustifiable. It distorts the decision boundary significantly, resulting in more misclassifications.
The proposed solution is to apply the cost to only certain samples or apply different costs depending on their level of difficulty. This improves the prediction performance in different imbalanced scenarios.
For more information, please refer to the following paper:
### Paper
arxiv: https://doi.org/10.48550/arXiv.2409.13007
The paper is currently under review.
## Installation
[](https://pypi.org/project/icost/)
```
pip install icost
```
## Usage Example
```
from icost import iCost, categorize_minority_class
from sklearn.svm import SVC
# Example with neighbor-mode cost assignment
clf = iCost(
base_classifier=SVC(kernel="rbf", probability=True),
method="neighbor",
neighbor_mode=2 # Mode 1, 2, or 3
)
clf.fit(X_train, y_train)
print("Test Accuracy:", clf.score(X_test, y_test))
# Example with mode=3 (custom penalties for g1..g6)
clf3 = iCost(
base_classifier=SVC(),
method="neighbor",
neighbor_mode=3,
neighbor_costs=[1.0, 2.0, 5.0, 5.0, 3.0, 1.0] # g1..g6
)
clf3.fit(X_train, y_train)
```
### Helper Function
You can analyze minority samples directly with:
```
import pandas as pd
from icost import categorize_minority_class
df = pd.read_csv("your_dataset.csv")
min_idx, groups, opp_counts = categorize_minority_class(
df,
minority_label=1,
mode=1,
show_summary=True
)
```
### Output:
```
Category summary (minority samples):
safe: 45
pure: 28
border: 62
```
## Structure
```
icost/
├── __init__.py # Makes icost a package; exposes iCost and helpers
├── __version__.py # Stores the package version (e.g., 0.1.0)
├── icost.py # Main iCost class (methods: ncs, org, mst, neighbor)
├── mst_linked_ind.py # MST-based helper:
│ # - Identifies 'linked' vs 'pure' minority samples
│ # - Used for MST variant of iCost
└── categorize_minority_v2.py # Neighbor-based helper:
# - Categorizes minority samples with 5-NN
# - Supports modes (safe, pure, border, outlier, g1–g6)
# - Provides summary statistics
```
### Other files in the repo
- README.md → Documentation and usage instructions.
- LICENSE → Project license (MIT by default).
- pyproject.toml → Build configuration for packaging and PyPI upload.
- icost_usage_example → tests to check functionality.
## Screenshots


## BibTex Citation
If you plan to use this module, please cite the paper:
```
@misc{newaz2024icostnovelinstancecomplexity,
title={iCost: A Novel Instance Complexity Based Cost-Sensitive Learning Framework for Imbalanced Classification},
author={Asif Newaz and Asif Ur Rahman Adib and Taskeed Jabid},
year={2024},
eprint={2409.13007},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2409.13007},
}
```
### License
This project is licensed under the MIT License.
### Note
The work is currently being updated to include additional features, which I plan to incorporate soon.
Raw data
{
"_id": null,
"home_page": null,
"name": "icost",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "machine-learning, cost-sensitive learning, class-imbalance, scikit-learn",
"author": null,
"author_email": "Asif Newaz <eee.asifnewaz@iut-dhaka.edu>",
"download_url": "https://files.pythonhosted.org/packages/4c/5b/8e948b64560d9173637fdb5146c71e8ac5fc326541e054f104a6b9f0473d/icost-0.1.1.tar.gz",
"platform": null,
"description": "\r\n# iCost\r\n\r\niCost is a Python library for instance-level cost-sensitive learning, fully compatible with scikit-learn. It extends traditional cost-sensitive classification by dynamically adjusting sample costs based on instance complexity. Multiple strategies have been incorporated into the algorithm, and it works with any scikit-learn classifier that supports sample_weight.\r\n\r\n### Requirements:\r\n[](https://www.python.org/downloads/)\r\n[](https://scikit-learn.org/stable/)\r\n[](https://numpy.org/)\r\n[](https://pandas.pydata.org/)\r\n\r\n\r\n\r\n\r\n \r\n\r\n### Key Features:\r\n- Support for any scikit-learn compatible classifier as the base model.\r\n- Multiple strategies for cost-sensitive learning:\r\n\r\n -- ncs \u2192 no cost (baseline).\r\n \r\n -- org \u2192 original sklearn-style cost-sensitive (all minority weighted by imbalance ratio).\r\n \r\n -- mst \u2192 MST-based linked vs. pure minority categorization.\r\n \r\n -- neighbor \u2192 neighbor-based categorization with three sub-modes.\r\n\r\n- Neighbor-based categorization (5-NN):\r\n\r\n -- Mode 1 \u2192 safe, pure, border.\r\n \r\n -- Mode 2 \u2192 safe, border, outlier.\r\n \r\n -- Mode 3 \u2192 fine-grained categories g1\u2013g6 with user-defined penalties.\r\n\r\n- Utility function: categorize_minority_class for direct analysis of minority-class samples.\r\n\r\n \r\n## Synopsis\r\n\r\nThe standard weighted classifier applies an increased weight to all the minority class misclassifications in imbalanced classification tasks. This approach is available in the standard implementation of the sklearn library.\r\n\r\nHowever, there is an issue. Should the same weight be applied to all the minority class samples indiscriminately? Some minority class samples are closer to the decision boundary (difficult to identify), while some samples are far way from the border (easy to classify). There are also some instances that are noisy, completely surrounded by instances from the majority class. Now, applying the same higher misclassification cost to all the minority-class samples is unjustifiable. It distorts the decision boundary significantly, resulting in more misclassifications. \r\n\r\nThe proposed solution is to apply the cost to only certain samples or apply different costs depending on their level of difficulty. This improves the prediction performance in different imbalanced scenarios.\r\n\r\nFor more information, please refer to the following paper:\r\n\r\n### Paper\r\n\r\narxiv: https://doi.org/10.48550/arXiv.2409.13007 \r\n\r\nThe paper is currently under review.\r\n\r\n## Installation\r\n\r\n[](https://pypi.org/project/icost/)\r\n\r\n\r\n```\r\npip install icost\r\n```\r\n\r\n\r\n## Usage Example\r\n\r\n```\r\nfrom icost import iCost, categorize_minority_class\r\nfrom sklearn.svm import SVC\r\n\r\n# Example with neighbor-mode cost assignment\r\nclf = iCost(\r\n base_classifier=SVC(kernel=\"rbf\", probability=True),\r\n method=\"neighbor\",\r\n neighbor_mode=2 # Mode 1, 2, or 3\r\n)\r\n\r\nclf.fit(X_train, y_train)\r\nprint(\"Test Accuracy:\", clf.score(X_test, y_test))\r\n\r\n# Example with mode=3 (custom penalties for g1..g6)\r\nclf3 = iCost(\r\n base_classifier=SVC(),\r\n method=\"neighbor\",\r\n neighbor_mode=3,\r\n neighbor_costs=[1.0, 2.0, 5.0, 5.0, 3.0, 1.0] # g1..g6\r\n)\r\nclf3.fit(X_train, y_train)\r\n```\r\n\r\n### Helper Function\r\n\r\nYou can analyze minority samples directly with:\r\n\r\n```\r\nimport pandas as pd\r\nfrom icost import categorize_minority_class\r\n\r\ndf = pd.read_csv(\"your_dataset.csv\")\r\nmin_idx, groups, opp_counts = categorize_minority_class(\r\n df,\r\n minority_label=1,\r\n mode=1,\r\n show_summary=True\r\n)\r\n```\r\n\r\n### Output:\r\n\r\n```\r\nCategory summary (minority samples):\r\n safe: 45\r\n pure: 28\r\n border: 62\r\n```\r\n\r\n## Structure\r\n\r\n```\r\nicost/\r\n\u251c\u2500\u2500 __init__.py # Makes icost a package; exposes iCost and helpers\r\n\u251c\u2500\u2500 __version__.py # Stores the package version (e.g., 0.1.0)\r\n\u251c\u2500\u2500 icost.py # Main iCost class (methods: ncs, org, mst, neighbor)\r\n\u251c\u2500\u2500 mst_linked_ind.py # MST-based helper:\r\n\u2502 # - Identifies 'linked' vs 'pure' minority samples\r\n\u2502 # - Used for MST variant of iCost\r\n\u2514\u2500\u2500 categorize_minority_v2.py # Neighbor-based helper:\r\n # - Categorizes minority samples with 5-NN\r\n # - Supports modes (safe, pure, border, outlier, g1\u2013g6)\r\n # - Provides summary statistics\r\n```\r\n\r\n### Other files in the repo\r\n\r\n - README.md \u2192 Documentation and usage instructions.\r\n - LICENSE \u2192 Project license (MIT by default).\r\n - pyproject.toml \u2192 Build configuration for packaging and PyPI upload.\r\n - icost_usage_example \u2192 tests to check functionality.\r\n\r\n\r\n## Screenshots\r\n\r\n\r\n\r\n\r\n\r\n\r\n## BibTex Citation\r\nIf you plan to use this module, please cite the paper:\r\n\r\n```\r\n@misc{newaz2024icostnovelinstancecomplexity,\r\n title={iCost: A Novel Instance Complexity Based Cost-Sensitive Learning Framework for Imbalanced Classification}, \r\n author={Asif Newaz and Asif Ur Rahman Adib and Taskeed Jabid},\r\n year={2024},\r\n eprint={2409.13007},\r\n archivePrefix={arXiv},\r\n primaryClass={cs.LG},\r\n url={https://arxiv.org/abs/2409.13007}, \r\n}\r\n```\r\n\r\n### License\r\n\r\nThis project is licensed under the MIT License.\r\n\r\n### Note\r\n\r\nThe work is currently being updated to include additional features, which I plan to incorporate soon. \r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Instance-complexity based cost-sensitive learning",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/newaz-aa/icost",
"Issues": "https://github.com/newaz-aa/icost/issues"
},
"split_keywords": [
"machine-learning",
" cost-sensitive learning",
" class-imbalance",
" scikit-learn"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "755b093ec1dd14764f03e83f74bdc6a54c305896fcb297b0d51c697161f0b05d",
"md5": "7494a14650d016635bf0cdc287314a81",
"sha256": "2b19cbd94b77f8f6da454916db9057ea893af865c88d802b5350e37d7da0cb14"
},
"downloads": -1,
"filename": "icost-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7494a14650d016635bf0cdc287314a81",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 11557,
"upload_time": "2025-08-28T12:27:40",
"upload_time_iso_8601": "2025-08-28T12:27:40.599186Z",
"url": "https://files.pythonhosted.org/packages/75/5b/093ec1dd14764f03e83f74bdc6a54c305896fcb297b0d51c697161f0b05d/icost-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4c5b8e948b64560d9173637fdb5146c71e8ac5fc326541e054f104a6b9f0473d",
"md5": "c3c0d61c7a38f35cedf772700c61883f",
"sha256": "a45a7bad8676e2d2630f34d2baaf21c02a1bfe6aa6d592a257dd8dad2a365bbd"
},
"downloads": -1,
"filename": "icost-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "c3c0d61c7a38f35cedf772700c61883f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 13120,
"upload_time": "2025-08-28T12:27:42",
"upload_time_iso_8601": "2025-08-28T12:27:42.219512Z",
"url": "https://files.pythonhosted.org/packages/4c/5b/8e948b64560d9173637fdb5146c71e8ac5fc326541e054f104a6b9f0473d/icost-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-28 12:27:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "newaz-aa",
"github_project": "icost",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "icost"
}