ChemIC-ml


NameChemIC-ml JSON
Version 1.3.1 PyPI version JSON
download
home_pagehttps://github.com/ontochem/ChemIC.git
SummaryChemical images classification project. Program for training the deep neural network model and web service for classification chemical images
upload_time2024-09-02 19:43:56
maintainerNone
docs_urlNone
authorDr.Aleksei Krasnov
requires_python<3.12,>=3.10
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Chemical Image Classifier (ChemIC) v1.3.1
[![License](https://img.shields.io/badge/License-MIT-brightgreen.svg)](https://opensource.org/licenses/MIT)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-blue.svg)](https://GitHub.com/ontochem/ChemIC/graphs/commit-activity)
[![GitHub issues](https://img.shields.io/github/issues/ontochem/ChemIC.svg)](https://github.com/ontochem/ChemIC/issues)
[![GitHub contributors](https://img.shields.io/github/contributors/ontochem/ChemIC.svg)](https://github.com/ontochem/ChemIC/graphs/contributors)
[![DOI](https://zenodo.org/badge/DOI/10.1039/D3DD00228D.svg)](https://doi.org/10.1039/D3DD00228D)



## Table of Contents
- [Project Description](#project-description)
- [Requirements](#requirements)
- [Prepare Workspace Environment with Conda](#prepare-workspace-environment-with-conda)
- [Model Construction](#model-construction)
- [Models Download](#models-download)
- [Usage: Web Service for Chemical Image Classification](#usage-web-service-for-chemical-image-classification)
- [Jupyter Notebook](#jupyter-notebook)
- [Author](#author)
- [Citation](#citation)
- [References](#references)
- [License](#license)

## Project Description
The Chemical Image Classifier (ChemIC) project is designed for training and utilizing a Convolutional Neural Network (CNN) model to classify chemical images into one of four predefined classes:
1. Images with a single chemical structure
2. Images with chemical reactions
3. Images with multiple chemical structures
4. Images with no chemical structures

The package consists of three main components:

### A) CNN Model for Image Classification ([chemic_train_eval.py](chemic_train_eval.py))
- Trains a deep learning model to classify images into the four predefined classes.
- Utilizes a pre-trained ResNet-50 model and includes steps for data preparation, model training, evaluation, and testing.

### B) Web Service for Chemical Image Classification ([app.py](chemic/app.py))
- Provides a FastAPI web application for classifying chemical images using the trained ResNet-50 model.
- Exposes an endpoint `/classify_image` for accepting chemical images and returning the predicted class.

### C) Image Classification Client ([client.py](chemic/client.py))
- Interacts with the ChemIC web server. The client can send to the server:
  - The path to an individual image file
  - The path to a directory with multiple images
  - Base64 encoded image data

  The server classifies the images and returns the recognition results to the client.

## Prepare Workspace Environment with Conda
```bash
# 1. Create and activate the conda environment
conda create --name chemic "python<3.12"
conda activate chemic

# 2. Install ChemIC-ml
# 2.1 From PyPI
pip install ChemIC-ml

# 2.2 Or, install from the GitHub repository
pip install git+https://github.com/ontochem/ChemIC.git

# 2.3 Or, install in editable mode from the GitHub repository
git clone https://github.com/ontochem/ChemIC.git
cd ChemIC
pip install -r requirements.txt
pip install -e .
```
- Where -e means "editable" mode.
 
## Model construction
First, download the archive with manually labeled images, available as part of the supplementary materials from Zenodo: [dataset_for_image_classifier.zip](https://zenodo.org/records/13378718). Unzip the archive:
```bash
unzip dataset_for_image_classifier.zip
```
To perform model training, validation, and testing, as well as saving your trained model, run the following command in the CLI:
```bash
python chemic_train_eval.py --dataset_dir /path/to/data --checkpoint_path /path/to/checkpoint.pth --models_dir /path/to/models
```
* `--dataset_dir`: Directory containing the dataset (with train, test, and validation subdirectories).
* `--checkpoint_path`: Path to the existing model checkpoint file.
* `--models_dir`: Directory to save newly trained models.

This command executes the training and evaluation using the specified paths.

## Models download
Download the pre-trained models from Zenodo as an archive: [models.zip](https://doi.org/10.5281/zenodo.10709886). 
Unzip it into the `chemic/models` directory. The models directory should contain the pre-trained model `chemical_image_classifier_resnet50.pth` for chemical image classification.

## Usage Web Service for Chemical Image Classification
### 1. Start the FastAPI Web Server in Production Mode
Run the following command in terminal:
```bash
uvicorn chemic.app:app --host 127.0.0.1 --port 5010 --workers 1 --timeout-keep-alive 3600
```
* `--workers 1`: Specifies the number of worker processes. Adjust based on your server's capabilities.
* `--host 127.0.0.1 --port 5010`: Binds the application to the specified address and port. Modify as needed.
* `--timeout-keep-alive 3600`: Sets the maximum allowed request processing time in seconds. Adjust as necessary.

## 2. Use frontend Web interface
In another terminal window, run the following command:
```bash
streamlit run chemic_frontendapp.py --server.address=0.0.0.0 --server.port=5009
```
This command will launch the ChemIC user web interface.

## 3. Classify Images Using the `client.py` Module via CLI
```bash
 python chemic/client.py --image_path /path/to/images --export_dir /path/to/export
```
OR 
```bash
 python chemic/client.py  --image_data <base64_encoded_string> --export_dir /path/to/export
```
* `--image_path` is the path to the image file or directory with images for classification.
* `--image_data` is the base64 encoded image data.
* `--export_dir` is the export directory for the results.

## 4. Alternatively, Use the Client for Classification in Your Python Code
```python
from chemic.client import ChemClassifierClient

client = ChemClassifierClient(server_url='http://127.0.0.1:5010')

# Check the health of the server
health_status = client.healthcheck().get('status')
print(f"Health Status: {health_status}")

# Use image path or directory. Replace with the actual path to your image file
image_path = '<path to the image file or directory with images for classification>'
recognition_results = client.classify_image(image_path)

# OR use base64-encoded image data. Replace with your base64-encoded image data:
base64_data = b'iVBORw0KGgoAAAANSUhEUgA....'
recognition_results = client.classify_image(image_data=base64_data)

# Recognition results will be returned in the form of  a list of dictionaries
print(recognition_results)
[
  {
    'image_id': 'image_name_1.png',
    'predicted_label': 'single chemical structure',
    'classifier_package': 'ChemIC-ml_1.3.1',
    'classifier_model': 'ResNet_50',
  },
 {
   'image_id': 'image_name_2.png',
    'predicted_label': 'multiple chemical structures',
   'classifier_package': 'ChemIC-ml_1.3.1',
   'classifier_model': 'ResNet_50',
 },
  ...
]
```

## Jupyter Notebook
The [client_image_classifier.ipynb](notebooks/client_image_classifier.ipynb) notebook in the `notebooks` directory provides an easy-to-use interface for classifying images. Follow the steps outlined in the notebook to perform image classification.

## Author
Dr. Aleksei Krasnov
a.krasnov@digital-science.com
OntoChem GmbH part of Digital Science

## Citation
- A. Krasnov, S. Barnabas, T. Böhme, S. Boyer, L. Weber, Comparing software tools for optical chemical structure recognition, Digital Discovery (2024).	https://doi.org/10.1039/D3DD00228D
- L. Weber, A. Krasnov, S. Barnabas, T. Böhme, S. Boyer, Comparing Optical Chemical Structure Recognition Tools, ChemRxiv. (2023). https://doi.org/10.26434/chemrxiv-2023-d6kmg-v2

## References
- A. Krasnov, Images dataset for Chemical Images Classifier model. https://zenodo.org/records/13378718
- A. Krasnov, Chemical Image Classifier Model. https://zenodo.org/records/10709886

## License
This project is licensed under the MIT - see the [LICENSE.md](LICENSE.md) file for details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ontochem/ChemIC.git",
    "name": "ChemIC-ml",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Dr.Aleksei Krasnov",
    "author_email": "a.krasnov@digital-science.com",
    "download_url": "https://files.pythonhosted.org/packages/57/1c/9d46cef40afdc45481eb86866eaa9e59185ad93753aca1d32d52397ceea1/chemic_ml-1.3.1.tar.gz",
    "platform": null,
    "description": "# Chemical Image Classifier (ChemIC) v1.3.1\n[![License](https://img.shields.io/badge/License-MIT-brightgreen.svg)](https://opensource.org/licenses/MIT)\n[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-blue.svg)](https://GitHub.com/ontochem/ChemIC/graphs/commit-activity)\n[![GitHub issues](https://img.shields.io/github/issues/ontochem/ChemIC.svg)](https://github.com/ontochem/ChemIC/issues)\n[![GitHub contributors](https://img.shields.io/github/contributors/ontochem/ChemIC.svg)](https://github.com/ontochem/ChemIC/graphs/contributors)\n[![DOI](https://zenodo.org/badge/DOI/10.1039/D3DD00228D.svg)](https://doi.org/10.1039/D3DD00228D)\n\n\n\n## Table of Contents\n- [Project Description](#project-description)\n- [Requirements](#requirements)\n- [Prepare Workspace Environment with Conda](#prepare-workspace-environment-with-conda)\n- [Model Construction](#model-construction)\n- [Models Download](#models-download)\n- [Usage: Web Service for Chemical Image Classification](#usage-web-service-for-chemical-image-classification)\n- [Jupyter Notebook](#jupyter-notebook)\n- [Author](#author)\n- [Citation](#citation)\n- [References](#references)\n- [License](#license)\n\n## Project Description\nThe Chemical Image Classifier (ChemIC) project is designed for training and utilizing a Convolutional Neural Network (CNN) model to classify chemical images into one of four predefined classes:\n1. Images with a single chemical structure\n2. Images with chemical reactions\n3. Images with multiple chemical structures\n4. Images with no chemical structures\n\nThe package consists of three main components:\n\n### A) CNN Model for Image Classification ([chemic_train_eval.py](chemic_train_eval.py))\n- Trains a deep learning model to classify images into the four predefined classes.\n- Utilizes a pre-trained ResNet-50 model and includes steps for data preparation, model training, evaluation, and testing.\n\n### B) Web Service for Chemical Image Classification ([app.py](chemic/app.py))\n- Provides a FastAPI web application for classifying chemical images using the trained ResNet-50 model.\n- Exposes an endpoint `/classify_image` for accepting chemical images and returning the predicted class.\n\n### C) Image Classification Client ([client.py](chemic/client.py))\n- Interacts with the ChemIC web server. The client can send to the server:\n  - The path to an individual image file\n  - The path to a directory with multiple images\n  - Base64 encoded image data\n\n  The server classifies the images and returns the recognition results to the client.\n\n## Prepare Workspace Environment with Conda\n```bash\n# 1. Create and activate the conda environment\nconda create --name chemic \"python<3.12\"\nconda activate chemic\n\n# 2. Install ChemIC-ml\n# 2.1 From PyPI\npip install ChemIC-ml\n\n# 2.2 Or, install from the GitHub repository\npip install git+https://github.com/ontochem/ChemIC.git\n\n# 2.3 Or, install in editable mode from the GitHub repository\ngit clone https://github.com/ontochem/ChemIC.git\ncd ChemIC\npip install -r requirements.txt\npip install -e .\n```\n- Where -e means \"editable\" mode.\n \n## Model construction\nFirst, download the archive with manually labeled images, available as part of the supplementary materials from Zenodo: [dataset_for_image_classifier.zip](https://zenodo.org/records/13378718). Unzip the archive:\n```bash\nunzip dataset_for_image_classifier.zip\n```\nTo perform model training, validation, and testing, as well as saving your trained model, run the following command in the CLI:\n```bash\npython chemic_train_eval.py --dataset_dir /path/to/data --checkpoint_path /path/to/checkpoint.pth --models_dir /path/to/models\n```\n* `--dataset_dir`: Directory containing the dataset (with train, test, and validation subdirectories).\n* `--checkpoint_path`: Path to the existing model checkpoint file.\n* `--models_dir`: Directory to save newly trained models.\n\nThis command executes the training and evaluation using the specified paths.\n\n## Models download\nDownload the pre-trained models from Zenodo as an archive: [models.zip](https://doi.org/10.5281/zenodo.10709886). \nUnzip it into the `chemic/models` directory. The models directory should contain the pre-trained model `chemical_image_classifier_resnet50.pth` for chemical image classification.\n\n## Usage Web Service for Chemical Image Classification\n### 1. Start the FastAPI Web Server in Production Mode\nRun the following command in terminal:\n```bash\nuvicorn chemic.app:app --host 127.0.0.1 --port 5010 --workers 1 --timeout-keep-alive 3600\n```\n* `--workers 1`: Specifies the number of worker processes. Adjust based on your server's capabilities.\n* `--host 127.0.0.1 --port 5010`: Binds the application to the specified address and port. Modify as needed.\n* `--timeout-keep-alive 3600`: Sets the maximum allowed request processing time in seconds. Adjust as necessary.\n\n## 2. Use frontend Web interface\nIn another terminal window, run the following command:\n```bash\nstreamlit run chemic_frontendapp.py --server.address=0.0.0.0 --server.port=5009\n```\nThis command will launch the ChemIC user web interface.\n\n## 3. Classify Images Using the `client.py` Module via CLI\n```bash\n python chemic/client.py --image_path /path/to/images --export_dir /path/to/export\n```\nOR \n```bash\n python chemic/client.py  --image_data <base64_encoded_string> --export_dir /path/to/export\n```\n* `--image_path` is the path to the image file or directory with images for classification.\n* `--image_data` is the base64 encoded image data.\n* `--export_dir` is the export directory for the results.\n\n## 4. Alternatively, Use the Client for Classification in Your Python Code\n```python\nfrom chemic.client import ChemClassifierClient\n\nclient = ChemClassifierClient(server_url='http://127.0.0.1:5010')\n\n# Check the health of the server\nhealth_status = client.healthcheck().get('status')\nprint(f\"Health Status: {health_status}\")\n\n# Use image path or directory. Replace with the actual path to your image file\nimage_path = '<path to the image file or directory with images for classification>'\nrecognition_results = client.classify_image(image_path)\n\n# OR use base64-encoded image data. Replace with your base64-encoded image data:\nbase64_data = b'iVBORw0KGgoAAAANSUhEUgA....'\nrecognition_results = client.classify_image(image_data=base64_data)\n\n# Recognition results will be returned in the form of  a list of dictionaries\nprint(recognition_results)\n[\n  {\n    'image_id': 'image_name_1.png',\n    'predicted_label': 'single chemical structure',\n    'classifier_package': 'ChemIC-ml_1.3.1',\n    'classifier_model': 'ResNet_50',\n  },\n {\n   'image_id': 'image_name_2.png',\n    'predicted_label': 'multiple chemical structures',\n   'classifier_package': 'ChemIC-ml_1.3.1',\n   'classifier_model': 'ResNet_50',\n },\n  ...\n]\n```\n\n## Jupyter Notebook\nThe [client_image_classifier.ipynb](notebooks/client_image_classifier.ipynb) notebook in the `notebooks` directory provides an easy-to-use interface for classifying images. Follow the steps outlined in the notebook to perform image classification.\n\n## Author\nDr. Aleksei Krasnov\na.krasnov@digital-science.com\nOntoChem GmbH part of Digital Science\n\n## Citation\n- A. Krasnov, S. Barnabas, T. B\u00f6hme, S. Boyer, L. Weber, Comparing software tools for optical chemical structure recognition, Digital Discovery (2024).\thttps://doi.org/10.1039/D3DD00228D\n- L. Weber, A. Krasnov, S. Barnabas, T. B\u00f6hme, S. Boyer, Comparing Optical Chemical Structure Recognition Tools, ChemRxiv. (2023). https://doi.org/10.26434/chemrxiv-2023-d6kmg-v2\n\n## References\n- A. Krasnov, Images dataset for Chemical Images Classifier model. https://zenodo.org/records/13378718\n- A. Krasnov, Chemical Image Classifier Model. https://zenodo.org/records/10709886\n\n## License\nThis project is licensed under the MIT - see the [LICENSE.md](LICENSE.md) file for details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Chemical images classification project. Program for training the deep neural network model and web service for classification  chemical images",
    "version": "1.3.1",
    "project_urls": {
        "Homepage": "https://github.com/ontochem/ChemIC.git"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "561a3981650ab48297f6447a45e12c19e18728dbee95055addf28bb9737f150c",
                "md5": "f05aad512f74a1d80f9277fa106238c2",
                "sha256": "375804824877ff1d3c21cacd98b9e31ad8687e3e6dbabd902e39dbbaeb8935c4"
            },
            "downloads": -1,
            "filename": "ChemIC_ml-1.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f05aad512f74a1d80f9277fa106238c2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.10",
            "size": 14721,
            "upload_time": "2024-09-02T19:43:54",
            "upload_time_iso_8601": "2024-09-02T19:43:54.273547Z",
            "url": "https://files.pythonhosted.org/packages/56/1a/3981650ab48297f6447a45e12c19e18728dbee95055addf28bb9737f150c/ChemIC_ml-1.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "571c9d46cef40afdc45481eb86866eaa9e59185ad93753aca1d32d52397ceea1",
                "md5": "dc8ab60b2a2afed0e7a2af1578078cc9",
                "sha256": "e35868f1ed46aa4a4215e0ec6fcb58ceb81adb791c5bd72450fafe6d6947a78b"
            },
            "downloads": -1,
            "filename": "chemic_ml-1.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "dc8ab60b2a2afed0e7a2af1578078cc9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.10",
            "size": 13919,
            "upload_time": "2024-09-02T19:43:56",
            "upload_time_iso_8601": "2024-09-02T19:43:56.042016Z",
            "url": "https://files.pythonhosted.org/packages/57/1c/9d46cef40afdc45481eb86866eaa9e59185ad93753aca1d32d52397ceea1/chemic_ml-1.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-02 19:43:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ontochem",
    "github_project": "ChemIC",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "chemic-ml"
}
        
Elapsed time: 0.58298s