# *Cavity Detection Tool* (CADET)
[CADET](https://tomasplsek.github.io/CADET/) is a machine learning pipeline trained to identify of surface brightness depressions (*X-ray cavities*) in noisy *Chandra* images of early-type galaxies and galaxy clusters. The pipeline consists of a convolutional neural network trained to produce pixel-wise cavity predictions and a DBSCAN clustering algorithm that decomposes the predictions into individual cavities. The pipeline is described in detail in [Plšek et al. 2023](https://arxiv.org/abs/2304.05457).
The architecture of the convolutional network consists of 5 convolutional blocks, each resembling an Inception layer, it was implemented using the *Keras* library and its development was inspired by [Fort et al. 2017](https://ui.adsabs.harvard.edu/abs/2017arXiv171200523F/abstract) and [Secká 2019](https://is.muni.cz/th/rnxoz/?lang=en;fakulta=1411). For the clustering, we used is the *Scikit-learn* implementation of the Density-Based Spatial Clustering of Applications with Noise (DBSCAN).

## Python package
The CADET pipeline has been released as a standalone Python3 package [`pycadet`](https://pypi.org/project/pycadet/), which can be installed using pip:
```console
$ pip3 install pycadet
```
or from source:
```console
$ pip3 install git+https://github.com/tomasplsek/CADET.git
```
The `pycadet` package requires the following libraries (which should be installed automatically with the package):
```
numpy
scipy
astropy
matplotlib
pyds9
scikit-learn>=1.1
tensorflow>=2.8
```
For Conda environments, it is recommended to install the dependencies beforehand as some of the packages can be tricky to install in an existing environment (especially `tensorflow`) and on some machines (especially new Macs). For machines with dedicated NVIDIA GPUs, `tensorflow-gpu` can be installed to allow the CADET model to leverage the GPU for faster inference.
An exemplary notebook on how to use the `pycadet` package can be found here:
<a target="_blank" href="https://colab.research.google.com/github/tomasplsek/CADET/blob/main/example/CADET.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" style="margin-bottom:-4px"/>
</a>
## DS9 Plugin
The CADET pipeline can also be used as a [SAOImageDS9](https://ds9.si.edu/) plugin which is installed together with the `pycadet` Python package. The CADET plugin requires that SAOImageDS9 is already installed on the system. To avoid conflicts (e.g. the CIAO installation of DS9), it is recommended to install `pycadet` using a system installation of Python3 rather than a Conda environment.
After the installation, the CADET plugin should be available in the *Analysis* menu of DS9. After clicking on the *CADET* option, a new window will appear, where the user can set several options: whether the prediction should be averaged over multiple input images by shifting by +/- 1 pixel (*Shift*); and whether the prediction should be decomposed into individual cavities (*Decompose*). When decomposing into individual cavities, the user can also set a pair of discrimination thresholds, where the first one (*Threshold1*) is used for volume error calibration and the second one (*Threshold2*) for false positive rate calibration (for more info see [Plšek et al. 2023](https://arxiv.org/abs/2304.05457)).
If the CADET plugin does not appear in the *Analysis* menu, it can be added manually by opening *Edit* > *Preferences* > *Analysis* and adding a path to the following file [DS9CADET.ds9.ans](https://github.com/tomasplsek/CADET/raw/main/pycadet/DS9CADET.ds9.ans) (after the installation it should be located in `~/.ds9/`). The plugin is inspired by the [pyds9plugin](https://github.com/vpicouet/pyds9plugin/tree/master) library.

## Online CADET interface
A simplified version of the CADET pipeline is available via a <a href="https://huggingface.co/spaces/Plsek/CADET" target=_blank>web interface</a> hosted on HuggingFace Spaces. The input image should be centred on the galaxy centre and cropped to a square shape. It is also recommended to remove point sources from the image and fill them with the surrounding background level using Poisson statistics ([dmfilth](https://cxc.cfa.harvard.edu/ciao/ahelp/dmfilth.html) within [CIAO](https://cxc.harvard.edu/ciao/)). Furthermore, compared to the `pycadet` package, the web interface performs only a single thresholding of the raw pixel-wise prediction, which is easily adjustable using a slider.

## Convolutional part
The convolutional part of the pipeline can be used separately to produce raw pixel-wise predictions. Since the convolutional network was implemented using the functional *Keras* API, the architecture could have been stored together with the trained weights in the HDF5 format ([`CADET.hdf5`](https://github.com/tomasplsek/CADET/raw/main/pycadet/CADET.hdf5)). The trained model can then simply be loaded using the `load_model` *TensorFlow* function:
```python
from tensorflow.keras.models import load_model
model = load_model("CADET.hdf5")
y_pred = model.predict(X)
```
The raw CADET model only inputs 128x128 images. Furthermore, to maintain the compatibility with *Keras*, the input needs to be reshaped as `X.reshape(1, 128, 128, 1)` for single image or as `X.reshape(-1, 128, 128, 1)` for multiple images.
Alternatively, the CADET model can be imported from HuggingFace's [model hub](https://huggingface.co/Plsek/CADET-v1):
```python
from huggingface_hub import from_pretrained_keras
model = from_pretrained_keras("Plsek/CADET-v1")
y_pred = model.predict(X)
```
## How to cite
If you use the CADET pipeline in your research, please cite the following paper [Plšek et al. 2023](https://academic.oup.com/mnras/article/527/2/3315/7339785) ([arXiv](https://arxiv.org/abs/2304.05457)):
```
@ARTICLE{2023MNRAS.tmp.3233P,
author = {{Pl{\v{s}}ek}, T. and {Werner}, N. and {Topinka}, M. and {Simionescu}, A.},
title = "{CAvity DEtection Tool (CADET): Pipeline for detection of X-ray cavities in hot galactic and cluster atmospheres}",
journal = {\mnras},
year = 2023,
month = nov,
doi = {10.1093/mnras/stad3371},
}
```
## Todo
The following improvements to the data generation and training process are currently planned:
- [ ] add other features (cold fronts, complex sloshing, point sources, jets)
- [ ] use more complex cavity shapes (e.g. [Guo et al. 2015](https://arxiv.org/abs/1408.5018))
- [ ] train on multiband images simulated using PyXsim/SOXS
- [ ] replace DBSCAN by using instance segmentation
- [ ] restrict the cavity number and shape using regularization?
- [ ] systematic cavity size uncertainty estimation using MC Dropout
Raw data
{
"_id": null,
"home_page": "https://github.com/tomasplsek/CADET",
"name": "pycadet",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Tomas Plsek",
"author_email": "plsek@physics.muni.cz",
"download_url": "https://files.pythonhosted.org/packages/40/6b/9cf4484f7fe0f25dca31e9e0cf497f8e1a1f7d7d2a32535f3b4a50c6f9fd/pycadet-0.2.0.tar.gz",
"platform": null,
"description": "# *Cavity Detection Tool* (CADET)\n\n[CADET](https://tomasplsek.github.io/CADET/) is a machine learning pipeline trained to identify of surface brightness depressions (*X-ray cavities*) in noisy *Chandra* images of early-type galaxies and galaxy clusters. The pipeline consists of a convolutional neural network trained to produce pixel-wise cavity predictions and a DBSCAN clustering algorithm that decomposes the predictions into individual cavities. The pipeline is described in detail in [Pl\u0161ek et al. 2023](https://arxiv.org/abs/2304.05457).\n\nThe architecture of the convolutional network consists of 5 convolutional blocks, each resembling an Inception layer, it was implemented using the *Keras* library and its development was inspired by [Fort et al. 2017](https://ui.adsabs.harvard.edu/abs/2017arXiv171200523F/abstract) and [Seck\u00e1 2019](https://is.muni.cz/th/rnxoz/?lang=en;fakulta=1411). For the clustering, we used is the *Scikit-learn* implementation of the Density-Based Spatial Clustering of Applications with Noise (DBSCAN).\n\n\n\n\n## Python package\n\nThe CADET pipeline has been released as a standalone Python3 package [`pycadet`](https://pypi.org/project/pycadet/), which can be installed using pip:\n\n```console\n$ pip3 install pycadet\n```\n\nor from source:\n\n```console\n$ pip3 install git+https://github.com/tomasplsek/CADET.git\n```\n\nThe `pycadet` package requires the following libraries (which should be installed automatically with the package):\n```\nnumpy\nscipy\nastropy\nmatplotlib\npyds9\nscikit-learn>=1.1\ntensorflow>=2.8\n```\n\nFor Conda environments, it is recommended to install the dependencies beforehand as some of the packages can be tricky to install in an existing environment (especially `tensorflow`) and on some machines (especially new Macs). For machines with dedicated NVIDIA GPUs, `tensorflow-gpu` can be installed to allow the CADET model to leverage the GPU for faster inference.\n\nAn exemplary notebook on how to use the `pycadet` package can be found here: \n\n<a target=\"_blank\" href=\"https://colab.research.google.com/github/tomasplsek/CADET/blob/main/example/CADET.ipynb\">\n <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\" style=\"margin-bottom:-4px\"/>\n</a>\n\n\n## DS9 Plugin\n\nThe CADET pipeline can also be used as a [SAOImageDS9](https://ds9.si.edu/) plugin which is installed together with the `pycadet` Python package. The CADET plugin requires that SAOImageDS9 is already installed on the system. To avoid conflicts (e.g. the CIAO installation of DS9), it is recommended to install `pycadet` using a system installation of Python3 rather than a Conda environment.\n\nAfter the installation, the CADET plugin should be available in the *Analysis* menu of DS9. After clicking on the *CADET* option, a new window will appear, where the user can set several options: whether the prediction should be averaged over multiple input images by shifting by +/- 1 pixel (*Shift*); and whether the prediction should be decomposed into individual cavities (*Decompose*). When decomposing into individual cavities, the user can also set a pair of discrimination thresholds, where the first one (*Threshold1*) is used for volume error calibration and the second one (*Threshold2*) for false positive rate calibration (for more info see [Pl\u0161ek et al. 2023](https://arxiv.org/abs/2304.05457)).\n\nIf the CADET plugin does not appear in the *Analysis* menu, it can be added manually by opening *Edit* > *Preferences* > *Analysis* and adding a path to the following file [DS9CADET.ds9.ans](https://github.com/tomasplsek/CADET/raw/main/pycadet/DS9CADET.ds9.ans) (after the installation it should be located in `~/.ds9/`). The plugin is inspired by the [pyds9plugin](https://github.com/vpicouet/pyds9plugin/tree/master) library.\n\n\n\n## Online CADET interface\n\nA simplified version of the CADET pipeline is available via a <a href=\"https://huggingface.co/spaces/Plsek/CADET\" target=_blank>web interface</a> hosted on HuggingFace Spaces. The input image should be centred on the galaxy centre and cropped to a square shape. It is also recommended to remove point sources from the image and fill them with the surrounding background level using Poisson statistics ([dmfilth](https://cxc.cfa.harvard.edu/ciao/ahelp/dmfilth.html) within [CIAO](https://cxc.harvard.edu/ciao/)). Furthermore, compared to the `pycadet` package, the web interface performs only a single thresholding of the raw pixel-wise prediction, which is easily adjustable using a slider.\n\n\n\n\n## Convolutional part\n\nThe convolutional part of the pipeline can be used separately to produce raw pixel-wise predictions. Since the convolutional network was implemented using the functional *Keras* API, the architecture could have been stored together with the trained weights in the HDF5 format ([`CADET.hdf5`](https://github.com/tomasplsek/CADET/raw/main/pycadet/CADET.hdf5)). The trained model can then simply be loaded using the `load_model` *TensorFlow* function:\n\n```python\nfrom tensorflow.keras.models import load_model\n\nmodel = load_model(\"CADET.hdf5\")\n\ny_pred = model.predict(X)\n```\n\nThe raw CADET model only inputs 128x128 images. Furthermore, to maintain the compatibility with *Keras*, the input needs to be reshaped as `X.reshape(1, 128, 128, 1)` for single image or as `X.reshape(-1, 128, 128, 1)` for multiple images.\n\nAlternatively, the CADET model can be imported from HuggingFace's [model hub](https://huggingface.co/Plsek/CADET-v1):\n\n```python\nfrom huggingface_hub import from_pretrained_keras\n\nmodel = from_pretrained_keras(\"Plsek/CADET-v1\")\n\ny_pred = model.predict(X)\n```\n\n## How to cite\n\nIf you use the CADET pipeline in your research, please cite the following paper [Pl\u0161ek et al. 2023](https://academic.oup.com/mnras/article/527/2/3315/7339785) ([arXiv](https://arxiv.org/abs/2304.05457)):\n\n```\n@ARTICLE{2023MNRAS.tmp.3233P,\n author = {{Pl{\\v{s}}ek}, T. and {Werner}, N. and {Topinka}, M. and {Simionescu}, A.},\n title = \"{CAvity DEtection Tool (CADET): Pipeline for detection of X-ray cavities in hot galactic and cluster atmospheres}\",\n journal = {\\mnras},\n year = 2023,\n month = nov,\n doi = {10.1093/mnras/stad3371},\n}\n```\n\n## Todo\n\nThe following improvements to the data generation and training process are currently planned:\n\n- [ ] add other features (cold fronts, complex sloshing, point sources, jets)\n- [ ] use more complex cavity shapes (e.g. [Guo et al. 2015](https://arxiv.org/abs/1408.5018))\n- [ ] train on multiband images simulated using PyXsim/SOXS\n- [ ] replace DBSCAN by using instance segmentation \n- [ ] restrict the cavity number and shape using regularization?\n- [ ] systematic cavity size uncertainty estimation using MC Dropout\n",
"bugtrack_url": null,
"license": null,
"summary": "Cavity Detection Tool",
"version": "0.2.0",
"project_urls": {
"Homepage": "https://github.com/tomasplsek/CADET"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "406b9cf4484f7fe0f25dca31e9e0cf497f8e1a1f7d7d2a32535f3b4a50c6f9fd",
"md5": "5a6407f07846570386ab86403e0eda88",
"sha256": "9d0e1bcd1af023ae7215971ad4d480093912fb815cc554cc9a209b73acd1eac5"
},
"downloads": -1,
"filename": "pycadet-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "5a6407f07846570386ab86403e0eda88",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 6283836,
"upload_time": "2024-08-21T19:22:27",
"upload_time_iso_8601": "2024-08-21T19:22:27.286601Z",
"url": "https://files.pythonhosted.org/packages/40/6b/9cf4484f7fe0f25dca31e9e0cf497f8e1a1f7d7d2a32535f3b4a50c6f9fd/pycadet-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-21 19:22:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tomasplsek",
"github_project": "CADET",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pycadet"
}