chemfunc


Namechemfunc JSON
Version 1.0.5 PyPI version JSON
download
home_pagehttps://github.com/swansonk14/chemfunc
SummaryChem Func
upload_time2023-12-29 16:37:36
maintainer
docs_urlNone
authorKyle Swanson
requires_python>=3.10
licenseMIT
keywords computational chemistry
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Chem Func

Useful functions and scripts for working with small molecules.

## Installation

Optionally, create a conda environment.
```bash
conda create -y -n chemfunc python=3.10
conda activate chemfunc
```

Install the latest version of Chem Func using pip.
```
pip install chemfunc
```

Alternatively, clone the repository and install the local version of the package.
```
git clone https://github.com/swansonk14/chemfunc.git
cd chemfunc
pip install -e .
```

If there are version issues with the required packages, create a conda environment with specific working versions of the packages as follows.
```bash
pip install -r requirements.txt
pip install -e .
```

**Note:** If you get the issue `ImportError: libXrender.so.1: cannot open shared object file: No such file or directory`, run `conda install -c conda-forge xorg-libxrender`.


## Features

Chem Func contains a variety of useful functions and scripts for working with small molecules.

Functions can be imported from the `chemfunc` package. For example:
```python
from pathlib import Path
from chemfunc.sdf_to_smiles import sdf_to_smiles

sdf_to_smiles(
    data_path=Path('molecules.sdf'),
    save_path=Path('molecules.csv')
)
```

Most modules can also be run as scripts from the command line using the `chemfunc` command along with the appropriate function name. For example:
```bash
chemfunc sdf_to_smiles \
    --data_path molecules.sdf \
    --save_path molecules.csv
```

To see a list of available scripts, run `chemfunc -h`.

For each script, run `chemfunc <script_name> -h` to see a description of the arguments for that script.


## Contents

Below is a list of the contents of the package.

[`canonicalize_smiles.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/canonicalize_smiles.py) (function, script)

Canonicalizes SMILES using RDKit canonicalization and optionally strips salts.

[`chemical_diversity.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/chemical_diversity.py) (function, script)

Computes the chemical diversity of a set of molecules in terms of Tanimoto distances.

[`cluster_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/cluster_molecules.py) (function, script)

Performs k-means clustering to cluster molecules based on Morgan fingerprints.

[`compute_property_distribution.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/compute_property_distribution.py) (function, script)

Computes one or more molecular properties for a set of molecules.

[`deduplicate_smiles.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/deduplicate_smiles.py) (function, script)

Deduplicate a CSV files by SMILES.

[`filter_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/filter_molecules.py) (function, script)

Filters molecules to those with values in a certain range.

[`measure_experimental_reproducibility.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/measure_experimental_reproducibility.py) (function, script)

Measures the experimental reproducibility of two biological replicates by using one replicate to predict the other.

[`molecular_fingerprints.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/molecular_fingerprints.py) (functions, script)

Contains functions to compute fingerprints for molecules. Parallelized for speed. The function `save_fingerprints` can be used as a script to compute fingerprints from a CSV file and save them as an NPZ file.

[`molecular_properties.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/molecular_properties.py) (functions)

Contains functions to compute molecular properties. Parallelized for speed.

[`molecular_similarities.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/molecular_similarities.py) (functions)

Contains functions to compute similarities between molecules. Parallelized for speed.

[`nearest_neighbor.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/nearest_neighbor.py) (function, script)

Given a dataset of molecules, computes the nearest neighbor molecule in a second dataset using one of several similarity metrics.

[`plot_property_distribution.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/plot_property_distribution.py) (function, script)

Plots the distribution of molecular properties of a set of molecules.

[`plot_tsne.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/plot_tsne.py) (function, script)

Runs a t-SNE on molecular fingerprints from one or more chemical libraries.

[`regression_to_classification.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/regression_to_classification.py) (function, script)

Converts regression data to classification data using given thresholds.

[`sample_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/sample_molecules.py) (function, script)

Samples molecules from a CSV file, either uniformly at random across the entire dataset or uniformly at random from each cluster within the data.

[`sdf_to_smiles.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/sdf_to_smiles.py) (function, script)

Converts an SDF file to a CSV file with SMILES.

[`select_from_clusters.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/select_from_clusters.py) (function, script)

Selects the best molecule from each cluster.

[`smiles_to_svg.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/smiles_to_svg.py) (function, script)

Converts a SMILES string to an SVG image of the molecule.

[`visualize_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/visualize_molecules.py)(function, script)

Converts a file of SMILES to images of molecular structures.

[`visualize_reactions.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/visualize_reactions.py) (function, script)

Converts a file of reaction SMARTS to images of chemical reactions.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/swansonk14/chemfunc",
    "name": "chemfunc",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "computational chemistry",
    "author": "Kyle Swanson",
    "author_email": "swansonk.14@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/d4/bb/357145ab3d20bc18f5eaa961cb5323f2431c88b96d91dcd0df0dba6caf39/chemfunc-1.0.5.tar.gz",
    "platform": null,
    "description": "# Chem Func\n\nUseful functions and scripts for working with small molecules.\n\n## Installation\n\nOptionally, create a conda environment.\n```bash\nconda create -y -n chemfunc python=3.10\nconda activate chemfunc\n```\n\nInstall the latest version of Chem Func using pip.\n```\npip install chemfunc\n```\n\nAlternatively, clone the repository and install the local version of the package.\n```\ngit clone https://github.com/swansonk14/chemfunc.git\ncd chemfunc\npip install -e .\n```\n\nIf there are version issues with the required packages, create a conda environment with specific working versions of the packages as follows.\n```bash\npip install -r requirements.txt\npip install -e .\n```\n\n**Note:** If you get the issue `ImportError: libXrender.so.1: cannot open shared object file: No such file or directory`, run `conda install -c conda-forge xorg-libxrender`.\n\n\n## Features\n\nChem Func contains a variety of useful functions and scripts for working with small molecules.\n\nFunctions can be imported from the `chemfunc` package. For example:\n```python\nfrom pathlib import Path\nfrom chemfunc.sdf_to_smiles import sdf_to_smiles\n\nsdf_to_smiles(\n    data_path=Path('molecules.sdf'),\n    save_path=Path('molecules.csv')\n)\n```\n\nMost modules can also be run as scripts from the command line using the `chemfunc` command along with the appropriate function name. For example:\n```bash\nchemfunc sdf_to_smiles \\\n    --data_path molecules.sdf \\\n    --save_path molecules.csv\n```\n\nTo see a list of available scripts, run `chemfunc -h`.\n\nFor each script, run `chemfunc <script_name> -h` to see a description of the arguments for that script.\n\n\n## Contents\n\nBelow is a list of the contents of the package.\n\n[`canonicalize_smiles.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/canonicalize_smiles.py) (function, script)\n\nCanonicalizes SMILES using RDKit canonicalization and optionally strips salts.\n\n[`chemical_diversity.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/chemical_diversity.py) (function, script)\n\nComputes the chemical diversity of a set of molecules in terms of Tanimoto distances.\n\n[`cluster_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/cluster_molecules.py) (function, script)\n\nPerforms k-means clustering to cluster molecules based on Morgan fingerprints.\n\n[`compute_property_distribution.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/compute_property_distribution.py) (function, script)\n\nComputes one or more molecular properties for a set of molecules.\n\n[`deduplicate_smiles.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/deduplicate_smiles.py) (function, script)\n\nDeduplicate a CSV files by SMILES.\n\n[`filter_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/filter_molecules.py) (function, script)\n\nFilters molecules to those with values in a certain range.\n\n[`measure_experimental_reproducibility.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/measure_experimental_reproducibility.py) (function, script)\n\nMeasures the experimental reproducibility of two biological replicates by using one replicate to predict the other.\n\n[`molecular_fingerprints.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/molecular_fingerprints.py) (functions, script)\n\nContains functions to compute fingerprints for molecules. Parallelized for speed. The function `save_fingerprints` can be used as a script to compute fingerprints from a CSV file and save them as an NPZ file.\n\n[`molecular_properties.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/molecular_properties.py) (functions)\n\nContains functions to compute molecular properties. Parallelized for speed.\n\n[`molecular_similarities.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/molecular_similarities.py) (functions)\n\nContains functions to compute similarities between molecules. Parallelized for speed.\n\n[`nearest_neighbor.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/nearest_neighbor.py) (function, script)\n\nGiven a dataset of molecules, computes the nearest neighbor molecule in a second dataset using one of several similarity metrics.\n\n[`plot_property_distribution.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/plot_property_distribution.py) (function, script)\n\nPlots the distribution of molecular properties of a set of molecules.\n\n[`plot_tsne.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/plot_tsne.py) (function, script)\n\nRuns a t-SNE on molecular fingerprints from one or more chemical libraries.\n\n[`regression_to_classification.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/regression_to_classification.py) (function, script)\n\nConverts regression data to classification data using given thresholds.\n\n[`sample_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/sample_molecules.py) (function, script)\n\nSamples molecules from a CSV file, either uniformly at random across the entire dataset or uniformly at random from each cluster within the data.\n\n[`sdf_to_smiles.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/sdf_to_smiles.py) (function, script)\n\nConverts an SDF file to a CSV file with SMILES.\n\n[`select_from_clusters.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/select_from_clusters.py) (function, script)\n\nSelects the best molecule from each cluster.\n\n[`smiles_to_svg.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/smiles_to_svg.py) (function, script)\n\nConverts a SMILES string to an SVG image of the molecule.\n\n[`visualize_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/visualize_molecules.py)(function, script)\n\nConverts a file of SMILES to images of molecular structures.\n\n[`visualize_reactions.py`](https://github.com/swansonk14/chemfunc/blob/main/chemfunc/visualize_reactions.py) (function, script)\n\nConverts a file of reaction SMARTS to images of chemical reactions.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Chem Func",
    "version": "1.0.5",
    "project_urls": {
        "Download": "https://github.com/swansonk14/chemfunc/v_1.0.5.tar.gz",
        "Homepage": "https://github.com/swansonk14/chemfunc",
        "PyPi": "https://pypi.org/project/chemfunc/",
        "Source": "https://github.com/swansonk14/chemfunc"
    },
    "split_keywords": [
        "computational",
        "chemistry"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d4bb357145ab3d20bc18f5eaa961cb5323f2431c88b96d91dcd0df0dba6caf39",
                "md5": "3bd8a54a4ba935096b03225ab065e793",
                "sha256": "ae45bda9308568c3107906ee1652f3b6d4aaf1ad9e62697b6f43d5c110725183"
            },
            "downloads": -1,
            "filename": "chemfunc-1.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "3bd8a54a4ba935096b03225ab065e793",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 20899,
            "upload_time": "2023-12-29T16:37:36",
            "upload_time_iso_8601": "2023-12-29T16:37:36.858893Z",
            "url": "https://files.pythonhosted.org/packages/d4/bb/357145ab3d20bc18f5eaa961cb5323f2431c88b96d91dcd0df0dba6caf39/chemfunc-1.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-29 16:37:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "swansonk14",
    "github_project": "chemfunc",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "chemfunc"
}
        
Elapsed time: 0.25193s