# EnzymeStructuralFiltering
Structural filtering pipeline using docking and active site heuristics to prioritze ML-predicted enzyme variants for experimental validation.
This tool processes superimposed ligand poses and filters them using geometric criteria such as distances, angles, and optionally, esterase-specific filters or nucleophilic proximity.
---
## 🚀 Features
- Parse and apply SMARTS patterns to ligand structures.
- Filter poses based on geometric constraints.
- Optional esterase or nucleophile-focused analysis.
- Supports CSV and pickle-based data pipelines.
---
## 📦 Installation
### Option 1: Install via pip
```bash
pip install XXXX
```
### Option 2: Clone the repository
```bash
git clone https://github.com/HelenSchmid/EnzymeStructuralFiltering.git
cd EnzymeStructuralFiltering
pip install .
```
## :seedling: Environment Setup
### Using conda
```bash
conda env create -f environment.yml
conda activate filterpipeline
```
## 🔧 Usage Example
```python
from filtering_pipeline.pipeline import Pipeline
import pandas as pd
from pathlib import Path
df = pd.read_pickle("DEHP-MEHP.pkl")
pipeline = Pipeline(
df = df,
ligand_name="TPP",
ligand_smiles="CCCCC(CC)COC(=O)C1=CC=CC=C1C(=O)OCC(CC)CCCC", # SMILES string of ligand
smarts_pattern='[$([CX3](=O)[OX2H0][#6])]', # SMARTS pattern of the chemical moiety of interest of ligand
max_matches=1000,
esterase=1,
find_closest_nuc=1,
num_threads=1,
squidly_dir='/nvme2/ariane/home/data/models/squidly_final_models/',
base_output_dir="pipeline_output"
)
pipeline.run()
Raw data
{
"_id": null,
"home_page": "https://github.com/HelenSchmid/EnzymeStructuralFiltering",
"name": "enzyme-filtering-pipeline",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "util",
"author": "Helen Schmid",
"author_email": "schmid.helen2@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/11/18/9dddc6350d79eb7e722d43ca92fa0d2bd7e7f4db1b074d3a90c26b8f0dd2/enzyme_filtering_pipeline-0.0.41.tar.gz",
"platform": null,
"description": "# EnzymeStructuralFiltering\n\nStructural filtering pipeline using docking and active site heuristics to prioritze ML-predicted enzyme variants for experimental validation. \nThis tool processes superimposed ligand poses and filters them using geometric criteria such as distances, angles, and optionally, esterase-specific filters or nucleophilic proximity.\n\n---\n\n## \ud83d\ude80 Features\n\n- Parse and apply SMARTS patterns to ligand structures.\n- Filter poses based on geometric constraints.\n- Optional esterase or nucleophile-focused analysis.\n- Supports CSV and pickle-based data pipelines.\n\n---\n\n## \ud83d\udce6 Installation\n\n### Option 1: Install via pip\n```bash\npip install XXXX\n```\n### Option 2: Clone the repository\n```bash\ngit clone https://github.com/HelenSchmid/EnzymeStructuralFiltering.git\ncd EnzymeStructuralFiltering\npip install .\n```\n\n## :seedling: Environment Setup\n### Using conda\n```bash\nconda env create -f environment.yml\nconda activate filterpipeline\n```\n\n## \ud83d\udd27 Usage Example\n```python\nfrom filtering_pipeline.pipeline import Pipeline\nimport pandas as pd\nfrom pathlib import Path\ndf = pd.read_pickle(\"DEHP-MEHP.pkl\")\n\npipeline = Pipeline(\n df = df,\n ligand_name=\"TPP\",\n ligand_smiles=\"CCCCC(CC)COC(=O)C1=CC=CC=C1C(=O)OCC(CC)CCCC\", # SMILES string of ligand\n smarts_pattern='[$([CX3](=O)[OX2H0][#6])]', # SMARTS pattern of the chemical moiety of interest of ligand\n max_matches=1000,\n esterase=1,\n find_closest_nuc=1,\n num_threads=1,\n squidly_dir='/nvme2/ariane/home/data/models/squidly_final_models/',\n base_output_dir=\"pipeline_output\"\n )\n\npipeline.run()\n",
"bugtrack_url": null,
"license": "GPL3",
"summary": null,
"version": "0.0.41",
"project_urls": {
"Bug Tracker": "https://github.com/HelenSchmid/EnzymeStructuralFiltering/issues",
"Documentation": "https://github.com/HelenSchmid/EnzymeStructuralFiltering",
"Homepage": "https://github.com/HelenSchmid/EnzymeStructuralFiltering",
"Source Code": "https://github.com/HelenSchmid/EnzymeStructuralFiltering"
},
"split_keywords": [
"util"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "dc5484e4a9ece293043af8a5c1d26a0adc9f017ede7f5a5fbf0797c1e1d91ab1",
"md5": "bbdbe154d06e61ffee9fc0dfb12b065a",
"sha256": "36891fcef43bb9ab24b71af31bb0c2e5c52fc3bf928ce838184f925525ba45f4"
},
"downloads": -1,
"filename": "enzyme_filtering_pipeline-0.0.41-py3-none-any.whl",
"has_sig": false,
"md5_digest": "bbdbe154d06e61ffee9fc0dfb12b065a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 55551,
"upload_time": "2025-08-03T01:05:32",
"upload_time_iso_8601": "2025-08-03T01:05:32.579187Z",
"url": "https://files.pythonhosted.org/packages/dc/54/84e4a9ece293043af8a5c1d26a0adc9f017ede7f5a5fbf0797c1e1d91ab1/enzyme_filtering_pipeline-0.0.41-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "11189dddc6350d79eb7e722d43ca92fa0d2bd7e7f4db1b074d3a90c26b8f0dd2",
"md5": "9d18735e794debb5e30dea5a972006af",
"sha256": "7db123e4f560e6dddc7124b6d3c86bda161df6036e2dc9a9311cc05a3ac0f8dd"
},
"downloads": -1,
"filename": "enzyme_filtering_pipeline-0.0.41.tar.gz",
"has_sig": false,
"md5_digest": "9d18735e794debb5e30dea5a972006af",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 41126,
"upload_time": "2025-08-03T01:06:18",
"upload_time_iso_8601": "2025-08-03T01:06:18.168504Z",
"url": "https://files.pythonhosted.org/packages/11/18/9dddc6350d79eb7e722d43ca92fa0d2bd7e7f4db1b074d3a90c26b8f0dd2/enzyme_filtering_pipeline-0.0.41.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-03 01:06:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "HelenSchmid",
"github_project": "EnzymeStructuralFiltering",
"github_not_found": true,
"lcname": "enzyme-filtering-pipeline"
}