# miRNA target site prediction Benchmarks
## Installation
miRBench package can be easily installed using pip:
```bash
pip install miRBench
```
Default installation allows access to the datasets. To use predictors and encoders, you need to install additional dependencies.
### Dependencies for predictors and encoders
To use miRBench with predictors and encoders, install the following dependencies:
- numpy
- biopython
- viennarna
- torch
- tensorflow
- typing-extensions
To install the miRBench package with all dependencies into a virtual environment, you can use the following commands:
```bash
python3.8 -m venv mirbench_venv
source mirbench_venv/bin/activate
pip install miRBench
pip install numpy==1.24.3 biopython==1.83 viennarna==2.7.0 torch==1.9.0 tensorflow==2.13.1 typing-extensions==4.5.0
```
Note: This instalation is for running predictors on the CPU. If you want to use GPU, you need to install version of torch and tensorflow with GPU support.
## Examples
### Get all available datasets
```python
from miRBench.dataset import list_datasets
list_datasets()
```
```python
['AGO2_CLASH_Hejret2023',
'AGO2_eCLIP_Klimentova2022',
'AGO2_eCLIP_Manakov2022']
```
Not all datasets are available with all splits and ratios. To get available splits and ratios, use the `full` option.
```python
list_datasets(full=True)
```
```python
{'AGO2_CLASH_Hejret2023': {'splits': {
'train': {'ratios': ['10']},
'test': {'ratios': ['1', '10', '100']}}},
'AGO2_eCLIP_Klimentova2022': {'splits': {
'test': {'ratios': ['1', '10', '100']}}},
'AGO2_eCLIP_Manakov2022': {'splits': {
'train': {'ratios': ['1', '10', '100']},
'test': {'ratios': ['1', '10', '100']}}}
}
```
### Get dataset
```python
from miRBench.dataset import get_dataset_df
dataset_name = "AGO2_CLASH_Hejret2023"
df = get_dataset_df(dataset_name, split="test", ratio="1")
df.head()
```
| | noncodingRNA | gene | label |
| -------- | ------- | ------- | ------- |
| 0 | TCCGAGCCTGGGTCTCCCTCTT |GGGTTTAGGGAAGGAGGTTCGGAGACAGGGAGCCAAGGCCTCTGTC... | 1 |
|1 | TGCGGGGCTAGGGCTAACAGCA |GCTTCCCAAGTTAGGTTAGTGATGTGAAATGCTCCTGTCCCTGGCC... | 1 |
| 2 | CCCACTGCCCCAGGTGCTGCTGG |TCTTTCCAAAATTGTCCAGCAGCTTGAATGAGGCAGTGACAATTCT... | 1 |
| 3 | TGAGGGGCAGAGAGCGAGACTTT |CAGAACTGGGATTCAAGCGAGGTCTGGCCCCTCAGTCTGTGGCTTT... | 1 |
| 4 |CAAAGTGCTGTTCGTGCAGGTAG |TTTTTTCCCTTAGGACTCTGCACTTTATAGAATGTTGTAAAACAGA... | 1 |
If you want to get just a path to the dataset, use the `get_dataset_path` function:
```python
from miRBench.dataset import get_dataset_path
dataset_path = get_dataset_path(dataset_name, split="test", ratio="1")
dataset_path
```
```python
/home/user/.miRBench/datasets/13909173/AGO2_CLASH_Hejret2023/1/test/dataset.tsv
```
### Get all available tools
```python
from miRBench.predictor import list_predictors
list_predictors()
```
```python
['CnnMirTarget_Zheng2020',
'RNACofold',
'miRNA_CNN_Hejret2023',
'miRBind_Klimentova2022',
'TargetNet_Min2021',
'Seed8mer',
'Seed7mer',
'Seed6mer',
'Seed6merBulgeOrMismatch',
'TargetScanCnn_McGeary2019',
'InteractionAwareModel_Yang2024']
```
### Encode dataset
```python
from miRBench.encoder import get_encoder
tool = 'miRBind_Klimentova2022'
encoder = get_encoder(tool)
input = encoder(df)
```
### Get predictions
```python
from miRBench.predictor import get_predictor
predictor = get_predictor(tool)
predictions = predictor(input)
predictions[:10]
```
```python
array([0.6899161 , 0.15220629, 0.07301956, 0.43757868, 0.34360734,
0.20519172, 0.0955029 , 0.79298246, 0.14150576, 0.05329492],
dtype=float32)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/katarinagresova/miRBench",
"name": "miRBench",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "miRNA, target site prediction, benchmarking",
"author": "Katarina Gresova",
"author_email": "gresova11@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/c8/9e/0fa1325616f6f9a6655e8d5c75ee5234d0309bf6ae7208093ea3bc01dcf7/mirbench-1.0.0.tar.gz",
"platform": null,
"description": "# miRNA target site prediction Benchmarks\n\n## Installation\n\nmiRBench package can be easily installed using pip:\n\n```bash\npip install miRBench\n```\n\nDefault installation allows access to the datasets. To use predictors and encoders, you need to install additional dependencies.\n\n### Dependencies for predictors and encoders\n\nTo use miRBench with predictors and encoders, install the following dependencies:\n- numpy\n- biopython\n- viennarna\n- torch\n- tensorflow\n- typing-extensions\n\nTo install the miRBench package with all dependencies into a virtual environment, you can use the following commands:\n\n```bash\npython3.8 -m venv mirbench_venv\nsource mirbench_venv/bin/activate\npip install miRBench\npip install numpy==1.24.3 biopython==1.83 viennarna==2.7.0 torch==1.9.0 tensorflow==2.13.1 typing-extensions==4.5.0\n```\n\nNote: This instalation is for running predictors on the CPU. If you want to use GPU, you need to install version of torch and tensorflow with GPU support.\n\n## Examples\n\n### Get all available datasets\n\n```python\nfrom miRBench.dataset import list_datasets\n\nlist_datasets()\n```\n\n```python\n['AGO2_CLASH_Hejret2023',\n 'AGO2_eCLIP_Klimentova2022',\n 'AGO2_eCLIP_Manakov2022']\n```\n\nNot all datasets are available with all splits and ratios. To get available splits and ratios, use the `full` option.\n\n```python\nlist_datasets(full=True)\n```\n\n```python\n{'AGO2_CLASH_Hejret2023': {'splits': {\n 'train': {'ratios': ['10']},\n 'test': {'ratios': ['1', '10', '100']}}},\n 'AGO2_eCLIP_Klimentova2022': {'splits': {\n 'test': {'ratios': ['1', '10', '100']}}},\n 'AGO2_eCLIP_Manakov2022': {'splits': {\n 'train': {'ratios': ['1', '10', '100']},\n 'test': {'ratios': ['1', '10', '100']}}}\n}\n```\n\n### Get dataset\n\n```python\nfrom miRBench.dataset import get_dataset_df\n\ndataset_name = \"AGO2_CLASH_Hejret2023\"\ndf = get_dataset_df(dataset_name, split=\"test\", ratio=\"1\")\ndf.head()\n```\n\n|\t| noncodingRNA\t| gene |\tlabel |\n| -------- | ------- | ------- | ------- |\n| 0 |\tTCCGAGCCTGGGTCTCCCTCTT\t |GGGTTTAGGGAAGGAGGTTCGGAGACAGGGAGCCAAGGCCTCTGTC... |\t1 |\n|1 |\tTGCGGGGCTAGGGCTAACAGCA\t|GCTTCCCAAGTTAGGTTAGTGATGTGAAATGCTCCTGTCCCTGGCC...\t| 1 |\n| 2 |\tCCCACTGCCCCAGGTGCTGCTGG\t|TCTTTCCAAAATTGTCCAGCAGCTTGAATGAGGCAGTGACAATTCT...\t| 1 |\n| 3 |\tTGAGGGGCAGAGAGCGAGACTTT\t|CAGAACTGGGATTCAAGCGAGGTCTGGCCCCTCAGTCTGTGGCTTT...\t| 1 |\n| 4\t |CAAAGTGCTGTTCGTGCAGGTAG\t|TTTTTTCCCTTAGGACTCTGCACTTTATAGAATGTTGTAAAACAGA...\t| 1 |\n\nIf you want to get just a path to the dataset, use the `get_dataset_path` function:\n\n```python\nfrom miRBench.dataset import get_dataset_path\n\ndataset_path = get_dataset_path(dataset_name, split=\"test\", ratio=\"1\")\ndataset_path\n```\n\n```python\n/home/user/.miRBench/datasets/13909173/AGO2_CLASH_Hejret2023/1/test/dataset.tsv\n```\n\n### Get all available tools\n\n```python\nfrom miRBench.predictor import list_predictors\n\nlist_predictors()\n```\n```python\n['CnnMirTarget_Zheng2020',\n 'RNACofold',\n 'miRNA_CNN_Hejret2023',\n 'miRBind_Klimentova2022',\n 'TargetNet_Min2021',\n 'Seed8mer',\n 'Seed7mer',\n 'Seed6mer',\n 'Seed6merBulgeOrMismatch',\n 'TargetScanCnn_McGeary2019',\n 'InteractionAwareModel_Yang2024']\n```\n\n### Encode dataset\n\n```python\nfrom miRBench.encoder import get_encoder\n\ntool = 'miRBind_Klimentova2022'\nencoder = get_encoder(tool)\n\ninput = encoder(df)\n```\n\n### Get predictions\n\n```python\nfrom miRBench.predictor import get_predictor\n\npredictor = get_predictor(tool)\n\npredictions = predictor(input)\npredictions[:10]\n```\n\n```python\narray([0.6899161 , 0.15220629, 0.07301956, 0.43757868, 0.34360734,\n 0.20519172, 0.0955029 , 0.79298246, 0.14150576, 0.05329492],\n dtype=float32)\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A collection of datasets and predictors for benchmarking miRNA target site prediction algorithms",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/katarinagresova/miRBench"
},
"split_keywords": [
"mirna",
" target site prediction",
" benchmarking"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c89e0fa1325616f6f9a6655e8d5c75ee5234d0309bf6ae7208093ea3bc01dcf7",
"md5": "de923b795f7d9b3b0d6342664e997dc1",
"sha256": "2484e5b1dd86dcc39bfae1736125aca7452e659a4d8d3cfacb6f0091ab758d7d"
},
"downloads": -1,
"filename": "mirbench-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "de923b795f7d9b3b0d6342664e997dc1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 15995,
"upload_time": "2024-10-15T11:37:58",
"upload_time_iso_8601": "2024-10-15T11:37:58.393538Z",
"url": "https://files.pythonhosted.org/packages/c8/9e/0fa1325616f6f9a6655e8d5c75ee5234d0309bf6ae7208093ea3bc01dcf7/mirbench-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-15 11:37:58",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "katarinagresova",
"github_project": "miRBench",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "mirbench"
}