miRBench


NamemiRBench JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/katarinagresova/miRBench
SummaryA collection of datasets and predictors for benchmarking miRNA target site prediction algorithms
upload_time2024-09-27 09:24:51
maintainerNone
docs_urlNone
authorKatarina Gresova
requires_pythonNone
licenseMIT
keywords mirna target site prediction benchmarking
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # miRNA target site prediction Benchmarks

## Installation

```bash
pip install miRBench
```

## Examples

### Get all available datasets

```python
import miRBench

miRBench.dataset.list_datasets()
```

```python
['AGO2_CLASH_Hejret2023',
 'AGO2_eCLIP_Klimentova2022',
 'AGO2_eCLIP_Manakov2022']
```

Not all datasets are available with all splits and ratios. To get available splits and ratios, use the `full` option.

```python
miRBench.dataset.list_datasets(full=True)
```

```python
{'AGO2_CLASH_Hejret2023': {'splits': {
      'train': {'ratios': ['10']},
      'test': {'ratios': ['1', '10', '100']}}},
 'AGO2_eCLIP_Klimentova2022': {'splits': {
      'test': {'ratios': ['1', '10', '100']}}},
 'AGO2_eCLIP_Manakov2022': {'splits': {
      'train': {'ratios': ['1', '10', '100']},
      'test': {'ratios': ['1', '10', '100']}}}
}
```

### Get dataset

```python
dataset_name = "AGO2_CLASH_Hejret2023"
df = miRBench.dataset.get_dataset_df(dataset_name, split="test", ratio="1")
df.head()
```

|	| noncodingRNA	| gene |	label |
| -------- | ------- | ------- | ------- |
| 0 |	TCCGAGCCTGGGTCTCCCTCTT	 |GGGTTTAGGGAAGGAGGTTCGGAGACAGGGAGCCAAGGCCTCTGTC... |	1 |
|1 |	TGCGGGGCTAGGGCTAACAGCA	|GCTTCCCAAGTTAGGTTAGTGATGTGAAATGCTCCTGTCCCTGGCC...	| 1 |
| 2 |	CCCACTGCCCCAGGTGCTGCTGG	|TCTTTCCAAAATTGTCCAGCAGCTTGAATGAGGCAGTGACAATTCT...	| 1 |
| 3 |	TGAGGGGCAGAGAGCGAGACTTT	|CAGAACTGGGATTCAAGCGAGGTCTGGCCCCTCAGTCTGTGGCTTT...	| 1 |
| 4	 |CAAAGTGCTGTTCGTGCAGGTAG	|TTTTTTCCCTTAGGACTCTGCACTTTATAGAATGTTGTAAAACAGA...	| 1 |

Data will be downloaded to `$HOME / ".miRBench" / "datasets"` directory, under separate subdirectories for each dataset.

### Get all available tools

```python
miRBench.predictor.list_predictors()
```
```python
['CnnMirTarget_Zheng2020',
 'RNACofold',
 'miRNA_CNN_Hejret2023',
 'miRBind_Klimentova2022',
 'TargetNet_Min2021',
 'Seed8mer',
 'Seed7mer',
 'Seed6mer',
 'Seed6merBulgeOrMismatch',
 'TargetScanCnn_McGeary2019',
 'InteractionAwareModel_Yang2024']
```

### Encode dataset

```python
tool = 'miRBind_Klimentova2022'
encoder = miRBench.encoder.get_encoder(tool)

input = encoder(df)
```

### Get predictions

```python
predictor = miRBench.predictor.get_predictor(tool)

predictions = predictor(input)
predictions[:10]
```

```python
array([0.6899161 , 0.15220629, 0.07301956, 0.43757868, 0.34360734,
       0.20519172, 0.0955029 , 0.79298246, 0.14150576, 0.05329492],
      dtype=float32)
```

## Benchmark all tools on all datasets

```bash
python benchmark_all.py OUTPUT_FOLDER_PATH
```

The script will run all tools on all datasets and will produce a file with suffix `_predictions.tsv` for each dataset. Predictions from every tool will be saved in separate columns.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/katarinagresova/miRBench",
    "name": "miRBench",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "miRNA, target site prediction, benchmarking",
    "author": "Katarina Gresova",
    "author_email": "gresova11@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/b0/1c/ca17bb9a6078865d6d958902f53bb45c54d07c1b9d938218a375907e1a1f/mirbench-0.1.1.tar.gz",
    "platform": null,
    "description": "# miRNA target site prediction Benchmarks\n\n## Installation\n\n```bash\npip install miRBench\n```\n\n## Examples\n\n### Get all available datasets\n\n```python\nimport miRBench\n\nmiRBench.dataset.list_datasets()\n```\n\n```python\n['AGO2_CLASH_Hejret2023',\n 'AGO2_eCLIP_Klimentova2022',\n 'AGO2_eCLIP_Manakov2022']\n```\n\nNot all datasets are available with all splits and ratios. To get available splits and ratios, use the `full` option.\n\n```python\nmiRBench.dataset.list_datasets(full=True)\n```\n\n```python\n{'AGO2_CLASH_Hejret2023': {'splits': {\n      'train': {'ratios': ['10']},\n      'test': {'ratios': ['1', '10', '100']}}},\n 'AGO2_eCLIP_Klimentova2022': {'splits': {\n      'test': {'ratios': ['1', '10', '100']}}},\n 'AGO2_eCLIP_Manakov2022': {'splits': {\n      'train': {'ratios': ['1', '10', '100']},\n      'test': {'ratios': ['1', '10', '100']}}}\n}\n```\n\n### Get dataset\n\n```python\ndataset_name = \"AGO2_CLASH_Hejret2023\"\ndf = miRBench.dataset.get_dataset_df(dataset_name, split=\"test\", ratio=\"1\")\ndf.head()\n```\n\n|\t| noncodingRNA\t| gene |\tlabel |\n| -------- | ------- | ------- | ------- |\n| 0 |\tTCCGAGCCTGGGTCTCCCTCTT\t |GGGTTTAGGGAAGGAGGTTCGGAGACAGGGAGCCAAGGCCTCTGTC... |\t1 |\n|1 |\tTGCGGGGCTAGGGCTAACAGCA\t|GCTTCCCAAGTTAGGTTAGTGATGTGAAATGCTCCTGTCCCTGGCC...\t| 1 |\n| 2 |\tCCCACTGCCCCAGGTGCTGCTGG\t|TCTTTCCAAAATTGTCCAGCAGCTTGAATGAGGCAGTGACAATTCT...\t| 1 |\n| 3 |\tTGAGGGGCAGAGAGCGAGACTTT\t|CAGAACTGGGATTCAAGCGAGGTCTGGCCCCTCAGTCTGTGGCTTT...\t| 1 |\n| 4\t |CAAAGTGCTGTTCGTGCAGGTAG\t|TTTTTTCCCTTAGGACTCTGCACTTTATAGAATGTTGTAAAACAGA...\t| 1 |\n\nData will be downloaded to `$HOME / \".miRBench\" / \"datasets\"` directory, under separate subdirectories for each dataset.\n\n### Get all available tools\n\n```python\nmiRBench.predictor.list_predictors()\n```\n```python\n['CnnMirTarget_Zheng2020',\n 'RNACofold',\n 'miRNA_CNN_Hejret2023',\n 'miRBind_Klimentova2022',\n 'TargetNet_Min2021',\n 'Seed8mer',\n 'Seed7mer',\n 'Seed6mer',\n 'Seed6merBulgeOrMismatch',\n 'TargetScanCnn_McGeary2019',\n 'InteractionAwareModel_Yang2024']\n```\n\n### Encode dataset\n\n```python\ntool = 'miRBind_Klimentova2022'\nencoder = miRBench.encoder.get_encoder(tool)\n\ninput = encoder(df)\n```\n\n### Get predictions\n\n```python\npredictor = miRBench.predictor.get_predictor(tool)\n\npredictions = predictor(input)\npredictions[:10]\n```\n\n```python\narray([0.6899161 , 0.15220629, 0.07301956, 0.43757868, 0.34360734,\n       0.20519172, 0.0955029 , 0.79298246, 0.14150576, 0.05329492],\n      dtype=float32)\n```\n\n## Benchmark all tools on all datasets\n\n```bash\npython benchmark_all.py OUTPUT_FOLDER_PATH\n```\n\nThe script will run all tools on all datasets and will produce a file with suffix `_predictions.tsv` for each dataset. Predictions from every tool will be saved in separate columns.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A collection of datasets and predictors for benchmarking miRNA target site prediction algorithms",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/katarinagresova/miRBench"
    },
    "split_keywords": [
        "mirna",
        " target site prediction",
        " benchmarking"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b01cca17bb9a6078865d6d958902f53bb45c54d07c1b9d938218a375907e1a1f",
                "md5": "bb62c2359a779817bbea0a0851d5ad13",
                "sha256": "61ab615a95e365e2c1cb87d61e8e8e02f3989c8dd568d7baf512de0ed7e75156"
            },
            "downloads": -1,
            "filename": "mirbench-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "bb62c2359a779817bbea0a0851d5ad13",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 15579,
            "upload_time": "2024-09-27T09:24:51",
            "upload_time_iso_8601": "2024-09-27T09:24:51.191547Z",
            "url": "https://files.pythonhosted.org/packages/b0/1c/ca17bb9a6078865d6d958902f53bb45c54d07c1b9d938218a375907e1a1f/mirbench-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-27 09:24:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "katarinagresova",
    "github_project": "miRBench",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "mirbench"
}
        
Elapsed time: 0.42278s