Name | rouskinhf JSON |
Version |
0.4.8
JSON |
| download |
home_page | None |
Summary | A library to manipulate data for our DMS prediction models. |
upload_time | 2024-05-14 17:07:07 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | MIT License Copyright (c) Microsoft Corporation. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
![PyPI](https://img.shields.io/pypi/v/rouskinhf)
![GitHub tag (with filter)](https://img.shields.io/github/v/tag/rouskinlab/rouskinhf)
# Download your RNA data from HuggingFace with rouskinhf!
A wrapper around Huggingface the load data for eFold. You can:
- pull datasets from the Rouskinlab's HuggingFace
- create datasets from local files
# Installation
### To download data
```bash
pip install rouskinhf
```
### To push data to huggingface (optional)
- get a token access from the rouskilab huggingface's page
- add this token to your environment
```bash
export HUGGINGFACE_TOKEN="hf_yourtokenhere"
```
### To predict structures from rouskinhf (optional)
You'll need to install D. Mathew's [RNAstructure Fold](https://rna.urmc.rochester.edu/RNAstructure.html) (also available on [Rouskinlab GitHub](https://github.com/rouskinlab/RNAstructure)).
Check your RNAstructure Fold installation in a terminal:
```bash
Fold --version
```
# How to use
### Download a dataset
```python
import rouskinhf
rouskinhf.get_dataset(
name='bpRNA-1m', # the name of a dataset from huggingface/rouskinlab
force_download = False # use a local copy of the data if it exists
)
```
### Convert whatever format to rouskinhf format
```python
import rouskinhf
rouskinhf.convert(
format = 'ct', # can be ct, seismic, bpseq, fasta or json (rouskinhf output data structure)
file_or_folder = 'path/to/my/ct/folder',
predict_structure = False, # Add structure from RNAstructure
filter = True, # removes duplicates, non-regular characters and low AUROC
min_AUROC=0.8,
)
```
> Note: Sequences with bases different than `A`, `C`, `G`, `T`, `U`, `N`, `a`, `c`, `g`, `t`, `u`, `n` are not supported. The data will be filtered out.
### Rouskinhf structure format
```json
# rouskinhf_output_file.json
{
"reference_name": {
"sequence": "CACGCUAUG",
"structure": [(0,8), (1,7)], # base pair representation
# whatever other info you need
}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "rouskinhf",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Yves Martin <yves@martin.yt>, Alberic de Lajarte <albericlajarte@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/ed/44/cadd00eb40814ee959875deda43a36c11cbe91f5b76e4e83bd2fa6733d4d/rouskinhf-0.4.8.tar.gz",
"platform": null,
"description": "![PyPI](https://img.shields.io/pypi/v/rouskinhf)\n![GitHub tag (with filter)](https://img.shields.io/github/v/tag/rouskinlab/rouskinhf)\n\n# Download your RNA data from HuggingFace with rouskinhf!\n\nA wrapper around Huggingface the load data for eFold. You can:\n- pull datasets from the Rouskinlab's HuggingFace\n- create datasets from local files \n\n# Installation\n\n### To download data\n\n```bash\npip install rouskinhf\n```\n\n### To push data to huggingface (optional) \n\n- get a token access from the rouskilab huggingface's page\n- add this token to your environment\n\n```bash\nexport HUGGINGFACE_TOKEN=\"hf_yourtokenhere\"\n```\n\n### To predict structures from rouskinhf (optional)\nYou'll need to install D. Mathew's [RNAstructure Fold](https://rna.urmc.rochester.edu/RNAstructure.html) (also available on [Rouskinlab GitHub](https://github.com/rouskinlab/RNAstructure)).\n\nCheck your RNAstructure Fold installation in a terminal:\n\n```bash\nFold --version\n```\n\n# How to use\n\n### Download a dataset\n\n```python\nimport rouskinhf\n\nrouskinhf.get_dataset(\n name='bpRNA-1m', # the name of a dataset from huggingface/rouskinlab\n force_download = False # use a local copy of the data if it exists\n)\n```\n\n### Convert whatever format to rouskinhf format\n\n```python\nimport rouskinhf\n\nrouskinhf.convert(\n format = 'ct', # can be ct, seismic, bpseq, fasta or json (rouskinhf output data structure)\n file_or_folder = 'path/to/my/ct/folder',\n predict_structure = False, # Add structure from RNAstructure\n filter = True, # removes duplicates, non-regular characters and low AUROC\n min_AUROC=0.8,\n)\n```\n> Note: Sequences with bases different than `A`, `C`, `G`, `T`, `U`, `N`, `a`, `c`, `g`, `t`, `u`, `n` are not supported. The data will be filtered out.\n\n\n### Rouskinhf structure format\n```json\n# rouskinhf_output_file.json\n{\n \"reference_name\": {\n \"sequence\": \"CACGCUAUG\",\n \"structure\": [(0,8), (1,7)], # base pair representation\n # whatever other info you need\n }\n}\n```\n\n\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) Microsoft Corporation. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE ",
"summary": "A library to manipulate data for our DMS prediction models.",
"version": "0.4.8",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2e7f68789fba2f5fea1ef590f004f48be92ec856758b73241c92735eaaf2ddb9",
"md5": "92f2a5e63b8872cd511db0fd9d344661",
"sha256": "04576cb878ca6c049c22258083381d34a46eb7a583aeaefe0454cd3460a4c3ad"
},
"downloads": -1,
"filename": "rouskinhf-0.4.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "92f2a5e63b8872cd511db0fd9d344661",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 19308,
"upload_time": "2024-05-14T17:07:06",
"upload_time_iso_8601": "2024-05-14T17:07:06.474700Z",
"url": "https://files.pythonhosted.org/packages/2e/7f/68789fba2f5fea1ef590f004f48be92ec856758b73241c92735eaaf2ddb9/rouskinhf-0.4.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ed44cadd00eb40814ee959875deda43a36c11cbe91f5b76e4e83bd2fa6733d4d",
"md5": "ff1f4dc2611970a8f1d11ca069ed0677",
"sha256": "61eaf0316e28fa4e8ba82f94d6bbd14a63e174b604f5980f6a27bdab2fd5e436"
},
"downloads": -1,
"filename": "rouskinhf-0.4.8.tar.gz",
"has_sig": false,
"md5_digest": "ff1f4dc2611970a8f1d11ca069ed0677",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 17283,
"upload_time": "2024-05-14T17:07:07",
"upload_time_iso_8601": "2024-05-14T17:07:07.769856Z",
"url": "https://files.pythonhosted.org/packages/ed/44/cadd00eb40814ee959875deda43a36c11cbe91f5b76e4e83bd2fa6733d4d/rouskinhf-0.4.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-14 17:07:07",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "rouskinhf"
}