utilsovs-pkg


Nameutilsovs-pkg JSON
Version 0.9.5 PyPI version JSON
download
home_pagehttps://github.com/synthaze/utilsovs
SummaryUtils derived from the O-GlcNAc Database source code
upload_time2022-02-02 14:53:53
maintainer
docs_urlNone
authorFlorian Malard, PhD
requires_python>=3.7
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Utilsovs - 0.9

Utils derived from the [O-GlcNAc Database](https://www.oglcnac.mcw.edu/) code source.

Please report any bugs or incompatibilities.

If you use *utilsovs* in your academic work, please cite:

Malard F, Wulff-Fuentes E, Berendt R, Didier G and Olivier-Van Stichelen S. **Automatization and self-maintenance of the O-GlcNAcome catalogue:
A Smart Scientific Database**. *Database*, Volume 2021, (2021).

## Install

```python
pip3 install utilsovs-pkg
```

Test install with ```pytest``` from the package root directory.

## Content

The package utilsovs contains:

- API wrappers - Proteins from UniProtKB ID ([UniProtKB](https://www.uniprot.org/), [GlyGen](https://www.glygen.org/), [The *O*-GlcNAc Database](https://www.oglcnac.mcw.edu/))
- API wrappers - Literature from PMID ([MedLine/PubMed](https://pubmed.ncbi.nlm.nih.gov/), [Semantic Scholar](https://www.semanticscholar.org/), [ProteomeXchange](http://www.proteomexchange.org/))
- Protein digestion tool: full and partial digestion and MW calculation (monoisotopic, average mass)
- Calculation of log2(odds) from alignment file and generation of sequence logo
- Match residuePosition on sequence fetched from UniProtKB to validate datasets
- Convert PDF to Text using wrappers and repair/clean
- Miscellaneous functions

### API wrappers - Proteins from UniProtKB ID

```python
from utilsovs import *

# Fetch UniProtKB Proteins REST API (@data.url)
data = fetch_one_UniProtKB('P08047',filepath='out.json',pprint=False)

# Fetch The O-GlcNAc Database Proteins REST API (@data.url)
data = fetch_one_oglcnacDB('P08047',filepath='out.json',pprint=False)

# Fetch RESTful Glygen webservice-based APIs (@data.url)
data = fetch_one_GlyGen('P08047',filepath='out.json',pprint=False)

# data is an class instance. To print the data of interest:
print (data.data)

```

### API wrappers - Literature from PubMed IDentifier (PMID)

```python
from utilsovs import *

# Fetch MedLine/PubMed API using Entrez.efetch (@data.url)
data = fetch_one_PubMed('33479245',db="pubmed",filepath='out.json',pprint=False)

# Fetch Semantic Scholar API (@data.url)
data = fetch_one_SemanticScholar('33479245',filepath='out.json',pprint=False)

# Fetch proteomeXchange using GET search request (@data.url)
data = fetch_one_proteomeXchange('29351928',filepath='out.json',pprint=False)

# data is an class instance. To print the data of interest:
print (data.data)

```

### Compute - Digest protein, match residuePosition on sequence or calculate log2(odds) from alignment file and draw consensus sequence logo

```python
from utilsovs import *

# Full digestion of a UniProtKB ID protein sequence: [ ['PEPTIDE',(start,end),mw_monoisotopic,mw_average], ... ]
data = compute_one_fullDigest('P13693','Trypsin',filepath='out.json')

# Partial digestion of a UniProtKB ID protein sequence: [ ['PEPTIDE',(start,end),mw_monoisotopic,mw_average], ... ]
# All possible combinations of adjacent fragments are generated
data = compute_one_partialDigest('P13693','Trypsin',filepath='out.json')

# Match residuePosition with UniProtKB ID protein sequence
data = compute_match_aaSeq('P13693','D6',filepath='out.json')

# Compute log2odds from alignment file - Input for draw_one_seqLogo()
data = compute_aln_log2odds('align.aln',organism='HUMAN',filepath='out.json')

# Draw sequence logo from compute_aln_log2odds output file
# See https://logomaker.readthedocs.io/en/latest/implementation.html
# Edit logomaker config in src/ultilsovs_draw.py
draw_one_seqLogo('compute_aln_log2odds.json',filepath='out.png',showplot=False,center_values=False)

# data is an class instance. To print the data of interest:
print (data.data)

```

### Text Processing

```python
from utilsovs import *

# PDF to Text conversion using GNU pdftotext (Linux/Mac) or Tika (Windows) and text repair + cleaning.
data = pdf_one_pdf2text('test.pdf',filepath='out.dat',clean=True)

# data is an class instance. To print the data of interest:
print (data.data)

```

### Miscellaneous standalone functions

Functions below return Python objects or variables.

```python
from utilsovs import *

# Show list of proteases for digest utils
show_proteases()

# Return protein sequence from UniProtKB ID
get_one_sequence('P13693',filepath='out.dat')

# Compute MW of a peptide and return [string,mw_monoisotopic,mw_average]
compute_one_MW('EWENMR',filepath='out.json')

#Compute amino-acids frequency table for a given organism from uniprot_sprot.fasta.gz
get_one_freqAAdict(organism='HUMAN',filepath='out.json')

#Clear all data in utilsovs cache
clearCache()


```



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/synthaze/utilsovs",
    "name": "utilsovs-pkg",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "",
    "author": "Florian Malard, PhD",
    "author_email": "florian.malard@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/d2/50/98bb8cd1789e55238dc186e68d42799eb5c7612046b679af679b8a8ed54a/utilsovs-pkg-0.9.5.tar.gz",
    "platform": "",
    "description": "# Utilsovs - 0.9\n\nUtils derived from the [O-GlcNAc Database](https://www.oglcnac.mcw.edu/) code source.\n\nPlease report any bugs or incompatibilities.\n\nIf you use *utilsovs* in your academic work, please cite:\n\nMalard F, Wulff-Fuentes E, Berendt R, Didier G and Olivier-Van Stichelen S. **Automatization and self-maintenance of the O-GlcNAcome catalogue:\nA Smart Scientific Database**. *Database*, Volume 2021, (2021).\n\n## Install\n\n```python\npip3 install utilsovs-pkg\n```\n\nTest install with ```pytest``` from the package root directory.\n\n## Content\n\nThe package utilsovs contains:\n\n- API wrappers - Proteins from UniProtKB ID ([UniProtKB](https://www.uniprot.org/), [GlyGen](https://www.glygen.org/), [The *O*-GlcNAc Database](https://www.oglcnac.mcw.edu/))\n- API wrappers - Literature from PMID ([MedLine/PubMed](https://pubmed.ncbi.nlm.nih.gov/), [Semantic Scholar](https://www.semanticscholar.org/), [ProteomeXchange](http://www.proteomexchange.org/))\n- Protein digestion tool: full and partial digestion and MW calculation (monoisotopic, average mass)\n- Calculation of log2(odds) from alignment file and generation of sequence logo\n- Match residuePosition on sequence fetched from UniProtKB to validate datasets\n- Convert PDF to Text using wrappers and repair/clean\n- Miscellaneous functions\n\n### API wrappers - Proteins from UniProtKB ID\n\n```python\nfrom utilsovs import *\n\n# Fetch UniProtKB Proteins REST API (@data.url)\ndata = fetch_one_UniProtKB('P08047',filepath='out.json',pprint=False)\n\n# Fetch The O-GlcNAc Database Proteins REST API (@data.url)\ndata = fetch_one_oglcnacDB('P08047',filepath='out.json',pprint=False)\n\n# Fetch RESTful Glygen webservice-based APIs (@data.url)\ndata = fetch_one_GlyGen('P08047',filepath='out.json',pprint=False)\n\n# data is an class instance. To print the data of interest:\nprint (data.data)\n\n```\n\n### API wrappers - Literature from PubMed IDentifier (PMID)\n\n```python\nfrom utilsovs import *\n\n# Fetch MedLine/PubMed API using Entrez.efetch (@data.url)\ndata = fetch_one_PubMed('33479245',db=\"pubmed\",filepath='out.json',pprint=False)\n\n# Fetch Semantic Scholar API (@data.url)\ndata = fetch_one_SemanticScholar('33479245',filepath='out.json',pprint=False)\n\n# Fetch proteomeXchange using GET search request (@data.url)\ndata = fetch_one_proteomeXchange('29351928',filepath='out.json',pprint=False)\n\n# data is an class instance. To print the data of interest:\nprint (data.data)\n\n```\n\n### Compute - Digest protein, match residuePosition on sequence or calculate log2(odds) from alignment file and draw consensus sequence logo\n\n```python\nfrom utilsovs import *\n\n# Full digestion of a UniProtKB ID protein sequence: [ ['PEPTIDE',(start,end),mw_monoisotopic,mw_average], ... ]\ndata = compute_one_fullDigest('P13693','Trypsin',filepath='out.json')\n\n# Partial digestion of a UniProtKB ID protein sequence: [ ['PEPTIDE',(start,end),mw_monoisotopic,mw_average], ... ]\n# All possible combinations of adjacent fragments are generated\ndata = compute_one_partialDigest('P13693','Trypsin',filepath='out.json')\n\n# Match residuePosition with UniProtKB ID protein sequence\ndata = compute_match_aaSeq('P13693','D6',filepath='out.json')\n\n# Compute log2odds from alignment file - Input for draw_one_seqLogo()\ndata = compute_aln_log2odds('align.aln',organism='HUMAN',filepath='out.json')\n\n# Draw sequence logo from compute_aln_log2odds output file\n# See https://logomaker.readthedocs.io/en/latest/implementation.html\n# Edit logomaker config in src/ultilsovs_draw.py\ndraw_one_seqLogo('compute_aln_log2odds.json',filepath='out.png',showplot=False,center_values=False)\n\n# data is an class instance. To print the data of interest:\nprint (data.data)\n\n```\n\n### Text Processing\n\n```python\nfrom utilsovs import *\n\n# PDF to Text conversion using GNU pdftotext (Linux/Mac) or Tika (Windows) and text repair + cleaning.\ndata = pdf_one_pdf2text('test.pdf',filepath='out.dat',clean=True)\n\n# data is an class instance. To print the data of interest:\nprint (data.data)\n\n```\n\n### Miscellaneous standalone functions\n\nFunctions below return Python objects or variables.\n\n```python\nfrom utilsovs import *\n\n# Show list of proteases for digest utils\nshow_proteases()\n\n# Return protein sequence from UniProtKB ID\nget_one_sequence('P13693',filepath='out.dat')\n\n# Compute MW of a peptide and return [string,mw_monoisotopic,mw_average]\ncompute_one_MW('EWENMR',filepath='out.json')\n\n#Compute amino-acids frequency table for a given organism from uniprot_sprot.fasta.gz\nget_one_freqAAdict(organism='HUMAN',filepath='out.json')\n\n#Clear all data in utilsovs cache\nclearCache()\n\n\n```\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Utils derived from the O-GlcNAc Database source code",
    "version": "0.9.5",
    "project_urls": {
        "Bug Tracker": "https://github.com/synthaze/utilsovs/issues",
        "Homepage": "https://github.com/synthaze/utilsovs"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bfa92dc69ac55770d232cb9e60cf9ae4f6381c10d6ba87f04dd734d9193d9a10",
                "md5": "2fd808316cc7db5509ff48e177bf1e67",
                "sha256": "dc0531c216283ec60616c77e4804b7039d94a045d8bf55e9edaba4a6c8936a44"
            },
            "downloads": -1,
            "filename": "utilsovs_pkg-0.9.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2fd808316cc7db5509ff48e177bf1e67",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 25837,
            "upload_time": "2022-02-02T14:53:52",
            "upload_time_iso_8601": "2022-02-02T14:53:52.563653Z",
            "url": "https://files.pythonhosted.org/packages/bf/a9/2dc69ac55770d232cb9e60cf9ae4f6381c10d6ba87f04dd734d9193d9a10/utilsovs_pkg-0.9.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d25098bb8cd1789e55238dc186e68d42799eb5c7612046b679af679b8a8ed54a",
                "md5": "98c397b7c5384edc1dbb3901d52a0bca",
                "sha256": "dfa36a7a90495eaf1d4eb07c40743b161d28b59f9ecb691129b7ef8a5fbe4128"
            },
            "downloads": -1,
            "filename": "utilsovs-pkg-0.9.5.tar.gz",
            "has_sig": false,
            "md5_digest": "98c397b7c5384edc1dbb3901d52a0bca",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 23999,
            "upload_time": "2022-02-02T14:53:53",
            "upload_time_iso_8601": "2022-02-02T14:53:53.921293Z",
            "url": "https://files.pythonhosted.org/packages/d2/50/98bb8cd1789e55238dc186e68d42799eb5c7612046b679af679b8a8ed54a/utilsovs-pkg-0.9.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-02-02 14:53:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "synthaze",
    "github_project": "utilsovs",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "utilsovs-pkg"
}
        
Elapsed time: 0.18355s