bio


Namebio JSON
Version 0.1.4 PyPI version JSON
download
home_pagehttps://github.com/ialbert/bio
Summarybio
upload_time2020-11-21 14:03:00
maintainer
docs_urlNone
authorIstvan Albert
requires_python>=3.6
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # bio: making bioinformatics fun again

> The software is currently under development. It is operational but not fully vetted.

`bio` - command-line utilities to make bioinformatics explorations more enjoyable.

Full documentation: 

* https://ialbert.github.io/bio/

[docs]: https://ialbert.github.io/bio/

## Why do we need this software?

If you've ever done bioinformatics you know how even seemingly straightforward tasks require multiple steps, arcane incantations, reading documentation and numerous other preparations that slow down your progress. 

Time and again I found myself not pursuing an idea because getting to the fun part was too tedious. The `bio` package is meant to solve that tedium.  With `bio` you can write things like this:

    # Fetch the data from NCBI.
    bio NC_045512 --fetch --rename ncov
    bio MN996532  --fetch --rename ratg13

    # Align the DNA for the S protein.
    bio ncov:S ratg13:S --end 90 --align

to align the first 90 basepairs of the DNA sequence of the  `S` protein from the SARS-COV-2 novel coronavirus to its closest (known) relative, the bat coronavirus RaTG13. The command above will print:

```
### 1: YP_009724390 vs QHR63300.2 ###

Length: 90 (semiglobal)
Query:  90 [1, 90]
Target: 90 [1, 90]
Score:  387
Ident:  83/90 (92.2%)
Simil:  83/90 (92.2%)
Gaps:   0/90 (0.0%)
Matrix: nuc44(-11, -1)

YP_009724390 ATGTTTGTTTTTCTTGTTTTATTGCCACTAGTCTCTAGTCAGTGTGTTAATCTTACAACCAGAACTCAATTACCCCCTGCATACACTAAT
           1 ||||||||||||||||||||||||||||||||.||||||||||||||||||||.|||||.||||||||.|||||.|||||||||||.||. 90
QHR63300.2   ATGTTTGTTTTTCTTGTTTTATTGCCACTAGTTTCTAGTCAGTGTGTTAATCTAACAACTAGAACTCAGTTACCTCCTGCATACACCAAC
```

If you wanted to align the same sequences as translated proteins `bio` lets you write:

    bio ncov:S ratg13:S --end 90 --translate --align

to generate:

```
### 1: YP_009724390 vs QHR63300.2 ###

Length: 30 (semiglobal)
Query:  30 [1, 30]
Target: 30 [1, 30]
Score:  153
Ident:  30/30 (100.0%)
Simil:  30/30 (100.0%)
Gaps:   0/30 (0.0%)
Matrix: blosum62(-11, -1)

YP_009724390 MFVFLVLLPLVSSQCVNLTTRTQLPPAYTN
           1 |||||||||||||||||||||||||||||| 30
QHR63300.2   MFVFLVLLPLVSSQCVNLTTRTQLPPAYTN
```

Beyond alignments there is a lot more to `bio` and we recommend looking at the [documentation][docs]

## Who is `bio` designed for?

The software was written to teach bioinformatics and is the companion software to the [Biostar Handbook][handbook] textbook. The targeted audience comprises:

- Students learning about bioinformatics.
- Bioinformatics educators that need a platform to demonstrate bioinformatics concepts. 
- Scientists working with large numbers of similar genomes (bacterial/viral strains).
- Scientists that need to closely investigate and understand particular details of a genomic region.

The ideas and motivations fueling the creation of `bio` came to us while educating the many cohorts of students that used the handbook in the classrom. 

You see, in bioinformatics, many tasks that should be straightforward are, instead, needlessly complicated. `bio` is an opinionated take on how bioinformatics, particularly data presentation and access, should be simplified. 

[handbook]: https://www.biostarhandbook.com/

## Documentation

The documentation is maintained at

* https://ialbert.github.io/bio/


## Quick install

`bio` works on Linux and Mac computers and on Windows when using the Linux Subsystem. Install the package with:

    # We recommend installing prerequisites with conda.
    conda install -c bioconda biopython parasail-python

    # Install the bio package.
    pip install bio --upgrade

See more details in the [documentation][docs].

## Development

If you clone the repository we recommend to install as development package with:

    python setup.py develop

## Testing

Testing uses the pytest framework:

    pip install pytest

To run all tests use:

    make test

Tests are automatically built from a test script that mimics real life usage scenarios.

* https://github.com/ialbert/bio/blob/master/test/test_bio_data.sh

## New tests

To add a new test first run the command you wish to test, for example:

    bio foo --gff > output.gff

in the `test/data` directory. After that add the same command above into the master script:

* https://github.com/ialbert/bio/blob/master/test/test_bio_data.sh

followed by:

    make build_tests

The latter command will automatically generate a Python test for each line in the master script.

The automatically generated test will verify that the command is operational and that the output matches the expectations.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ialbert/bio",
    "name": "bio",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "Istvan Albert",
    "author_email": "istvan.albert@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/0d/0f/91faac3b8b0f42db4d0427446ccb85622e9833aa40a56fa3ca682cb2105a/bio-0.1.4.tar.gz",
    "platform": "",
    "description": "# bio: making bioinformatics fun again\n\n> The software is currently under development. It is operational but not fully vetted.\n\n`bio` - command-line utilities to make bioinformatics explorations more enjoyable.\n\nFull documentation: \n\n* https://ialbert.github.io/bio/\n\n[docs]: https://ialbert.github.io/bio/\n\n## Why do we need this software?\n\nIf you've ever done bioinformatics you know how even seemingly straightforward tasks require multiple steps, arcane incantations, reading documentation and numerous other preparations that slow down your progress. \n\nTime and again I found myself not pursuing an idea because getting to the fun part was too tedious. The `bio` package is meant to solve that tedium.  With `bio` you can write things like this:\n\n    # Fetch the data from NCBI.\n    bio NC_045512 --fetch --rename ncov\n    bio MN996532  --fetch --rename ratg13\n\n    # Align the DNA for the S protein.\n    bio ncov:S ratg13:S --end 90 --align\n\nto align the first 90 basepairs of the DNA sequence of the  `S` protein from the SARS-COV-2 novel coronavirus to its closest (known) relative, the bat coronavirus RaTG13. The command above will print:\n\n```\n### 1: YP_009724390 vs QHR63300.2 ###\n\nLength: 90 (semiglobal)\nQuery:  90 [1, 90]\nTarget: 90 [1, 90]\nScore:  387\nIdent:  83/90 (92.2%)\nSimil:  83/90 (92.2%)\nGaps:   0/90 (0.0%)\nMatrix: nuc44(-11, -1)\n\nYP_009724390 ATGTTTGTTTTTCTTGTTTTATTGCCACTAGTCTCTAGTCAGTGTGTTAATCTTACAACCAGAACTCAATTACCCCCTGCATACACTAAT\n           1 ||||||||||||||||||||||||||||||||.||||||||||||||||||||.|||||.||||||||.|||||.|||||||||||.||. 90\nQHR63300.2   ATGTTTGTTTTTCTTGTTTTATTGCCACTAGTTTCTAGTCAGTGTGTTAATCTAACAACTAGAACTCAGTTACCTCCTGCATACACCAAC\n```\n\nIf you wanted to align the same sequences as translated proteins `bio` lets you write:\n\n    bio ncov:S ratg13:S --end 90 --translate --align\n\nto generate:\n\n```\n### 1: YP_009724390 vs QHR63300.2 ###\n\nLength: 30 (semiglobal)\nQuery:  30 [1, 30]\nTarget: 30 [1, 30]\nScore:  153\nIdent:  30/30 (100.0%)\nSimil:  30/30 (100.0%)\nGaps:   0/30 (0.0%)\nMatrix: blosum62(-11, -1)\n\nYP_009724390 MFVFLVLLPLVSSQCVNLTTRTQLPPAYTN\n           1 |||||||||||||||||||||||||||||| 30\nQHR63300.2   MFVFLVLLPLVSSQCVNLTTRTQLPPAYTN\n```\n\nBeyond alignments there is a lot more to `bio` and we recommend looking at the [documentation][docs]\n\n## Who is `bio` designed for?\n\nThe software was written to teach bioinformatics and is the companion software to the [Biostar Handbook][handbook] textbook. The targeted audience comprises:\n\n- Students learning about bioinformatics.\n- Bioinformatics educators that need a platform to demonstrate bioinformatics concepts. \n- Scientists working with large numbers of similar genomes (bacterial/viral strains).\n- Scientists that need to closely investigate and understand particular details of a genomic region.\n\nThe ideas and motivations fueling the creation of `bio` came to us while educating the many cohorts of students that used the handbook in the classrom. \n\nYou see, in bioinformatics, many tasks that should be straightforward are, instead, needlessly complicated. `bio` is an opinionated take on how bioinformatics, particularly data presentation and access, should be simplified. \n\n[handbook]: https://www.biostarhandbook.com/\n\n## Documentation\n\nThe documentation is maintained at\n\n* https://ialbert.github.io/bio/\n\n\n## Quick install\n\n`bio` works on Linux and Mac computers and on Windows when using the Linux Subsystem. Install the package with:\n\n    # We recommend installing prerequisites with conda.\n    conda install -c bioconda biopython parasail-python\n\n    # Install the bio package.\n    pip install bio --upgrade\n\nSee more details in the [documentation][docs].\n\n## Development\n\nIf you clone the repository we recommend to install as development package with:\n\n    python setup.py develop\n\n## Testing\n\nTesting uses the pytest framework:\n\n    pip install pytest\n\nTo run all tests use:\n\n    make test\n\nTests are automatically built from a test script that mimics real life usage scenarios.\n\n* https://github.com/ialbert/bio/blob/master/test/test_bio_data.sh\n\n## New tests\n\nTo add a new test first run the command you wish to test, for example:\n\n    bio foo --gff > output.gff\n\nin the `test/data` directory. After that add the same command above into the master script:\n\n* https://github.com/ialbert/bio/blob/master/test/test_bio_data.sh\n\nfollowed by:\n\n    make build_tests\n\nThe latter command will automatically generate a Python test for each line in the master script.\n\nThe automatically generated test will verify that the command is operational and that the output matches the expectations.\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "bio",
    "version": "0.1.4",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "d440d9f8bc284befe75b15a66a5072a4",
                "sha256": "690039e2dc8019230b63ba66b70a1614a38d5f2e2832350d04876ace06088903"
            },
            "downloads": -1,
            "filename": "bio-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d440d9f8bc284befe75b15a66a5072a4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 44098,
            "upload_time": "2020-11-21T14:02:59",
            "upload_time_iso_8601": "2020-11-21T14:02:59.121529Z",
            "url": "https://files.pythonhosted.org/packages/da/50/125c8866cdc9349690f016eb76ddc4da0bba2ed432b8d27ea852102960f6/bio-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "f7eea84a06bb514b69d360e04909b5d7",
                "sha256": "9a42e6eb9d2e60b1a1e0c005e41381e3da7dea7fd21608734caba1af9affd47c"
            },
            "downloads": -1,
            "filename": "bio-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "f7eea84a06bb514b69d360e04909b5d7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 35504,
            "upload_time": "2020-11-21T14:03:00",
            "upload_time_iso_8601": "2020-11-21T14:03:00.511521Z",
            "url": "https://files.pythonhosted.org/packages/0d/0f/91faac3b8b0f42db4d0427446ccb85622e9833aa40a56fa3ca682cb2105a/bio-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2020-11-21 14:03:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": null,
    "github_project": "ialbert",
    "error": "Could not fetch GitHub repository",
    "lcname": "bio"
}
        
Elapsed time: 0.21026s