![Example](docs/img/logo.png)
# FASTQ with Emoji = FASTQE π€
Read one or more FASTQ files, [fastqe](https://fastqe.com/) will compute quality stats for each file and print those stats as emoji... for some reason.
Given a fastq file in Illumina 1.8+/Sanger format, calculate the mean (rounded) score for each position and print a corresponding emoji!
![Example](docs/img/fastqe_binned.png)
https://fastqe.com/
# Install
Latest release versions of `fastqe` are available via `pip` or BioConda:
`pip install fastqe`
`conda install -c bioconda fastqe`
## Development
Development version can be isntall from this repository in the `master` branch.
# Usage
`fastqe` can display usage information on the command line via the `-h` or `--help` argument:
```
usage: fastqe [-h] [--minlen N] [--scale] [--version] [--mean]
[--custom CUSTOM_DICT] [--bin] [--noemoji] [--min] [--max]
[--output OUTPUT_FILE] [--long READ_LENGTH] [--log LOG_FILE]
[FASTQ_FILE [FASTQ_FILE ...]]
Read one or more FASTQ files, compute quality stats for each file, print as
emoji... for some reason.π
positional arguments:
FASTQ_FILE Input FASTQ files
optional arguments:
-h, --help show this help message and exit
--minlen N Minimum length sequence to include in stats (default
0)
--scale show relevant scale in output
--version show program's version number and exit
--mean show mean quality per position (DEFAULT)
--custom CUSTOM_DICT use a mapping of custom emoji to quality in
CUSTOM_DICT (ππ΄)
--bin use binned scores (π«ππ©β οΈππππ)
--noemoji use mapping without emoji (βββββ
βββ)
--min show minimum quality per position
--max show maximum quality per position
--output OUTPUT_FILE write output to OUTPUT_FILE instead of stdout
--long READ_LENGTH enable long reads up to READ_LENGTH bp long
--log LOG_FILE record program progress in LOG_FILE
```
## Convert
`fastqe` will summarise FASTQ files to display the max, mean and minumum quality using emoji. To convert a file into this format, rather than summarise, you can use the companion program `biomojify` that will convert both sequence and quality information to emoji:
```
$ cat test.fq
@ Sequence
GTGCCAGCCGCCGCGGTAGTCCGACGTGGC
+
GGGGGGGGGGGGGGGGGGGGGG!@#$%&%(
```
```
$ biomojify fastq test.fq
βΆοΈ Sequence
ππ
ππ½π½π₯ππ½π½ππ½π½ππ½πππ
π₯ππ
π½π½ππ₯π½ππ
πππ½
πππππππππππππππππππππππ«ππΊππ
πΎπ
π
```
Intall with `pip install biomojify`, and see the `biomojify` page for more information: https://github.com/fastqe/biomojify/
# Quickstart
`fastqe test.fastq`
`fastqe --min test.fastq`
`fastqe --max test.fastq`
`fastqe --max -min -bin test.fastq`
# Teaching Materials
## Command line and NGS Introduction
This lesson introduces NGS process in the command line using by using the results of FASTQE before and after quality filerting
using `fastp`:
[https://qubeshub.org/publications/1092/2](https://qubeshub.org/publications/1092/2)
```
Rachael St. Jacques, Max Maza, Sabrina Robertson, Guoqing Lu, Andrew Lonsdale, Ray A Enke (2019).
A Fun Introductory Command Line Exercise: Next Generation Sequencing Quality Analysis with Emoji!.
NIBLSE Incubator: Intro to Command Line Coding Genomics Analysis, (Version 2.0).
QUBES Educational Resources. doi:10.25334/Q4D172
```
## Galaxy
A Galaxy wrapper is available from the [IUC toolshed](https://toolshed.g2.bx.psu.edu/repository?repository_id=13576f42f394cfb6). Contact your Galaxy Admin
if you would like to have it installed. A Galaxy Tutorial using FASTQE is in development.
![FASTQE in Galaxy](docs/img/galaxy_full.png)
# History
FASTQE started out as part of PyCon Au presentations:
- PyCon Au 2016 - [Python for science, side projects and stuff!](https://www.youtube.com/watch?v=PCZS9wqBUuE)
- PyCon Au 2017 - [Lightning Talk](https://youtu.be/WywQ6a3uQ5I?t=33m18s)
- BCC 2020 - Short Presentaion
<img src="docs/img/fastqe.png" class="img-fluid" alt="Responsive image">
### Versions
- version 0.0.1 at PyCon Au 2016:
- Mean position per read
- version 0.0.2 at PyconAu 2017:
- update emoji map
- Max and minimum scores per position added
- Wrapper code based on early version of [Bionitio](https://github.com/bionitio-team/bionitio) added
- prepare for PyPi
- version 0.1.0 July 2018
- clean up code
- add binning
- version 0.2.6 July 2020
- refactor code
- add long read support with --long
- add --noemoji for block-based output on systems that don't support emoji
- add --custom for user-defined mapping to emoji
- add --output to redirect to file instead of stdout
- add gzip support
- add redirect from stdin support
- fix bug of dropping position if some sequences are only 0 quality
- Galaxy Wrapper created July 2020
- `biomojify` created July 2020
- version 0.2.7 2021
- bugfix
- version 0.3.1 2023
- HTML reporting for Galaxy
- version 0.3.3 2024
- Update emoji that render in default fonts with alternatives
# Limitations
- ~Reads up to 500bp only~ Read length above 500bp allowed but must be set by user with `--long MAX_LENGTH`
- Same emoji for all scores above 41
## Licence
This program is released as open source software under the terms of [BSD License](https://raw.githubusercontent.com/fastqe/fastqe/master/LICENSE)
## Dependencies
- pyemojify
- BioPython
- NumPy
## Roadmap
- [x] Rearrange emoji to use more realistic ranges (i.e > 60 use uncommon emoji) and remove inconsistencies
- [x] ~Add conversion to emoji sequence format, with/without binning, for compressed fastq data~ fits into https://github.com/fastqe/biomojify/
- [ ] Rewrite conversion to standalone function for use in iPython etc.
- [ ] Teaching resources
- [ ] Test data and unit tests
- [x] ~Add FASTA mode for nucleotide and proteins emoji~ see https://github.com/fastqe/biomojify/
- [ ] MultiQC plugin
- [ ] ~Galaxy Wrapper~: available form the [IUC toolshed](https://toolshed.g2.bx.psu.edu/repository?repository_id=13576f42f394cfb6)
Rather convert to emoji than summarise? We've just started `biomojify` for that: https://github.com/fastqe/biomojify/
# Contributors
- Andrew Lonsdale
- BjΓΆrn GrΓΌning
- Catherine Bromhead
- Clare Sloggett
- Clarissa Womack
- Helena Rasche
- Maria Doyle
- Michael Franklin
- Nicola Soranzo
- Phil Ewels
## Scale
Use the `--scale` option to include in output.
```
0 ! π«
1 " β
2 # πΊ
3 $ π
4 % π
5 & πΎ
6 ' πΏ
7 ( π
8 ) π»
9 * π
10 + π
11 , π
12 - π΅
13 . πΏ
14 / πΎ
15 0 π
16 1 π£
17 2 π₯
18 3 π‘
19 4 π©
20 5 π¨
21 6 π
22 7 π
23 8 π
24 9 π
25 : π
26 ; π
27 < π
28 = π
29 > π
30 ? π
31 @ π
32 A π
33 B π
34 C π
35 D π
36 E π
37 F π
38 G π
39 H π
40 I π
41 J π
```
Binned scale:
```
0 ! π«
1 " π«
2 # π
3 $ π
4 % π
5 & π
6 ' π
7 ( π
8 ) π
9 * π
10 + π©
11 , π©
12 - π©
13 . π©
14 / π©
15 0 π©
16 1 π©
17 2 π©
18 3 π©
19 4 π©
20 5 π¨
21 6 π¨
22 7 π¨
23 8 π¨
24 9 π¨
25 : π
26 ; π
27 < π
28 = π
29 > π
30 ? π
31 @ π
32 A π
33 B π
34 C π
35 D π
36 E π
37 F π
38 G π
39 H π
40 I π
41 J π
```
## Custom
Use a dictionary of [Pyemojify mappings](https://github.com/lord63/pyemojify/blob/master/pyemojify/emoji.py) in a text file instead of built in emoji choices:
```
{
'#': ':no_entry_sign:',
'\"': ':x:',
'!': ':japanese_goblin:',
'$': ':broken_heart:'
}
```
Emoji characters can also be used directlty instead (experimental):
```
{
'#': ':no_entry_sign:',
'\"': ':x:',
'!': 'πΏ',
'$': ':broken_heart:'
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/fastqe/fastqe",
"name": "fastqe",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "emoji, bioinformatics, next-generation sequencing",
"author": "Andrew Lonsdale",
"author_email": "andrew.lonsdale@lonsbio.com.au",
"download_url": "https://files.pythonhosted.org/packages/7c/7b/d748e7e174a6dd6e3001bdd118a2c27208e75715ce26e3cad206c5183d24/fastqe-0.3.3.tar.gz",
"platform": null,
"description": "![Example](docs/img/logo.png)\n\n# FASTQ with Emoji = FASTQE \ud83e\udd14\n\nRead one or more FASTQ files, [fastqe](https://fastqe.com/) will compute quality stats for each file and print those stats as emoji... for some reason.\n\nGiven a fastq file in Illumina 1.8+/Sanger format, calculate the mean (rounded) score for each position and print a corresponding emoji!\n\n![Example](docs/img/fastqe_binned.png)\n\nhttps://fastqe.com/\n\n# Install\n\nLatest release versions of `fastqe` are available via `pip` or BioConda:\n\n`pip install fastqe`\n\n`conda install -c bioconda fastqe`\n\n## Development\n\nDevelopment version can be isntall from this repository in the `master` branch. \n\n\n# Usage\n\n`fastqe` can display usage information on the command line via the `-h` or `--help` argument:\n```\nusage: fastqe [-h] [--minlen N] [--scale] [--version] [--mean]\n [--custom CUSTOM_DICT] [--bin] [--noemoji] [--min] [--max]\n [--output OUTPUT_FILE] [--long READ_LENGTH] [--log LOG_FILE]\n [FASTQ_FILE [FASTQ_FILE ...]]\n\nRead one or more FASTQ files, compute quality stats for each file, print as\nemoji... for some reason.\ud83d\ude04\n\npositional arguments:\n FASTQ_FILE Input FASTQ files\n\noptional arguments:\n -h, --help show this help message and exit\n --minlen N Minimum length sequence to include in stats (default\n 0)\n --scale show relevant scale in output\n --version show program's version number and exit\n --mean show mean quality per position (DEFAULT)\n --custom CUSTOM_DICT use a mapping of custom emoji to quality in\n CUSTOM_DICT (\ud83d\udc0d\ud83c\udf34)\n --bin use binned scores (\ud83d\udeab\ud83d\udc80\ud83d\udca9\u26a0\ufe0f\ud83d\ude04\ud83d\ude06\ud83d\ude0e\ud83d\ude0d)\n --noemoji use mapping without emoji (\u2581\u2582\u2583\u2584\u2585\u2586\u2587\u2588)\n --min show minimum quality per position\n --max show maximum quality per position\n --output OUTPUT_FILE write output to OUTPUT_FILE instead of stdout\n --long READ_LENGTH enable long reads up to READ_LENGTH bp long\n --log LOG_FILE record program progress in LOG_FILE\n```\n\n\n## Convert\n\n`fastqe` will summarise FASTQ files to display the max, mean and minumum quality using emoji. To convert a file into this format, rather than summarise, you can use the companion program `biomojify` that will convert both sequence and quality information to emoji:\n\n```\n$ cat test.fq\n@ Sequence\nGTGCCAGCCGCCGCGGTAGTCCGACGTGGC\n+\nGGGGGGGGGGGGGGGGGGGGGG!@#$%&%(\n```\n\n```\n$ biomojify fastq test.fq\n\u25b6\ufe0f Sequence\n\ud83c\udf47\ud83c\udf45\ud83c\udf47\ud83c\udf3d\ud83c\udf3d\ud83e\udd51\ud83c\udf47\ud83c\udf3d\ud83c\udf3d\ud83c\udf47\ud83c\udf3d\ud83c\udf3d\ud83c\udf47\ud83c\udf3d\ud83c\udf47\ud83c\udf47\ud83c\udf45\ud83e\udd51\ud83c\udf47\ud83c\udf45\ud83c\udf3d\ud83c\udf3d\ud83c\udf47\ud83e\udd51\ud83c\udf3d\ud83c\udf47\ud83c\udf45\ud83c\udf47\ud83c\udf47\ud83c\udf3d\n\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\ude01\ud83d\udeab\ud83d\ude04\ud83d\udc7a\ud83d\udc94\ud83d\ude45\ud83d\udc7e\ud83d\ude45\ud83d\udc80\n```\n\nIntall with `pip install biomojify`, and see the `biomojify` page for more information: https://github.com/fastqe/biomojify/\n\n\n\n# Quickstart\n\n`fastqe test.fastq`\n\n`fastqe --min test.fastq`\n\n`fastqe --max test.fastq`\n\n`fastqe --max -min -bin test.fastq`\n\n\n# Teaching Materials\n\n## Command line and NGS Introduction\n\nThis lesson introduces NGS process in the command line using by using the results of FASTQE before and after quality filerting\nusing `fastp`:\n\n[https://qubeshub.org/publications/1092/2](https://qubeshub.org/publications/1092/2) \n\n```\nRachael St. Jacques, Max Maza, Sabrina Robertson, Guoqing Lu, Andrew Lonsdale, Ray A Enke (2019).\nA Fun Introductory Command Line Exercise: Next Generation Sequencing Quality Analysis with Emoji!.\nNIBLSE Incubator: Intro to Command Line Coding Genomics Analysis, (Version 2.0).\nQUBES Educational Resources. doi:10.25334/Q4D172\n\n```\n\n## Galaxy\n\nA Galaxy wrapper is available from the [IUC toolshed](https://toolshed.g2.bx.psu.edu/repository?repository_id=13576f42f394cfb6). Contact your Galaxy Admin\n if you would like to have it installed. A Galaxy Tutorial using FASTQE is in development.\n\n![FASTQE in Galaxy](docs/img/galaxy_full.png)\n\n# History\n\nFASTQE started out as part of PyCon Au presentations:\n\n\n- PyCon Au 2016 - [Python for science, side projects and stuff!](https://www.youtube.com/watch?v=PCZS9wqBUuE)\n- PyCon Au 2017 - [Lightning Talk](https://youtu.be/WywQ6a3uQ5I?t=33m18s)\n- BCC 2020 - Short Presentaion\n\n<img src=\"docs/img/fastqe.png\" class=\"img-fluid\" alt=\"Responsive image\">\n\n### Versions\n\n- version 0.0.1 at PyCon Au 2016:\n - Mean position per read\n- version 0.0.2 at PyconAu 2017:\n - update emoji map\n - Max and minimum scores per position added\n - Wrapper code based on early version of [Bionitio](https://github.com/bionitio-team/bionitio) added\n - prepare for PyPi\n- version 0.1.0 July 2018\n - clean up code\n - add binning\n- version 0.2.6 July 2020\n - refactor code\n - add long read support with --long\n - add --noemoji for block-based output on systems that don't support emoji\n - add --custom for user-defined mapping to emoji\n - add --output to redirect to file instead of stdout\n - add gzip support\n - add redirect from stdin support\n - fix bug of dropping position if some sequences are only 0 quality\n- Galaxy Wrapper created July 2020\n- `biomojify` created July 2020\n- version 0.2.7 2021\n - bugfix\n- version 0.3.1 2023\n - HTML reporting for Galaxy \n- version 0.3.3 2024\n - Update emoji that render in default fonts with alternatives\n\n\n# Limitations\n\n- ~Reads up to 500bp only~ Read length above 500bp allowed but must be set by user with `--long MAX_LENGTH`\n- Same emoji for all scores above 41\n\n\n\n## Licence\n\nThis program is released as open source software under the terms of [BSD License](https://raw.githubusercontent.com/fastqe/fastqe/master/LICENSE)\n\n\n## Dependencies\n\n- pyemojify\n- BioPython\n- NumPy\n\n\n## Roadmap\n\n- [x] Rearrange emoji to use more realistic ranges (i.e > 60 use uncommon emoji) and remove inconsistencies\n- [x] ~Add conversion to emoji sequence format, with/without binning, for compressed fastq data~ fits into https://github.com/fastqe/biomojify/\n- [ ] Rewrite conversion to standalone function for use in iPython etc.\n- [ ] Teaching resources\n- [ ] Test data and unit tests\n- [x] ~Add FASTA mode for nucleotide and proteins emoji~ see https://github.com/fastqe/biomojify/\n- [ ] MultiQC plugin\n- [ ] ~Galaxy Wrapper~: available form the [IUC toolshed](https://toolshed.g2.bx.psu.edu/repository?repository_id=13576f42f394cfb6) \n\nRather convert to emoji than summarise? We've just started `biomojify` for that: https://github.com/fastqe/biomojify/\n\n# Contributors\n\n- Andrew Lonsdale \n- Bj\u00f6rn Gr\u00fcning \n- Catherine Bromhead \n- Clare Sloggett \n- Clarissa Womack \n- Helena Rasche \n- Maria Doyle \n- Michael Franklin \n- Nicola Soranzo\n- Phil Ewels\n\n\n\n## Scale\n\nUse the `--scale` option to include in output.\n```\n0 ! \ud83d\udeab\n1 \" \u274c\n2 # \ud83d\udc7a\n3 $ \ud83d\udc94\n4 % \ud83d\ude45\n5 & \ud83d\udc7e\n6 ' \ud83d\udc7f\n7 ( \ud83d\udc80\n8 ) \ud83d\udc7b\n9 * \ud83d\ude48\n10 + \ud83d\ude49\n11 , \ud83d\ude4a\n12 - \ud83d\udc35\n13 . \ud83d\ude3f\n14 / \ud83d\ude3e\n15 0 \ud83d\ude40\n16 1 \ud83d\udca3\n17 2 \ud83d\udd25\n18 3 \ud83d\ude21\n19 4 \ud83d\udca9\n20 5 \ud83d\udea8\n21 6 \ud83d\ude00\n22 7 \ud83d\ude05\n23 8 \ud83d\ude0f\n24 9 \ud83d\ude0a\n25 : \ud83d\ude19\n26 ; \ud83d\ude17\n27 < \ud83d\ude1a\n28 = \ud83d\ude03\n29 > \ud83d\ude18\n30 ? \ud83d\ude06\n31 @ \ud83d\ude04\n32 A \ud83d\ude0b\n33 B \ud83d\ude04\n34 C \ud83d\ude1d\n35 D \ud83d\ude1b\n36 E \ud83d\ude1c\n37 F \ud83d\ude09\n38 G \ud83d\ude01\n39 H \ud83d\ude04\n40 I \ud83d\ude0e\n41 J \ud83d\ude0d\n```\n\nBinned scale:\n\n```\n0 ! \ud83d\udeab\n1 \" \ud83d\udeab\n2 # \ud83d\udc80\n3 $ \ud83d\udc80\n4 % \ud83d\udc80\n5 & \ud83d\udc80\n6 ' \ud83d\udc80\n7 ( \ud83d\udc80\n8 ) \ud83d\udc80\n9 * \ud83d\udc80\n10 + \ud83d\udca9\n11 , \ud83d\udca9\n12 - \ud83d\udca9\n13 . \ud83d\udca9\n14 / \ud83d\udca9\n15 0 \ud83d\udca9\n16 1 \ud83d\udca9\n17 2 \ud83d\udca9\n18 3 \ud83d\udca9\n19 4 \ud83d\udca9\n20 5 \ud83d\udea8\n21 6 \ud83d\udea8\n22 7 \ud83d\udea8\n23 8 \ud83d\udea8\n24 9 \ud83d\udea8\n25 : \ud83d\ude04\n26 ; \ud83d\ude04\n27 < \ud83d\ude04\n28 = \ud83d\ude04\n29 > \ud83d\ude04\n30 ? \ud83d\ude06\n31 @ \ud83d\ude06\n32 A \ud83d\ude06\n33 B \ud83d\ude06\n34 C \ud83d\ude06\n35 D \ud83d\ude0e\n36 E \ud83d\ude0e\n37 F \ud83d\ude0e\n38 G \ud83d\ude0e\n39 H \ud83d\ude0e\n40 I \ud83d\ude0d\n41 J \ud83d\ude0d\n```\n\n## Custom\n\nUse a dictionary of [Pyemojify mappings](https://github.com/lord63/pyemojify/blob/master/pyemojify/emoji.py) in a text file instead of built in emoji choices: \n\n```\n{\n'#': ':no_entry_sign:',\n'\\\"': ':x:',\n'!': ':japanese_goblin:',\n'$': ':broken_heart:'\n}\n```\n\nEmoji characters can also be used directlty instead (experimental):\n\n```\n{\n'#': ':no_entry_sign:',\n'\\\"': ':x:',\n'!': '\ud83d\udc7f',\n'$': ':broken_heart:'\n}\n```\n",
"bugtrack_url": null,
"license": "BSD-3-Clause",
"summary": "A emoji based bioinformatics command line tool",
"version": "0.3.3",
"project_urls": {
"Download": "https://github.com/fastqe/fastqe/tarball/v0.3.3",
"Homepage": "https://github.com/fastqe/fastqe"
},
"split_keywords": [
"emoji",
" bioinformatics",
" next-generation sequencing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0326e75fe9d149e4e0c26b9bd0a29d6a09bbf6b98f8459216d27c7f19a417699",
"md5": "5ec1cc47eab586d08fd469d34495aa98",
"sha256": "11a3568fdd1416c2f5ee27b57a5daa1bcfe2d58a37d87edd07a2880797efb963"
},
"downloads": -1,
"filename": "fastqe-0.3.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5ec1cc47eab586d08fd469d34495aa98",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 16208,
"upload_time": "2024-10-25T12:21:12",
"upload_time_iso_8601": "2024-10-25T12:21:12.327146Z",
"url": "https://files.pythonhosted.org/packages/03/26/e75fe9d149e4e0c26b9bd0a29d6a09bbf6b98f8459216d27c7f19a417699/fastqe-0.3.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7c7bd748e7e174a6dd6e3001bdd118a2c27208e75715ce26e3cad206c5183d24",
"md5": "cc282d5aae9ca8e8287d0b621d96a5d3",
"sha256": "6ddcef4a9d25e22d7391ea6f6234181f3ec22c0767d07c90cb2c66deb4ec3881"
},
"downloads": -1,
"filename": "fastqe-0.3.3.tar.gz",
"has_sig": false,
"md5_digest": "cc282d5aae9ca8e8287d0b621d96a5d3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20214,
"upload_time": "2024-10-25T12:21:13",
"upload_time_iso_8601": "2024-10-25T12:21:13.686740Z",
"url": "https://files.pythonhosted.org/packages/7c/7b/d748e7e174a6dd6e3001bdd118a2c27208e75715ce26e3cad206c5183d24/fastqe-0.3.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-25 12:21:13",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "fastqe",
"github_project": "fastqe",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "fastqe"
}