# DiMA - Diversity Motif Analyser
![PyPI - Downloads](https://img.shields.io/pypi/dm/dima-cli)
![GitHub closed issues](https://img.shields.io/github/issues-closed-raw/PU-SDS/DiMA)
![GitHub issues](https://img.shields.io/github/issues-raw/PU-SDS/DiMA)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/dima-cli)
![PyPI](https://img.shields.io/pypi/v/dima-cli)
![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/PU-SDS/DiMA)
## Table of Contents
- [What is DiMA?](#what-is-dima)
- [Publications](#publications)
- [Installation](#installation)
- [Basic Usage](#basic-usage)
- [Shell Command](#shell-command)
- [Python](#python)
- [Results](#results)
- [Advance Usage](#advance-usage)
- [Shell Command](#shell-command)
- [Python](#python)
- [Results](#results)
- [Command-Line Arguments](#command-line-arguments)
- [Module Parameters](#module-parameters)
## What is DiMA?
Protein sequence diversity is one of the major challenges in the design of diagnostic, prophylactic and therapeutic
interventions against viruses. DiMA is a tool designed to facilitate the dissection of protein sequence diversity
dynamics for viruses. DiMA provides a quantitative measure of sequence diversity by use of Shannon’s entropy,
applied via a user-defined k-mer sliding window. Further, the entropy value is corrected for sample size bias by
applying a statistical adjustment.
Additionally, DiMA further interrogates the diversity by dissecting the entropy value at each k-mer position to various
diversity motifs. The distinct k-mer sequences at each position are classified into the following motifs based on
their incidence.
- **Index**: The predominant sequence.
- **Major**: The sequence with the second highest incidence after the Index.
- **Minor**: Kmers with incidence in between major and unique motifs
- **Unique**: Kmers which are only seen once in a particular kmer position.
Moreover, the description line of the sequences in the alignment can be
formatted for inclusion of meta-data that can be tagged to the diversity motifs. DiMA enables comparative diversity
dynamics analysis, within and between proteins of a virus species, and proteomes of different viral species.
## Publications
- https://arxiv.org/abs/2205.13915
## Installation
`pip install dima-cli`
## Basic Usage
### Shell Command
```shell
dima-cli -i aligned_sequences.afa -o results.json
```
### Python
```python
from dima import Dima
results = Dima(sequences="aligned_sequences.afa").run()
```
### Results
<details>
<summary>Click to view basic results</summary>
```
{
"sequence_count": 5,
"support_threshold": 30,
"low_support_count": 20,
"query_name": "Unknown Query",
"kmer_length": 9,
"average_entropy": 0.06854034285524647,
"highest_entropy": {
"position": 186,
"entropy": 1.3921472236645345
},
"results": [
{
"position": 1,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "MSASKEIKS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "SAGVYMGNL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 2,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "AGVYMGNLS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "SASKEIKSF",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 3,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "GVYMGNLSS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "ASKEIKSFL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 4,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "VYMGNLSSQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "SKEIKSFLW",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 5,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "KEIKSFLWT",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "YMGNLSSQQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 6,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "MGNLSSQQL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "EIKSFLWTQ",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 7,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "IKSFLWTQS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "GNLSSQQLD",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 8,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "KSFLWTQSL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "NLSSQQLDQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 9,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SFLWTQSLR",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "LSSQQLDQR",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 10,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SSQQLDQRR",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "FLWTQSLRR",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 11,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "LWTQSLRRE",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "SQQLDQRRA",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 12,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "QQLDQRRAL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "WTQSLRREL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 13,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "TQSLRRELS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "QLDQRRALL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 14,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "QSLRRELSG",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "LDQRRALLS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "QSLRRELSS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 15,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "DQRRALLSM",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "SLRRELSGY",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "SLRRELSSY",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 16,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "QRRALLSMI",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "LRRELSSYC",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "LRRELSGYC",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 17,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "RRELSGYCS",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "RRALLSMIG",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "RRELSSYCS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 18,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "RELSGYCSN",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "RALLSMIGM",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "RELSSYCSN",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 19,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "ALLSMIGMS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "ELSSYCSNI",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "ELSGYCSNI",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 20,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "LSGYCSNIK",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "LLSMIGMSG",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "LSSYCSNIK",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
}
]
}
```
</details>
## Advance Usage
### Shell Command
```shell
dima-cli -i aligned_sequences.afa -o results.json -f "accession|strain|country|date"
```
### Python
```python
from dima import Dima
results = Dima(sequences="aligned_sequences.afa", header_format="accession|strain|country|date").run()
```
### Results
<details>
<summary>Click to view advanced results</summary>
```
{
"sequence_count": 5,
"support_threshold": 30,
"low_support_count": 20,
"query_name": "Unknown Query",
"kmer_length": 9,
"average_entropy": 0.06854034285524647,
"highest_entropy": {
"position": 186,
"entropy": 1.3921472236645345
},
"results": [
{
"position": 1,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "MSASKEIKS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75365.1": 1,
"QEP52131.1": 1,
"AYD75321.1": 1,
"AYD75325.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Homo sapiens": 3,
"Unknown": 1
},
"Year": {
"1977": 1,
"2012": 1,
"1980": 1,
"1979": 1
}
}
},
{
"sequence": "SAGVYMGNL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
},
"Species": {
"Homo sapiens": 1
}
}
}
]
},
{
"position": 2,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SASKEIKSF",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Species": {
"Homo sapiens": 3,
"Unknown": 1
},
"Year": {
"1977": 1,
"1980": 1,
"1979": 1,
"2012": 1
},
"Country": {
"Sierra Leone": 4
},
"Accession": {
"AYD75325.1": 1,
"QEP52131.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1
}
}
},
{
"sequence": "AGVYMGNLS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
},
{
"position": 3,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "ASKEIKSFL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Year": {
"1980": 1,
"1977": 1,
"1979": 1,
"2012": 1
},
"Accession": {
"AYD75321.1": 1,
"AYD75365.1": 1,
"QEP52131.1": 1,
"AYD75325.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
}
}
},
{
"sequence": "GVYMGNLSS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 4,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SKEIKSFLW",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75325.1": 1,
"AYD75365.1": 1,
"QEP52131.1": 1,
"AYD75321.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Year": {
"2012": 1,
"1979": 1,
"1980": 1,
"1977": 1
}
}
},
{
"sequence": "VYMGNLSSQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
},
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
},
{
"position": 5,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "KEIKSFLWT",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"QEP52131.1": 1,
"AYD75325.1": 1,
"AYD75321.1": 1,
"AYD75365.1": 1
},
"Year": {
"1979": 1,
"1980": 1,
"1977": 1,
"2012": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Homo sapiens": 3,
"Unknown": 1
}
}
},
{
"sequence": "YMGNLSSQQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
}
}
}
]
},
{
"position": 6,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "MGNLSSQQL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
}
}
},
{
"sequence": "EIKSFLWTQ",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75321.1": 1,
"QEP52131.1": 1,
"AYD75365.1": 1,
"AYD75325.1": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Country": {
"Sierra Leone": 4
},
"Year": {
"1977": 1,
"1980": 1,
"2012": 1,
"1979": 1
}
}
}
]
},
{
"position": 7,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "GNLSSQQLD",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
}
}
},
{
"sequence": "IKSFLWTQS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75365.1": 1,
"AYD75325.1": 1,
"QEP52131.1": 1,
"AYD75321.1": 1
},
"Year": {
"1979": 1,
"1980": 1,
"1977": 1,
"2012": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Country": {
"Sierra Leone": 4
}
}
}
]
},
{
"position": 8,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "NLSSQQLDQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
}
}
},
{
"sequence": "KSFLWTQSL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 4
},
"Year": {
"1979": 1,
"2012": 1,
"1977": 1,
"1980": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Accession": {
"AYD75325.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1,
"QEP52131.1": 1
}
}
}
]
},
{
"position": 9,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SFLWTQSLR",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Year": {
"2012": 1,
"1979": 1,
"1980": 1,
"1977": 1
},
"Accession": {
"QEP52131.1": 1,
"AYD75325.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
}
}
},
{
"sequence": "LSSQQLDQR",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
},
{
"position": 10,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "FLWTQSLRR",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Year": {
"1977": 1,
"2012": 1,
"1980": 1,
"1979": 1
},
"Accession": {
"AYD75321.1": 1,
"AYD75325.1": 1,
"AYD75365.1": 1,
"QEP52131.1": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Country": {
"Sierra Leone": 4
}
}
},
{
"sequence": "SSQQLDQRR",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
},
{
"position": 11,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "LWTQSLRRE",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 4
},
"Accession": {
"QEP52131.1": 1,
"AYD75365.1": 1,
"AYD75325.1": 1,
"AYD75321.1": 1
},
"Species": {
"Homo sapiens": 3,
"Unknown": 1
},
"Year": {
"1979": 1,
"1980": 1,
"2012": 1,
"1977": 1
}
}
},
{
"sequence": "SQQLDQRRA",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 12,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "QQLDQRRAL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
}
}
},
{
"sequence": "WTQSLRREL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 4
},
"Year": {
"1980": 1,
"2012": 1,
"1979": 1,
"1977": 1
},
"Accession": {
"QEP52131.1": 1,
"AYD75321.1": 1,
"AYD75325.1": 1,
"AYD75365.1": 1
},
"Species": {
"Homo sapiens": 3,
"Unknown": 1
}
}
}
]
},
{
"position": 13,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "TQSLRRELS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75365.1": 1,
"AYD75321.1": 1,
"QEP52131.1": 1,
"AYD75325.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Year": {
"1977": 1,
"1979": 1,
"2012": 1,
"1980": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
}
}
},
{
"sequence": "QLDQRRALL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
}
}
}
]
},
{
"position": 14,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "QSLRRELSG",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 3
},
"Accession": {
"AYD75325.1": 1,
"AYD75321.1": 1,
"AYD75365.1": 1
},
"Species": {
"Homo sapiens": 2,
"Unknown": 1
},
"Year": {
"1979": 1,
"1980": 1,
"1977": 1
}
}
},
{
"sequence": "QSLRRELSS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Year": {
"2012": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"QEP52131.1": 1
}
}
},
{
"sequence": "LDQRRALLS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 15,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "DQRRALLSM",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Accession": {
"AYD75329.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
}
}
},
{
"sequence": "SLRRELSSY",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"QEP52131.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"2012": 1
}
}
},
{
"sequence": "SLRRELSGY",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Species": {
"Unknown": 1,
"Homo sapiens": 2
},
"Year": {
"1977": 1,
"1980": 1,
"1979": 1
},
"Country": {
"Sierra Leone": 3
},
"Accession": {
"AYD75365.1": 1,
"AYD75325.1": 1,
"AYD75321.1": 1
}
}
}
]
},
{
"position": 16,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "LRRELSSYC",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Accession": {
"QEP52131.1": 1
},
"Year": {
"2012": 1
},
"Species": {
"Homo sapiens": 1
}
}
},
{
"sequence": "LRRELSGYC",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Year": {
"1979": 1,
"1977": 1,
"1980": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 2
},
"Country": {
"Sierra Leone": 3
},
"Accession": {
"AYD75365.1": 1,
"AYD75321.1": 1,
"AYD75325.1": 1
}
}
},
{
"sequence": "QRRALLSMI",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 17,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "RRELSSYCS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Year": {
"2012": 1
},
"Accession": {
"QEP52131.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
}
}
},
{
"sequence": "RRELSGYCS",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Species": {
"Homo sapiens": 2,
"Unknown": 1
},
"Year": {
"1980": 1,
"1977": 1,
"1979": 1
},
"Accession": {
"AYD75325.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1
},
"Country": {
"Sierra Leone": 3
}
}
},
{
"sequence": "RRALLSMIG",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 18,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "RALLSMIGM",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
}
}
},
{
"sequence": "RELSSYCSN",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Accession": {
"QEP52131.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"2012": 1
}
}
},
{
"sequence": "RELSGYCSN",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75325.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1
},
"Species": {
"Homo sapiens": 2,
"Unknown": 1
},
"Year": {
"1980": 1,
"1977": 1,
"1979": 1
},
"Country": {
"Sierra Leone": 3
}
}
}
]
},
{
"position": 19,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "ELSGYCSNI",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 3
},
"Species": {
"Homo sapiens": 2,
"Unknown": 1
},
"Accession": {
"AYD75365.1": 1,
"AYD75325.1": 1,
"AYD75321.1": 1
},
"Year": {
"1977": 1,
"1979": 1,
"1980": 1
}
}
},
{
"sequence": "ELSSYCSNI",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"QEP52131.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"2012": 1
}
}
},
{
"sequence": "ALLSMIGMS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Accession": {
"AYD75329.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 20,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "LLSMIGMSG",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
}
}
},
{
"sequence": "LSGYCSNIK",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 3
},
"Year": {
"1979": 1,
"1977": 1,
"1980": 1
},
"Accession": {
"AYD75365.1": 1,
"AYD75321.1": 1,
"AYD75325.1": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 2
}
}
},
{
"sequence": "LSSYCSNIK",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Year": {
"2012": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"QEP52131.1": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
}
]
}
```
</details>
## Command-Line Arguments
| **Argument** | **Type** | **Required** | **Default** | **Example** | **Description** |
|--------------|--------------------|--------------|---------------|--------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|
| -h | N/A | False | N/A | `dima-cli -h` | Prints a summary of all available command-line arguments. |
| -n | String | False | Unknown | `dima-cli -i sequences.afa -o results.json -f "accession\|strain\|country" -n "NA"` -n "Unknown" | Silently fix missing values in the FASTA header with given value. |
| -v | N/A | False | N/A | `dima-cli -v` | Prints the version of dima-cli that is currently installed. |
| -q | String | False | Unknown Query | `dima-cli -q "Coronavirus Surface Protein" -i sequences.afa -o results.json` | The name of the sample that will appear on the results. |
| -i | String | True | N/A | `dima-cli -i sequences.afa -o results.json` | The path to the FASTA Multiple Sequence Alignment file. |
| -o | String | True | N/A | `dima-cli -i sequences.afa -o results,json` | The location where the results shall be saved. |
| -l | Integer | False | 9 | `dima-cli -i sequences.afa -l 12 -o results.json` | The length of the kmers generated. |
| -f | String | False | N/A | `dima-cli -i sequences.afa -f "accession\|strain\|country" -o results.json` | The format of the FASTA header. Labels where each variant of a kmer position originated from. |
| -s | Integer | False | 30 | `dima-cli -i sequences.afa -l 12 -s 40 -o results.json` | The minimum required support for each kmer position. |
| -a | nucleotide/protein | False | protein | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json` | The alphabet of the sequences (ie: `protein`/`nucleotide`, default: protein) |
| -t | json/xlsx | False | json | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -t xlsx` | The output format (ie: `json`/`xlsx`, default: json) |
| -c | String | False | N/A | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -c hcs.json` | Path to save Highly Conserved Sequences (HCS) in JSON format. |
| -e | Float | False | 100 | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -c hcs.json -e 90.5` | Minimum incidence (%) threshold for HCS concatenation. |
## Module Parameters
| **Parameter** | **Type** | **Required** | **Default** | **Description** |
|-------------------|-----------------|--------------|-----------------|-----------------------------------------------------------------------------------------------------------------|
| sequences | String/StringIO | True | N/A | The path to a FASTA Multiple Sequence Alignment file (MSA), or a StringIO object containing FASTA MSA. |
| kmer_length | Integer | False | 9 | The length of the kmers generated. |
| header_fillna | String | False | Unknown | Silently fix missing values in the FASTA header with given value (only required when `header_format` is given). |
| header_format | String | False | N/A | The format of the FASTA header. Labels where each variant of a kmer position originated from. |
| support_threshold | Integer | False | 30 | The minimum required support for each kmer position. |
| query_name | String | False | Unknown Query | The name of the sample that will appear on the results. |
| alphabet | String | False | protein | The alphabet of the sequences (ie: protein/nucleotide, default: protein) |
Raw data
{
"_id": null,
"home_page": null,
"name": "dima-cli",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7,<3.11",
"maintainer_email": null,
"keywords": "bioinformatics,biology,protein,virus,diversity,dna,rna",
"author": "Shan Tharanga <stwm2@student.london.ac.uk>",
"author_email": "Shan Tharanga <stwm2@student.london.ac.uk>",
"download_url": null,
"platform": null,
"description": "# DiMA - Diversity Motif Analyser\n![PyPI - Downloads](https://img.shields.io/pypi/dm/dima-cli)\n![GitHub closed issues](https://img.shields.io/github/issues-closed-raw/PU-SDS/DiMA)\n![GitHub issues](https://img.shields.io/github/issues-raw/PU-SDS/DiMA)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/dima-cli)\n![PyPI](https://img.shields.io/pypi/v/dima-cli)\n![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/PU-SDS/DiMA)\n\n## Table of Contents\n- [What is DiMA?](#what-is-dima)\n- [Publications](#publications)\n- [Installation](#installation)\n- [Basic Usage](#basic-usage)\n - [Shell Command](#shell-command)\n - [Python](#python)\n - [Results](#results)\n- [Advance Usage](#advance-usage)\n - [Shell Command](#shell-command)\n - [Python](#python)\n - [Results](#results)\n- [Command-Line Arguments](#command-line-arguments)\n- [Module Parameters](#module-parameters)\n\n## What is DiMA?\n\nProtein sequence diversity is one of the major challenges in the design of diagnostic, prophylactic and therapeutic \ninterventions against viruses. DiMA is a tool designed to facilitate the dissection of protein sequence diversity \ndynamics for viruses. DiMA provides a quantitative measure of sequence diversity by use of Shannon\u2019s entropy, \napplied via a user-defined k-mer sliding window. Further, the entropy value is corrected for sample size bias by \napplying a statistical adjustment. \nAdditionally, DiMA further interrogates the diversity by dissecting the entropy value at each k-mer position to various \ndiversity motifs. The distinct k-mer sequences at each position are classified into the following motifs based on \ntheir incidence. \n\n - **Index**: The predominant sequence. \n - **Major**: The sequence with the second highest incidence after the Index.\n - **Minor**: Kmers with incidence in between major and unique motifs\n - **Unique**: Kmers which are only seen once in a particular kmer position. \n \nMoreover, the description line of the sequences in the alignment can be \nformatted for inclusion of meta-data that can be tagged to the diversity motifs. DiMA enables comparative diversity \ndynamics analysis, within and between proteins of a virus species, and proteomes of different viral species.\n\n## Publications\n- https://arxiv.org/abs/2205.13915\n\n## Installation\n\n`pip install dima-cli`\n\n## Basic Usage\n### Shell Command\n```shell\ndima-cli -i aligned_sequences.afa -o results.json\n```\n\n### Python\n```python\nfrom dima import Dima\nresults = Dima(sequences=\"aligned_sequences.afa\").run()\n```\n### Results\n<details>\n<summary>Click to view basic results</summary>\n\n```\n{\n \"sequence_count\": 5,\n \"support_threshold\": 30,\n \"low_support_count\": 20,\n \"query_name\": \"Unknown Query\",\n \"kmer_length\": 9,\n \"average_entropy\": 0.06854034285524647,\n \"highest_entropy\": {\n \"position\": 186,\n \"entropy\": 1.3921472236645345\n },\n \"results\": [\n {\n \"position\": 1,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"MSASKEIKS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SAGVYMGNL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 2,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"AGVYMGNLS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SASKEIKSF\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 3,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"GVYMGNLSS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"ASKEIKSFL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 4,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"VYMGNLSSQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SKEIKSFLW\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 5,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"KEIKSFLWT\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"YMGNLSSQQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 6,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"MGNLSSQQL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"EIKSFLWTQ\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 7,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"IKSFLWTQS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"GNLSSQQLD\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 8,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"KSFLWTQSL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"NLSSQQLDQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 9,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SFLWTQSLR\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LSSQQLDQR\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 10,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SSQQLDQRR\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"FLWTQSLRR\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 11,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LWTQSLRRE\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SQQLDQRRA\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 12,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QQLDQRRAL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"WTQSLRREL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 13,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"TQSLRRELS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"QLDQRRALL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 14,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QSLRRELSG\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LDQRRALLS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"QSLRRELSS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 15,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"DQRRALLSM\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SLRRELSGY\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SLRRELSSY\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 16,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QRRALLSMI\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LRRELSSYC\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LRRELSGYC\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 17,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"RRELSGYCS\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"RRALLSMIG\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"RRELSSYCS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 18,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"RELSGYCSN\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"RALLSMIGM\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"RELSSYCSN\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 19,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"ALLSMIGMS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"ELSSYCSNI\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"ELSGYCSNI\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 20,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LSGYCSNIK\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LLSMIGMSG\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LSSYCSNIK\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n }\n ]\n}\n```\n</details>\n\n## Advance Usage\n### Shell Command\n```shell\ndima-cli -i aligned_sequences.afa -o results.json -f \"accession|strain|country|date\"\n```\n\n### Python\n```python\nfrom dima import Dima\nresults = Dima(sequences=\"aligned_sequences.afa\", header_format=\"accession|strain|country|date\").run()\n```\n### Results\n<details>\n<summary>Click to view advanced results</summary>\n\n```\n{\n \"sequence_count\": 5,\n \"support_threshold\": 30,\n \"low_support_count\": 20,\n \"query_name\": \"Unknown Query\",\n \"kmer_length\": 9,\n \"average_entropy\": 0.06854034285524647,\n \"highest_entropy\": {\n \"position\": 186,\n \"entropy\": 1.3921472236645345\n },\n \"results\": [\n {\n \"position\": 1,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"MSASKEIKS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1977\": 1,\n \"2012\": 1,\n \"1980\": 1,\n \"1979\": 1\n }\n }\n },\n {\n \"sequence\": \"SAGVYMGNL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 2,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SASKEIKSF\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1977\": 1,\n \"1980\": 1,\n \"1979\": 1,\n \"2012\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1\n }\n }\n },\n {\n \"sequence\": \"AGVYMGNLS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 3,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"ASKEIKSFL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Year\": {\n \"1980\": 1,\n \"1977\": 1,\n \"1979\": 1,\n \"2012\": 1\n },\n \"Accession\": {\n \"AYD75321.1\": 1,\n \"AYD75365.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n }\n }\n },\n {\n \"sequence\": \"GVYMGNLSS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 4,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SKEIKSFLW\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Year\": {\n \"2012\": 1,\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1\n }\n }\n },\n {\n \"sequence\": \"VYMGNLSSQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 5,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"KEIKSFLWT\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"QEP52131.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75365.1\": 1\n },\n \"Year\": {\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1,\n \"2012\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n }\n }\n },\n {\n \"sequence\": \"YMGNLSSQQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 6,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"MGNLSSQQL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n },\n {\n \"sequence\": \"EIKSFLWTQ\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75321.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Year\": {\n \"1977\": 1,\n \"1980\": 1,\n \"2012\": 1,\n \"1979\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 7,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"GNLSSQQLD\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n },\n {\n \"sequence\": \"IKSFLWTQS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Year\": {\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1,\n \"2012\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Country\": {\n \"Sierra Leone\": 4\n }\n }\n }\n ]\n },\n {\n \"position\": 8,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"NLSSQQLDQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n }\n }\n },\n {\n \"sequence\": \"KSFLWTQSL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Year\": {\n \"1979\": 1,\n \"2012\": 1,\n \"1977\": 1,\n \"1980\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1,\n \"QEP52131.1\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 9,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SFLWTQSLR\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Year\": {\n \"2012\": 1,\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n }\n }\n },\n {\n \"sequence\": \"LSSQQLDQR\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 10,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"FLWTQSLRR\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Year\": {\n \"1977\": 1,\n \"2012\": 1,\n \"1980\": 1,\n \"1979\": 1\n },\n \"Accession\": {\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"QEP52131.1\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Country\": {\n \"Sierra Leone\": 4\n }\n }\n },\n {\n \"sequence\": \"SSQQLDQRR\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 11,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LWTQSLRRE\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Accession\": {\n \"QEP52131.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1979\": 1,\n \"1980\": 1,\n \"2012\": 1,\n \"1977\": 1\n }\n }\n },\n {\n \"sequence\": \"SQQLDQRRA\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 12,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QQLDQRRAL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n }\n }\n },\n {\n \"sequence\": \"WTQSLRREL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Year\": {\n \"1980\": 1,\n \"2012\": 1,\n \"1979\": 1,\n \"1977\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 13,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"TQSLRRELS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Year\": {\n \"1977\": 1,\n \"1979\": 1,\n \"2012\": 1,\n \"1980\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n }\n }\n },\n {\n \"sequence\": \"QLDQRRALL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 14,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QSLRRELSG\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75365.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 2,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1\n }\n }\n },\n {\n \"sequence\": \"QSLRRELSS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"2012\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n }\n }\n },\n {\n \"sequence\": \"LDQRRALLS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 15,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"DQRRALLSM\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n },\n {\n \"sequence\": \"SLRRELSSY\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"2012\": 1\n }\n }\n },\n {\n \"sequence\": \"SLRRELSGY\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 2\n },\n \"Year\": {\n \"1977\": 1,\n \"1980\": 1,\n \"1979\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 16,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LRRELSSYC\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Year\": {\n \"2012\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n },\n {\n \"sequence\": \"LRRELSGYC\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Year\": {\n \"1979\": 1,\n \"1977\": 1,\n \"1980\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 2\n },\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1\n }\n }\n },\n {\n \"sequence\": \"QRRALLSMI\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 17,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"RRELSSYCS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Year\": {\n \"2012\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n },\n {\n \"sequence\": \"RRELSGYCS\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 2,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1980\": 1,\n \"1977\": 1,\n \"1979\": 1\n },\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 3\n }\n }\n },\n {\n \"sequence\": \"RRALLSMIG\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 18,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"RALLSMIGM\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n },\n {\n \"sequence\": \"RELSSYCSN\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"2012\": 1\n }\n }\n },\n {\n \"sequence\": \"RELSGYCSN\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 2,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1980\": 1,\n \"1977\": 1,\n \"1979\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 3\n }\n }\n }\n ]\n },\n {\n \"position\": 19,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"ELSGYCSNI\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Species\": {\n \"Homo sapiens\": 2,\n \"Unknown\": 1\n },\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Year\": {\n \"1977\": 1,\n \"1979\": 1,\n \"1980\": 1\n }\n }\n },\n {\n \"sequence\": \"ELSSYCSNI\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"2012\": 1\n }\n }\n },\n {\n \"sequence\": \"ALLSMIGMS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 20,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LLSMIGMSG\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n },\n {\n \"sequence\": \"LSGYCSNIK\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Year\": {\n \"1979\": 1,\n \"1977\": 1,\n \"1980\": 1\n },\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 2\n }\n }\n },\n {\n \"sequence\": \"LSSYCSNIK\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Year\": {\n \"2012\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n }\n ]\n}\n```\n</details>\n\n## Command-Line Arguments\n| **Argument** | **Type** | **Required** | **Default** | **Example** | **Description** |\n|--------------|--------------------|--------------|---------------|--------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|\n| -h | N/A | False | N/A | `dima-cli -h` | Prints a summary of all available command-line arguments. |\n| -n | String | False | Unknown | `dima-cli -i sequences.afa -o results.json -f \"accession\\|strain\\|country\" -n \"NA\"` -n \"Unknown\" | Silently fix missing values in the FASTA header with given value. |\n| -v | N/A | False | N/A | `dima-cli -v` | Prints the version of dima-cli that is currently installed. |\n| -q | String | False | Unknown Query | `dima-cli -q \"Coronavirus Surface Protein\" -i sequences.afa -o results.json` | The name of the sample that will appear on the results. |\n| -i | String | True | N/A | `dima-cli -i sequences.afa -o results.json` | The path to the FASTA Multiple Sequence Alignment file. |\n| -o | String | True | N/A | `dima-cli -i sequences.afa -o results,json` | The location where the results shall be saved. |\n| -l | Integer | False | 9 | `dima-cli -i sequences.afa -l 12 -o results.json` | The length of the kmers generated. |\n| -f | String | False | N/A | `dima-cli -i sequences.afa -f \"accession\\|strain\\|country\" -o results.json` | The format of the FASTA header. Labels where each variant of a kmer position originated from. |\n| -s | Integer | False | 30 | `dima-cli -i sequences.afa -l 12 -s 40 -o results.json` | The minimum required support for each kmer position. |\n| -a | nucleotide/protein | False | protein | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json` | The alphabet of the sequences (ie: `protein`/`nucleotide`, default: protein) |\n| -t | json/xlsx | False | json | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -t xlsx` | The output format (ie: `json`/`xlsx`, default: json) |\n| -c | String | False | N/A | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -c hcs.json` | Path to save Highly Conserved Sequences (HCS) in JSON format. |\n| -e | Float | False | 100 | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -c hcs.json -e 90.5` | Minimum incidence (%) threshold for HCS concatenation. |\n\n\n## Module Parameters\n| **Parameter** | **Type** | **Required** | **Default** | **Description** |\n|-------------------|-----------------|--------------|-----------------|-----------------------------------------------------------------------------------------------------------------|\n| sequences | String/StringIO | True | N/A | The path to a FASTA Multiple Sequence Alignment file (MSA), or a StringIO object containing FASTA MSA. |\n| kmer_length | Integer | False | 9 | The length of the kmers generated. |\n| header_fillna | String | False | Unknown | Silently fix missing values in the FASTA header with given value (only required when `header_format` is given). |\n| header_format | String | False | N/A | The format of the FASTA header. Labels where each variant of a kmer position originated from. |\n| support_threshold | Integer | False | 30 | The minimum required support for each kmer position. |\n| query_name | String | False | Unknown Query | The name of the sample that will appear on the results. |\n| alphabet | String | False | protein | The alphabet of the sequences (ie: protein/nucleotide, default: protein) |\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A command-line tool that analyses the diversity and motifs of biological sequences",
"version": "4.1.3",
"split_keywords": [
"bioinformatics",
"biology",
"protein",
"virus",
"diversity",
"dna",
"rna"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4398ff8214acd1370e44157278c33b757c785754daeabcad7561e99143c8c93c",
"md5": "47bb0cd4ceae3101bcb6b672b15bb95d",
"sha256": "ae1641148b3085e3af3f13da890ff277535015963abf0f75f86fa9554ba8b781"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl",
"has_sig": false,
"md5_digest": "47bb0cd4ceae3101bcb6b672b15bb95d",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.7,<3.11",
"size": 947314,
"upload_time": "2023-01-10T12:08:51",
"upload_time_iso_8601": "2023-01-10T12:08:51.759476Z",
"url": "https://files.pythonhosted.org/packages/43/98/ff8214acd1370e44157278c33b757c785754daeabcad7561e99143c8c93c/dima_cli-4.1.3-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7cb867e404555fa1a56031c137d6cfa604e15bdb652560c87e7a427294860a18",
"md5": "f91b21c8007f1a2be7f98c4be53f7907",
"sha256": "d2036b4bb5af4d1811f621f8e86d36777f99b769e32f3c934f5ba82e354f0119"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp310-cp310-manylinux_2_24_x86_64.whl",
"has_sig": false,
"md5_digest": "f91b21c8007f1a2be7f98c4be53f7907",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.7,<3.11",
"size": 518669,
"upload_time": "2023-01-10T12:06:46",
"upload_time_iso_8601": "2023-01-10T12:06:46.043998Z",
"url": "https://files.pythonhosted.org/packages/7c/b8/67e404555fa1a56031c137d6cfa604e15bdb652560c87e7a427294860a18/dima_cli-4.1.3-cp310-cp310-manylinux_2_24_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7d6c9891f40ce6f7d4a78cc7a2c05c102f236d85d3baa95a56a4fa0a333ae349",
"md5": "c2b1f8de766336182cff4b27645bca61",
"sha256": "186640ad97aa9a6b315fe69741b94f1325280e7dee9fc2b6c00e4d106470bfce"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp310-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "c2b1f8de766336182cff4b27645bca61",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.7,<3.11",
"size": 435834,
"upload_time": "2023-01-10T12:10:17",
"upload_time_iso_8601": "2023-01-10T12:10:17.393656Z",
"url": "https://files.pythonhosted.org/packages/7d/6c/9891f40ce6f7d4a78cc7a2c05c102f236d85d3baa95a56a4fa0a333ae349/dima_cli-4.1.3-cp310-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8b4fe382f9960be63d89862d420f8b11b90186aaa79bcd1933a248814968dd9e",
"md5": "7a55111708072546823f2a7c175f1a5c",
"sha256": "2b2b5ff59caba5ecc988637ef00574b32a98cc51d9ddb93ba4c9aa361c20c23c"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp37-cp37m-manylinux_2_24_x86_64.whl",
"has_sig": false,
"md5_digest": "7a55111708072546823f2a7c175f1a5c",
"packagetype": "bdist_wheel",
"python_version": "cp37",
"requires_python": ">=3.7,<3.11",
"size": 519067,
"upload_time": "2023-01-10T12:06:36",
"upload_time_iso_8601": "2023-01-10T12:06:36.035958Z",
"url": "https://files.pythonhosted.org/packages/8b/4f/e382f9960be63d89862d420f8b11b90186aaa79bcd1933a248814968dd9e/dima_cli-4.1.3-cp37-cp37m-manylinux_2_24_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1b0bab32cd4680bdf30e2b4707dbd69a618b6959ee6cae01be595deed5ab8ab8",
"md5": "4865259da85402716d36d177723d22c6",
"sha256": "453c42a669c7d7d6b2c280a4b39fa98107686ad22742d3609f24a8705dd02fcf"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp37-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "4865259da85402716d36d177723d22c6",
"packagetype": "bdist_wheel",
"python_version": "cp37",
"requires_python": ">=3.7,<3.11",
"size": 436507,
"upload_time": "2023-01-10T12:08:26",
"upload_time_iso_8601": "2023-01-10T12:08:26.079469Z",
"url": "https://files.pythonhosted.org/packages/1b/0b/ab32cd4680bdf30e2b4707dbd69a618b6959ee6cae01be595deed5ab8ab8/dima_cli-4.1.3-cp37-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4211c8dc95d348465efaa52198dff2a75e78b43d304a25bf5b07bc62c2b7f3c9",
"md5": "38e40ede14118cd42b80df6ef8d0d430",
"sha256": "7ee04ba4d1ccc49eed876081efb9272abdc12f8f289581f78d822c2a74508f6d"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl",
"has_sig": false,
"md5_digest": "38e40ede14118cd42b80df6ef8d0d430",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.7,<3.11",
"size": 947374,
"upload_time": "2023-01-10T12:11:32",
"upload_time_iso_8601": "2023-01-10T12:11:32.564089Z",
"url": "https://files.pythonhosted.org/packages/42/11/c8dc95d348465efaa52198dff2a75e78b43d304a25bf5b07bc62c2b7f3c9/dima_cli-4.1.3-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d0c695b3879dc095e4d0120d760d1e612f316647d1a3b68e4907693cc547c5dc",
"md5": "eb4d58b1038e6d4de000427726bbafae",
"sha256": "27b46b300d3b03e3b3222c74d72346491c063e625d07b7d33740ed6fac6b9064"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp38-cp38-manylinux_2_24_x86_64.whl",
"has_sig": false,
"md5_digest": "eb4d58b1038e6d4de000427726bbafae",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.7,<3.11",
"size": 519262,
"upload_time": "2023-01-10T12:06:36",
"upload_time_iso_8601": "2023-01-10T12:06:36.012144Z",
"url": "https://files.pythonhosted.org/packages/d0/c6/95b3879dc095e4d0120d760d1e612f316647d1a3b68e4907693cc547c5dc/dima_cli-4.1.3-cp38-cp38-manylinux_2_24_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "afff3609949d60ecda7aa9ceb6b2552086ecdb92af1e2471dc1bbb7fefa6ab3a",
"md5": "c0e589c1acc6e9a3b64620bf2d1a7df8",
"sha256": "f901fbc42c3cc63f900a4d9e2e78f56870659cbf5d9009b5b001b2f8f6a95d90"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp38-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "c0e589c1acc6e9a3b64620bf2d1a7df8",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.7,<3.11",
"size": 436629,
"upload_time": "2023-01-10T12:08:31",
"upload_time_iso_8601": "2023-01-10T12:08:31.633281Z",
"url": "https://files.pythonhosted.org/packages/af/ff/3609949d60ecda7aa9ceb6b2552086ecdb92af1e2471dc1bbb7fefa6ab3a/dima_cli-4.1.3-cp38-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "2f090ed1abf97c1fb5d0fcc79d3b4d633bcba6ddc977fe9016c61e4461b99293",
"md5": "de38db50d9fe546943f7be672598d9e2",
"sha256": "c4784c2110464f95bbf41f41df0ce1f981b35c28f9fed12c1a4050071082998d"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp39-cp39-manylinux_2_24_x86_64.whl",
"has_sig": false,
"md5_digest": "de38db50d9fe546943f7be672598d9e2",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.7,<3.11",
"size": 518973,
"upload_time": "2023-01-10T12:07:09",
"upload_time_iso_8601": "2023-01-10T12:07:09.816481Z",
"url": "https://files.pythonhosted.org/packages/2f/09/0ed1abf97c1fb5d0fcc79d3b4d633bcba6ddc977fe9016c61e4461b99293/dima_cli-4.1.3-cp39-cp39-manylinux_2_24_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "b089aa3b8f43795464b244d212daafedb27b42ffd0bcdcc146275ed2e22ac4bf",
"md5": "69c549aa66fcf9046edc45c69f45a68a",
"sha256": "85007d902ebec25e98c10c74bf6da2432b78ecc6baf00120cb9985cb2a00ea5a"
},
"downloads": -1,
"filename": "dima_cli-4.1.3-cp39-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "69c549aa66fcf9046edc45c69f45a68a",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.7,<3.11",
"size": 436154,
"upload_time": "2023-01-10T12:07:52",
"upload_time_iso_8601": "2023-01-10T12:07:52.578862Z",
"url": "https://files.pythonhosted.org/packages/b0/89/aa3b8f43795464b244d212daafedb27b42ffd0bcdcc146275ed2e22ac4bf/dima_cli-4.1.3-cp39-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-10 12:08:51",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "dima-cli"
}