# DiMA - Diversity Motif Analyser
![PyPI - Downloads](https://img.shields.io/pypi/dm/dima-cli)
![GitHub closed issues](https://img.shields.io/github/issues-closed-raw/BVU-BILSAB/DiMA)
![GitHub issues](https://img.shields.io/github/issues-raw/BVU-BILSAB/DiMA)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/dima-cli)
![PyPI](https://img.shields.io/pypi/v/dima-cli)
![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/BVU-BILSAB/DiMA)
## Table of Contents
- [What is DiMA?](#what-is-dima)
- [Publications](#publications)
- [Installation](#installation)
- [Basic Usage](#basic-usage)
- [Shell Command](#shell-command)
- [Python](#python)
- [Results](#results)
- [Advance Usage](#advance-usage)
- [Shell Command](#shell-command)
- [Python](#python)
- [Results](#results)
- [Command-Line Arguments](#command-line-arguments)
- [Module Parameters](#module-parameters)
## What is DiMA?
Protein sequence diversity is one of the major challenges in the design of diagnostic, prophylactic and therapeutic
interventions against viruses. DiMA is a tool designed to facilitate the dissection of protein sequence diversity
dynamics for viruses. DiMA provides a quantitative measure of sequence diversity by use of Shannon’s entropy,
applied via a user-defined k-mer sliding window. Further, the entropy value is corrected for sample size bias by
applying a statistical adjustment.
Additionally, DiMA further interrogates the diversity by dissecting the entropy value at each k-mer position to various
diversity motifs. The distinct k-mer sequences at each position are classified into the following motifs based on
their incidence.
- **Index**: The predominant sequence.
- **Major**: The sequence with the second highest incidence after the Index.
- **Minor**: Kmers with incidence in between major and unique motifs
- **Unique**: Kmers which are only seen once in a particular kmer position.
Moreover, the description line of the sequences in the alignment can be
formatted for inclusion of meta-data that can be tagged to the diversity motifs. DiMA enables comparative diversity
dynamics analysis, within and between proteins of a virus species, and proteomes of different viral species.
## Publications
- https://arxiv.org/abs/2205.13915
## Installation
`pip install dima-cli`
## Basic Usage
### Shell Command
```shell
dima-cli -i aligned_sequences.afa -o results.json
```
### Python
```python
from dima import Dima
results = Dima(sequences="aligned_sequences.afa").run()
```
### Results
<details>
<summary>Click to view basic results</summary>
```
{
"sequence_count": 5,
"support_threshold": 30,
"low_support_count": 20,
"query_name": "Unknown Query",
"kmer_length": 9,
"average_entropy": 0.06854034285524647,
"highest_entropy": {
"position": 186,
"entropy": 1.3921472236645345
},
"results": [
{
"position": 1,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "MSASKEIKS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "SAGVYMGNL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 2,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "AGVYMGNLS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "SASKEIKSF",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 3,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "GVYMGNLSS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "ASKEIKSFL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 4,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "VYMGNLSSQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "SKEIKSFLW",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 5,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "KEIKSFLWT",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "YMGNLSSQQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 6,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "MGNLSSQQL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "EIKSFLWTQ",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 7,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "IKSFLWTQS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "GNLSSQQLD",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 8,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "KSFLWTQSL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "NLSSQQLDQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 9,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SFLWTQSLR",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "LSSQQLDQR",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 10,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SSQQLDQRR",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "FLWTQSLRR",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 11,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "LWTQSLRRE",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "SQQLDQRRA",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 12,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "QQLDQRRAL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "WTQSLRREL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 13,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "TQSLRRELS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "QLDQRRALL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 14,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "QSLRRELSG",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "LDQRRALLS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "QSLRRELSS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 15,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "DQRRALLSM",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "SLRRELSGY",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "SLRRELSSY",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 16,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "QRRALLSMI",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "LRRELSSYC",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "LRRELSGYC",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 17,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "RRELSGYCS",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "RRALLSMIG",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "RRELSSYCS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 18,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "RELSGYCSN",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "RALLSMIGM",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "RELSSYCSN",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
},
{
"position": 19,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "ALLSMIGMS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "ELSSYCSNI",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "ELSGYCSNI",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
}
]
},
{
"position": 20,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "LSGYCSNIK",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": null
},
{
"sequence": "LLSMIGMSG",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
},
{
"sequence": "LSSYCSNIK",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": null
}
]
}
]
}
```
</details>
## Advance Usage
### Shell Command
```shell
dima-cli -i aligned_sequences.afa -o results.json -f "accession|strain|country|date"
```
### Python
```python
from dima import Dima
results = Dima(sequences="aligned_sequences.afa", header_format="accession|strain|country|date").run()
```
### Results
<details>
<summary>Click to view advanced results</summary>
```
{
"sequence_count": 5,
"support_threshold": 30,
"low_support_count": 20,
"query_name": "Unknown Query",
"kmer_length": 9,
"average_entropy": 0.06854034285524647,
"highest_entropy": {
"position": 186,
"entropy": 1.3921472236645345
},
"results": [
{
"position": 1,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "MSASKEIKS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75365.1": 1,
"QEP52131.1": 1,
"AYD75321.1": 1,
"AYD75325.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Homo sapiens": 3,
"Unknown": 1
},
"Year": {
"1977": 1,
"2012": 1,
"1980": 1,
"1979": 1
}
}
},
{
"sequence": "SAGVYMGNL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
},
"Species": {
"Homo sapiens": 1
}
}
}
]
},
{
"position": 2,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SASKEIKSF",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Species": {
"Homo sapiens": 3,
"Unknown": 1
},
"Year": {
"1977": 1,
"1980": 1,
"1979": 1,
"2012": 1
},
"Country": {
"Sierra Leone": 4
},
"Accession": {
"AYD75325.1": 1,
"QEP52131.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1
}
}
},
{
"sequence": "AGVYMGNLS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
},
{
"position": 3,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "ASKEIKSFL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Year": {
"1980": 1,
"1977": 1,
"1979": 1,
"2012": 1
},
"Accession": {
"AYD75321.1": 1,
"AYD75365.1": 1,
"QEP52131.1": 1,
"AYD75325.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
}
}
},
{
"sequence": "GVYMGNLSS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 4,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SKEIKSFLW",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75325.1": 1,
"AYD75365.1": 1,
"QEP52131.1": 1,
"AYD75321.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Year": {
"2012": 1,
"1979": 1,
"1980": 1,
"1977": 1
}
}
},
{
"sequence": "VYMGNLSSQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
},
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
},
{
"position": 5,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "KEIKSFLWT",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"QEP52131.1": 1,
"AYD75325.1": 1,
"AYD75321.1": 1,
"AYD75365.1": 1
},
"Year": {
"1979": 1,
"1980": 1,
"1977": 1,
"2012": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Homo sapiens": 3,
"Unknown": 1
}
}
},
{
"sequence": "YMGNLSSQQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
}
}
}
]
},
{
"position": 6,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "MGNLSSQQL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
}
}
},
{
"sequence": "EIKSFLWTQ",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75321.1": 1,
"QEP52131.1": 1,
"AYD75365.1": 1,
"AYD75325.1": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Country": {
"Sierra Leone": 4
},
"Year": {
"1977": 1,
"1980": 1,
"2012": 1,
"1979": 1
}
}
}
]
},
{
"position": 7,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "GNLSSQQLD",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
}
}
},
{
"sequence": "IKSFLWTQS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75365.1": 1,
"AYD75325.1": 1,
"QEP52131.1": 1,
"AYD75321.1": 1
},
"Year": {
"1979": 1,
"1980": 1,
"1977": 1,
"2012": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Country": {
"Sierra Leone": 4
}
}
}
]
},
{
"position": 8,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "NLSSQQLDQ",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
}
}
},
{
"sequence": "KSFLWTQSL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 4
},
"Year": {
"1979": 1,
"2012": 1,
"1977": 1,
"1980": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Accession": {
"AYD75325.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1,
"QEP52131.1": 1
}
}
}
]
},
{
"position": 9,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "SFLWTQSLR",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Year": {
"2012": 1,
"1979": 1,
"1980": 1,
"1977": 1
},
"Accession": {
"QEP52131.1": 1,
"AYD75325.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
}
}
},
{
"sequence": "LSSQQLDQR",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
},
{
"position": 10,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "FLWTQSLRR",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Year": {
"1977": 1,
"2012": 1,
"1980": 1,
"1979": 1
},
"Accession": {
"AYD75321.1": 1,
"AYD75325.1": 1,
"AYD75365.1": 1,
"QEP52131.1": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
},
"Country": {
"Sierra Leone": 4
}
}
},
{
"sequence": "SSQQLDQRR",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
},
{
"position": 11,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "LWTQSLRRE",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 4
},
"Accession": {
"QEP52131.1": 1,
"AYD75365.1": 1,
"AYD75325.1": 1,
"AYD75321.1": 1
},
"Species": {
"Homo sapiens": 3,
"Unknown": 1
},
"Year": {
"1979": 1,
"1980": 1,
"2012": 1,
"1977": 1
}
}
},
{
"sequence": "SQQLDQRRA",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 12,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "QQLDQRRAL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
}
}
},
{
"sequence": "WTQSLRREL",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 4
},
"Year": {
"1980": 1,
"2012": 1,
"1979": 1,
"1977": 1
},
"Accession": {
"QEP52131.1": 1,
"AYD75321.1": 1,
"AYD75325.1": 1,
"AYD75365.1": 1
},
"Species": {
"Homo sapiens": 3,
"Unknown": 1
}
}
}
]
},
{
"position": 13,
"low_support": "LS",
"entropy": 0.7219280948873623,
"support": 5,
"distinct_variants_count": 1,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 20.0,
"diversity_motifs": [
{
"sequence": "TQSLRRELS",
"count": 4,
"incidence": 80.0,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75365.1": 1,
"AYD75321.1": 1,
"QEP52131.1": 1,
"AYD75325.1": 1
},
"Country": {
"Sierra Leone": 4
},
"Year": {
"1977": 1,
"1979": 1,
"2012": 1,
"1980": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 3
}
}
},
{
"sequence": "QLDQRRALL",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
}
}
}
]
},
{
"position": 14,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "QSLRRELSG",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 3
},
"Accession": {
"AYD75325.1": 1,
"AYD75321.1": 1,
"AYD75365.1": 1
},
"Species": {
"Homo sapiens": 2,
"Unknown": 1
},
"Year": {
"1979": 1,
"1980": 1,
"1977": 1
}
}
},
{
"sequence": "QSLRRELSS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Year": {
"2012": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"QEP52131.1": 1
}
}
},
{
"sequence": "LDQRRALLS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 15,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "DQRRALLSM",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Accession": {
"AYD75329.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
}
}
},
{
"sequence": "SLRRELSSY",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"QEP52131.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"2012": 1
}
}
},
{
"sequence": "SLRRELSGY",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Species": {
"Unknown": 1,
"Homo sapiens": 2
},
"Year": {
"1977": 1,
"1980": 1,
"1979": 1
},
"Country": {
"Sierra Leone": 3
},
"Accession": {
"AYD75365.1": 1,
"AYD75325.1": 1,
"AYD75321.1": 1
}
}
}
]
},
{
"position": 16,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "LRRELSSYC",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Accession": {
"QEP52131.1": 1
},
"Year": {
"2012": 1
},
"Species": {
"Homo sapiens": 1
}
}
},
{
"sequence": "LRRELSGYC",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Year": {
"1979": 1,
"1977": 1,
"1980": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 2
},
"Country": {
"Sierra Leone": 3
},
"Accession": {
"AYD75365.1": 1,
"AYD75321.1": 1,
"AYD75325.1": 1
}
}
},
{
"sequence": "QRRALLSMI",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 17,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "RRELSSYCS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Year": {
"2012": 1
},
"Accession": {
"QEP52131.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
}
}
},
{
"sequence": "RRELSGYCS",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Species": {
"Homo sapiens": 2,
"Unknown": 1
},
"Year": {
"1980": 1,
"1977": 1,
"1979": 1
},
"Accession": {
"AYD75325.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1
},
"Country": {
"Sierra Leone": 3
}
}
},
{
"sequence": "RRALLSMIG",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"AYD75329.1": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 18,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "RALLSMIGM",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Year": {
"1975": 1
},
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
}
}
},
{
"sequence": "RELSSYCSN",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Species": {
"Homo sapiens": 1
},
"Accession": {
"QEP52131.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"2012": 1
}
}
},
{
"sequence": "RELSGYCSN",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Accession": {
"AYD75325.1": 1,
"AYD75365.1": 1,
"AYD75321.1": 1
},
"Species": {
"Homo sapiens": 2,
"Unknown": 1
},
"Year": {
"1980": 1,
"1977": 1,
"1979": 1
},
"Country": {
"Sierra Leone": 3
}
}
}
]
},
{
"position": 19,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "ELSGYCSNI",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 3
},
"Species": {
"Homo sapiens": 2,
"Unknown": 1
},
"Accession": {
"AYD75365.1": 1,
"AYD75325.1": 1,
"AYD75321.1": 1
},
"Year": {
"1977": 1,
"1979": 1,
"1980": 1
}
}
},
{
"sequence": "ELSSYCSNI",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"QEP52131.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Country": {
"Sierra Leone": 1
},
"Year": {
"2012": 1
}
}
},
{
"sequence": "ALLSMIGMS",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Country": {
"Sierra Leone": 1
},
"Accession": {
"AYD75329.1": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
}
}
}
]
},
{
"position": 20,
"low_support": "LS",
"entropy": 1.3709505944546687,
"support": 5,
"distinct_variants_count": 2,
"distinct_variants_incidence": 100.0,
"total_variants_incidence": 40.0,
"diversity_motifs": [
{
"sequence": "LLSMIGMSG",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Accession": {
"AYD75329.1": 1
},
"Country": {
"Sierra Leone": 1
},
"Species": {
"Homo sapiens": 1
},
"Year": {
"1975": 1
}
}
},
{
"sequence": "LSGYCSNIK",
"count": 3,
"incidence": 60.000004,
"motif_short": "I",
"motif_long": "Index",
"metadata": {
"Country": {
"Sierra Leone": 3
},
"Year": {
"1979": 1,
"1977": 1,
"1980": 1
},
"Accession": {
"AYD75365.1": 1,
"AYD75321.1": 1,
"AYD75325.1": 1
},
"Species": {
"Unknown": 1,
"Homo sapiens": 2
}
}
},
{
"sequence": "LSSYCSNIK",
"count": 1,
"incidence": 20.0,
"motif_short": "U",
"motif_long": "Unique",
"metadata": {
"Year": {
"2012": 1
},
"Species": {
"Homo sapiens": 1
},
"Accession": {
"QEP52131.1": 1
},
"Country": {
"Sierra Leone": 1
}
}
}
]
}
]
}
```
</details>
## Command-Line Arguments
| **Argument** | **Type** | **Required** | **Default** | **Example** | **Description** |
|--------------|--------------------|--------------|---------------|--------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|
| -h | N/A | False | N/A | `dima-cli -h` | Prints a summary of all available command-line arguments. |
| -n | String | False | Unknown | `dima-cli -i sequences.afa -o results.json -f "accession\|strain\|country" -n "NA"` -n "Unknown" | Silently fix missing values in the FASTA header with given value. |
| -v | N/A | False | N/A | `dima-cli -v` | Prints the version of dima-cli that is currently installed. |
| -q | String | False | Unknown Query | `dima-cli -q "Coronavirus Surface Protein" -i sequences.afa -o results.json` | The name of the sample that will appear on the results. |
| -i | String | True | N/A | `dima-cli -i sequences.afa -o results.json` | The path to the FASTA Multiple Sequence Alignment file. |
| -o | String | True | N/A | `dima-cli -i sequences.afa -o results,json` | The location where the results shall be saved. |
| -l | Integer | False | 9 | `dima-cli -i sequences.afa -l 12 -o results.json` | The length of the kmers generated. |
| -f | String | False | N/A | `dima-cli -i sequences.afa -f "accession\|strain\|country" -o results.json` | The format of the FASTA header. Labels where each variant of a kmer position originated from. |
| -s | Integer | False | 30 | `dima-cli -i sequences.afa -l 12 -s 40 -o results.json` | The minimum required support for each kmer position. |
| -a | nucleotide/protein | False | protein | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json` | The alphabet of the sequences (ie: `protein`/`nucleotide`, default: protein) |
| -t | json/xlsx | False | json | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -t xlsx` | The output format (ie: `json`/`xlsx`, default: json) |
| -c | String | False | N/A | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -c hcs.json` | Path to save Highly Conserved Sequences (HCS) in JSON format. |
| -e | Float | False | 100 | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -c hcs.json -e 90.5` | Minimum incidence (%) threshold for HCS concatenation. |
## Module Parameters
| **Parameter** | **Type** | **Required** | **Default** | **Description** |
|-------------------|-----------------|--------------|-----------------|-----------------------------------------------------------------------------------------------------------------|
| sequences | String/StringIO | True | N/A | The path to a FASTA Multiple Sequence Alignment file (MSA), or a StringIO object containing FASTA MSA. |
| kmer_length | Integer | False | 9 | The length of the kmers generated. |
| header_fillna | String | False | Unknown | Silently fix missing values in the FASTA header with given value (only required when `header_format` is given). |
| header_format | String | False | N/A | The format of the FASTA header. Labels where each variant of a kmer position originated from. |
| support_threshold | Integer | False | 30 | The minimum required support for each kmer position. |
| query_name | String | False | Unknown Query | The name of the sample that will appear on the results. |
| alphabet | String | False | protein | The alphabet of the sequences (ie: protein/nucleotide, default: protein) |
Raw data
{
"_id": null,
"home_page": null,
"name": "dima-cli",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.8",
"maintainer_email": null,
"keywords": "bioinformatics, biology, protein, virus, diversity, dna, rna",
"author": "Shan Tharanga <stwm2@student.london.ac.uk>",
"author_email": "Shan Tharanga <stwm2@student.london.ac.uk>",
"download_url": null,
"platform": null,
"description": "# DiMA - Diversity Motif Analyser\n![PyPI - Downloads](https://img.shields.io/pypi/dm/dima-cli)\n![GitHub closed issues](https://img.shields.io/github/issues-closed-raw/BVU-BILSAB/DiMA)\n![GitHub issues](https://img.shields.io/github/issues-raw/BVU-BILSAB/DiMA)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/dima-cli)\n![PyPI](https://img.shields.io/pypi/v/dima-cli)\n![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/BVU-BILSAB/DiMA)\n\n## Table of Contents\n- [What is DiMA?](#what-is-dima)\n- [Publications](#publications)\n- [Installation](#installation)\n- [Basic Usage](#basic-usage)\n - [Shell Command](#shell-command)\n - [Python](#python)\n - [Results](#results)\n- [Advance Usage](#advance-usage)\n - [Shell Command](#shell-command)\n - [Python](#python)\n - [Results](#results)\n- [Command-Line Arguments](#command-line-arguments)\n- [Module Parameters](#module-parameters)\n\n## What is DiMA?\n\nProtein sequence diversity is one of the major challenges in the design of diagnostic, prophylactic and therapeutic \ninterventions against viruses. DiMA is a tool designed to facilitate the dissection of protein sequence diversity \ndynamics for viruses. DiMA provides a quantitative measure of sequence diversity by use of Shannon\u2019s entropy, \napplied via a user-defined k-mer sliding window. Further, the entropy value is corrected for sample size bias by \napplying a statistical adjustment. \nAdditionally, DiMA further interrogates the diversity by dissecting the entropy value at each k-mer position to various \ndiversity motifs. The distinct k-mer sequences at each position are classified into the following motifs based on \ntheir incidence. \n\n - **Index**: The predominant sequence. \n - **Major**: The sequence with the second highest incidence after the Index.\n - **Minor**: Kmers with incidence in between major and unique motifs\n - **Unique**: Kmers which are only seen once in a particular kmer position. \n \nMoreover, the description line of the sequences in the alignment can be \nformatted for inclusion of meta-data that can be tagged to the diversity motifs. DiMA enables comparative diversity \ndynamics analysis, within and between proteins of a virus species, and proteomes of different viral species.\n\n## Publications\n- https://arxiv.org/abs/2205.13915\n\n## Installation\n\n`pip install dima-cli`\n\n## Basic Usage\n### Shell Command\n```shell\ndima-cli -i aligned_sequences.afa -o results.json\n```\n\n### Python\n```python\nfrom dima import Dima\nresults = Dima(sequences=\"aligned_sequences.afa\").run()\n```\n### Results\n<details>\n<summary>Click to view basic results</summary>\n\n```\n{\n \"sequence_count\": 5,\n \"support_threshold\": 30,\n \"low_support_count\": 20,\n \"query_name\": \"Unknown Query\",\n \"kmer_length\": 9,\n \"average_entropy\": 0.06854034285524647,\n \"highest_entropy\": {\n \"position\": 186,\n \"entropy\": 1.3921472236645345\n },\n \"results\": [\n {\n \"position\": 1,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"MSASKEIKS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SAGVYMGNL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 2,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"AGVYMGNLS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SASKEIKSF\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 3,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"GVYMGNLSS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"ASKEIKSFL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 4,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"VYMGNLSSQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SKEIKSFLW\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 5,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"KEIKSFLWT\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"YMGNLSSQQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 6,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"MGNLSSQQL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"EIKSFLWTQ\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 7,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"IKSFLWTQS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"GNLSSQQLD\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 8,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"KSFLWTQSL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"NLSSQQLDQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 9,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SFLWTQSLR\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LSSQQLDQR\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 10,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SSQQLDQRR\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"FLWTQSLRR\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 11,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LWTQSLRRE\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SQQLDQRRA\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 12,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QQLDQRRAL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"WTQSLRREL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 13,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"TQSLRRELS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"QLDQRRALL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 14,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QSLRRELSG\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LDQRRALLS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"QSLRRELSS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 15,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"DQRRALLSM\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SLRRELSGY\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"SLRRELSSY\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 16,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QRRALLSMI\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LRRELSSYC\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LRRELSGYC\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 17,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"RRELSGYCS\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"RRALLSMIG\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"RRELSSYCS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 18,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"RELSGYCSN\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"RALLSMIGM\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"RELSSYCSN\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 19,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"ALLSMIGMS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"ELSSYCSNI\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"ELSGYCSNI\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n }\n ]\n },\n {\n \"position\": 20,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LSGYCSNIK\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LLSMIGMSG\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n },\n {\n \"sequence\": \"LSSYCSNIK\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": null\n }\n ]\n }\n ]\n}\n```\n</details>\n\n## Advance Usage\n### Shell Command\n```shell\ndima-cli -i aligned_sequences.afa -o results.json -f \"accession|strain|country|date\"\n```\n\n### Python\n```python\nfrom dima import Dima\nresults = Dima(sequences=\"aligned_sequences.afa\", header_format=\"accession|strain|country|date\").run()\n```\n### Results\n<details>\n<summary>Click to view advanced results</summary>\n\n```\n{\n \"sequence_count\": 5,\n \"support_threshold\": 30,\n \"low_support_count\": 20,\n \"query_name\": \"Unknown Query\",\n \"kmer_length\": 9,\n \"average_entropy\": 0.06854034285524647,\n \"highest_entropy\": {\n \"position\": 186,\n \"entropy\": 1.3921472236645345\n },\n \"results\": [\n {\n \"position\": 1,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"MSASKEIKS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1977\": 1,\n \"2012\": 1,\n \"1980\": 1,\n \"1979\": 1\n }\n }\n },\n {\n \"sequence\": \"SAGVYMGNL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 2,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SASKEIKSF\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1977\": 1,\n \"1980\": 1,\n \"1979\": 1,\n \"2012\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1\n }\n }\n },\n {\n \"sequence\": \"AGVYMGNLS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 3,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"ASKEIKSFL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Year\": {\n \"1980\": 1,\n \"1977\": 1,\n \"1979\": 1,\n \"2012\": 1\n },\n \"Accession\": {\n \"AYD75321.1\": 1,\n \"AYD75365.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n }\n }\n },\n {\n \"sequence\": \"GVYMGNLSS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 4,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SKEIKSFLW\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Year\": {\n \"2012\": 1,\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1\n }\n }\n },\n {\n \"sequence\": \"VYMGNLSSQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 5,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"KEIKSFLWT\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"QEP52131.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75365.1\": 1\n },\n \"Year\": {\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1,\n \"2012\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n }\n }\n },\n {\n \"sequence\": \"YMGNLSSQQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 6,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"MGNLSSQQL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n },\n {\n \"sequence\": \"EIKSFLWTQ\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75321.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Year\": {\n \"1977\": 1,\n \"1980\": 1,\n \"2012\": 1,\n \"1979\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 7,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"GNLSSQQLD\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n },\n {\n \"sequence\": \"IKSFLWTQS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Year\": {\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1,\n \"2012\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Country\": {\n \"Sierra Leone\": 4\n }\n }\n }\n ]\n },\n {\n \"position\": 8,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"NLSSQQLDQ\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n }\n }\n },\n {\n \"sequence\": \"KSFLWTQSL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Year\": {\n \"1979\": 1,\n \"2012\": 1,\n \"1977\": 1,\n \"1980\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1,\n \"QEP52131.1\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 9,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"SFLWTQSLR\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Year\": {\n \"2012\": 1,\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n }\n }\n },\n {\n \"sequence\": \"LSSQQLDQR\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 10,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"FLWTQSLRR\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Year\": {\n \"1977\": 1,\n \"2012\": 1,\n \"1980\": 1,\n \"1979\": 1\n },\n \"Accession\": {\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"QEP52131.1\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n },\n \"Country\": {\n \"Sierra Leone\": 4\n }\n }\n },\n {\n \"sequence\": \"SSQQLDQRR\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 11,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LWTQSLRRE\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Accession\": {\n \"QEP52131.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1979\": 1,\n \"1980\": 1,\n \"2012\": 1,\n \"1977\": 1\n }\n }\n },\n {\n \"sequence\": \"SQQLDQRRA\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 12,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QQLDQRRAL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n }\n }\n },\n {\n \"sequence\": \"WTQSLRREL\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Year\": {\n \"1980\": 1,\n \"2012\": 1,\n \"1979\": 1,\n \"1977\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 3,\n \"Unknown\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 13,\n \"low_support\": \"LS\",\n \"entropy\": 0.7219280948873623,\n \"support\": 5,\n \"distinct_variants_count\": 1,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 20.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"TQSLRRELS\",\n \"count\": 4,\n \"incidence\": 80.0,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1,\n \"QEP52131.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 4\n },\n \"Year\": {\n \"1977\": 1,\n \"1979\": 1,\n \"2012\": 1,\n \"1980\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 3\n }\n }\n },\n {\n \"sequence\": \"QLDQRRALL\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 14,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"QSLRRELSG\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75365.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 2,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1979\": 1,\n \"1980\": 1,\n \"1977\": 1\n }\n }\n },\n {\n \"sequence\": \"QSLRRELSS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"2012\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n }\n }\n },\n {\n \"sequence\": \"LDQRRALLS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 15,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"DQRRALLSM\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n },\n {\n \"sequence\": \"SLRRELSSY\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"2012\": 1\n }\n }\n },\n {\n \"sequence\": \"SLRRELSGY\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 2\n },\n \"Year\": {\n \"1977\": 1,\n \"1980\": 1,\n \"1979\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 16,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LRRELSSYC\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Year\": {\n \"2012\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n },\n {\n \"sequence\": \"LRRELSGYC\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Year\": {\n \"1979\": 1,\n \"1977\": 1,\n \"1980\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 2\n },\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1\n }\n }\n },\n {\n \"sequence\": \"QRRALLSMI\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 17,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"RRELSSYCS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Year\": {\n \"2012\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n },\n {\n \"sequence\": \"RRELSGYCS\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 2,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1980\": 1,\n \"1977\": 1,\n \"1979\": 1\n },\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 3\n }\n }\n },\n {\n \"sequence\": \"RRALLSMIG\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 18,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"RALLSMIGM\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Year\": {\n \"1975\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n }\n }\n },\n {\n \"sequence\": \"RELSSYCSN\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"2012\": 1\n }\n }\n },\n {\n \"sequence\": \"RELSGYCSN\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75325.1\": 1,\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 2,\n \"Unknown\": 1\n },\n \"Year\": {\n \"1980\": 1,\n \"1977\": 1,\n \"1979\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 3\n }\n }\n }\n ]\n },\n {\n \"position\": 19,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"ELSGYCSNI\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Species\": {\n \"Homo sapiens\": 2,\n \"Unknown\": 1\n },\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75325.1\": 1,\n \"AYD75321.1\": 1\n },\n \"Year\": {\n \"1977\": 1,\n \"1979\": 1,\n \"1980\": 1\n }\n }\n },\n {\n \"sequence\": \"ELSSYCSNI\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Year\": {\n \"2012\": 1\n }\n }\n },\n {\n \"sequence\": \"ALLSMIGMS\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n }\n ]\n },\n {\n \"position\": 20,\n \"low_support\": \"LS\",\n \"entropy\": 1.3709505944546687,\n \"support\": 5,\n \"distinct_variants_count\": 2,\n \"distinct_variants_incidence\": 100.0,\n \"total_variants_incidence\": 40.0,\n \"diversity_motifs\": [\n {\n \"sequence\": \"LLSMIGMSG\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Accession\": {\n \"AYD75329.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Year\": {\n \"1975\": 1\n }\n }\n },\n {\n \"sequence\": \"LSGYCSNIK\",\n \"count\": 3,\n \"incidence\": 60.000004,\n \"motif_short\": \"I\",\n \"motif_long\": \"Index\",\n \"metadata\": {\n \"Country\": {\n \"Sierra Leone\": 3\n },\n \"Year\": {\n \"1979\": 1,\n \"1977\": 1,\n \"1980\": 1\n },\n \"Accession\": {\n \"AYD75365.1\": 1,\n \"AYD75321.1\": 1,\n \"AYD75325.1\": 1\n },\n \"Species\": {\n \"Unknown\": 1,\n \"Homo sapiens\": 2\n }\n }\n },\n {\n \"sequence\": \"LSSYCSNIK\",\n \"count\": 1,\n \"incidence\": 20.0,\n \"motif_short\": \"U\",\n \"motif_long\": \"Unique\",\n \"metadata\": {\n \"Year\": {\n \"2012\": 1\n },\n \"Species\": {\n \"Homo sapiens\": 1\n },\n \"Accession\": {\n \"QEP52131.1\": 1\n },\n \"Country\": {\n \"Sierra Leone\": 1\n }\n }\n }\n ]\n }\n ]\n}\n```\n</details>\n\n## Command-Line Arguments\n| **Argument** | **Type** | **Required** | **Default** | **Example** | **Description** |\n|--------------|--------------------|--------------|---------------|--------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|\n| -h | N/A | False | N/A | `dima-cli -h` | Prints a summary of all available command-line arguments. |\n| -n | String | False | Unknown | `dima-cli -i sequences.afa -o results.json -f \"accession\\|strain\\|country\" -n \"NA\"` -n \"Unknown\" | Silently fix missing values in the FASTA header with given value. |\n| -v | N/A | False | N/A | `dima-cli -v` | Prints the version of dima-cli that is currently installed. |\n| -q | String | False | Unknown Query | `dima-cli -q \"Coronavirus Surface Protein\" -i sequences.afa -o results.json` | The name of the sample that will appear on the results. |\n| -i | String | True | N/A | `dima-cli -i sequences.afa -o results.json` | The path to the FASTA Multiple Sequence Alignment file. |\n| -o | String | True | N/A | `dima-cli -i sequences.afa -o results,json` | The location where the results shall be saved. |\n| -l | Integer | False | 9 | `dima-cli -i sequences.afa -l 12 -o results.json` | The length of the kmers generated. |\n| -f | String | False | N/A | `dima-cli -i sequences.afa -f \"accession\\|strain\\|country\" -o results.json` | The format of the FASTA header. Labels where each variant of a kmer position originated from. |\n| -s | Integer | False | 30 | `dima-cli -i sequences.afa -l 12 -s 40 -o results.json` | The minimum required support for each kmer position. |\n| -a | nucleotide/protein | False | protein | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json` | The alphabet of the sequences (ie: `protein`/`nucleotide`, default: protein) |\n| -t | json/xlsx | False | json | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -t xlsx` | The output format (ie: `json`/`xlsx`, default: json) |\n| -c | String | False | N/A | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -c hcs.json` | Path to save Highly Conserved Sequences (HCS) in JSON format. |\n| -e | Float | False | 100 | `dima-cli -i dna_sequences.afa -a nucleotide -o results.json -c hcs.json -e 90.5` | Minimum incidence (%) threshold for HCS concatenation. |\n\n\n## Module Parameters\n| **Parameter** | **Type** | **Required** | **Default** | **Description** |\n|-------------------|-----------------|--------------|-----------------|-----------------------------------------------------------------------------------------------------------------|\n| sequences | String/StringIO | True | N/A | The path to a FASTA Multiple Sequence Alignment file (MSA), or a StringIO object containing FASTA MSA. |\n| kmer_length | Integer | False | 9 | The length of the kmers generated. |\n| header_fillna | String | False | Unknown | Silently fix missing values in the FASTA header with given value (only required when `header_format` is given). |\n| header_format | String | False | N/A | The format of the FASTA header. Labels where each variant of a kmer position originated from. |\n| support_threshold | Integer | False | 30 | The minimum required support for each kmer position. |\n| query_name | String | False | Unknown Query | The name of the sample that will appear on the results. |\n| alphabet | String | False | protein | The alphabet of the sequences (ie: protein/nucleotide, default: protein) |\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A command-line tool that analyses the diversity and motifs of biological sequences",
"version": "5.0.9",
"project_urls": {
"Source Code": "https://github.com/PU-SDS/DiMA"
},
"split_keywords": [
"bioinformatics",
" biology",
" protein",
" virus",
" diversity",
" dna",
" rna"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "bbbcc397af3883a4f69a15d404f4b9cd71b94eb7b7c6ce24fa82a8ea1516ef9a",
"md5": "39aea29b30e90e797d53b5e81d05a1af",
"sha256": "21b6201cc0abd6d3f841a39bf516d99015e1250706c52e1f8c03fbe3c1f237dc"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp310-cp310-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl",
"has_sig": false,
"md5_digest": "39aea29b30e90e797d53b5e81d05a1af",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": "<3.12,>=3.8",
"size": 780813,
"upload_time": "2024-09-03T14:37:35",
"upload_time_iso_8601": "2024-09-03T14:37:35.731774Z",
"url": "https://files.pythonhosted.org/packages/bb/bc/c397af3883a4f69a15d404f4b9cd71b94eb7b7c6ce24fa82a8ea1516ef9a/dima_cli-5.0.9-cp310-cp310-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "dac591f9d1fdcf11e52bd3f24bd1f35eb1659b26bd312acc1f8948375493777a",
"md5": "a3b811e6b3d0de5b5442695389bfec95",
"sha256": "3311d949b9acf64fbe81c67babc8a01529bc4f81fca86f392b30da264bd48a32"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "a3b811e6b3d0de5b5442695389bfec95",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": "<3.12,>=3.8",
"size": 446610,
"upload_time": "2024-09-03T14:37:46",
"upload_time_iso_8601": "2024-09-03T14:37:46.216554Z",
"url": "https://files.pythonhosted.org/packages/da/c5/91f9d1fdcf11e52bd3f24bd1f35eb1659b26bd312acc1f8948375493777a/dima_cli-5.0.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "9f86ee9a72cbccf135541cee223e5615e2264dbac8e011e0d2b7297351d531c3",
"md5": "4f4f311df04789fc78e39547208ca628",
"sha256": "b04a116664e56f93953f77bef3a5b96c79c5b097e031cc0b8bafc76e0ba6d4b0"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp310-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "4f4f311df04789fc78e39547208ca628",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": "<3.12,>=3.8",
"size": 288054,
"upload_time": "2024-09-03T14:38:29",
"upload_time_iso_8601": "2024-09-03T14:38:29.929517Z",
"url": "https://files.pythonhosted.org/packages/9f/86/ee9a72cbccf135541cee223e5615e2264dbac8e011e0d2b7297351d531c3/dima_cli-5.0.9-cp310-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "52eb4ad3df89abd4c677f9a6601d14056135215a889aee09594775cd94dc357e",
"md5": "c8fa70c07d2ea1165c45768f53a105b1",
"sha256": "e41e5bc1f4959820fd3673a8f66f6dfa84d9b2aaab6f92dcda096fa45800071d"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl",
"has_sig": false,
"md5_digest": "c8fa70c07d2ea1165c45768f53a105b1",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": "<3.12,>=3.8",
"size": 779664,
"upload_time": "2024-09-03T14:38:08",
"upload_time_iso_8601": "2024-09-03T14:38:08.940721Z",
"url": "https://files.pythonhosted.org/packages/52/eb/4ad3df89abd4c677f9a6601d14056135215a889aee09594775cd94dc357e/dima_cli-5.0.9-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "001d290c1252d5a5822f00ba9aa164f3214d04b2b45c97cbacaadf61bbd265df",
"md5": "962413787cc392dd8a9b88e3cfc8eb86",
"sha256": "a299e74150c2a377f475fc6c1c054b221674ac5cf88f011d05026d4d342c1d03"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "962413787cc392dd8a9b88e3cfc8eb86",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": "<3.12,>=3.8",
"size": 446212,
"upload_time": "2024-09-03T14:37:43",
"upload_time_iso_8601": "2024-09-03T14:37:43.727437Z",
"url": "https://files.pythonhosted.org/packages/00/1d/290c1252d5a5822f00ba9aa164f3214d04b2b45c97cbacaadf61bbd265df/dima_cli-5.0.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "b82bb3e77667450ef14bc32a04ab53447af426deace69936cbcf9487fe4cec02",
"md5": "1164ea58add761992f0a4fb605176d30",
"sha256": "9001c4c56b850b9644475127ea4bdf6b9933c6d8139cc3c031681acefe768f0e"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp311-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "1164ea58add761992f0a4fb605176d30",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": "<3.12,>=3.8",
"size": 288260,
"upload_time": "2024-09-03T14:39:42",
"upload_time_iso_8601": "2024-09-03T14:39:42.946936Z",
"url": "https://files.pythonhosted.org/packages/b8/2b/b3e77667450ef14bc32a04ab53447af426deace69936cbcf9487fe4cec02/dima_cli-5.0.9-cp311-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "36b21ae6759f0f82da8254fa22184ee2b48b79ead2f149cc28ebf66d991899d8",
"md5": "01f155675a18a194b9e52d9b6579f8e7",
"sha256": "5d33c3fdbe5cb5c9fca6fbf7b7f5e086ba4531240c5b76efd570c6ee00467ee3"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp38-cp38-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl",
"has_sig": false,
"md5_digest": "01f155675a18a194b9e52d9b6579f8e7",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": "<3.12,>=3.8",
"size": 782303,
"upload_time": "2024-09-03T14:38:13",
"upload_time_iso_8601": "2024-09-03T14:38:13.735618Z",
"url": "https://files.pythonhosted.org/packages/36/b2/1ae6759f0f82da8254fa22184ee2b48b79ead2f149cc28ebf66d991899d8/dima_cli-5.0.9-cp38-cp38-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d44846b6a3c0bb46ed50bbe9bdcea4bed3d6c2431ae045a8804b3d411fc01ffa",
"md5": "a206c6068df9a029dd2851c398572b94",
"sha256": "27fb2f048716219afb1a5498bb99a81ccdb2104df11a13c665f336adc1d628f2"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "a206c6068df9a029dd2851c398572b94",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": "<3.12,>=3.8",
"size": 447112,
"upload_time": "2024-09-03T14:37:49",
"upload_time_iso_8601": "2024-09-03T14:37:49.667090Z",
"url": "https://files.pythonhosted.org/packages/d4/48/46b6a3c0bb46ed50bbe9bdcea4bed3d6c2431ae045a8804b3d411fc01ffa/dima_cli-5.0.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "001ce0ef1c8a83264fdea9191358eb3135c244a891b33e1fd551fb7c511a8861",
"md5": "746f8b75d492315188f8b969dcbbf6d0",
"sha256": "bf82bc636bef44619da8a78cbdcfa1bec277571a486ea5e26aa995a46cbd458a"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp38-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "746f8b75d492315188f8b969dcbbf6d0",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": "<3.12,>=3.8",
"size": 288368,
"upload_time": "2024-09-03T14:39:54",
"upload_time_iso_8601": "2024-09-03T14:39:54.007301Z",
"url": "https://files.pythonhosted.org/packages/00/1c/e0ef1c8a83264fdea9191358eb3135c244a891b33e1fd551fb7c511a8861/dima_cli-5.0.9-cp38-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "cf7ce3ba6883ec6df01833281f7a99e57d54a789733a386730fd9a943a938128",
"md5": "c6005903296b0f8720fe30001d057ddf",
"sha256": "be9812b60c6e1a104de87bad99d1451fb299950069cba5979d9a31d2717657b6"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp39-cp39-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl",
"has_sig": false,
"md5_digest": "c6005903296b0f8720fe30001d057ddf",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": "<3.12,>=3.8",
"size": 782587,
"upload_time": "2024-09-03T14:37:59",
"upload_time_iso_8601": "2024-09-03T14:37:59.928848Z",
"url": "https://files.pythonhosted.org/packages/cf/7c/e3ba6883ec6df01833281f7a99e57d54a789733a386730fd9a943a938128/dima_cli-5.0.9-cp39-cp39-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "751768d0bf609b4ddd1071a5cbbebd4db3804bc6d1c7ffc8c139070fec7ad340",
"md5": "2a7e1256f7474c2a64d61bc244d21db8",
"sha256": "22087fe610fe8fbc561b44c1f6bd30e42f8adcaccf7c68204041608ded487b42"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "2a7e1256f7474c2a64d61bc244d21db8",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": "<3.12,>=3.8",
"size": 447237,
"upload_time": "2024-09-03T14:37:44",
"upload_time_iso_8601": "2024-09-03T14:37:44.206740Z",
"url": "https://files.pythonhosted.org/packages/75/17/68d0bf609b4ddd1071a5cbbebd4db3804bc6d1c7ffc8c139070fec7ad340/dima_cli-5.0.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "bb2450c1aa78ea84bb01824e67ad186d574d148f38da5d3093fb3581cd1c8b1a",
"md5": "b5b6dafddac2c924dfa96c40b3835a00",
"sha256": "16d1b2e696b319a14e56279a0a3e75a3c7a73d2208081d4b9776d741d48955ce"
},
"downloads": -1,
"filename": "dima_cli-5.0.9-cp39-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "b5b6dafddac2c924dfa96c40b3835a00",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": "<3.12,>=3.8",
"size": 288143,
"upload_time": "2024-09-03T14:39:49",
"upload_time_iso_8601": "2024-09-03T14:39:49.845989Z",
"url": "https://files.pythonhosted.org/packages/bb/24/50c1aa78ea84bb01824e67ad186d574d148f38da5d3093fb3581cd1c8b1a/dima_cli-5.0.9-cp39-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-03 14:37:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "PU-SDS",
"github_project": "DiMA",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "dima-cli"
}