Name | ikiss JSON |
Version |
1.5.0
JSON |
| download |
home_page | |
Summary | iKISS is a pipeline to detect kmers under selection. |
upload_time | 2023-09-05 10:56:51 |
maintainer | |
docs_url | None |
author | |
requires_python | >=3.8 |
license | MIT License Copyright (c) 2022 DIADE IRD / Julie Orjuela, Yves Vigouroux Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
snakemake
kmers
selection
diversity
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
.. image:: ./ikiss/logo_ikiss.png
:width: 400
:alt: ikiss Logo
:align: center
|PythonVersions| |SnakemakeVersions| |Singularity|
.. contents:: Table of Contents
:depth: 2
**Homepage:** https://forge.ird.fr/diade/iKISS
About iKISS
===============
**iKISS (Kmer Inference sSelection)** is a snakemake pipeline able to decompose reads into kmers and extract kmers under selection.
iKISS uses KmersGWAS https://github.com/voichek/kmersGWAS, pcadapt https://cran.r-project.org/web/packages/pcadapt/readme/README.html and lfmm https://bcm-uga.github.io/lfmm/articles/lfmm to select genomics regions under selection.
1. Install dependencies and clone iKISS
=============================================
Check dependencies for iKISS : python and singularity
Install singularity and python3 in your local machine OR use module load to add singularity and python3 in your environment if you are working in a cluster :
.. code-block:: bash
module load system/python/3.8.12
module load system/singularity/3.6.0
iKISS is NOW available as a PyPI package (recommended)
.. code-block:: bash
python3 -m pip install ikiss
OR you can also install iKISS from git repository
.. code-block:: bash
python3 -m pip install ikiss@git+https://forge.ird.fr/diade/iKISS.git
#OR
git clone https://forge.ird.fr/diade/iKISS.git
cd iKISS
python3 -m pip install .
1.1 Installing in cluster mode
-------------------------------
Install iKISS in cluster mode using **singularity** container from ikiss_utilities https://itrop.ird.fr/ikiss_utilities/
.. code-block:: bash
ikiss install_cluster --help
ikiss install_cluster --scheduler slurm --env singularity
1.2 Installing in local mode
----------------------------
.. code-block:: bash
ikiss install_local --help
ikiss install_local
2. Running a datatest
=============================================
Running test with a datatest from iKISS_utilities in a repertory TEST
.. code-block:: bash
ikiss test_install --help
ikiss test_install -d TEST
2.1 In CLUSTER mode
-------------------
Launching suggested command line done by iKISS, in CLUSTER mode :
Please run command line 'ikiss create_cluster_config' before the first run and modify theads, ram, node and computer ressources.
iKISS do a copy of cluster_config.yaml file into your home "/home/$USER/.config/ikiss/cluster_config.yaml"
.. code-block:: bash
ikiss run_cluster --help
ikiss create_cluster_config
If singularity was selected in installation of iKISS, it could be needed to give argument --singularity-args \"--bind $HOME\" to Snakemake, by using :
.. code-block:: bash
ikiss run_cluster --help
ikiss run_cluster -c TEST/data_test_config.yaml --singularity-args "--bind $HOME"
# @IFB
ikiss run_cluster -c TEST/data_test_config.yaml --singularity-args "--bind /shared:/shared"
#you can also use snakemake parametters as --rerun-incomplete --nolock
**Important Note** : In i-Trop cluster, run iKISS using ONLY a node, data has to be in "/scratch" of chosen node. Use `nodelist : nodeX` parametter inside of cluster_config.yaml file.
2.2 In LOCAL mode
-----------------
launching suggested command line done by iKISS, in LOCAL mode:
.. code-block:: bash
ikiss run_local --help
ikiss run_local -t 8 -c TEST/data_test_config.yaml --singularity-args "--bind $HOME"
In local mode, its possible to allocate threads to some rules using `--set-threads` snakemake argument such as
.. code-block:: bash
ikiss run_local -t 8 -c TEST/data_test_config.yaml --set-threads kmers_gwas_per_sample=4 mapping_kmers=2 filter_bam=2 kmer_position_from_bam=4 pcadapt=2 extract_kmers_from_bed=2
3. Running your data
========================
3.1. Adapt config.yaml
------------------------
Before to run iKISS, adapt `config.yaml` by using :
.. code-block:: bash
ikiss create_config
Adapt `config.yaml` file with path to fastq files (FASTQ) and outfile (OUTPUT) in the `DATA` section.
.. code-block:: yaml
DATA:
FASTQ: './DATATEST/fastq'
OUTPUT: './OUTPUT-KISS/'
:warning if yours reads are ilumina paired, you need rename reads SAMPLE_R1.fastq.gz and SAMPLE_R2.fastq.gz. For single reads use SAMPLE_R1.fastq.gz
iKISS uses compressed ans decompressed fastq files.
3.1.1 WORKFLOW section
-----------------------
Parameter iKISS steps using the section WORKFLOW and parameter it with the PARAMS sections.
In WORKFLOW section:
KMERS_GWAS step has to be activated by default.
PCADAPT, LFMM, MAPPING or ASSEMBLY are optional. Active or deactivate these steps using true or false.
**KMERS_GWAS** convert reads in kmers, filter them and create a format ready to use in population genomics!
**PCADAPT** detects genetic markers (kmers here ^^) involved in biological adaptation and provides outlier detection based on Principal Component Analysis (PCA).
**LFMM** is used by iKISS for testing correlations between kmers and environmental data.
**MAPPING_KMERS** can optionally be used to align kmers to a genomic reference (if it is available ! ).
**ASSEMBLY_KMERS** can optionally assembly significant kmers obtained by pcadapt or lfmm
**INTERSECT** can optionally calculate how many kmers (if MAPPING_KMERS is activated ) or contigs(if ASSEMBLY_KMERS is activated) are found in FEATURES (gene by default)
.. code-block:: yaml
WORKFLOW:
KMERS_MODULE : true
PCADAPT : true
LFMM : true
MAPPING_KMERS: true
ASSEMBLY_KMERS: true
INTERSECT: True
3.1.2 PARAMS section
--------------------
In the PARAMS section, tools parameters can be modified and adapted.
=> 1. KMERS_MODULE
-------------------
KMERS_GWAS module decompose reads into kmers and create a binary table of presence/absence of kmers. This table can be filter to use only most informative kmers into the populations. PLINK format outfiles are obtained in this module.
.. code-block:: yaml
PARAMS:
KMERS_MODULE:
KMER_SIZE : 31
MAC : 2
P : 0.2
MAF : 0.05
B : 1000000 # nb kmers in each bed file
SPLIT_LIST_SIZE : 100000
MIN_LIST_SIZE : 50000
**KMER_SIZE** is the length of kmers (should be between 15-31)
**MAC** is the minor allele count (min allowed appearance of a kmer)
**P** is the minimum percent of appearance in each strand form
**MAF** is the minimum allele frequency
**B** is the number of kmers in each bed file
**SPLIT_LIST_SIZE** is the nb of kmers by bed file
**MIN_LIST_SIZE** indicates the minimal number of kmers allowed in the smaller bed file after splitting
=> 2. PCADAPT
--------------
PCADAPT detects kmers involved in biological adaptation and provides outlier detection based on Principal Component Analysis (PCA)
.. code-block:: yaml
PARAMS:
PCADAPT:
K : 2
SAMPLES: "samples.txt"
CORRECTION: 'FDR'
ALPHA : 0.05
**K** : number K of principal components
**SAMPLES** : you need to generate a *samples.txt* file. This file contains two columns (tab delimitations) : accession_id and phenotype_value. It will be used by PCADAPT.
**accession_id** : contains exactly same name of samples in FASTQ.
**phenotype_value** (int): contains sample group (wild=1, cultivated=2 for example)
.. code-block:: bash
accession_id group
Clone12 2
Clone14 2
Clone16 2
Clone20 2
Clone2 1
Clone4 1
Clone8 1
**CORRECTION**: kmers outliers are obtained using a correction of BONFERONNI, BH or FDR model.
**ALPHA**: modify the alpha cutoff for outlier detection
=> 3. LFMM
----------
LFMM is used by iKISS for testing correlations between kmers and environmental data.
.. code-block:: yaml
PARAMS:
LFMM:
K : 2
PHENOTYPE_FILE: "pheno.txt"
PHENOTYPE_PCA_ANALYSIS : false
CORRECTION: 'BH'
ALPHA : 0.05
**K** are the latent factors used in LFMM association analyses
**PHENOTYPE_FILE**: an phenotype file is obligatory in LFMM analysis. You can give to iKISS PCA results, climate variables, etc.
A PCA can reveal some 'structure' in the genotype data and it could help you to fix K parameter.
**PHENOTYPE_PCA_ANALYSIS**
* If **PHENOTYPE_PCA_ANALYSIS** is true, iKISS automatically run PCA using the file given by user in the PHENOTYPE_FILE key. This PHENOTYPE_FILE can be a PCA result for example.
* If **PHENOTYPE_PCA_ANALYSIS** is false, iKISS use directly the PHENOTYPE_FILE as 'phenotype' to LFMM analysis. Kmers are used as 'genotype' data.
Here, a example of a phenotype file with climate variables
.. code-block:: bash
accession_id group b2.Mean_Diurnal_Range b3.Isothermality b4.Temp_Seasonality b5.Max_Temp_of_Warmest_Month b6.Min_Temp_of_Coldest_Month b7.Temp_Annual_Range b8.Mean_Temp_of
_Wettest_Quarter b9.Mean_Temp_of_Driest_Quarter b10.Mean_Temp_of_Warmest_Quarter b11.Mean_Temp_of_Coldest_Quarter b12.Annual_Precipitation b13.Precipitation_of_Wettest_Mo
nth b14.Precipitation_of_Driest_Month b15.Precipitation_Seasonality b16.Precipitation_of_Wettest_Quarter b17.Precipitation_of_Driest_Quarter b18.Precipitation_of_Warmest_Quarter b19.Precipitation_of_Coldest_Quarter
Clone12 2 99 68 1230 310 166 144 250 226 258 226 1462 249 3 68 573 17 549 17
Clone14 2 100 68 1235 301 155 146 241 217 248 217 1525 259 3 67 603 18 575 18
Clone16 2 93 65 1389 310 168 142 250 223 258 223 1416 264 0 73 579 8 544 8
Clone20 2 154 55 3955 403 123 280 296 234 315 214 118 62 0 184 107 0 45 0
Clone2 1 152 55 3617 403 128 275 287 242 316 220 173 80 0 167 153 0 18 0
Clone4 1 168 51 5719 414 86 328 315 201 322 181 20 12 0 166 18 0 17 0
Clone8 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
**CORRECTION**: kmers outliers are obtained using a correction of BONFERONNI, BH or FDR model.
**ALPHA**: modify the alpha cutoff for outlier detection
=> 4. MAPPING_KMERS
-------------------
MAPPING_KMERS section in PARAMS can optionally be used to align kmers to a genomic reference. It could give a idea of selected regions in a genome.
.. code-block:: yaml
PARAMS:
MAPPING_KMERS:
REF: "reference.fasta"
MODE : bwa-aln
INDEX_OPTIONS: ""
OPTIONS : "-n 0.04"
FILTER_FLAG : 4
FILTER_QUAL : 10
Use a reference file in the **REF** section.
Parametter **MODE** using *bwa-aln* or *bwa-mem2*
Set up the **INDEX_OPTIONS** according to the MODE you have chosen.
If *bwa-mem2* leaf empty
If *bwa-aln* "-a bwtsw" or ""
Set options according of chosen mapper in the **OPTIONS** key.
If *bwa-mem2* default parameters -A 1 -B 4;
If *bwa-aln* -n 0.04
Obtained bam could be filtered using **FILTER_FLAG** (-F 4 by default) and **FILTER_QUAL** (mapq>10 by defaut) params.
=> 5. ASSEMBLY_KMERS
--------------------
ASSEMBLY_KMERS section in PARAMS can optionally be used to assembly significant kmers obtained by pcadapt or/and lfmm.
Contigs are assembled by iKISS using mergeTags from dekupl package https://github.com/Transipedia/dekupl-mergeTags.
Chose minimal overlap size "OVERLAP_SIZE" allowed to assembly kmers.
Feel free to filter contigs by size "FILTER_CONTIG_SIZE".
Assembled contigs could be mapped activating **MAPPING_CONTIGS**. This mapping can be launch versus a **REF** reference file using bwa-mem2 by default.
Reference file used in this step can be a different reference from **MAPPING_KMERS** options. Feel free of change parametters of mapping using **MAPPING_OPTIONS**
Assembled contigs could be used by blastn against a database, you can also try to annotate them!
.. code-block:: yaml
PARAMS:
ASSEMBLY:
OVERLAP_SIZE : 15
FILTER_CONTIG_SIZE : 100
MAPPING_CONTIGS: True
# if MAPPING_CONTIGS is activate, ikiss maps contigs vs REF using bwamem2
REF: 'reference.fasta'
MAPPING_OPTIONS : ""
=> 6. INTERSECT
---------------
iKISS uses bedtools intersect to calculate how many kmers/contigs are mapped in **FEATURES** (gene by default).
These **FEATURES** are filtered from the annotation **GFF** fileb before use bedtools intersect.
iKISS filtered kmers/contigs by using **FILTER_MAPQ_STATS** and minimal kmers/contigs number **FILTER_MIN_STATS** by FEATURE.
.. code-block:: yaml
PARAMS:
INTERSECT:
GFF : 'reference.gff'
FEATURE : 'gene'
FILTER_MAPQ_STATS: '15'
3.2. Adapt cluster_config.yaml
-------------------------------
If you will run ikiss in cluster, adapt `cluster_config.yaml` :
.. code-block:: bash
ikiss edit_cluster_config
Inside `cluster_config.yaml`, adapt partition to your favorite cluster and change memory and cpu number in by `__default__` key or in rules you need :
.. code-block:: bash
__default__:
cpus-per-task : 4
mem-per-cpu : 10G
partition : "normal"
nodelist: node19
output : 'slurm_logs/stdout/{rule}/{wildcards}.o'
error : 'slurm_logs/error/{rule}/{wildcards}.e'
job-name : '{rule}.{wildcards}'
kmers_gwas_per_sample:
cpus-per-task : 4
mem-per-cpu : 10G
RULES
-----
Here you can quickly find iKISS snakemake rules list :
.. code-block:: bash
rule kmers_gwas_per_sample *
rule kmers_to_use
rule kmers_table
rule extract_kmers_from_bed
rule index_ref
rule index_ref_to_assembly
rule mapping_kmers
rule filter_bam
rule kmer_position_from_bam *
rule merge_kmer_position
rule samtools_merge
rule pcadapt *
rule merge_method
rule outliers_position
rule extracting_features_from_gff
rule kmers_bedtools_intersect
rule get_pca_from_phenotype
rule lfmm *
rule mergetags
rule mapping_contigs
rule contigs_bedtools_intersect
rule intersect_and_contigs
rule intersect_and_outliers
rule fastq_stats
rule report_ikiss
rule html_ikiss
* rules with a `*` can be parallelised.
4. Running iKISS
================
Run iKISS by `ikiss run_local` or `ikiss run_cluster` as explained in "Running a datatest" section.
5. iKISS output
================
This is a overwiew of iKISS output directory:
.. code-block:: bash
OUTPUT-KISS/
config_corrected.yaml
0.FASTQ_STATS
└── fastq_stats.txt
1.KMERS_MODULE
├── Clone12
├── Clone14
├── Clone16
├── Clone2
├── Clone20
├── Clone4
└── Clone8
2.KMERS_TABLE
├── kmers_list_paths.txt
├── kmers_table.names
├── kmers_table.table
├── kmers_to_use
├── kmers_to_use.no_pass_kmers
├── kmers_to_use.shareness
├── kmers_to_use.stats.both
├── kmers_to_use.stats.only_canonical
└── kmers_to_use.stats.only_non_canonical
3.TABLE2BED
├── log
├── output_file.0.bed
├── output_file.0.bim
├── output_file.0.fam
├── output_file.1.bed
├── output_file.1.bim
├── output_file.1.fam
├── output_file.2.bed
├── output_file.2.bim
├── output_file.2.fam
├── output_file.3.bed
├── output_file.3.bim
├── output_file.3.fam
├── output_file.4.bed
├── output_file.4.bim
└── output_file.4.fam
4.EXTRACT_FASTA
├── output_file.0.fasta.gz
├── output_file.1.fasta.gz
├── output_file.2.fasta.gz
├── output_file.3.fasta.gz
└── output_file.4.fasta.gz
5.RANGES
├── output_file.0
├── output_file.1
├── output_file.2
├── output_file.3
└── output_file.4
6.LFMM
├── output_file.0_10_LFMM_outliers.csv
├── output_file.0_10_LFMM_pvalues.csv
├── output_file.0_10_LFMM.rplot.pdf
...
6.LFMM_PHENO
├── PCA_from_phenotype.csv
├── PCA_from_phenotype.html
└── PCA_from_phenotype.ipynb
6.PCADAPT
├── output_file.0_10_PCADAPT_outliers.csv
├── output_file.0_10_PCADAPT_pvalues.csv
├── output_file.0_10_PCADAPT.rplot.pdf
├── output_file.0_10_PCADAPT_scores.csv
...
7.MERGED_LFMM
├── merged_LFMM_outliers.csv
└── merged_LFMM_pvalues.csv
7.MERGED_PCADAPT
├── merged_PCADAPT_outliers.csv
└── merged_PCADAPT_pvalues.csv
8.MAPPING_KMERS
├── bam_files.txt
├── output_file.0_vs_reference.bam
├── output_file.0_vs_reference_FMQ.bam
├── output_file.0_vs_reference.sai
├── output_file.0_vs_reference_sorted.bam
├── output_file.0_vs_reference_sorted.bam.bai
├── output_file.0_vs_reference_sorted.bam.idxstats
├── output_file.0_vs_reference_sorted.bam.stats
...
9.KMERPOSITION
├── output_file.0_vs_reference_KMERPOSITION.txt
├── output_file.1_vs_reference_KMERPOSITION.txt
├── output_file.2_vs_reference_KMERPOSITION.txt
├── output_file.3_vs_reference_KMERPOSITION.txt
└── output_file.4_vs_reference_KMERPOSITION.txt
10.MERGE_KMERPOSITION
├── kmer_position_merged.txt
└── kmer_position_samtools_merge.bam
11.OUTLIERS_LFMM_POSITION
└── outliers_with_position.csv
11.OUTLIERS_PCADAPT_POSITION
└── outliers_with_position.csv
12.ASSEMBLY_OUTLIERS_LFMM
├── contigs_LFMM_vs_reference.bam
├── contigs_LFMM_vs_reference.sorted.bam
├── contigs_LFMM_vs_reference.sorted.bam.bai
├── contigs_LFMM_vs_reference.sorted.bam.idxstats
├── contigs_LFMM_vs_reference.sorted.bam.stats
├── outliers_LFMM_mergetags.csv
└── outliers_LFMM_mergetags.fasta
12.ASSEMBLY_OUTLIERS_PCADAPT
├── contigs_PCADAPT_vs_reference.bam
├── contigs_PCADAPT_vs_reference.sorted.bam
├── contigs_PCADAPT_vs_reference.sorted.bam.bai
├── contigs_PCADAPT_vs_reference.sorted.bam.idxstats
├── contigs_PCADAPT_vs_reference.sorted.bam.stats
├── outliers_PCADAPT_mergetags.csv
└── outliers_PCADAPT_mergetags.fasta
13.GFF_FEATURES
└── extracted.gff
14.CONTIGS_INTERSECT_LFMM
└── contigs_intersect_annotation.bed
14.CONTIGS_INTERSECT_PCADAPT
└── contigs_intersect_annotation.bed
14.KMERS_INTERSECT
└── kmers_bedtools_intersect_annotation.bed
15.CONTIGS_LFMM_INTERSECT
└── global_intersect_stats
15.CONTIGS_PCADAPT_INTERSECT
└── global_intersect_stats
15.OUTLIERS_LFMM_INTERSECT
├── global_intersect_stats
└── outliers_intersect_stats
15.OUTLIERS_PCADAPT_INTERSECT
├── global_intersect_stats
└── outliers_intersect_stats
REF
├── reference2.fasta
├── reference2.fasta.0123
├── reference2.fasta.amb
├── reference2.fasta.ann
├── reference2.fasta.bwt.2bit.64
├── reference2.fasta.pac
├── reference.fasta
├── reference.fasta.amb
├── reference.fasta.ann
├── reference.fasta.bwt
├── reference.fasta.pac
└── reference.fasta.sa
REPORT
├── iKISS_report.csv
├── iKISS_report.html
├── iKISS_report.ipynb
├── PCA_from_phenotype.html
└── PCA_from_phenotype.ipynb
BENCHMARK
LOGS
Note : we recommended to remove 1.KMER_GWAS repertory after analysis.
Authors
========
Julie Orjuela (IRD) develops iKISS
Yves Vigouroux (IRD) is the big boss with a lot of ideas and contributions!
Contributeurs
==============
Djamel Boubred (Bioinformatics Student at IRD) and Tram VI (Ph.D student IRD) have also contributed by debugging and test with rice and coffea datasets.
Sebastien Ravel has also contributed with the snakecdysis python package developpement.
Thanks
=======
Thanks to Ndomassi Tando (i-Trop IRD) for his administration support.
The authors acknowledge the IRD i-Trop HPC (South Green Platform) from IRD Montpellier for providing HPC resources that contributed to this work. https://bioinfo.ird.fr/ - http://www.southgreen.fr
License
=======
Licensed under MIT.
Intellectual property belongs to IRD and authors.
iKISS uses recycled code from the culebrONT project of SouthGreen platform https://culebront-pipeline.readthedocs.io/en/latest/.
iKISS uses SnakEcdysis package https://snakecdysis.readthedocs.io/en/latest/package.html to perform installation and execution in local and cluster mode.
.. |PythonVersions| image:: https://img.shields.io/badge/python-3.7%2B-blue
:target: https://www.python.org/downloads
.. |SnakemakeVersions| image:: https://img.shields.io/badge/snakemake-≥5.10.0-brightgreen.svg?style=flat
:target: https://snakemake.readthedocs.io
.. |Singularity| image:: https://img.shields.io/badge/singularity-≥3.3.0-7E4C74.svg
:target: https://sylabs.io/docs/
.. |readthedocs| image:: https://pbs.twimg.com/media/E5oBxcRXoAEBSp1.png
:target: https://culebront-pipeline.readthedocs.io/en/latest/
:width: 400px
Raw data
{
"_id": null,
"home_page": "",
"name": "ikiss",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "snakemake,kmers,selection,diversity",
"author": "",
"author_email": "\"Julie Orjuela (DIADE-IRD)\" <julie.orjuela@ird.fr>",
"download_url": "https://files.pythonhosted.org/packages/5a/09/d19f21b90d6350bcef81a4cf59badbd7bb8677cdd1f5452431d068194f53/ikiss-1.5.0.tar.gz",
"platform": null,
"description": ".. image:: ./ikiss/logo_ikiss.png\n :width: 400\n :alt: ikiss Logo\n :align: center\n\n\n|PythonVersions| |SnakemakeVersions| |Singularity|\n\n.. contents:: Table of Contents\n :depth: 2\n\n\n**Homepage:** https://forge.ird.fr/diade/iKISS\n\nAbout iKISS \n===============\n\n**iKISS (Kmer Inference sSelection)** is a snakemake pipeline able to decompose reads into kmers and extract kmers under selection. \n\niKISS uses KmersGWAS https://github.com/voichek/kmersGWAS, pcadapt https://cran.r-project.org/web/packages/pcadapt/readme/README.html and lfmm https://bcm-uga.github.io/lfmm/articles/lfmm to select genomics regions under selection.\n\n1. Install dependencies and clone iKISS\n=============================================\n\nCheck dependencies for iKISS : python and singularity\n\nInstall singularity and python3 in your local machine OR use module load to add singularity and python3 in your environment if you are working in a cluster :\n\n.. code-block:: bash\n\n module load system/python/3.8.12\n module load system/singularity/3.6.0\n\n\niKISS is NOW available as a PyPI package (recommended)\n\n.. code-block:: bash\n\n python3 -m pip install ikiss\n\n\nOR you can also install iKISS from git repository\n\n.. code-block:: bash\n\n python3 -m pip install ikiss@git+https://forge.ird.fr/diade/iKISS.git \n \n #OR\n \n git clone https://forge.ird.fr/diade/iKISS.git \n cd iKISS\n python3 -m pip install .\n\n\n1.1 Installing in cluster mode\n-------------------------------\n\nInstall iKISS in cluster mode using **singularity** container from ikiss_utilities https://itrop.ird.fr/ikiss_utilities/\n\n.. code-block:: bash\n\n ikiss install_cluster --help\n ikiss install_cluster --scheduler slurm --env singularity\n \n\n1.2 Installing in local mode \n----------------------------\n\n.. code-block:: bash\n\n ikiss install_local --help\n ikiss install_local\n\n\n2. Running a datatest\n=============================================\n\nRunning test with a datatest from iKISS_utilities in a repertory TEST\n\n.. code-block:: bash\n\n ikiss test_install --help\n ikiss test_install -d TEST\n\n\n2.1 In CLUSTER mode\n-------------------\n\nLaunching suggested command line done by iKISS, in CLUSTER mode : \n\nPlease run command line 'ikiss create_cluster_config' before the first run and modify theads, ram, node and computer ressources. \niKISS do a copy of cluster_config.yaml file into your home \"/home/$USER/.config/ikiss/cluster_config.yaml\"\n\n \n.. code-block:: bash\n\n ikiss run_cluster --help\n ikiss create_cluster_config\n\nIf singularity was selected in installation of iKISS, it could be needed to give argument --singularity-args \\\"--bind $HOME\\\" to Snakemake, by using :\n\n.. code-block:: bash\n\n ikiss run_cluster --help\n ikiss run_cluster -c TEST/data_test_config.yaml --singularity-args \"--bind $HOME\"\n # @IFB\n ikiss run_cluster -c TEST/data_test_config.yaml --singularity-args \"--bind /shared:/shared\"\n #you can also use snakemake parametters as --rerun-incomplete --nolock\n\n\n**Important Note** : In i-Trop cluster, run iKISS using ONLY a node, data has to be in \"/scratch\" of chosen node. Use `nodelist : nodeX` parametter inside of cluster_config.yaml file.\n\n\n2.2 In LOCAL mode\n-----------------\n\nlaunching suggested command line done by iKISS, in LOCAL mode: \n\n.. code-block:: bash\n\n ikiss run_local --help\n ikiss run_local -t 8 -c TEST/data_test_config.yaml --singularity-args \"--bind $HOME\"\n\nIn local mode, its possible to allocate threads to some rules using `--set-threads` snakemake argument such as\n\n.. code-block:: bash\n\n ikiss run_local -t 8 -c TEST/data_test_config.yaml --set-threads kmers_gwas_per_sample=4 mapping_kmers=2 filter_bam=2 kmer_position_from_bam=4 pcadapt=2 extract_kmers_from_bed=2\n\n\n3. Running your data\n========================\n\n\n3.1. Adapt config.yaml\n------------------------\n\nBefore to run iKISS, adapt `config.yaml` by using : \n\n.. code-block:: bash\n\n ikiss create_config\n\n\nAdapt `config.yaml` file with path to fastq files (FASTQ) and outfile (OUTPUT) in the `DATA` section. \n\n.. code-block:: yaml\n\n DATA:\n FASTQ: './DATATEST/fastq'\n OUTPUT: './OUTPUT-KISS/'\n\n:warning if yours reads are ilumina paired, you need rename reads SAMPLE_R1.fastq.gz and SAMPLE_R2.fastq.gz. For single reads use SAMPLE_R1.fastq.gz\n\niKISS uses compressed ans decompressed fastq files.\n\n\n3.1.1 WORKFLOW section\n-----------------------\n\nParameter iKISS steps using the section WORKFLOW and parameter it with the PARAMS sections.\n\nIn WORKFLOW section:\n\n KMERS_GWAS step has to be activated by default. \n\n PCADAPT, LFMM, MAPPING or ASSEMBLY are optional. Active or deactivate these steps using true or false.\n\n\n**KMERS_GWAS** convert reads in kmers, filter them and create a format ready to use in population genomics!\n\n**PCADAPT** detects genetic markers (kmers here ^^) involved in biological adaptation and provides outlier detection based on Principal Component Analysis (PCA).\n\n**LFMM** is used by iKISS for testing correlations between kmers and environmental data.\n\n**MAPPING_KMERS** can optionally be used to align kmers to a genomic reference (if it is available ! ).\n\n**ASSEMBLY_KMERS** can optionally assembly significant kmers obtained by pcadapt or lfmm\n\n**INTERSECT** can optionally calculate how many kmers (if MAPPING_KMERS is activated ) or contigs(if ASSEMBLY_KMERS is activated) are found in FEATURES (gene by default) \n\n.. code-block:: yaml\n\n WORKFLOW:\n KMERS_MODULE : true\n PCADAPT : true\n LFMM : true\n MAPPING_KMERS: true\n ASSEMBLY_KMERS: true\n INTERSECT: True\n\n3.1.2 PARAMS section\n--------------------\n\nIn the PARAMS section, tools parameters can be modified and adapted.\n\n\n=> 1. KMERS_MODULE\n-------------------\n\nKMERS_GWAS module decompose reads into kmers and create a binary table of presence/absence of kmers. This table can be filter to use only most informative kmers into the populations. PLINK format outfiles are obtained in this module.\n\n.. code-block:: yaml\n\n PARAMS:\n KMERS_MODULE:\n KMER_SIZE : 31\n MAC : 2\n P : 0.2\n MAF : 0.05\n B : 1000000 # nb kmers in each bed file\n SPLIT_LIST_SIZE : 100000\n MIN_LIST_SIZE : 50000\n\n\n**KMER_SIZE** is the length of kmers (should be between 15-31)\n\n**MAC** is the minor allele count (min allowed appearance of a kmer) \n\n**P** is the minimum percent of appearance in each strand form\n\n**MAF** is the minimum allele frequency\n\n**B** is the number of kmers in each bed file\n\n**SPLIT_LIST_SIZE** is the nb of kmers by bed file\n\n**MIN_LIST_SIZE** indicates the minimal number of kmers allowed in the smaller bed file after splitting\n\n\n=> 2. PCADAPT\n--------------\n\nPCADAPT detects kmers involved in biological adaptation and provides outlier detection based on Principal Component Analysis (PCA)\n\n.. code-block:: yaml\n\n PARAMS: \n PCADAPT:\n K : 2\n SAMPLES: \"samples.txt\"\n CORRECTION: 'FDR'\n ALPHA : 0.05\n\n\n**K** : number K of principal components\n\n**SAMPLES** : you need to generate a *samples.txt* file. This file contains two columns (tab delimitations) : accession_id and phenotype_value. It will be used by PCADAPT.\n\n **accession_id** : contains exactly same name of samples in FASTQ. \n\n **phenotype_value** (int): contains sample group (wild=1, cultivated=2 for example)\n\n.. code-block:: bash\n\n accession_id\tgroup\n Clone12\t2\n Clone14\t2\n Clone16\t2\n Clone20\t2\n Clone2\t1\n Clone4\t1\n Clone8\t1\n\n**CORRECTION**: kmers outliers are obtained using a correction of BONFERONNI, BH or FDR model.\n\n**ALPHA**: modify the alpha cutoff for outlier detection\n\n\n=> 3. LFMM\n----------\n\nLFMM is used by iKISS for testing correlations between kmers and environmental data.\n\n.. code-block:: yaml\n\n PARAMS:\n LFMM:\n K : 2\n PHENOTYPE_FILE: \"pheno.txt\"\n PHENOTYPE_PCA_ANALYSIS : false\n CORRECTION: 'BH'\n ALPHA : 0.05\n\n\n**K** are the latent factors used in LFMM association analyses \n\n**PHENOTYPE_FILE**: an phenotype file is obligatory in LFMM analysis. You can give to iKISS PCA results, climate variables, etc.\n\nA PCA can reveal some 'structure' in the genotype data and it could help you to fix K parameter.\n\n**PHENOTYPE_PCA_ANALYSIS** \n\n * If **PHENOTYPE_PCA_ANALYSIS** is true, iKISS automatically run PCA using the file given by user in the PHENOTYPE_FILE key. This PHENOTYPE_FILE can be a PCA result for example.\n\n * If **PHENOTYPE_PCA_ANALYSIS** is false, iKISS use directly the PHENOTYPE_FILE as 'phenotype' to LFMM analysis. Kmers are used as 'genotype' data.\n\nHere, a example of a phenotype file with climate variables\n\n.. code-block:: bash\n\n accession_id\tgroup\tb2.Mean_Diurnal_Range\tb3.Isothermality\tb4.Temp_Seasonality\tb5.Max_Temp_of_Warmest_Month\tb6.Min_Temp_of_Coldest_Month\tb7.Temp_Annual_Range\tb8.Mean_Temp_of\n _Wettest_Quarter\tb9.Mean_Temp_of_Driest_Quarter\tb10.Mean_Temp_of_Warmest_Quarter\tb11.Mean_Temp_of_Coldest_Quarter\tb12.Annual_Precipitation\tb13.Precipitation_of_Wettest_Mo\n nth\tb14.Precipitation_of_Driest_Month\tb15.Precipitation_Seasonality\tb16.Precipitation_of_Wettest_Quarter\tb17.Precipitation_of_Driest_Quarter\tb18.Precipitation_of_Warmest_Quarter\tb19.Precipitation_of_Coldest_Quarter\n Clone12\t2\t99\t68\t1230\t310\t166\t144\t250\t226\t258\t226\t1462\t249\t3\t68\t573\t17\t549\t17\n Clone14\t2\t100\t68\t1235\t301\t155\t146\t241\t217\t248\t217\t1525\t259\t3\t67\t603\t18\t575\t18\n Clone16\t2\t93\t65\t1389\t310\t168\t142\t250\t223\t258\t223\t1416\t264\t0\t73\t579\t8\t544\t8\n Clone20\t2\t154\t55\t3955\t403\t123\t280\t296\t234\t315\t214\t118\t62\t0\t184\t107\t0\t45\t0\n Clone2\t1\t152\t55\t3617\t403\t128\t275\t287\t242\t316\t220\t173\t80\t0\t167\t153\t0\t18\t0\n Clone4\t1\t168\t51\t5719\t414\t86\t328\t315\t201\t322\t181\t20\t12\t0\t166\t18\t0\t17\t0\n Clone8\t1\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\n\n\n**CORRECTION**: kmers outliers are obtained using a correction of BONFERONNI, BH or FDR model.\n\n**ALPHA**: modify the alpha cutoff for outlier detection\n\n\n=> 4. MAPPING_KMERS\n-------------------\n\nMAPPING_KMERS section in PARAMS can optionally be used to align kmers to a genomic reference. It could give a idea of selected regions in a genome. \n\n.. code-block:: yaml\n\n PARAMS:\n MAPPING_KMERS:\n REF: \"reference.fasta\"\n MODE : bwa-aln\n INDEX_OPTIONS: \"\"\n OPTIONS : \"-n 0.04\"\n FILTER_FLAG : 4\n FILTER_QUAL : 10\n\n\nUse a reference file in the **REF** section. \n\nParametter **MODE** using *bwa-aln* or *bwa-mem2* \n\nSet up the **INDEX_OPTIONS** according to the MODE you have chosen.\n\n If *bwa-mem2* leaf empty\n \n If *bwa-aln* \"-a bwtsw\" or \"\" \n\nSet options according of chosen mapper in the **OPTIONS** key. \n\n If *bwa-mem2* default parameters -A 1 -B 4;\n \n If *bwa-aln* -n 0.04\n\nObtained bam could be filtered using **FILTER_FLAG** (-F 4 by default) and **FILTER_QUAL** (mapq>10 by defaut) params.\n\n=> 5. ASSEMBLY_KMERS\n--------------------\n\nASSEMBLY_KMERS section in PARAMS can optionally be used to assembly significant kmers obtained by pcadapt or/and lfmm.\n\nContigs are assembled by iKISS using mergeTags from dekupl package https://github.com/Transipedia/dekupl-mergeTags.\n\nChose minimal overlap size \"OVERLAP_SIZE\" allowed to assembly kmers.\n\nFeel free to filter contigs by size \"FILTER_CONTIG_SIZE\".\n\nAssembled contigs could be mapped activating **MAPPING_CONTIGS**. This mapping can be launch versus a **REF** reference file using bwa-mem2 by default.\nReference file used in this step can be a different reference from **MAPPING_KMERS** options. Feel free of change parametters of mapping using **MAPPING_OPTIONS**\n\nAssembled contigs could be used by blastn against a database, you can also try to annotate them!\n\n.. code-block:: yaml\n\n PARAMS:\n ASSEMBLY:\n OVERLAP_SIZE : 15\n FILTER_CONTIG_SIZE : 100\n MAPPING_CONTIGS: True\n # if MAPPING_CONTIGS is activate, ikiss maps contigs vs REF using bwamem2\n REF: 'reference.fasta'\n MAPPING_OPTIONS : \"\"\n\n\n=> 6. INTERSECT\n---------------\n\niKISS uses bedtools intersect to calculate how many kmers/contigs are mapped in **FEATURES** (gene by default).\n\nThese **FEATURES** are filtered from the annotation **GFF** fileb before use bedtools intersect.\n\niKISS filtered kmers/contigs by using **FILTER_MAPQ_STATS** and minimal kmers/contigs number **FILTER_MIN_STATS** by FEATURE. \n\n.. code-block:: yaml\n\n PARAMS:\n INTERSECT:\n GFF : 'reference.gff'\n FEATURE : 'gene'\n FILTER_MAPQ_STATS: '15' \n\n\n\n3.2. Adapt cluster_config.yaml\n-------------------------------\n\n\nIf you will run ikiss in cluster, adapt `cluster_config.yaml` : \n\n.. code-block:: bash\n\n ikiss edit_cluster_config\n\nInside `cluster_config.yaml`, adapt partition to your favorite cluster and change memory and cpu number in by `__default__` key or in rules you need :\n\n.. code-block:: bash\n\n __default__:\n cpus-per-task : 4\n mem-per-cpu : 10G\n partition : \"normal\"\n nodelist: node19\n output : 'slurm_logs/stdout/{rule}/{wildcards}.o'\n error : 'slurm_logs/error/{rule}/{wildcards}.e'\n job-name : '{rule}.{wildcards}'\n \n kmers_gwas_per_sample:\n cpus-per-task : 4\n mem-per-cpu : 10G\n\n\nRULES \n-----\n\nHere you can quickly find iKISS snakemake rules list : \n\n.. code-block:: bash\n\n rule kmers_gwas_per_sample *\n rule kmers_to_use\n rule kmers_table\n rule extract_kmers_from_bed\n rule index_ref\n rule index_ref_to_assembly\n rule mapping_kmers\n rule filter_bam\n rule kmer_position_from_bam * \n rule merge_kmer_position\n rule samtools_merge\n rule pcadapt * \n rule merge_method\n rule outliers_position\n rule extracting_features_from_gff\n rule kmers_bedtools_intersect\n rule get_pca_from_phenotype\n rule lfmm * \n rule mergetags\n rule mapping_contigs\n rule contigs_bedtools_intersect\n rule intersect_and_contigs\n rule intersect_and_outliers\n rule fastq_stats\n rule report_ikiss\n rule html_ikiss\n\n* rules with a `*` can be parallelised.\n\n\n4. Running iKISS\n================\n\nRun iKISS by `ikiss run_local` or `ikiss run_cluster` as explained in \"Running a datatest\" section.\n\n\n\n5. iKISS output\n================\n\nThis is a overwiew of iKISS output directory:\n\n.. code-block:: bash\n\n OUTPUT-KISS/ \n config_corrected.yaml\n 0.FASTQ_STATS\n \u2514\u2500\u2500 fastq_stats.txt \n 1.KMERS_MODULE\n \u251c\u2500\u2500 Clone12\n \u251c\u2500\u2500 Clone14\n \u251c\u2500\u2500 Clone16\n \u251c\u2500\u2500 Clone2\n \u251c\u2500\u2500 Clone20\n \u251c\u2500\u2500 Clone4\n \u2514\u2500\u2500 Clone8\n 2.KMERS_TABLE\n \u251c\u2500\u2500 kmers_list_paths.txt\n \u251c\u2500\u2500 kmers_table.names\n \u251c\u2500\u2500 kmers_table.table\n \u251c\u2500\u2500 kmers_to_use\n \u251c\u2500\u2500 kmers_to_use.no_pass_kmers\n \u251c\u2500\u2500 kmers_to_use.shareness\n \u251c\u2500\u2500 kmers_to_use.stats.both\n \u251c\u2500\u2500 kmers_to_use.stats.only_canonical\n \u2514\u2500\u2500 kmers_to_use.stats.only_non_canonical\n 3.TABLE2BED\n \u251c\u2500\u2500 log\n \u251c\u2500\u2500 output_file.0.bed\n \u251c\u2500\u2500 output_file.0.bim\n \u251c\u2500\u2500 output_file.0.fam\n \u251c\u2500\u2500 output_file.1.bed\n \u251c\u2500\u2500 output_file.1.bim\n \u251c\u2500\u2500 output_file.1.fam\n \u251c\u2500\u2500 output_file.2.bed\n \u251c\u2500\u2500 output_file.2.bim\n \u251c\u2500\u2500 output_file.2.fam\n \u251c\u2500\u2500 output_file.3.bed\n \u251c\u2500\u2500 output_file.3.bim\n \u251c\u2500\u2500 output_file.3.fam\n \u251c\u2500\u2500 output_file.4.bed\n \u251c\u2500\u2500 output_file.4.bim\n \u2514\u2500\u2500 output_file.4.fam\n 4.EXTRACT_FASTA\n \u251c\u2500\u2500 output_file.0.fasta.gz\n \u251c\u2500\u2500 output_file.1.fasta.gz\n \u251c\u2500\u2500 output_file.2.fasta.gz\n \u251c\u2500\u2500 output_file.3.fasta.gz\n \u2514\u2500\u2500 output_file.4.fasta.gz\n 5.RANGES\n \u251c\u2500\u2500 output_file.0\n \u251c\u2500\u2500 output_file.1\n \u251c\u2500\u2500 output_file.2\n \u251c\u2500\u2500 output_file.3\n \u2514\u2500\u2500 output_file.4\n 6.LFMM\n \u251c\u2500\u2500 output_file.0_10_LFMM_outliers.csv\n \u251c\u2500\u2500 output_file.0_10_LFMM_pvalues.csv\n \u251c\u2500\u2500 output_file.0_10_LFMM.rplot.pdf\n ...\n 6.LFMM_PHENO\n \u251c\u2500\u2500 PCA_from_phenotype.csv\n \u251c\u2500\u2500 PCA_from_phenotype.html\n \u2514\u2500\u2500 PCA_from_phenotype.ipynb\n 6.PCADAPT\n \u251c\u2500\u2500 output_file.0_10_PCADAPT_outliers.csv\n \u251c\u2500\u2500 output_file.0_10_PCADAPT_pvalues.csv\n \u251c\u2500\u2500 output_file.0_10_PCADAPT.rplot.pdf\n \u251c\u2500\u2500 output_file.0_10_PCADAPT_scores.csv\n ... \n 7.MERGED_LFMM\n \u251c\u2500\u2500 merged_LFMM_outliers.csv\n \u2514\u2500\u2500 merged_LFMM_pvalues.csv\n 7.MERGED_PCADAPT\n \u251c\u2500\u2500 merged_PCADAPT_outliers.csv\n \u2514\u2500\u2500 merged_PCADAPT_pvalues.csv\n 8.MAPPING_KMERS\n \u251c\u2500\u2500 bam_files.txt\n \u251c\u2500\u2500 output_file.0_vs_reference.bam\n \u251c\u2500\u2500 output_file.0_vs_reference_FMQ.bam\n \u251c\u2500\u2500 output_file.0_vs_reference.sai\n \u251c\u2500\u2500 output_file.0_vs_reference_sorted.bam\n \u251c\u2500\u2500 output_file.0_vs_reference_sorted.bam.bai\n \u251c\u2500\u2500 output_file.0_vs_reference_sorted.bam.idxstats\n \u251c\u2500\u2500 output_file.0_vs_reference_sorted.bam.stats\n ...\n 9.KMERPOSITION\n \u251c\u2500\u2500 output_file.0_vs_reference_KMERPOSITION.txt\n \u251c\u2500\u2500 output_file.1_vs_reference_KMERPOSITION.txt\n \u251c\u2500\u2500 output_file.2_vs_reference_KMERPOSITION.txt\n \u251c\u2500\u2500 output_file.3_vs_reference_KMERPOSITION.txt\n \u2514\u2500\u2500 output_file.4_vs_reference_KMERPOSITION.txt \n 10.MERGE_KMERPOSITION\n \u251c\u2500\u2500 kmer_position_merged.txt\n \u2514\u2500\u2500 kmer_position_samtools_merge.bam\n 11.OUTLIERS_LFMM_POSITION\n \u2514\u2500\u2500 outliers_with_position.csv\n 11.OUTLIERS_PCADAPT_POSITION\n \u2514\u2500\u2500 outliers_with_position.csv\n 12.ASSEMBLY_OUTLIERS_LFMM\n \u251c\u2500\u2500 contigs_LFMM_vs_reference.bam\n \u251c\u2500\u2500 contigs_LFMM_vs_reference.sorted.bam\n \u251c\u2500\u2500 contigs_LFMM_vs_reference.sorted.bam.bai\n \u251c\u2500\u2500 contigs_LFMM_vs_reference.sorted.bam.idxstats\n \u251c\u2500\u2500 contigs_LFMM_vs_reference.sorted.bam.stats\n \u251c\u2500\u2500 outliers_LFMM_mergetags.csv\n \u2514\u2500\u2500 outliers_LFMM_mergetags.fasta\n 12.ASSEMBLY_OUTLIERS_PCADAPT\n \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.bam\n \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.sorted.bam\n \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.sorted.bam.bai\n \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.sorted.bam.idxstats\n \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.sorted.bam.stats\n \u251c\u2500\u2500 outliers_PCADAPT_mergetags.csv\n \u2514\u2500\u2500 outliers_PCADAPT_mergetags.fasta\n 13.GFF_FEATURES\n \u2514\u2500\u2500 extracted.gff\n 14.CONTIGS_INTERSECT_LFMM\n \u2514\u2500\u2500 contigs_intersect_annotation.bed\n 14.CONTIGS_INTERSECT_PCADAPT\n \u2514\u2500\u2500 contigs_intersect_annotation.bed\n 14.KMERS_INTERSECT\n \u2514\u2500\u2500 kmers_bedtools_intersect_annotation.bed\n 15.CONTIGS_LFMM_INTERSECT\n \u2514\u2500\u2500 global_intersect_stats\n 15.CONTIGS_PCADAPT_INTERSECT\n \u2514\u2500\u2500 global_intersect_stats\n 15.OUTLIERS_LFMM_INTERSECT\n \u251c\u2500\u2500 global_intersect_stats\n \u2514\u2500\u2500 outliers_intersect_stats\n 15.OUTLIERS_PCADAPT_INTERSECT\n \u251c\u2500\u2500 global_intersect_stats\n \u2514\u2500\u2500 outliers_intersect_stats\n REF\n \u251c\u2500\u2500 reference2.fasta\n \u251c\u2500\u2500 reference2.fasta.0123\n \u251c\u2500\u2500 reference2.fasta.amb\n \u251c\u2500\u2500 reference2.fasta.ann\n \u251c\u2500\u2500 reference2.fasta.bwt.2bit.64\n \u251c\u2500\u2500 reference2.fasta.pac\n \u251c\u2500\u2500 reference.fasta\n \u251c\u2500\u2500 reference.fasta.amb\n \u251c\u2500\u2500 reference.fasta.ann\n \u251c\u2500\u2500 reference.fasta.bwt\n \u251c\u2500\u2500 reference.fasta.pac\n \u2514\u2500\u2500 reference.fasta.sa\n REPORT\n \u251c\u2500\u2500 iKISS_report.csv\n \u251c\u2500\u2500 iKISS_report.html\n \u251c\u2500\u2500 iKISS_report.ipynb\n \u251c\u2500\u2500 PCA_from_phenotype.html\n \u2514\u2500\u2500 PCA_from_phenotype.ipynb\n BENCHMARK\n LOGS\n\n\nNote : we recommended to remove 1.KMER_GWAS repertory after analysis.\n\nAuthors\n========\n\nJulie Orjuela (IRD) develops iKISS\n\nYves Vigouroux (IRD) is the big boss with a lot of ideas and contributions! \n\nContributeurs \n==============\n\nDjamel Boubred (Bioinformatics Student at IRD) and Tram VI (Ph.D student IRD) have also contributed by debugging and test with rice and coffea datasets. \n\nSebastien Ravel has also contributed with the snakecdysis python package developpement.\n\nThanks\n=======\n\nThanks to Ndomassi Tando (i-Trop IRD) for his administration support.\n\nThe authors acknowledge the IRD i-Trop HPC (South Green Platform) from IRD Montpellier for providing HPC resources that contributed to this work. https://bioinfo.ird.fr/ - http://www.southgreen.fr\n \nLicense\n=======\n\nLicensed under MIT.\n\nIntellectual property belongs to IRD and authors.\n\niKISS uses recycled code from the culebrONT project of SouthGreen platform https://culebront-pipeline.readthedocs.io/en/latest/.\niKISS uses SnakEcdysis package https://snakecdysis.readthedocs.io/en/latest/package.html to perform installation and execution in local and cluster mode.\n\n.. |PythonVersions| image:: https://img.shields.io/badge/python-3.7%2B-blue\n :target: https://www.python.org/downloads\n.. |SnakemakeVersions| image:: https://img.shields.io/badge/snakemake-\u22655.10.0-brightgreen.svg?style=flat\n :target: https://snakemake.readthedocs.io\n.. |Singularity| image:: https://img.shields.io/badge/singularity-\u22653.3.0-7E4C74.svg\n :target: https://sylabs.io/docs/\n.. |readthedocs| image:: https://pbs.twimg.com/media/E5oBxcRXoAEBSp1.png\n :target: https://culebront-pipeline.readthedocs.io/en/latest/\n :width: 400px\n\n\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) 2022 DIADE IRD / Julie Orjuela, Yves Vigouroux Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
"summary": "iKISS is a pipeline to detect kmers under selection.",
"version": "1.5.0",
"project_urls": {
"Bug Tracker": "https://forge.ird.fr/diade/iKISS/-/issues",
"Documentation": "https://forge.ird.fr/diade/iKISS/-/blob/master/README.rst",
"Downloads": "https://forge.ird.fr/diade/iKISS/-/releases/",
"Homepage": "https://forge.ird.fr/diade/iKISS.git",
"Source Code": "https://forge.ird.fr/diade/iKISS.git",
"repository": "https://forge.ird.fr/diade/iKISS.git"
},
"split_keywords": [
"snakemake",
"kmers",
"selection",
"diversity"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e31479d540fd25d0f40e66272dcac9f312098cd09d3220969ed61e441a660d9b",
"md5": "4d92ac494c1de9d2a6c4bdc23e4beaa3",
"sha256": "18359fe0394d47804679a0c53a0d7aca168b3817e0556d1def557871fa526b7a"
},
"downloads": -1,
"filename": "ikiss-1.5.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4d92ac494c1de9d2a6c4bdc23e4beaa3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 3172331,
"upload_time": "2023-09-05T10:56:46",
"upload_time_iso_8601": "2023-09-05T10:56:46.855547Z",
"url": "https://files.pythonhosted.org/packages/e3/14/79d540fd25d0f40e66272dcac9f312098cd09d3220969ed61e441a660d9b/ikiss-1.5.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5a09d19f21b90d6350bcef81a4cf59badbd7bb8677cdd1f5452431d068194f53",
"md5": "ef87b126ec539a41262a69009e2706ed",
"sha256": "551f512cfd1d03b880f77110e216f924bbb22905db9e5a406b24c3dbbaf699b2"
},
"downloads": -1,
"filename": "ikiss-1.5.0.tar.gz",
"has_sig": false,
"md5_digest": "ef87b126ec539a41262a69009e2706ed",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 3168548,
"upload_time": "2023-09-05T10:56:51",
"upload_time_iso_8601": "2023-09-05T10:56:51.643008Z",
"url": "https://files.pythonhosted.org/packages/5a/09/d19f21b90d6350bcef81a4cf59badbd7bb8677cdd1f5452431d068194f53/ikiss-1.5.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-05 10:56:51",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "ikiss"
}