ikiss


Nameikiss JSON
Version 1.5.0 PyPI version JSON
download
home_page
SummaryiKISS is a pipeline to detect kmers under selection.
upload_time2023-09-05 10:56:51
maintainer
docs_urlNone
author
requires_python>=3.8
licenseMIT License Copyright (c) 2022 DIADE IRD / Julie Orjuela, Yves Vigouroux Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords snakemake kmers selection diversity
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. image:: ./ikiss/logo_ikiss.png
   :width: 400
   :alt: ikiss Logo
   :align: center


|PythonVersions| |SnakemakeVersions| |Singularity|

.. contents:: Table of Contents
    :depth: 2


**Homepage:** https://forge.ird.fr/diade/iKISS

About iKISS 
===============

**iKISS (Kmer Inference sSelection)** is a snakemake pipeline able to decompose reads into kmers and extract kmers under selection. 

iKISS uses KmersGWAS https://github.com/voichek/kmersGWAS, pcadapt https://cran.r-project.org/web/packages/pcadapt/readme/README.html and lfmm https://bcm-uga.github.io/lfmm/articles/lfmm to select genomics regions under selection.

1. Install dependencies and clone iKISS
=============================================

Check dependencies for iKISS : python and singularity

Install  singularity and python3 in your local machine OR use module load to add singularity and python3 in your environment if you are working in a cluster :

.. code-block:: bash

   module load system/python/3.8.12
   module load system/singularity/3.6.0


iKISS is NOW available as a PyPI package (recommended)

.. code-block:: bash

   python3 -m pip install ikiss


OR you can also install iKISS from git repository

.. code-block:: bash

   python3 -m pip install ikiss@git+https://forge.ird.fr/diade/iKISS.git 
   
   #OR
   
   git clone https://forge.ird.fr/diade/iKISS.git 
   cd iKISS
   python3 -m pip install .


1.1 Installing in cluster mode
-------------------------------

Install iKISS in cluster mode using **singularity** container from ikiss_utilities https://itrop.ird.fr/ikiss_utilities/

.. code-block:: bash

   ikiss install_cluster --help
   ikiss install_cluster --scheduler slurm --env singularity
   

1.2 Installing in local mode 
----------------------------

.. code-block:: bash

   ikiss install_local --help
   ikiss install_local


2. Running a datatest
=============================================

Running test with a datatest from iKISS_utilities in a repertory TEST

.. code-block:: bash

   ikiss test_install --help
   ikiss test_install -d TEST


2.1 In CLUSTER mode
-------------------

Launching suggested command line done by iKISS, in CLUSTER mode : 

Please run command line 'ikiss create_cluster_config' before the first run and modify theads, ram, node and computer ressources. 
iKISS do a copy of cluster_config.yaml file into your home "/home/$USER/.config/ikiss/cluster_config.yaml"

   
.. code-block:: bash

   ikiss run_cluster --help
   ikiss create_cluster_config

If singularity was selected in installation of iKISS, it could be needed to give argument --singularity-args \"--bind $HOME\" to Snakemake, by using :

.. code-block:: bash

   ikiss run_cluster --help
   ikiss run_cluster -c TEST/data_test_config.yaml --singularity-args "--bind $HOME"
   # @IFB
   ikiss run_cluster -c TEST/data_test_config.yaml --singularity-args "--bind /shared:/shared"
   #you can also use snakemake parametters as --rerun-incomplete --nolock


**Important Note** : In i-Trop cluster, run iKISS using ONLY a node, data has to be in "/scratch" of chosen node. Use `nodelist : nodeX` parametter inside of cluster_config.yaml file.


2.2 In LOCAL mode
-----------------

launching suggested command line done by iKISS, in LOCAL mode: 

.. code-block:: bash

   ikiss run_local --help
   ikiss run_local -t 8 -c TEST/data_test_config.yaml --singularity-args "--bind $HOME"

In local mode, its possible to allocate threads to some rules using `--set-threads` snakemake argument such as

.. code-block:: bash

    ikiss run_local -t 8 -c TEST/data_test_config.yaml --set-threads kmers_gwas_per_sample=4 mapping_kmers=2 filter_bam=2 kmer_position_from_bam=4 pcadapt=2 extract_kmers_from_bed=2


3. Running your data
========================


3.1. Adapt config.yaml
------------------------

Before to run iKISS, adapt `config.yaml` by using : 

.. code-block:: bash

   ikiss create_config


Adapt `config.yaml` file with path to fastq files (FASTQ) and outfile (OUTPUT) in the `DATA` section. 

.. code-block:: yaml

   DATA:
      FASTQ: './DATATEST/fastq'
      OUTPUT: './OUTPUT-KISS/'

:warning if yours reads are ilumina paired, you need rename reads SAMPLE_R1.fastq.gz and SAMPLE_R2.fastq.gz. For single reads use SAMPLE_R1.fastq.gz

iKISS uses compressed ans decompressed fastq files.


3.1.1 WORKFLOW section
-----------------------

Parameter iKISS steps using the section WORKFLOW and parameter it with the PARAMS sections.

In WORKFLOW section:

   KMERS_GWAS step has to be activated by default. 

   PCADAPT, LFMM, MAPPING or ASSEMBLY are optional. Active or deactivate these steps using true or false.


**KMERS_GWAS** convert reads in kmers, filter them and create a format ready to use in population genomics!

**PCADAPT** detects genetic markers (kmers here ^^) involved in biological adaptation and provides outlier detection based on Principal Component Analysis (PCA).

**LFMM** is used by iKISS for testing correlations between kmers and environmental data.

**MAPPING_KMERS** can optionally be used to align kmers to a genomic reference (if it is available ! ).

**ASSEMBLY_KMERS** can optionally assembly significant kmers obtained by pcadapt or lfmm

**INTERSECT** can optionally calculate how many kmers (if MAPPING_KMERS is activated ) or contigs(if ASSEMBLY_KMERS is activated) are found in FEATURES (gene by default)  

.. code-block:: yaml

   WORKFLOW:
      KMERS_MODULE : true
      PCADAPT : true
      LFMM : true
      MAPPING_KMERS: true
      ASSEMBLY_KMERS: true
      INTERSECT: True

3.1.2 PARAMS section
--------------------

In the PARAMS section, tools parameters can be modified and adapted.


=> 1. KMERS_MODULE
-------------------

KMERS_GWAS module decompose reads into kmers and create a binary table of presence/absence of kmers. This table can be filter to use only most informative kmers into the populations. PLINK format outfiles are obtained in this module.

.. code-block:: yaml

   PARAMS:
      KMERS_MODULE:
         KMER_SIZE : 31
         MAC : 2
         P : 0.2
         MAF : 0.05
         B : 1000000 # nb kmers in each bed file
         SPLIT_LIST_SIZE : 100000
         MIN_LIST_SIZE : 50000


**KMER_SIZE** is the length of kmers (should be between 15-31)

**MAC** is the minor allele count (min allowed appearance of a kmer) 

**P** is the minimum percent of appearance in each strand form

**MAF** is the minimum allele frequency

**B** is the number of kmers in each bed file

**SPLIT_LIST_SIZE** is the nb of kmers by bed file

**MIN_LIST_SIZE** indicates the minimal number of kmers allowed in the smaller bed file after splitting


=> 2. PCADAPT
--------------

PCADAPT detects kmers involved in biological adaptation and provides outlier detection based on Principal Component Analysis (PCA)

.. code-block:: yaml

   PARAMS:        
      PCADAPT:
         K : 2
         SAMPLES: "samples.txt"
         CORRECTION: 'FDR'
         ALPHA : 0.05


**K** : number K of principal components

**SAMPLES** : you need to generate a *samples.txt* file.  This file contains two columns (tab delimitations) : accession_id and phenotype_value. It will be used by PCADAPT.

   **accession_id** : contains exactly same name of samples in FASTQ. 

   **phenotype_value** (int): contains sample group (wild=1, cultivated=2 for example)

.. code-block:: bash

   accession_id	group
   Clone12	2
   Clone14	2
   Clone16	2
   Clone20	2
   Clone2	1
   Clone4	1
   Clone8	1

**CORRECTION**: kmers outliers are obtained using a correction of BONFERONNI, BH or FDR model.

**ALPHA**: modify the alpha cutoff for outlier detection


=> 3. LFMM
----------

LFMM is used by iKISS for testing correlations between kmers and environmental data.

.. code-block:: yaml

   PARAMS:
      LFMM:
         K : 2
         PHENOTYPE_FILE: "pheno.txt"
         PHENOTYPE_PCA_ANALYSIS : false
         CORRECTION: 'BH'
         ALPHA : 0.05


**K** are the latent factors used in LFMM association analyses 

**PHENOTYPE_FILE**: an phenotype file is obligatory in LFMM analysis. You can give to iKISS PCA results, climate variables, etc.

A PCA can reveal some 'structure' in the genotype data and it could help you to fix K parameter.

**PHENOTYPE_PCA_ANALYSIS** 

   * If **PHENOTYPE_PCA_ANALYSIS** is true, iKISS automatically run PCA using the file given by user in the PHENOTYPE_FILE key. This PHENOTYPE_FILE can be a PCA result for example.

   * If **PHENOTYPE_PCA_ANALYSIS** is false, iKISS use directly the PHENOTYPE_FILE as 'phenotype' to LFMM analysis. Kmers are used as 'genotype' data.

Here, a example of a phenotype file with climate variables

.. code-block:: bash

    accession_id	group	b2.Mean_Diurnal_Range	b3.Isothermality	b4.Temp_Seasonality	b5.Max_Temp_of_Warmest_Month	b6.Min_Temp_of_Coldest_Month	b7.Temp_Annual_Range	b8.Mean_Temp_of
    _Wettest_Quarter	b9.Mean_Temp_of_Driest_Quarter	b10.Mean_Temp_of_Warmest_Quarter	b11.Mean_Temp_of_Coldest_Quarter	b12.Annual_Precipitation	b13.Precipitation_of_Wettest_Mo
    nth	b14.Precipitation_of_Driest_Month	b15.Precipitation_Seasonality	b16.Precipitation_of_Wettest_Quarter	b17.Precipitation_of_Driest_Quarter	b18.Precipitation_of_Warmest_Quarter	b19.Precipitation_of_Coldest_Quarter
    Clone12	2	99	68	1230	310	166	144	250	226	258	226	1462	249	3	68	573	17	549	17
    Clone14	2	100	68	1235	301	155	146	241	217	248	217	1525	259	3	67	603	18	575	18
    Clone16	2	93	65	1389	310	168	142	250	223	258	223	1416	264	0	73	579	8	544	8
    Clone20	2	154	55	3955	403	123	280	296	234	315	214	118	62	0	184	107	0	45	0
    Clone2	1	152	55	3617	403	128	275	287	242	316	220	173	80	0	167	153	0	18	0
    Clone4	1	168	51	5719	414	86	328	315	201	322	181	20	12	0	166	18	0	17	0
    Clone8	1	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA


**CORRECTION**: kmers outliers are obtained using a correction of BONFERONNI, BH or FDR model.

**ALPHA**: modify the alpha cutoff for outlier detection


=> 4. MAPPING_KMERS
-------------------

MAPPING_KMERS section in PARAMS can optionally be used to align kmers to a genomic reference. It could give a idea of selected regions in a genome. 

.. code-block:: yaml

   PARAMS:
      MAPPING_KMERS:
         REF: "reference.fasta"
         MODE : bwa-aln
         INDEX_OPTIONS: ""
         OPTIONS : "-n 0.04"
         FILTER_FLAG : 4
         FILTER_QUAL : 10


Use a reference file in the **REF** section. 

Parametter **MODE** using  *bwa-aln* or *bwa-mem2* 

Set up the **INDEX_OPTIONS** according to the MODE you have chosen.

   If *bwa-mem2* leaf empty
   
   If *bwa-aln* "-a bwtsw" or "" 

Set options according of chosen mapper in the **OPTIONS** key. 

   If *bwa-mem2* default parameters -A 1 -B 4;
   
   If *bwa-aln* -n 0.04

Obtained bam could be filtered using **FILTER_FLAG** (-F 4 by default) and **FILTER_QUAL** (mapq>10 by defaut) params.

=> 5. ASSEMBLY_KMERS
--------------------

ASSEMBLY_KMERS section in PARAMS can optionally be used to assembly significant kmers obtained by pcadapt or/and lfmm.

Contigs are assembled by iKISS using  mergeTags from dekupl package https://github.com/Transipedia/dekupl-mergeTags.

Chose minimal overlap size "OVERLAP_SIZE" allowed to assembly kmers.

Feel free to filter contigs by size "FILTER_CONTIG_SIZE".

Assembled contigs could be mapped activating **MAPPING_CONTIGS**. This mapping can be launch versus a **REF** reference file using bwa-mem2 by default.
Reference file used in this step can be a different reference from **MAPPING_KMERS** options. Feel free of change parametters of mapping using **MAPPING_OPTIONS**

Assembled contigs could be used by blastn against a database, you can also try to annotate them!

.. code-block:: yaml

   PARAMS:
      ASSEMBLY:
         OVERLAP_SIZE : 15
         FILTER_CONTIG_SIZE : 100
         MAPPING_CONTIGS: True
         # if MAPPING_CONTIGS is activate, ikiss maps contigs vs REF using bwamem2
         REF: 'reference.fasta'
         MAPPING_OPTIONS : ""


=> 6. INTERSECT
---------------

iKISS uses bedtools intersect to calculate how many kmers/contigs are mapped in **FEATURES** (gene by default).

These **FEATURES** are filtered from the annotation **GFF** fileb before use bedtools intersect.

iKISS filtered kmers/contigs by using **FILTER_MAPQ_STATS** and minimal kmers/contigs number **FILTER_MIN_STATS** by FEATURE. 

.. code-block:: yaml

   PARAMS:
      INTERSECT:
            GFF : 'reference.gff'
            FEATURE : 'gene'
            FILTER_MAPQ_STATS: '15'              



3.2. Adapt cluster_config.yaml
-------------------------------


If you will run ikiss in cluster, adapt `cluster_config.yaml` :  

.. code-block:: bash

   ikiss edit_cluster_config

Inside `cluster_config.yaml`, adapt partition to your favorite cluster and change memory and cpu number in by `__default__` key or in rules you need :

.. code-block:: bash

   __default__:
      cpus-per-task : 4
      mem-per-cpu : 10G
      partition : "normal"
      nodelist: node19
      output : 'slurm_logs/stdout/{rule}/{wildcards}.o'
      error : 'slurm_logs/error/{rule}/{wildcards}.e'
      job-name : '{rule}.{wildcards}'
      
   kmers_gwas_per_sample:
      cpus-per-task : 4
      mem-per-cpu : 10G


RULES  
-----

Here you can quickly find iKISS snakemake rules list : 

.. code-block:: bash

   rule kmers_gwas_per_sample *
   rule kmers_to_use
   rule kmers_table
   rule extract_kmers_from_bed
   rule index_ref
   rule index_ref_to_assembly
   rule mapping_kmers
   rule filter_bam
   rule kmer_position_from_bam * 
   rule merge_kmer_position
   rule samtools_merge
   rule pcadapt * 
   rule merge_method
   rule outliers_position
   rule extracting_features_from_gff
   rule kmers_bedtools_intersect
   rule get_pca_from_phenotype
   rule lfmm * 
   rule mergetags
   rule mapping_contigs
   rule contigs_bedtools_intersect
   rule intersect_and_contigs
   rule intersect_and_outliers
   rule fastq_stats
   rule report_ikiss
   rule html_ikiss

* rules with a `*` can be parallelised.


4. Running iKISS
================

Run iKISS by `ikiss run_local` or `ikiss run_cluster` as explained in "Running a datatest" section.



5. iKISS output
================

This is a overwiew of iKISS output directory:

.. code-block:: bash

   OUTPUT-KISS/   
      config_corrected.yaml
      0.FASTQ_STATS
      └── fastq_stats.txt  
      1.KMERS_MODULE
      ├── Clone12
      ├── Clone14
      ├── Clone16
      ├── Clone2
      ├── Clone20
      ├── Clone4
      └── Clone8
      2.KMERS_TABLE
      ├── kmers_list_paths.txt
      ├── kmers_table.names
      ├── kmers_table.table
      ├── kmers_to_use
      ├── kmers_to_use.no_pass_kmers
      ├── kmers_to_use.shareness
      ├── kmers_to_use.stats.both
      ├── kmers_to_use.stats.only_canonical
      └── kmers_to_use.stats.only_non_canonical
      3.TABLE2BED
      ├── log
      ├── output_file.0.bed
      ├── output_file.0.bim
      ├── output_file.0.fam
      ├── output_file.1.bed
      ├── output_file.1.bim
      ├── output_file.1.fam
      ├── output_file.2.bed
      ├── output_file.2.bim
      ├── output_file.2.fam
      ├── output_file.3.bed
      ├── output_file.3.bim
      ├── output_file.3.fam
      ├── output_file.4.bed
      ├── output_file.4.bim
      └── output_file.4.fam
      4.EXTRACT_FASTA
      ├── output_file.0.fasta.gz
      ├── output_file.1.fasta.gz
      ├── output_file.2.fasta.gz
      ├── output_file.3.fasta.gz
      └── output_file.4.fasta.gz
      5.RANGES
      ├── output_file.0
      ├── output_file.1
      ├── output_file.2
      ├── output_file.3
      └── output_file.4
      6.LFMM
      ├── output_file.0_10_LFMM_outliers.csv
      ├── output_file.0_10_LFMM_pvalues.csv
      ├── output_file.0_10_LFMM.rplot.pdf
      ...
      6.LFMM_PHENO
      ├── PCA_from_phenotype.csv
      ├── PCA_from_phenotype.html
      └── PCA_from_phenotype.ipynb
      6.PCADAPT
      ├── output_file.0_10_PCADAPT_outliers.csv
      ├── output_file.0_10_PCADAPT_pvalues.csv
      ├── output_file.0_10_PCADAPT.rplot.pdf
      ├── output_file.0_10_PCADAPT_scores.csv
      ... 
      7.MERGED_LFMM
      ├── merged_LFMM_outliers.csv
      └── merged_LFMM_pvalues.csv
      7.MERGED_PCADAPT
      ├── merged_PCADAPT_outliers.csv
      └── merged_PCADAPT_pvalues.csv
      8.MAPPING_KMERS
      ├── bam_files.txt
      ├── output_file.0_vs_reference.bam
      ├── output_file.0_vs_reference_FMQ.bam
      ├── output_file.0_vs_reference.sai
      ├── output_file.0_vs_reference_sorted.bam
      ├── output_file.0_vs_reference_sorted.bam.bai
      ├── output_file.0_vs_reference_sorted.bam.idxstats
      ├── output_file.0_vs_reference_sorted.bam.stats
      ...
      9.KMERPOSITION
      ├── output_file.0_vs_reference_KMERPOSITION.txt
      ├── output_file.1_vs_reference_KMERPOSITION.txt
      ├── output_file.2_vs_reference_KMERPOSITION.txt
      ├── output_file.3_vs_reference_KMERPOSITION.txt
      └── output_file.4_vs_reference_KMERPOSITION.txt    
      10.MERGE_KMERPOSITION
      ├── kmer_position_merged.txt
      └── kmer_position_samtools_merge.bam
      11.OUTLIERS_LFMM_POSITION
      └── outliers_with_position.csv
      11.OUTLIERS_PCADAPT_POSITION
      └── outliers_with_position.csv
      12.ASSEMBLY_OUTLIERS_LFMM
      ├── contigs_LFMM_vs_reference.bam
      ├── contigs_LFMM_vs_reference.sorted.bam
      ├── contigs_LFMM_vs_reference.sorted.bam.bai
      ├── contigs_LFMM_vs_reference.sorted.bam.idxstats
      ├── contigs_LFMM_vs_reference.sorted.bam.stats
      ├── outliers_LFMM_mergetags.csv
      └── outliers_LFMM_mergetags.fasta
      12.ASSEMBLY_OUTLIERS_PCADAPT
      ├── contigs_PCADAPT_vs_reference.bam
      ├── contigs_PCADAPT_vs_reference.sorted.bam
      ├── contigs_PCADAPT_vs_reference.sorted.bam.bai
      ├── contigs_PCADAPT_vs_reference.sorted.bam.idxstats
      ├── contigs_PCADAPT_vs_reference.sorted.bam.stats
      ├── outliers_PCADAPT_mergetags.csv
      └── outliers_PCADAPT_mergetags.fasta
      13.GFF_FEATURES
      └── extracted.gff
      14.CONTIGS_INTERSECT_LFMM
      └── contigs_intersect_annotation.bed
      14.CONTIGS_INTERSECT_PCADAPT
      └── contigs_intersect_annotation.bed
      14.KMERS_INTERSECT
      └── kmers_bedtools_intersect_annotation.bed
      15.CONTIGS_LFMM_INTERSECT
      └── global_intersect_stats
      15.CONTIGS_PCADAPT_INTERSECT
      └── global_intersect_stats
      15.OUTLIERS_LFMM_INTERSECT
      ├── global_intersect_stats
      └── outliers_intersect_stats
      15.OUTLIERS_PCADAPT_INTERSECT
      ├── global_intersect_stats
      └── outliers_intersect_stats
      REF
      ├── reference2.fasta
      ├── reference2.fasta.0123
      ├── reference2.fasta.amb
      ├── reference2.fasta.ann
      ├── reference2.fasta.bwt.2bit.64
      ├── reference2.fasta.pac
      ├── reference.fasta
      ├── reference.fasta.amb
      ├── reference.fasta.ann
      ├── reference.fasta.bwt
      ├── reference.fasta.pac
      └── reference.fasta.sa
      REPORT
      ├── iKISS_report.csv
      ├── iKISS_report.html
      ├── iKISS_report.ipynb
      ├── PCA_from_phenotype.html
      └── PCA_from_phenotype.ipynb
      BENCHMARK
      LOGS


Note : we recommended to remove 1.KMER_GWAS repertory after analysis.

Authors
========

Julie Orjuela (IRD) develops iKISS

Yves Vigouroux (IRD) is the big boss with a lot of ideas and contributions! 

Contributeurs 
==============

Djamel Boubred (Bioinformatics Student at IRD) and Tram VI (Ph.D student IRD) have also contributed by debugging and test with rice and coffea datasets. 

Sebastien Ravel has also contributed with the snakecdysis python package developpement.

Thanks
=======

Thanks to Ndomassi Tando (i-Trop IRD) for his administration support.

The authors acknowledge the IRD i-Trop HPC (South Green Platform) from IRD Montpellier for providing HPC resources that contributed to this work. https://bioinfo.ird.fr/ - http://www.southgreen.fr
 
License
=======

Licensed under MIT.

Intellectual property belongs to IRD and authors.

iKISS uses recycled code from the culebrONT project of SouthGreen platform https://culebront-pipeline.readthedocs.io/en/latest/.
iKISS uses SnakEcdysis package https://snakecdysis.readthedocs.io/en/latest/package.html to perform installation and execution in local and cluster mode.

.. |PythonVersions| image:: https://img.shields.io/badge/python-3.7%2B-blue
   :target: https://www.python.org/downloads
.. |SnakemakeVersions| image:: https://img.shields.io/badge/snakemake-≥5.10.0-brightgreen.svg?style=flat
   :target: https://snakemake.readthedocs.io
.. |Singularity| image:: https://img.shields.io/badge/singularity-≥3.3.0-7E4C74.svg
   :target: https://sylabs.io/docs/
.. |readthedocs| image:: https://pbs.twimg.com/media/E5oBxcRXoAEBSp1.png
   :target: https://culebront-pipeline.readthedocs.io/en/latest/
   :width: 400px



            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "ikiss",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "snakemake,kmers,selection,diversity",
    "author": "",
    "author_email": "\"Julie Orjuela (DIADE-IRD)\" <julie.orjuela@ird.fr>",
    "download_url": "https://files.pythonhosted.org/packages/5a/09/d19f21b90d6350bcef81a4cf59badbd7bb8677cdd1f5452431d068194f53/ikiss-1.5.0.tar.gz",
    "platform": null,
    "description": ".. image:: ./ikiss/logo_ikiss.png\n   :width: 400\n   :alt: ikiss Logo\n   :align: center\n\n\n|PythonVersions| |SnakemakeVersions| |Singularity|\n\n.. contents:: Table of Contents\n    :depth: 2\n\n\n**Homepage:** https://forge.ird.fr/diade/iKISS\n\nAbout iKISS \n===============\n\n**iKISS (Kmer Inference sSelection)** is a snakemake pipeline able to decompose reads into kmers and extract kmers under selection. \n\niKISS uses KmersGWAS https://github.com/voichek/kmersGWAS, pcadapt https://cran.r-project.org/web/packages/pcadapt/readme/README.html and lfmm https://bcm-uga.github.io/lfmm/articles/lfmm to select genomics regions under selection.\n\n1. Install dependencies and clone iKISS\n=============================================\n\nCheck dependencies for iKISS : python and singularity\n\nInstall  singularity and python3 in your local machine OR use module load to add singularity and python3 in your environment if you are working in a cluster :\n\n.. code-block:: bash\n\n   module load system/python/3.8.12\n   module load system/singularity/3.6.0\n\n\niKISS is NOW available as a PyPI package (recommended)\n\n.. code-block:: bash\n\n   python3 -m pip install ikiss\n\n\nOR you can also install iKISS from git repository\n\n.. code-block:: bash\n\n   python3 -m pip install ikiss@git+https://forge.ird.fr/diade/iKISS.git \n   \n   #OR\n   \n   git clone https://forge.ird.fr/diade/iKISS.git \n   cd iKISS\n   python3 -m pip install .\n\n\n1.1 Installing in cluster mode\n-------------------------------\n\nInstall iKISS in cluster mode using **singularity** container from ikiss_utilities https://itrop.ird.fr/ikiss_utilities/\n\n.. code-block:: bash\n\n   ikiss install_cluster --help\n   ikiss install_cluster --scheduler slurm --env singularity\n   \n\n1.2 Installing in local mode \n----------------------------\n\n.. code-block:: bash\n\n   ikiss install_local --help\n   ikiss install_local\n\n\n2. Running a datatest\n=============================================\n\nRunning test with a datatest from iKISS_utilities in a repertory TEST\n\n.. code-block:: bash\n\n   ikiss test_install --help\n   ikiss test_install -d TEST\n\n\n2.1 In CLUSTER mode\n-------------------\n\nLaunching suggested command line done by iKISS, in CLUSTER mode : \n\nPlease run command line 'ikiss create_cluster_config' before the first run and modify theads, ram, node and computer ressources. \niKISS do a copy of cluster_config.yaml file into your home \"/home/$USER/.config/ikiss/cluster_config.yaml\"\n\n   \n.. code-block:: bash\n\n   ikiss run_cluster --help\n   ikiss create_cluster_config\n\nIf singularity was selected in installation of iKISS, it could be needed to give argument --singularity-args \\\"--bind $HOME\\\" to Snakemake, by using :\n\n.. code-block:: bash\n\n   ikiss run_cluster --help\n   ikiss run_cluster -c TEST/data_test_config.yaml --singularity-args \"--bind $HOME\"\n   # @IFB\n   ikiss run_cluster -c TEST/data_test_config.yaml --singularity-args \"--bind /shared:/shared\"\n   #you can also use snakemake parametters as --rerun-incomplete --nolock\n\n\n**Important Note** : In i-Trop cluster, run iKISS using ONLY a node, data has to be in \"/scratch\" of chosen node. Use `nodelist : nodeX` parametter inside of cluster_config.yaml file.\n\n\n2.2 In LOCAL mode\n-----------------\n\nlaunching suggested command line done by iKISS, in LOCAL mode: \n\n.. code-block:: bash\n\n   ikiss run_local --help\n   ikiss run_local -t 8 -c TEST/data_test_config.yaml --singularity-args \"--bind $HOME\"\n\nIn local mode, its possible to allocate threads to some rules using `--set-threads` snakemake argument such as\n\n.. code-block:: bash\n\n    ikiss run_local -t 8 -c TEST/data_test_config.yaml --set-threads kmers_gwas_per_sample=4 mapping_kmers=2 filter_bam=2 kmer_position_from_bam=4 pcadapt=2 extract_kmers_from_bed=2\n\n\n3. Running your data\n========================\n\n\n3.1. Adapt config.yaml\n------------------------\n\nBefore to run iKISS, adapt `config.yaml` by using : \n\n.. code-block:: bash\n\n   ikiss create_config\n\n\nAdapt `config.yaml` file with path to fastq files (FASTQ) and outfile (OUTPUT) in the `DATA` section. \n\n.. code-block:: yaml\n\n   DATA:\n      FASTQ: './DATATEST/fastq'\n      OUTPUT: './OUTPUT-KISS/'\n\n:warning if yours reads are ilumina paired, you need rename reads SAMPLE_R1.fastq.gz and SAMPLE_R2.fastq.gz. For single reads use SAMPLE_R1.fastq.gz\n\niKISS uses compressed ans decompressed fastq files.\n\n\n3.1.1 WORKFLOW section\n-----------------------\n\nParameter iKISS steps using the section WORKFLOW and parameter it with the PARAMS sections.\n\nIn WORKFLOW section:\n\n   KMERS_GWAS step has to be activated by default. \n\n   PCADAPT, LFMM, MAPPING or ASSEMBLY are optional. Active or deactivate these steps using true or false.\n\n\n**KMERS_GWAS** convert reads in kmers, filter them and create a format ready to use in population genomics!\n\n**PCADAPT** detects genetic markers (kmers here ^^) involved in biological adaptation and provides outlier detection based on Principal Component Analysis (PCA).\n\n**LFMM** is used by iKISS for testing correlations between kmers and environmental data.\n\n**MAPPING_KMERS** can optionally be used to align kmers to a genomic reference (if it is available ! ).\n\n**ASSEMBLY_KMERS** can optionally assembly significant kmers obtained by pcadapt or lfmm\n\n**INTERSECT** can optionally calculate how many kmers (if MAPPING_KMERS is activated ) or contigs(if ASSEMBLY_KMERS is activated) are found in FEATURES (gene by default)  \n\n.. code-block:: yaml\n\n   WORKFLOW:\n      KMERS_MODULE : true\n      PCADAPT : true\n      LFMM : true\n      MAPPING_KMERS: true\n      ASSEMBLY_KMERS: true\n      INTERSECT: True\n\n3.1.2 PARAMS section\n--------------------\n\nIn the PARAMS section, tools parameters can be modified and adapted.\n\n\n=> 1. KMERS_MODULE\n-------------------\n\nKMERS_GWAS module decompose reads into kmers and create a binary table of presence/absence of kmers. This table can be filter to use only most informative kmers into the populations. PLINK format outfiles are obtained in this module.\n\n.. code-block:: yaml\n\n   PARAMS:\n      KMERS_MODULE:\n         KMER_SIZE : 31\n         MAC : 2\n         P : 0.2\n         MAF : 0.05\n         B : 1000000 # nb kmers in each bed file\n         SPLIT_LIST_SIZE : 100000\n         MIN_LIST_SIZE : 50000\n\n\n**KMER_SIZE** is the length of kmers (should be between 15-31)\n\n**MAC** is the minor allele count (min allowed appearance of a kmer) \n\n**P** is the minimum percent of appearance in each strand form\n\n**MAF** is the minimum allele frequency\n\n**B** is the number of kmers in each bed file\n\n**SPLIT_LIST_SIZE** is the nb of kmers by bed file\n\n**MIN_LIST_SIZE** indicates the minimal number of kmers allowed in the smaller bed file after splitting\n\n\n=> 2. PCADAPT\n--------------\n\nPCADAPT detects kmers involved in biological adaptation and provides outlier detection based on Principal Component Analysis (PCA)\n\n.. code-block:: yaml\n\n   PARAMS:        \n      PCADAPT:\n         K : 2\n         SAMPLES: \"samples.txt\"\n         CORRECTION: 'FDR'\n         ALPHA : 0.05\n\n\n**K** : number K of principal components\n\n**SAMPLES** : you need to generate a *samples.txt* file.  This file contains two columns (tab delimitations) : accession_id and phenotype_value. It will be used by PCADAPT.\n\n   **accession_id** : contains exactly same name of samples in FASTQ. \n\n   **phenotype_value** (int): contains sample group (wild=1, cultivated=2 for example)\n\n.. code-block:: bash\n\n   accession_id\tgroup\n   Clone12\t2\n   Clone14\t2\n   Clone16\t2\n   Clone20\t2\n   Clone2\t1\n   Clone4\t1\n   Clone8\t1\n\n**CORRECTION**: kmers outliers are obtained using a correction of BONFERONNI, BH or FDR model.\n\n**ALPHA**: modify the alpha cutoff for outlier detection\n\n\n=> 3. LFMM\n----------\n\nLFMM is used by iKISS for testing correlations between kmers and environmental data.\n\n.. code-block:: yaml\n\n   PARAMS:\n      LFMM:\n         K : 2\n         PHENOTYPE_FILE: \"pheno.txt\"\n         PHENOTYPE_PCA_ANALYSIS : false\n         CORRECTION: 'BH'\n         ALPHA : 0.05\n\n\n**K** are the latent factors used in LFMM association analyses \n\n**PHENOTYPE_FILE**: an phenotype file is obligatory in LFMM analysis. You can give to iKISS PCA results, climate variables, etc.\n\nA PCA can reveal some 'structure' in the genotype data and it could help you to fix K parameter.\n\n**PHENOTYPE_PCA_ANALYSIS** \n\n   * If **PHENOTYPE_PCA_ANALYSIS** is true, iKISS automatically run PCA using the file given by user in the PHENOTYPE_FILE key. This PHENOTYPE_FILE can be a PCA result for example.\n\n   * If **PHENOTYPE_PCA_ANALYSIS** is false, iKISS use directly the PHENOTYPE_FILE as 'phenotype' to LFMM analysis. Kmers are used as 'genotype' data.\n\nHere, a example of a phenotype file with climate variables\n\n.. code-block:: bash\n\n    accession_id\tgroup\tb2.Mean_Diurnal_Range\tb3.Isothermality\tb4.Temp_Seasonality\tb5.Max_Temp_of_Warmest_Month\tb6.Min_Temp_of_Coldest_Month\tb7.Temp_Annual_Range\tb8.Mean_Temp_of\n    _Wettest_Quarter\tb9.Mean_Temp_of_Driest_Quarter\tb10.Mean_Temp_of_Warmest_Quarter\tb11.Mean_Temp_of_Coldest_Quarter\tb12.Annual_Precipitation\tb13.Precipitation_of_Wettest_Mo\n    nth\tb14.Precipitation_of_Driest_Month\tb15.Precipitation_Seasonality\tb16.Precipitation_of_Wettest_Quarter\tb17.Precipitation_of_Driest_Quarter\tb18.Precipitation_of_Warmest_Quarter\tb19.Precipitation_of_Coldest_Quarter\n    Clone12\t2\t99\t68\t1230\t310\t166\t144\t250\t226\t258\t226\t1462\t249\t3\t68\t573\t17\t549\t17\n    Clone14\t2\t100\t68\t1235\t301\t155\t146\t241\t217\t248\t217\t1525\t259\t3\t67\t603\t18\t575\t18\n    Clone16\t2\t93\t65\t1389\t310\t168\t142\t250\t223\t258\t223\t1416\t264\t0\t73\t579\t8\t544\t8\n    Clone20\t2\t154\t55\t3955\t403\t123\t280\t296\t234\t315\t214\t118\t62\t0\t184\t107\t0\t45\t0\n    Clone2\t1\t152\t55\t3617\t403\t128\t275\t287\t242\t316\t220\t173\t80\t0\t167\t153\t0\t18\t0\n    Clone4\t1\t168\t51\t5719\t414\t86\t328\t315\t201\t322\t181\t20\t12\t0\t166\t18\t0\t17\t0\n    Clone8\t1\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\tNA\n\n\n**CORRECTION**: kmers outliers are obtained using a correction of BONFERONNI, BH or FDR model.\n\n**ALPHA**: modify the alpha cutoff for outlier detection\n\n\n=> 4. MAPPING_KMERS\n-------------------\n\nMAPPING_KMERS section in PARAMS can optionally be used to align kmers to a genomic reference. It could give a idea of selected regions in a genome. \n\n.. code-block:: yaml\n\n   PARAMS:\n      MAPPING_KMERS:\n         REF: \"reference.fasta\"\n         MODE : bwa-aln\n         INDEX_OPTIONS: \"\"\n         OPTIONS : \"-n 0.04\"\n         FILTER_FLAG : 4\n         FILTER_QUAL : 10\n\n\nUse a reference file in the **REF** section. \n\nParametter **MODE** using  *bwa-aln* or *bwa-mem2* \n\nSet up the **INDEX_OPTIONS** according to the MODE you have chosen.\n\n   If *bwa-mem2* leaf empty\n   \n   If *bwa-aln* \"-a bwtsw\" or \"\" \n\nSet options according of chosen mapper in the **OPTIONS** key. \n\n   If *bwa-mem2* default parameters -A 1 -B 4;\n   \n   If *bwa-aln* -n 0.04\n\nObtained bam could be filtered using **FILTER_FLAG** (-F 4 by default) and **FILTER_QUAL** (mapq>10 by defaut) params.\n\n=> 5. ASSEMBLY_KMERS\n--------------------\n\nASSEMBLY_KMERS section in PARAMS can optionally be used to assembly significant kmers obtained by pcadapt or/and lfmm.\n\nContigs are assembled by iKISS using  mergeTags from dekupl package https://github.com/Transipedia/dekupl-mergeTags.\n\nChose minimal overlap size \"OVERLAP_SIZE\" allowed to assembly kmers.\n\nFeel free to filter contigs by size \"FILTER_CONTIG_SIZE\".\n\nAssembled contigs could be mapped activating **MAPPING_CONTIGS**. This mapping can be launch versus a **REF** reference file using bwa-mem2 by default.\nReference file used in this step can be a different reference from **MAPPING_KMERS** options. Feel free of change parametters of mapping using **MAPPING_OPTIONS**\n\nAssembled contigs could be used by blastn against a database, you can also try to annotate them!\n\n.. code-block:: yaml\n\n   PARAMS:\n      ASSEMBLY:\n         OVERLAP_SIZE : 15\n         FILTER_CONTIG_SIZE : 100\n         MAPPING_CONTIGS: True\n         # if MAPPING_CONTIGS is activate, ikiss maps contigs vs REF using bwamem2\n         REF: 'reference.fasta'\n         MAPPING_OPTIONS : \"\"\n\n\n=> 6. INTERSECT\n---------------\n\niKISS uses bedtools intersect to calculate how many kmers/contigs are mapped in **FEATURES** (gene by default).\n\nThese **FEATURES** are filtered from the annotation **GFF** fileb before use bedtools intersect.\n\niKISS filtered kmers/contigs by using **FILTER_MAPQ_STATS** and minimal kmers/contigs number **FILTER_MIN_STATS** by FEATURE. \n\n.. code-block:: yaml\n\n   PARAMS:\n      INTERSECT:\n            GFF : 'reference.gff'\n            FEATURE : 'gene'\n            FILTER_MAPQ_STATS: '15'              \n\n\n\n3.2. Adapt cluster_config.yaml\n-------------------------------\n\n\nIf you will run ikiss in cluster, adapt `cluster_config.yaml` :  \n\n.. code-block:: bash\n\n   ikiss edit_cluster_config\n\nInside `cluster_config.yaml`, adapt partition to your favorite cluster and change memory and cpu number in by `__default__` key or in rules you need :\n\n.. code-block:: bash\n\n   __default__:\n      cpus-per-task : 4\n      mem-per-cpu : 10G\n      partition : \"normal\"\n      nodelist: node19\n      output : 'slurm_logs/stdout/{rule}/{wildcards}.o'\n      error : 'slurm_logs/error/{rule}/{wildcards}.e'\n      job-name : '{rule}.{wildcards}'\n      \n   kmers_gwas_per_sample:\n      cpus-per-task : 4\n      mem-per-cpu : 10G\n\n\nRULES  \n-----\n\nHere you can quickly find iKISS snakemake rules list : \n\n.. code-block:: bash\n\n   rule kmers_gwas_per_sample *\n   rule kmers_to_use\n   rule kmers_table\n   rule extract_kmers_from_bed\n   rule index_ref\n   rule index_ref_to_assembly\n   rule mapping_kmers\n   rule filter_bam\n   rule kmer_position_from_bam * \n   rule merge_kmer_position\n   rule samtools_merge\n   rule pcadapt * \n   rule merge_method\n   rule outliers_position\n   rule extracting_features_from_gff\n   rule kmers_bedtools_intersect\n   rule get_pca_from_phenotype\n   rule lfmm * \n   rule mergetags\n   rule mapping_contigs\n   rule contigs_bedtools_intersect\n   rule intersect_and_contigs\n   rule intersect_and_outliers\n   rule fastq_stats\n   rule report_ikiss\n   rule html_ikiss\n\n* rules with a `*` can be parallelised.\n\n\n4. Running iKISS\n================\n\nRun iKISS by `ikiss run_local` or `ikiss run_cluster` as explained in \"Running a datatest\" section.\n\n\n\n5. iKISS output\n================\n\nThis is a overwiew of iKISS output directory:\n\n.. code-block:: bash\n\n   OUTPUT-KISS/   \n      config_corrected.yaml\n      0.FASTQ_STATS\n      \u2514\u2500\u2500 fastq_stats.txt  \n      1.KMERS_MODULE\n      \u251c\u2500\u2500 Clone12\n      \u251c\u2500\u2500 Clone14\n      \u251c\u2500\u2500 Clone16\n      \u251c\u2500\u2500 Clone2\n      \u251c\u2500\u2500 Clone20\n      \u251c\u2500\u2500 Clone4\n      \u2514\u2500\u2500 Clone8\n      2.KMERS_TABLE\n      \u251c\u2500\u2500 kmers_list_paths.txt\n      \u251c\u2500\u2500 kmers_table.names\n      \u251c\u2500\u2500 kmers_table.table\n      \u251c\u2500\u2500 kmers_to_use\n      \u251c\u2500\u2500 kmers_to_use.no_pass_kmers\n      \u251c\u2500\u2500 kmers_to_use.shareness\n      \u251c\u2500\u2500 kmers_to_use.stats.both\n      \u251c\u2500\u2500 kmers_to_use.stats.only_canonical\n      \u2514\u2500\u2500 kmers_to_use.stats.only_non_canonical\n      3.TABLE2BED\n      \u251c\u2500\u2500 log\n      \u251c\u2500\u2500 output_file.0.bed\n      \u251c\u2500\u2500 output_file.0.bim\n      \u251c\u2500\u2500 output_file.0.fam\n      \u251c\u2500\u2500 output_file.1.bed\n      \u251c\u2500\u2500 output_file.1.bim\n      \u251c\u2500\u2500 output_file.1.fam\n      \u251c\u2500\u2500 output_file.2.bed\n      \u251c\u2500\u2500 output_file.2.bim\n      \u251c\u2500\u2500 output_file.2.fam\n      \u251c\u2500\u2500 output_file.3.bed\n      \u251c\u2500\u2500 output_file.3.bim\n      \u251c\u2500\u2500 output_file.3.fam\n      \u251c\u2500\u2500 output_file.4.bed\n      \u251c\u2500\u2500 output_file.4.bim\n      \u2514\u2500\u2500 output_file.4.fam\n      4.EXTRACT_FASTA\n      \u251c\u2500\u2500 output_file.0.fasta.gz\n      \u251c\u2500\u2500 output_file.1.fasta.gz\n      \u251c\u2500\u2500 output_file.2.fasta.gz\n      \u251c\u2500\u2500 output_file.3.fasta.gz\n      \u2514\u2500\u2500 output_file.4.fasta.gz\n      5.RANGES\n      \u251c\u2500\u2500 output_file.0\n      \u251c\u2500\u2500 output_file.1\n      \u251c\u2500\u2500 output_file.2\n      \u251c\u2500\u2500 output_file.3\n      \u2514\u2500\u2500 output_file.4\n      6.LFMM\n      \u251c\u2500\u2500 output_file.0_10_LFMM_outliers.csv\n      \u251c\u2500\u2500 output_file.0_10_LFMM_pvalues.csv\n      \u251c\u2500\u2500 output_file.0_10_LFMM.rplot.pdf\n      ...\n      6.LFMM_PHENO\n      \u251c\u2500\u2500 PCA_from_phenotype.csv\n      \u251c\u2500\u2500 PCA_from_phenotype.html\n      \u2514\u2500\u2500 PCA_from_phenotype.ipynb\n      6.PCADAPT\n      \u251c\u2500\u2500 output_file.0_10_PCADAPT_outliers.csv\n      \u251c\u2500\u2500 output_file.0_10_PCADAPT_pvalues.csv\n      \u251c\u2500\u2500 output_file.0_10_PCADAPT.rplot.pdf\n      \u251c\u2500\u2500 output_file.0_10_PCADAPT_scores.csv\n      ... \n      7.MERGED_LFMM\n      \u251c\u2500\u2500 merged_LFMM_outliers.csv\n      \u2514\u2500\u2500 merged_LFMM_pvalues.csv\n      7.MERGED_PCADAPT\n      \u251c\u2500\u2500 merged_PCADAPT_outliers.csv\n      \u2514\u2500\u2500 merged_PCADAPT_pvalues.csv\n      8.MAPPING_KMERS\n      \u251c\u2500\u2500 bam_files.txt\n      \u251c\u2500\u2500 output_file.0_vs_reference.bam\n      \u251c\u2500\u2500 output_file.0_vs_reference_FMQ.bam\n      \u251c\u2500\u2500 output_file.0_vs_reference.sai\n      \u251c\u2500\u2500 output_file.0_vs_reference_sorted.bam\n      \u251c\u2500\u2500 output_file.0_vs_reference_sorted.bam.bai\n      \u251c\u2500\u2500 output_file.0_vs_reference_sorted.bam.idxstats\n      \u251c\u2500\u2500 output_file.0_vs_reference_sorted.bam.stats\n      ...\n      9.KMERPOSITION\n      \u251c\u2500\u2500 output_file.0_vs_reference_KMERPOSITION.txt\n      \u251c\u2500\u2500 output_file.1_vs_reference_KMERPOSITION.txt\n      \u251c\u2500\u2500 output_file.2_vs_reference_KMERPOSITION.txt\n      \u251c\u2500\u2500 output_file.3_vs_reference_KMERPOSITION.txt\n      \u2514\u2500\u2500 output_file.4_vs_reference_KMERPOSITION.txt    \n      10.MERGE_KMERPOSITION\n      \u251c\u2500\u2500 kmer_position_merged.txt\n      \u2514\u2500\u2500 kmer_position_samtools_merge.bam\n      11.OUTLIERS_LFMM_POSITION\n      \u2514\u2500\u2500 outliers_with_position.csv\n      11.OUTLIERS_PCADAPT_POSITION\n      \u2514\u2500\u2500 outliers_with_position.csv\n      12.ASSEMBLY_OUTLIERS_LFMM\n      \u251c\u2500\u2500 contigs_LFMM_vs_reference.bam\n      \u251c\u2500\u2500 contigs_LFMM_vs_reference.sorted.bam\n      \u251c\u2500\u2500 contigs_LFMM_vs_reference.sorted.bam.bai\n      \u251c\u2500\u2500 contigs_LFMM_vs_reference.sorted.bam.idxstats\n      \u251c\u2500\u2500 contigs_LFMM_vs_reference.sorted.bam.stats\n      \u251c\u2500\u2500 outliers_LFMM_mergetags.csv\n      \u2514\u2500\u2500 outliers_LFMM_mergetags.fasta\n      12.ASSEMBLY_OUTLIERS_PCADAPT\n      \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.bam\n      \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.sorted.bam\n      \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.sorted.bam.bai\n      \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.sorted.bam.idxstats\n      \u251c\u2500\u2500 contigs_PCADAPT_vs_reference.sorted.bam.stats\n      \u251c\u2500\u2500 outliers_PCADAPT_mergetags.csv\n      \u2514\u2500\u2500 outliers_PCADAPT_mergetags.fasta\n      13.GFF_FEATURES\n      \u2514\u2500\u2500 extracted.gff\n      14.CONTIGS_INTERSECT_LFMM\n      \u2514\u2500\u2500 contigs_intersect_annotation.bed\n      14.CONTIGS_INTERSECT_PCADAPT\n      \u2514\u2500\u2500 contigs_intersect_annotation.bed\n      14.KMERS_INTERSECT\n      \u2514\u2500\u2500 kmers_bedtools_intersect_annotation.bed\n      15.CONTIGS_LFMM_INTERSECT\n      \u2514\u2500\u2500 global_intersect_stats\n      15.CONTIGS_PCADAPT_INTERSECT\n      \u2514\u2500\u2500 global_intersect_stats\n      15.OUTLIERS_LFMM_INTERSECT\n      \u251c\u2500\u2500 global_intersect_stats\n      \u2514\u2500\u2500 outliers_intersect_stats\n      15.OUTLIERS_PCADAPT_INTERSECT\n      \u251c\u2500\u2500 global_intersect_stats\n      \u2514\u2500\u2500 outliers_intersect_stats\n      REF\n      \u251c\u2500\u2500 reference2.fasta\n      \u251c\u2500\u2500 reference2.fasta.0123\n      \u251c\u2500\u2500 reference2.fasta.amb\n      \u251c\u2500\u2500 reference2.fasta.ann\n      \u251c\u2500\u2500 reference2.fasta.bwt.2bit.64\n      \u251c\u2500\u2500 reference2.fasta.pac\n      \u251c\u2500\u2500 reference.fasta\n      \u251c\u2500\u2500 reference.fasta.amb\n      \u251c\u2500\u2500 reference.fasta.ann\n      \u251c\u2500\u2500 reference.fasta.bwt\n      \u251c\u2500\u2500 reference.fasta.pac\n      \u2514\u2500\u2500 reference.fasta.sa\n      REPORT\n      \u251c\u2500\u2500 iKISS_report.csv\n      \u251c\u2500\u2500 iKISS_report.html\n      \u251c\u2500\u2500 iKISS_report.ipynb\n      \u251c\u2500\u2500 PCA_from_phenotype.html\n      \u2514\u2500\u2500 PCA_from_phenotype.ipynb\n      BENCHMARK\n      LOGS\n\n\nNote : we recommended to remove 1.KMER_GWAS repertory after analysis.\n\nAuthors\n========\n\nJulie Orjuela (IRD) develops iKISS\n\nYves Vigouroux (IRD) is the big boss with a lot of ideas and contributions! \n\nContributeurs \n==============\n\nDjamel Boubred (Bioinformatics Student at IRD) and Tram VI (Ph.D student IRD) have also contributed by debugging and test with rice and coffea datasets. \n\nSebastien Ravel has also contributed with the snakecdysis python package developpement.\n\nThanks\n=======\n\nThanks to Ndomassi Tando (i-Trop IRD) for his administration support.\n\nThe authors acknowledge the IRD i-Trop HPC (South Green Platform) from IRD Montpellier for providing HPC resources that contributed to this work. https://bioinfo.ird.fr/ - http://www.southgreen.fr\n \nLicense\n=======\n\nLicensed under MIT.\n\nIntellectual property belongs to IRD and authors.\n\niKISS uses recycled code from the culebrONT project of SouthGreen platform https://culebront-pipeline.readthedocs.io/en/latest/.\niKISS uses SnakEcdysis package https://snakecdysis.readthedocs.io/en/latest/package.html to perform installation and execution in local and cluster mode.\n\n.. |PythonVersions| image:: https://img.shields.io/badge/python-3.7%2B-blue\n   :target: https://www.python.org/downloads\n.. |SnakemakeVersions| image:: https://img.shields.io/badge/snakemake-\u22655.10.0-brightgreen.svg?style=flat\n   :target: https://snakemake.readthedocs.io\n.. |Singularity| image:: https://img.shields.io/badge/singularity-\u22653.3.0-7E4C74.svg\n   :target: https://sylabs.io/docs/\n.. |readthedocs| image:: https://pbs.twimg.com/media/E5oBxcRXoAEBSp1.png\n   :target: https://culebront-pipeline.readthedocs.io/en/latest/\n   :width: 400px\n\n\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2022 DIADE IRD / Julie Orjuela, Yves Vigouroux  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "iKISS is a pipeline to detect kmers under selection.",
    "version": "1.5.0",
    "project_urls": {
        "Bug Tracker": "https://forge.ird.fr/diade/iKISS/-/issues",
        "Documentation": "https://forge.ird.fr/diade/iKISS/-/blob/master/README.rst",
        "Downloads": "https://forge.ird.fr/diade/iKISS/-/releases/",
        "Homepage": "https://forge.ird.fr/diade/iKISS.git",
        "Source Code": "https://forge.ird.fr/diade/iKISS.git",
        "repository": "https://forge.ird.fr/diade/iKISS.git"
    },
    "split_keywords": [
        "snakemake",
        "kmers",
        "selection",
        "diversity"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e31479d540fd25d0f40e66272dcac9f312098cd09d3220969ed61e441a660d9b",
                "md5": "4d92ac494c1de9d2a6c4bdc23e4beaa3",
                "sha256": "18359fe0394d47804679a0c53a0d7aca168b3817e0556d1def557871fa526b7a"
            },
            "downloads": -1,
            "filename": "ikiss-1.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4d92ac494c1de9d2a6c4bdc23e4beaa3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 3172331,
            "upload_time": "2023-09-05T10:56:46",
            "upload_time_iso_8601": "2023-09-05T10:56:46.855547Z",
            "url": "https://files.pythonhosted.org/packages/e3/14/79d540fd25d0f40e66272dcac9f312098cd09d3220969ed61e441a660d9b/ikiss-1.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5a09d19f21b90d6350bcef81a4cf59badbd7bb8677cdd1f5452431d068194f53",
                "md5": "ef87b126ec539a41262a69009e2706ed",
                "sha256": "551f512cfd1d03b880f77110e216f924bbb22905db9e5a406b24c3dbbaf699b2"
            },
            "downloads": -1,
            "filename": "ikiss-1.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "ef87b126ec539a41262a69009e2706ed",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 3168548,
            "upload_time": "2023-09-05T10:56:51",
            "upload_time_iso_8601": "2023-09-05T10:56:51.643008Z",
            "url": "https://files.pythonhosted.org/packages/5a/09/d19f21b90d6350bcef81a4cf59badbd7bb8677cdd1f5452431d068194f53/ikiss-1.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-05 10:56:51",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "ikiss"
}
        
Elapsed time: 0.11838s