picnic-bio

Name	picnic-bio JSON
Version	1.0.0 JSON
	download
home_page	https://picnic.cd-code.org/
Summary	PICNIC (Proteins Involved in CoNdensates In Cells) is a machine learning-based model that predicts proteins involved in biomolecular condensates.
upload_time	2024-12-19 10:42:09
maintainer	None
docs_url	None
author	Anna Hadarovich <hadarovi@mpi-cbg.de>, Soumyadeep Ghosh <soumyadeep11194@gmail.com>, Maxim Scheremetjew <schereme@mpi-cbg.de>
requires_python	>=3.9
license	None
keywords	biomolecular condensate scientific annotation tool condensate machine learning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <h1 align="center">
<img src="https://git.mpi-cbg.de/tothpetroczylab/picnic/-/raw/main/branding/logo/logo_picnic_v1.96113169.png" width="300">
</h1><br>

# PICNIC (Proteins Involved in CoNdensates In Cells)

[![Build Status](https://git.mpi-cbg.de/tothpetroczylab/picnic/badges/main/pipeline.svg)](https://git.mpi-cbg.de/tothpetroczylab/picnic/-/pipelines)
[![Coverage Status](https://git.mpi-cbg.de/tothpetroczylab/picnic/badges/main/coverage.svg)](https://git.mpi-cbg.de/tothpetroczylab/picnic/-/pipelines)
[![PyPI Version](https://img.shields.io/pypi/v/picnic-bio.svg)](https://pypi.org/project/picnic-bio/#history)
[![PyPI Downloads](https://img.shields.io/pypi/dm/picnic-bio.svg?label=PyPI%20downloads)](
https://pypi.org/project/picnic-bio/#files)
[![Nat Commun 15, 10668 (2024)](https://img.shields.io/badge/DOI-10.1038%2Fs41467_024_55089_x-blue)](
https://doi.org/10.1038/s41467-024-55089-x)
[![Python Versions](https://img.shields.io/pypi/pyversions/picnic-bio.svg)](https://pypi.org/project/picnic-bio/#description)
[![License](https://img.shields.io/pypi/l/picnic-bio.svg)](https://git.mpi-cbg.de/tothpetroczylab/picnic/-/blob/main/LICENSE)

PICNIC (Proteins Involved in CoNdensates In Cells) is a machine learning-based model that predicts proteins involved in biomolecular condensates. The first model (PICNIC) is based on sequence-based features and structure-based features derived from Alphafold2 models. Another model includes extended set of features based on Gene Ontology terms (PICNIC-GO). Although this model is biased by the already available annotations on proteins, it provides useful insights about specific protein properties that are enriched in proteins of biomolecular condensate. Overall, we recommend using PICNIC that is an unbiased predictor, and using PICNIC-GO for specific cases, for example for experimental hypothesis generation.

- [External software](#external-software)
- [Installation instructions](#installation-instructions)
  - [Requirements](#requirements)
  - [Install external requirements](#install-external-requirements)
  - [PICNIC is available on PyPI](#picnic-is-available-on-pypi)
  - [PICNIC is also installable from source](#picnic-is-also-installable-from-source)
  - [How to install PICNIC using Conda?](#how-to-install-picnic-using-conda)
- [How to use?](#how-to-use)
  - [Usage - Using PICNIC from command line](#usage---using-picnic-from-command-line)
  - [Examples](#examples)
  - [How to run the provided Jupyter notebook?](#how-to-run-the-provided-jupyter-notebook)
- [Publication](#publication)

## External software

*IUPred2A*

IUPred2A is a tool that predicts disordered protein regions. It is available for download via the link https://iupred2a.elte.hu/download_new
The downloaded archive should be unpacked into the "src/files/" directory.

*STRIDE*

STRIDE is a software for protein secondary structure assignment 
Installation guide can be found here https://webclu.bio.wzw.tum.de/stride/

## Installation instructions

A binary installer for the latest released version is available at the Python Package Index (PyPI).

### Requirements

* Python versions >=3.9,<3.13
* Download and unpack IUPred2A
  * Add IUPred2A to PYTHONPATH
* Download and unpack STRIDE
  * Add STRIDE binary to your system PATH


### Install external requirements

#### How to install STRIDE?

A complete installation guide can be found [here](https://webclu.bio.wzw.tum.de/stride/install.html) or simply
run the following commands:

```shell
$ mkdir stride
$ cd stride
$ curl -OL https://webclu.bio.wzw.tum.de/stride/stride.tar.gz
$ tar -zxf stride.tar.gz
$ make
$ export PATH="$PATH:$PWD"
```

#### How to install IUPred2A?

IUPred2A software is available for free only for academic users and it cannot be used for commercial purpose.
If you are an academic user, then you can download IUPred2A by filling out the following form [here](https://iupred2a.elte.hu/download_new).

```shell
# Step 1: Fill out the form above and download the IUPred2A tar ball
$ tar -zxf iupred2a.tar.gz
$ cd iupred2a
$ export PYTHONPATH="$PWD"
```

### PICNIC is available on PyPI

PICNIC officially supports Python versions >=3.9,<3.13.

```shell
$ python3 --version
Python 3.11.5

$ python3 -m venv picnic-env
$ source picnic-env/bin/activate
$ (picnic-env) % python -m pip install --upgrade pip
$ (picnic-env) % python -m pip install picnic_bio
```

### PICNIC is also installable from source

```shell
$ git clone git@git.mpi-cbg.de:atplab/picnic.git
```

Once you have a copy of the source, you can embed it in your own Python package, or install it into your site-packages easily

```shell
$ cd picnic
$ python3 -m venv picnic-env
$ source picnic-env/bin/activate
$ (picnic-env) % python -m pip install --upgrade pip
$ (picnic-env) $ python -m pip install .
```

### How to install PICNIC using Conda?

There isn't any binary installer available on Conda yet. Though it is possible to install PICNIC within a virtual Conda environment.

Please note that in a conda environment you have to pre-install catboost, before installing picnic-bio itself, otherwise the installation will fail when compiling the catboost package from source code. Also it is recommended to use and set up [conda-forge](https://conda-forge.org/docs/user/introduction.html) to fetch pre-compiled versions of catboost.

We have documented how to get around the catboost installation issue.

```shell
$ conda config --add channels conda-forge
$ conda config --set channel_priority strict

# Choose one of the supported Python versions, when creating the Conda environment: >=3.9,<3.13
# conda create -n myenv python=[3.9, 3.10, 3.11, 3.12] catboost
# e.g.
$ conda create -n myenv python=3.11 catboost
$ conda activate myenv
(myenv) $ python -m pip install picnic_bio
```

## How to use?

### Usage - Using PICNIC from command line

```
$ picnic <is_automated> <path_af> <protein_id> <is_go> --path_fasta_file <file>

usage: PICNIC [-h] [--path_fasta_file PATH_FASTA_FILE]
              is_automated path_af protein_id is_go

PICNIC (Proteins Involved in CoNdensates In Cells) is a machine learning-based
model that predicts proteins involved in biomolecular condensates.

positional arguments:
  is_automated          True if automated pipeline (works for proteins with
                        length < 1400 aa, with precalculated Alphafold2 model,
                        deposited to UniprotKB), else manual pipeline
                        (protein_id, Alphafold2 model(s) and fasta file are
                        needed to be provided as input)
  path_af               directory with pdb files, created by Alphafold2 for
                        the protein in the format. For smaller proteins ( <
                        1400 aa length) AlphaFold2 provides one model, that
                        should be named: AF-protein_id-F1-v{j}.pdb, where j is
                        a version number. In case of large proteins Alphafold2
                        provides more than one file, and all of them should be
                        stored in one directory and named: 'AF-
                        protein_id-F{i}-v{j}.pdb', where i is a number of
                        model, j is a version number.
  protein_id            protein identifier in UniprotKB (should correspond to
                        the name 'protein_id' for Alphafold2 models, stored in
                        directory_af_models)
  is_go                 boolean flag; if 'True', picnic_go score (picnic
                        version with Gene Ontology features) will be
                        calculated, Gene Ontology terms are retrieved in this
                        case from UniprotKB by protein_id identifier;
                        otherwise default picnic score will be printed
                        (without Gene Ontology annotation)

options:
  -h, --help            show this help message and exit
  --path_fasta_file PATH_FASTA_FILE
                        directory with sequence file in fasta format
```

### Examples

Run automated pipeline for a given UniProt Id:
```shell
$ picnic True notebooks/test_files/Q99720/ Q99720 True
```
Run manual pipeline for a given UniProt Id:
```shell
$ picnic False 'notebooks/test_files/O95613/' 'O95613' False --path_fasta_file 'notebooks/test_files/O95613/O95613.fasta.txt'
```
Run manual pipeline for your own protein sequence called MY_PROTEIN, which has no reference to UniProt:
```shell
$ picnic False 'notebooks/test_files/MY_PROTEIN/' 'MY_PROTEIN' False --path_fasta_file 'notebooks/test_files/MY_PROTEIN/my_protein.fasta'
```
Examples of using PICNIC are shown in a jupyter-notebook in notebooks folder.

### How to run the provided Jupyter notebook?

Examples of how to use and run PICNIC are shown in a provided Jupyter notebook. The notebook can be found under the
**notebooks** folder.

#### What is Jupyter Notebook?

Please read documentation [here](https://saturncloud.io/blog/how-to-launch-jupyter-notebook-from-your-terminal/#what-is-jupyter-notebook).


#### How to create a virtual environment and install all required Python packages.

Create a virtual environment by executing the command venv:
```shell
$ python -m venv /path/to/new/virtual/environment
# e.g.
$ python -m venv my_jupyter_env
```

Then install the classic Jupyter Notebook with:
```shell
$ source my_jupyter_env/bin/activate

$ pip install notebook
```
Also install picnic-bio from source in the same virtual environment...
```shell
$ pip install .
```

#### How to Launch Jupyter Notebook from Your Terminal?

In your terminal source the previously created virtual environment...
```shell
$ source my_jupyter_env/bin/activate
```
Launch Jupyter Notebook...
```shell
$ jupyter notebook
```
Open the example notebook called 'picnic_examples.ipynb' under the notebooks folder.  

## Publication
***PICNIC accurately predicts condensate-forming proteins regardless of their structural disorder across organisms.***
Anna Hadarovich, Hari Raj Singh, Soumyadeep Ghosh, Maxim Scheremetjew, Nadia Rostam, Anthony A. Hyman & Agnes Toth-Petroczy. 
Nature Communications volume 15, Article number: 10668 (2024). doi: [10.1038/s41467-024-55089-x](https://doi.org/10.1038/s41467-024-55089-x). PMID: [39663388](https://pubmed.ncbi.nlm.nih.gov/39663388/).

<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.

## Development

### Getting started

#### Add your SSH key to GitLab

Before you start make sure you have a [SSH key generated](https://git.mpi-cbg.de/help/ssh/index#generate-an-ssh-key-pair) and the [public SSH Key added to your GitLab account](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account).
You only have to do this once!

Every time you open a new console/terminal make sure your ssh-agent is running and all SSH keys are added.

```shell
$ eval `ssh-agent -s` && ssh-add
```

#### Download and unpack IUPred2A

Fill out and submit the IUPred2A web form [here](https://iupred2a.elte.hu/download_new) to request the iupred2a.tar.gz archive.
Once you received the archive please unpack the TAR ball under the following project sub folder src/files/:

```shell
$ cd picnic/bin
$ cp blah/iupred2a.tar.gz .
$ tar -xvf iupred2a.tar.gz
$ cd iupred2a
$ export PYTHONPATH="$PWD"
```

### How to enable BuildKit?

This could be achieved in different ways. Follow documentation [here](https://brianchristner.io/what-is-docker-buildkit/).

There are 2 main options:

* Enable the BuildKit in your local Docker Desktop
* Enable the BuildKit in a fresh terminal

#### Enable the BuildKit in a fresh terminal

```
export DOCKER_BUILDKIT=1
```

##### On Linux machines, you may also need:
```
export COMPOSE_DOCKER_CLI_BUILD=1
```

### Building your Docker images

```shell
$ docker build --build-arg="PYTHON_VERSION=3.10.13" . -f Dockerfile -t atplab/picnic-service
```

#### Run your image as a container

```shell
$ docker run atplab/picnic-service

e.g.
$ docker run atplab/picnic-service True 'notebooks/test_files/Q99720/' 'Q99720' True
```

#### Create an interactive bash shell in the container

```shell
$ docker run -it --entrypoint sh atplab/picnic-service
```

## Packaging and distribution

### Getting started

#### Install packages required for building and distributing a Python project

```shell
// Create a new env - one off task
$ python3 -m venv packaging

$ source packaging/bin/activate

(packaging) $ pip install -r requirements-packaging.txt
```

#### How to install and run the PICNIC package locally from the project root directory?

```shell
(venv) % cd /<path-to-project-root-folder>/picnic

(venv) % pip install .

# Type in the picnic command to found out if the installation was successfully
(venv) % picnic                                                                            
usage: PICNIC [-h] [--path_fasta_file PATH_FASTA_FILE] is_automated path_af protein_id is_go
PICNIC: error: the following arguments are required: is_automated, path_af, protein_id, is_go
```

#### How to build the PICNIC package from the project root directory?

Run the following command from the root directory to build the package. This will create a dist 
folder where the wheel distribution is built along with a zip file.
```shell
(packaging) $ cd /<path-to-project-root-folder>/picnic

(packaging) $ python -m build

(packaging) $ ls -l dist 
picnic-bio-1.0.0b1.tar.gz
picnic_bio-1.0.0b1-py3-none-any.whl

(packaging) $ twine check dist/*
```

#### How to upload distribution files to PyPi?

Finally, we need to upload these files to PyPi using Twine. Use the following command from the project root
directory. Enter the PyPi credentials to complete uploading the package.

```shell
# Perform a test upload on testPyPI
(packaging) $ twine upload --repository testpypi dist/*

# Finally upload the distribution to PyPI
(packaging) $ twine upload dist/*
```


#### How to deactivate the virtual environment?

```shell
(packaging) $ deactivate
```

Raw data

            {
    "_id": null,
    "home_page": "https://picnic.cd-code.org/",
    "name": "picnic-bio",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "Biomolecular condensate, Scientific Annotation Tool, condensate, machine learning",
    "author": "Anna Hadarovich <hadarovi@mpi-cbg.de>, Soumyadeep Ghosh <soumyadeep11194@gmail.com>, Maxim Scheremetjew <schereme@mpi-cbg.de>",
    "author_email": "picnic@cd-code.org",
    "download_url": "https://files.pythonhosted.org/packages/54/2c/ff5cdcd41db5495fac6f0aeaed8a4c360a12a0f3c7b89397760a85aee98e/picnic_bio-1.0.0.tar.gz",
    "platform": null,
    "description": "<h1 align=\"center\">\n<img src=\"https://git.mpi-cbg.de/tothpetroczylab/picnic/-/raw/main/branding/logo/logo_picnic_v1.96113169.png\" width=\"300\">\n</h1><br>\n\n# PICNIC (Proteins Involved in CoNdensates In Cells)\n\n[![Build Status](https://git.mpi-cbg.de/tothpetroczylab/picnic/badges/main/pipeline.svg)](https://git.mpi-cbg.de/tothpetroczylab/picnic/-/pipelines)\n[![Coverage Status](https://git.mpi-cbg.de/tothpetroczylab/picnic/badges/main/coverage.svg)](https://git.mpi-cbg.de/tothpetroczylab/picnic/-/pipelines)\n[![PyPI Version](https://img.shields.io/pypi/v/picnic-bio.svg)](https://pypi.org/project/picnic-bio/#history)\n[![PyPI Downloads](https://img.shields.io/pypi/dm/picnic-bio.svg?label=PyPI%20downloads)](\nhttps://pypi.org/project/picnic-bio/#files)\n[![Nat Commun 15, 10668 (2024)](https://img.shields.io/badge/DOI-10.1038%2Fs41467_024_55089_x-blue)](\nhttps://doi.org/10.1038/s41467-024-55089-x)\n[![Python Versions](https://img.shields.io/pypi/pyversions/picnic-bio.svg)](https://pypi.org/project/picnic-bio/#description)\n[![License](https://img.shields.io/pypi/l/picnic-bio.svg)](https://git.mpi-cbg.de/tothpetroczylab/picnic/-/blob/main/LICENSE)\n\nPICNIC (Proteins Involved in CoNdensates In Cells) is a machine learning-based model that predicts proteins involved in biomolecular condensates. The first model (PICNIC) is based on sequence-based features and structure-based features derived from Alphafold2 models. Another model includes extended set of features based on Gene Ontology terms (PICNIC-GO). Although this model is biased by the already available annotations on proteins, it provides useful insights about specific protein properties that are enriched in proteins of biomolecular condensate. Overall, we recommend using PICNIC that is an unbiased predictor, and using PICNIC-GO for specific cases, for example for experimental hypothesis generation.\n\n- [External software](#external-software)\n- [Installation instructions](#installation-instructions)\n  - [Requirements](#requirements)\n  - [Install external requirements](#install-external-requirements)\n  - [PICNIC is available on PyPI](#picnic-is-available-on-pypi)\n  - [PICNIC is also installable from source](#picnic-is-also-installable-from-source)\n  - [How to install PICNIC using Conda?](#how-to-install-picnic-using-conda)\n- [How to use?](#how-to-use)\n  - [Usage - Using PICNIC from command line](#usage---using-picnic-from-command-line)\n  - [Examples](#examples)\n  - [How to run the provided Jupyter notebook?](#how-to-run-the-provided-jupyter-notebook)\n- [Publication](#publication)\n\n## External software\n\n*IUPred2A*\n\nIUPred2A is a tool that predicts disordered protein regions. It is available for download via the link https://iupred2a.elte.hu/download_new\nThe downloaded archive should be unpacked into the \"src/files/\" directory.\n\n*STRIDE*\n\nSTRIDE is a software for protein secondary structure assignment \nInstallation guide can be found here https://webclu.bio.wzw.tum.de/stride/\n\n## Installation instructions\n\nA binary installer for the latest released version is available at the Python Package Index (PyPI).\n\n### Requirements\n\n* Python versions >=3.9,<3.13\n* Download and unpack IUPred2A\n  * Add IUPred2A to PYTHONPATH\n* Download and unpack STRIDE\n  * Add STRIDE binary to your system PATH\n\n\n### Install external requirements\n\n#### How to install STRIDE?\n\nA complete installation guide can be found [here](https://webclu.bio.wzw.tum.de/stride/install.html) or simply\nrun the following commands:\n\n```shell\n$ mkdir stride\n$ cd stride\n$ curl -OL https://webclu.bio.wzw.tum.de/stride/stride.tar.gz\n$ tar -zxf stride.tar.gz\n$ make\n$ export PATH=\"$PATH:$PWD\"\n```\n\n#### How to install IUPred2A?\n\nIUPred2A software is available for free only for academic users and it cannot be used for commercial purpose.\nIf you are an academic user, then you can download IUPred2A by filling out the following form [here](https://iupred2a.elte.hu/download_new).\n\n```shell\n# Step 1: Fill out the form above and download the IUPred2A tar ball\n$ tar -zxf iupred2a.tar.gz\n$ cd iupred2a\n$ export PYTHONPATH=\"$PWD\"\n```\n\n### PICNIC is available on PyPI\n\nPICNIC officially supports Python versions >=3.9,<3.13.\n\n```shell\n$ python3 --version\nPython 3.11.5\n\n$ python3 -m venv picnic-env\n$ source picnic-env/bin/activate\n$ (picnic-env) % python -m pip install --upgrade pip\n$ (picnic-env) % python -m pip install picnic_bio\n```\n\n### PICNIC is also installable from source\n\n```shell\n$ git clone git@git.mpi-cbg.de:atplab/picnic.git\n```\n\nOnce you have a copy of the source, you can embed it in your own Python package, or install it into your site-packages easily\n\n```shell\n$ cd picnic\n$ python3 -m venv picnic-env\n$ source picnic-env/bin/activate\n$ (picnic-env) % python -m pip install --upgrade pip\n$ (picnic-env) $ python -m pip install .\n```\n\n### How to install PICNIC using Conda?\n\nThere isn't any binary installer available on Conda yet. Though it is possible to install PICNIC within a virtual Conda environment.\n\nPlease note that in a conda environment you have to pre-install catboost, before installing picnic-bio itself, otherwise the installation will fail when compiling the catboost package from source code. Also it is recommended to use and set up [conda-forge](https://conda-forge.org/docs/user/introduction.html) to fetch pre-compiled versions of catboost.\n\nWe have documented how to get around the catboost installation issue.\n\n```shell\n$ conda config --add channels conda-forge\n$ conda config --set channel_priority strict\n\n# Choose one of the supported Python versions, when creating the Conda environment: >=3.9,<3.13\n# conda create -n myenv python=[3.9, 3.10, 3.11, 3.12] catboost\n# e.g.\n$ conda create -n myenv python=3.11 catboost\n$ conda activate myenv\n(myenv) $ python -m pip install picnic_bio\n```\n\n## How to use?\n\n### Usage - Using PICNIC from command line\n\n```\n$ picnic <is_automated> <path_af> <protein_id> <is_go> --path_fasta_file <file>\n\nusage: PICNIC [-h] [--path_fasta_file PATH_FASTA_FILE]\n              is_automated path_af protein_id is_go\n\nPICNIC (Proteins Involved in CoNdensates In Cells) is a machine learning-based\nmodel that predicts proteins involved in biomolecular condensates.\n\npositional arguments:\n  is_automated          True if automated pipeline (works for proteins with\n                        length < 1400 aa, with precalculated Alphafold2 model,\n                        deposited to UniprotKB), else manual pipeline\n                        (protein_id, Alphafold2 model(s) and fasta file are\n                        needed to be provided as input)\n  path_af               directory with pdb files, created by Alphafold2 for\n                        the protein in the format. For smaller proteins ( <\n                        1400 aa length) AlphaFold2 provides one model, that\n                        should be named: AF-protein_id-F1-v{j}.pdb, where j is\n                        a version number. In case of large proteins Alphafold2\n                        provides more than one file, and all of them should be\n                        stored in one directory and named: 'AF-\n                        protein_id-F{i}-v{j}.pdb', where i is a number of\n                        model, j is a version number.\n  protein_id            protein identifier in UniprotKB (should correspond to\n                        the name 'protein_id' for Alphafold2 models, stored in\n                        directory_af_models)\n  is_go                 boolean flag; if 'True', picnic_go score (picnic\n                        version with Gene Ontology features) will be\n                        calculated, Gene Ontology terms are retrieved in this\n                        case from UniprotKB by protein_id identifier;\n                        otherwise default picnic score will be printed\n                        (without Gene Ontology annotation)\n\noptions:\n  -h, --help            show this help message and exit\n  --path_fasta_file PATH_FASTA_FILE\n                        directory with sequence file in fasta format\n```\n\n### Examples\n\nRun automated pipeline for a given UniProt Id:\n```shell\n$ picnic True notebooks/test_files/Q99720/ Q99720 True\n```\nRun manual pipeline for a given UniProt Id:\n```shell\n$ picnic False 'notebooks/test_files/O95613/' 'O95613' False --path_fasta_file 'notebooks/test_files/O95613/O95613.fasta.txt'\n```\nRun manual pipeline for your own protein sequence called MY_PROTEIN, which has no reference to UniProt:\n```shell\n$ picnic False 'notebooks/test_files/MY_PROTEIN/' 'MY_PROTEIN' False --path_fasta_file 'notebooks/test_files/MY_PROTEIN/my_protein.fasta'\n```\nExamples of using PICNIC are shown in a jupyter-notebook in notebooks folder.\n\n### How to run the provided Jupyter notebook?\n\nExamples of how to use and run PICNIC are shown in a provided Jupyter notebook. The notebook can be found under the\n**notebooks** folder.\n\n#### What is Jupyter Notebook?\n\nPlease read documentation [here](https://saturncloud.io/blog/how-to-launch-jupyter-notebook-from-your-terminal/#what-is-jupyter-notebook).\n\n\n#### How to create a virtual environment and install all required Python packages.\n\nCreate a virtual environment by executing the command venv:\n```shell\n$ python -m venv /path/to/new/virtual/environment\n# e.g.\n$ python -m venv my_jupyter_env\n```\n\nThen install the classic Jupyter Notebook with:\n```shell\n$ source my_jupyter_env/bin/activate\n\n$ pip install notebook\n```\nAlso install picnic-bio from source in the same virtual environment...\n```shell\n$ pip install .\n```\n\n#### How to Launch Jupyter Notebook from Your Terminal?\n\nIn your terminal source the previously created virtual environment...\n```shell\n$ source my_jupyter_env/bin/activate\n```\nLaunch Jupyter Notebook...\n```shell\n$ jupyter notebook\n```\nOpen the example notebook called 'picnic_examples.ipynb' under the notebooks folder.  \n\n## Publication\n***PICNIC accurately predicts condensate-forming proteins regardless of their structural disorder across organisms.***\nAnna Hadarovich, Hari Raj Singh, Soumyadeep Ghosh, Maxim Scheremetjew, Nadia Rostam, Anthony A. Hyman & Agnes Toth-Petroczy. \nNature Communications volume 15, Article number: 10668 (2024). doi: [10.1038/s41467-024-55089-x](https://doi.org/10.1038/s41467-024-55089-x). PMID: [39663388](https://pubmed.ncbi.nlm.nih.gov/39663388/).\n\n<a rel=\"license\" href=\"http://creativecommons.org/licenses/by-sa/4.0/\"><img alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https://i.creativecommons.org/l/by-sa/4.0/88x31.png\" /></a><br />This work is licensed under a <a rel=\"license\" href=\"http://creativecommons.org/licenses/by-sa/4.0/\">Creative Commons Attribution-ShareAlike 4.0 International License</a>.\n\n## Development\n\n### Getting started\n\n#### Add your SSH key to GitLab\n\nBefore you start make sure you have a [SSH key generated](https://git.mpi-cbg.de/help/ssh/index#generate-an-ssh-key-pair) and the [public SSH Key added to your GitLab account](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account).\nYou only have to do this once!\n\nEvery time you open a new console/terminal make sure your ssh-agent is running and all SSH keys are added.\n\n```shell\n$ eval `ssh-agent -s` && ssh-add\n```\n\n#### Download and unpack IUPred2A\n\nFill out and submit the IUPred2A web form [here](https://iupred2a.elte.hu/download_new) to request the iupred2a.tar.gz archive.\nOnce you received the archive please unpack the TAR ball under the following project sub folder src/files/:\n\n```shell\n$ cd picnic/bin\n$ cp blah/iupred2a.tar.gz .\n$ tar -xvf iupred2a.tar.gz\n$ cd iupred2a\n$ export PYTHONPATH=\"$PWD\"\n```\n\n### How to enable BuildKit?\n\nThis could be achieved in different ways. Follow documentation [here](https://brianchristner.io/what-is-docker-buildkit/).\n\nThere are 2 main options:\n\n* Enable the BuildKit in your local Docker Desktop\n* Enable the BuildKit in a fresh terminal\n\n#### Enable the BuildKit in a fresh terminal\n\n```\nexport DOCKER_BUILDKIT=1\n```\n\n##### On Linux machines, you may also need:\n```\nexport COMPOSE_DOCKER_CLI_BUILD=1\n```\n\n### Building your Docker images\n\n```shell\n$ docker build --build-arg=\"PYTHON_VERSION=3.10.13\" . -f Dockerfile -t atplab/picnic-service\n```\n\n#### Run your image as a container\n\n```shell\n$ docker run atplab/picnic-service\n\ne.g.\n$ docker run atplab/picnic-service True 'notebooks/test_files/Q99720/' 'Q99720' True\n```\n\n#### Create an interactive bash shell in the container\n\n```shell\n$ docker run -it --entrypoint sh atplab/picnic-service\n```\n\n## Packaging and distribution\n\n### Getting started\n\n#### Install packages required for building and distributing a Python project\n\n```shell\n// Create a new env - one off task\n$ python3 -m venv packaging\n\n$ source packaging/bin/activate\n\n(packaging) $ pip install -r requirements-packaging.txt\n```\n\n#### How to install and run the PICNIC package locally from the project root directory?\n\n```shell\n(venv) % cd /<path-to-project-root-folder>/picnic\n\n(venv) % pip install .\n\n# Type in the picnic command to found out if the installation was successfully\n(venv) % picnic                                                                            \nusage: PICNIC [-h] [--path_fasta_file PATH_FASTA_FILE] is_automated path_af protein_id is_go\nPICNIC: error: the following arguments are required: is_automated, path_af, protein_id, is_go\n```\n\n#### How to build the PICNIC package from the project root directory?\n\nRun the following command from the root directory to build the package. This will create a dist \nfolder where the wheel distribution is built along with a zip file.\n```shell\n(packaging) $ cd /<path-to-project-root-folder>/picnic\n\n(packaging) $ python -m build\n\n(packaging) $ ls -l dist \npicnic-bio-1.0.0b1.tar.gz\npicnic_bio-1.0.0b1-py3-none-any.whl\n\n(packaging) $ twine check dist/*\n```\n\n#### How to upload distribution files to PyPi?\n\nFinally, we need to upload these files to PyPi using Twine. Use the following command from the project root\ndirectory. Enter the PyPi credentials to complete uploading the package.\n\n```shell\n# Perform a test upload on testPyPI\n(packaging) $ twine upload --repository testpypi dist/*\n\n# Finally upload the distribution to PyPI\n(packaging) $ twine upload dist/*\n```\n\n\n#### How to deactivate the virtual environment?\n\n```shell\n(packaging) $ deactivate\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "PICNIC (Proteins Involved in CoNdensates In Cells) is a machine learning-based model that predicts proteins involved in biomolecular condensates.",
    "version": "1.0.0",
    "project_urls": {
        "Documentation": "https://git.mpi-cbg.de/tothpetroczylab/picnic/-/blob/main/README.md",
        "Funding": "https://picnic.cd-code.org/about-us",
        "Homepage": "https://picnic.cd-code.org/",
        "Source": "https://git.mpi-cbg.de/tothpetroczylab/picnic",
        "Tracker": "https://git.mpi-cbg.de/tothpetroczylab/picnic/-/issues"
    },
    "split_keywords": [
        "biomolecular condensate",
        " scientific annotation tool",
        " condensate",
        " machine learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3ccd191f0d10f02d87fad3e289257460be0f2f6e789172ae8273e2db8b9355a4",
                "md5": "df07b4f5cf8e724e63ff28b5c4bd5b05",
                "sha256": "0e7df193a84e88b240c3c9458b636e22fab072400fa7b6b12d57b9283deaf203"
            },
            "downloads": -1,
            "filename": "picnic_bio-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "df07b4f5cf8e724e63ff28b5c4bd5b05",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 2347395,
            "upload_time": "2024-12-19T10:42:06",
            "upload_time_iso_8601": "2024-12-19T10:42:06.815576Z",
            "url": "https://files.pythonhosted.org/packages/3c/cd/191f0d10f02d87fad3e289257460be0f2f6e789172ae8273e2db8b9355a4/picnic_bio-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "542cff5cdcd41db5495fac6f0aeaed8a4c360a12a0f3c7b89397760a85aee98e",
                "md5": "bc5222f3953697f774d67e5eb8c2317b",
                "sha256": "cf3c153b7896ce6993b3441c7d2e03542a371df8d58313df6ad73ad20ca4b264"
            },
            "downloads": -1,
            "filename": "picnic_bio-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "bc5222f3953697f774d67e5eb8c2317b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 2262343,
            "upload_time": "2024-12-19T10:42:09",
            "upload_time_iso_8601": "2024-12-19T10:42:09.603745Z",
            "url": "https://files.pythonhosted.org/packages/54/2c/ff5cdcd41db5495fac6f0aeaed8a4c360a12a0f3c7b89397760a85aee98e/picnic_bio-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-19 10:42:09",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "picnic-bio"
}

Anna Hadarovich <hadarovi@mpi-cbg.de>, Soumyadeep Ghosh <soumyadeep11194@gmail.com>, Maxim Scheremetjew <schereme@mpi-cbg.de>