jump-portrait


Namejump-portrait JSON
Version 0.0.27 PyPI version JSON
download
home_pageNone
SummaryTools to fetch and visualize JUMP images
upload_time2025-02-10 15:13:18
maintainerNone
docs_urlNone
authorAlan Munoz
requires_python<3.12,>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Table of Contents

Fetch, visualize and.or download images from the JUMP dataset (cpg0016 in the [Cell Painting Gallery](https://github.com/broadinstitute/cellpainting-gallery)). 

## Workflow

### Workflow 1: Download all images for a given item and their controls

```python
from jump_portrait.save import download_item_images

item_name = "MYT1"  # Item or Compound of interest - (GC)OI
channels = ["DNA"]  # Standard channels are ER, AGP, Mito, DNA, RNA and (for most plates) Brightfield
corrections = ["Orig"]  # Can also be "Illum"
controls = True  # Fetch controls in plates alongside (GC)OI?

download_item_images(item_name, channels, corrections=corrections, controls=controls)
```

### Workflow 2: get images from explicit metadata

Fetch one image for a given item.
```python
import polars as pl
from jump_portrait.fetch import get_jump_image, get_sample

sample = get_sample()

source, batch, plate, well, site = sample.select(pl.col(f"Metadata_{x}" for x in ("Source", "Batch", "Plate", "Well", "Site"))).row(0)
channel = "DNA"
correction = None # or "Illum"

img = get_jump_image(source, batch, plate, well, channel, site, correction)
```

### Developer
First, we Locate the images produced to a given perturbation.

```python 
from jump_portrait.fetch import get_item_location_info

gene = "MYT1"

location_df = get_item_location_info(gene)

```

Returns a polars dataframe whose columns contain the metadata 
alongside path and file locations

``` python

#┌───────────┬───────────┬───────────┬───────────┬───┬───────────┬───────────┬───────────┬──────────┐
#│ Metadata_ ┆ Metadata_ ┆ Metadata_ ┆ Metadata_ ┆ … ┆ PathName_ ┆ Metadata_ ┆ Metadata_ ┆ standard │
#│ Source    ┆ Batch     ┆ Plate     ┆ Well      ┆   ┆ OrigRNA   ┆ PlateType ┆ JCP2022   ┆ _key     │
#│ ---       ┆ ---       ┆ ---       ┆ ---       ┆   ┆ ---       ┆ ---       ┆ ---       ┆ ---      │
#│ str       ┆ str       ┆ str       ┆ str       ┆   ┆ str       ┆ str       ┆ str       ┆ str      │
#╞═══════════╪═══════════╪═══════════╪═══════════╪═══╪═══════════╪═══════════╪═══════════╪══════════╡
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05       ┆ … ┆ s3://cell ┆ CRISPR    ┆ JCP2022_8 ┆ MYT1     │
#│           ┆ Run1      ┆ -20       ┆           ┆   ┆ painting- ┆           ┆ 04400     ┆          │
#│           ┆           ┆           ┆           ┆   ┆ gallery/c ┆           ┆           ┆          │
#│           ┆           ┆           ┆           ┆   ┆ pg001…    ┆           ┆           ┆          │
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05       ┆ … ┆ s3://cell ┆ CRISPR    ┆ JCP2022_8 ┆ MYT1     │
#│           ┆ Run1      ┆ -20       ┆           ┆   ┆ painting- ┆           ┆ 04400     ┆          │
#│           ┆           ┆           ┆           ┆   ┆ gallery/c ┆           ┆           ┆          │
#│           ┆           ┆           ┆           ┆   ┆ pg001…    ┆           ┆           ┆          │
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05       ┆ … ┆ s3://cell ┆ CRISPR    ┆ JCP2022_8 ┆ MYT1     │
#│           ┆ Run1      ┆ -20       ┆           ┆   ┆ painting- ┆           ┆ 04400     ┆          │
#│           ┆           ┆           ┆           ┆   ┆ gallery/c ┆           ┆           ┆          │
#│           ┆           ┆           ┆           ┆   ┆ pg001…    ┆           ┆           ┆          │
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05       ┆ … ┆ s3://cell ┆ CRISPR    ┆ JCP2022_8 ┆ MYT1     │
#│           ┆ Run1      ┆ -20       ┆           ┆   ┆ painting- ┆           ┆ 04400     ┆          │
#│           ┆           ┆           ┆           ┆   ┆ gallery/c ┆           ┆           ┆          │
#│           ┆           ┆           ┆           ┆   ┆ pg001…    ┆           ┆           ┆          │
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05       ┆ … ┆ s3://cell ┆ CRISPR    ┆ JCP2022_8 ┆ MYT1     │
#│           ┆ Run1      ┆ -20       ┆           ┆   ┆ painting- ┆           ┆ 04400     ┆          │
#│           ┆           ┆           ┆           ┆   ┆ gallery/c ┆           ┆           ┆          │
#│           ┆           ┆           ┆           ┆   ┆ pg001…    ┆           ┆           ┆          │
#└───────────┴───────────┴───────────┴───────────┴───┴───────────┴───────────┴───────────┴──────────┘
```

The columns of these dataframes are:

```
Metadata_[Source/Batch/Plate/Well/Site]:
 - Source: Source in the range 0-14.
 - Plate: Plate containing a multitude of wells. It is a string.
 - Batch: Collection of plates imaged at around the same time. It is a string.
 - Well: Physical location wherein the experiment was performed and imaged. It is a string with format [SNN] where S={A-P} and NN={00-24}.
 - Site: Foci or frame taken in a the well, these are 0-9 for the ORF and CRISPR datasets and 1-6 for the compounds dataset.
[File/Path]name_[Illum/Orig][Channel] 
    
 - Illum: Illumination correction 
 - Orig: Original File
 Also, markers can be:
   - DNA: Dna channel, generally Hoecsht.
   - ER: Endoplasmatic Reticulum channel.
   - Mito: Mitochondrial channel.
   - RNA: RNA channel.
standard_key: Gene or compound queried

```

We can then feed this information to `jump_portrait.fetch.get_jump_image` to fetch the available images as in workflow 2.

Or we can feed this information straight to `jump_portrait.fetch.get_jump_image_batch` to fetch the available images in batches with desired channel and sites.

```python
from jump_portrait.fetch import get_jump_image_batch
sub_location_df = location_df.select(["Metadata_Source", "Metadata_Batch", "Metadata_Plate", "Metadata_Well"]).unique()
channel = ["DNA", "AGP", "Mito", "ER", "RNA"] # example
site = [str(i) for i in range(10)] # every site from 0 to 9 (as this is a CRISPR plate) 
correction = "Orig" # or "Illum"
verbose = False # whether to have tqdm loading bar

iterable, img_list = get_jump_image_batch(sub_location_df, channel, site, correction, verbose)
```

Returns: 
- iterable (list of tuple) > list containing the metadata, channel, site and correction
- img_list (list of array) > list containing the images. NB, if no image has been retrieved for a specific site (this might happen), array object is replaced by a None

From there, current processing will include:
1. Filter out images where no image has been retrieved (remove None values) 
2. Stack images along a channel axis

```python
# first, filter out img / param where no img has been retrieved
mask = [x is not None for x in img_list]
iterable_filt = [param for i, param in enumerate(iterable) if mask[i]]
img_list_filt = [param for i, param in enumerate(img_list) if mask[i]]
```

``` python
# second, group image per source, batch, well, site > to stack on channel
from itertools import groupby, starmap
import numpy as np
zip_iter_img = sorted(zip(iterable_filt, img_list_filt),
                      key=lambda x: (x[0][0], x[0][1], x[0][2], x[0][3], x[0][5], x[0][4]))
iterable_stack, img_stack = map(lambda tup: list(tup),
        zip(*starmap(
            lambda key, param_img: (key, np.stack(list(map(lambda x: x[1], param_img)))),
            # grouped image are returned as the common key, and then the zip of param and img, so we retrieve the img then we stack
            groupby(zip_iter_img,
                    key=lambda x: (x[0][0], x[0][1], x[0][2], x[0][3], x[0][5])))))
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "jump-portrait",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Alan Munoz",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/3e/75/89c2d8f2a038b042d863f86876234f5d49be6bf621e33a975718940a62ba/jump_portrait-0.0.27.tar.gz",
    "platform": null,
    "description": "# Table of Contents\n\nFetch, visualize and.or download images from the JUMP dataset (cpg0016 in the [Cell Painting Gallery](https://github.com/broadinstitute/cellpainting-gallery)). \n\n## Workflow\n\n### Workflow 1: Download all images for a given item and their controls\n\n```python\nfrom jump_portrait.save import download_item_images\n\nitem_name = \"MYT1\"  # Item or Compound of interest - (GC)OI\nchannels = [\"DNA\"]  # Standard channels are ER, AGP, Mito, DNA, RNA and (for most plates) Brightfield\ncorrections = [\"Orig\"]  # Can also be \"Illum\"\ncontrols = True  # Fetch controls in plates alongside (GC)OI?\n\ndownload_item_images(item_name, channels, corrections=corrections, controls=controls)\n```\n\n### Workflow 2: get images from explicit metadata\n\nFetch one image for a given item.\n```python\nimport polars as pl\nfrom jump_portrait.fetch import get_jump_image, get_sample\n\nsample = get_sample()\n\nsource, batch, plate, well, site = sample.select(pl.col(f\"Metadata_{x}\" for x in (\"Source\", \"Batch\", \"Plate\", \"Well\", \"Site\"))).row(0)\nchannel = \"DNA\"\ncorrection = None # or \"Illum\"\n\nimg = get_jump_image(source, batch, plate, well, channel, site, correction)\n```\n\n### Developer\nFirst, we Locate the images produced to a given perturbation.\n\n```python \nfrom jump_portrait.fetch import get_item_location_info\n\ngene = \"MYT1\"\n\nlocation_df = get_item_location_info(gene)\n\n```\n\nReturns a polars dataframe whose columns contain the metadata \nalongside path and file locations\n\n``` python\n\n#\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n#\u2502 Metadata_ \u2506 Metadata_ \u2506 Metadata_ \u2506 Metadata_ \u2506 \u2026 \u2506 PathName_ \u2506 Metadata_ \u2506 Metadata_ \u2506 standard \u2502\n#\u2502 Source    \u2506 Batch     \u2506 Plate     \u2506 Well      \u2506   \u2506 OrigRNA   \u2506 PlateType \u2506 JCP2022   \u2506 _key     \u2502\n#\u2502 ---       \u2506 ---       \u2506 ---       \u2506 ---       \u2506   \u2506 ---       \u2506 ---       \u2506 ---       \u2506 ---      \u2502\n#\u2502 str       \u2506 str       \u2506 str       \u2506 str       \u2506   \u2506 str       \u2506 str       \u2506 str       \u2506 str      \u2502\n#\u255e\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05       \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR    \u2506 JCP2022_8 \u2506 MYT1     \u2502\n#\u2502           \u2506 Run1      \u2506 -20       \u2506           \u2506   \u2506 painting- \u2506           \u2506 04400     \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 gallery/c \u2506           \u2506           \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 pg001\u2026    \u2506           \u2506           \u2506          \u2502\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05       \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR    \u2506 JCP2022_8 \u2506 MYT1     \u2502\n#\u2502           \u2506 Run1      \u2506 -20       \u2506           \u2506   \u2506 painting- \u2506           \u2506 04400     \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 gallery/c \u2506           \u2506           \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 pg001\u2026    \u2506           \u2506           \u2506          \u2502\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05       \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR    \u2506 JCP2022_8 \u2506 MYT1     \u2502\n#\u2502           \u2506 Run1      \u2506 -20       \u2506           \u2506   \u2506 painting- \u2506           \u2506 04400     \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 gallery/c \u2506           \u2506           \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 pg001\u2026    \u2506           \u2506           \u2506          \u2502\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05       \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR    \u2506 JCP2022_8 \u2506 MYT1     \u2502\n#\u2502           \u2506 Run1      \u2506 -20       \u2506           \u2506   \u2506 painting- \u2506           \u2506 04400     \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 gallery/c \u2506           \u2506           \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 pg001\u2026    \u2506           \u2506           \u2506          \u2502\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05       \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR    \u2506 JCP2022_8 \u2506 MYT1     \u2502\n#\u2502           \u2506 Run1      \u2506 -20       \u2506           \u2506   \u2506 painting- \u2506           \u2506 04400     \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 gallery/c \u2506           \u2506           \u2506          \u2502\n#\u2502           \u2506           \u2506           \u2506           \u2506   \u2506 pg001\u2026    \u2506           \u2506           \u2506          \u2502\n#\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\nThe columns of these dataframes are:\n\n```\nMetadata_[Source/Batch/Plate/Well/Site]:\n - Source: Source in the range 0-14.\n - Plate: Plate containing a multitude of wells. It is a string.\n - Batch: Collection of plates imaged at around the same time. It is a string.\n - Well: Physical location wherein the experiment was performed and imaged. It is a string with format [SNN] where S={A-P} and NN={00-24}.\n - Site: Foci or frame taken in a the well, these are 0-9 for the ORF and CRISPR datasets and 1-6 for the compounds dataset.\n[File/Path]name_[Illum/Orig][Channel] \n    \n - Illum: Illumination correction \n - Orig: Original File\n Also, markers can be:\n   - DNA: Dna channel, generally Hoecsht.\n   - ER: Endoplasmatic Reticulum channel.\n   - Mito: Mitochondrial channel.\n   - RNA: RNA channel.\nstandard_key: Gene or compound queried\n\n```\n\nWe can then feed this information to `jump_portrait.fetch.get_jump_image` to fetch the available images as in workflow 2.\n\nOr we can feed this information straight to `jump_portrait.fetch.get_jump_image_batch` to fetch the available images in batches with desired channel and sites.\n\n```python\nfrom jump_portrait.fetch import get_jump_image_batch\nsub_location_df = location_df.select([\"Metadata_Source\", \"Metadata_Batch\", \"Metadata_Plate\", \"Metadata_Well\"]).unique()\nchannel = [\"DNA\", \"AGP\", \"Mito\", \"ER\", \"RNA\"] # example\nsite = [str(i) for i in range(10)] # every site from 0 to 9 (as this is a CRISPR plate) \ncorrection = \"Orig\" # or \"Illum\"\nverbose = False # whether to have tqdm loading bar\n\niterable, img_list = get_jump_image_batch(sub_location_df, channel, site, correction, verbose)\n```\n\nReturns: \n- iterable (list of tuple) > list containing the metadata, channel, site and correction\n- img_list (list of array) > list containing the images. NB, if no image has been retrieved for a specific site (this might happen), array object is replaced by a None\n\nFrom there, current processing will include:\n1. Filter out images where no image has been retrieved (remove None values) \n2. Stack images along a channel axis\n\n```python\n# first, filter out img / param where no img has been retrieved\nmask = [x is not None for x in img_list]\niterable_filt = [param for i, param in enumerate(iterable) if mask[i]]\nimg_list_filt = [param for i, param in enumerate(img_list) if mask[i]]\n```\n\n``` python\n# second, group image per source, batch, well, site > to stack on channel\nfrom itertools import groupby, starmap\nimport numpy as np\nzip_iter_img = sorted(zip(iterable_filt, img_list_filt),\n                      key=lambda x: (x[0][0], x[0][1], x[0][2], x[0][3], x[0][5], x[0][4]))\niterable_stack, img_stack = map(lambda tup: list(tup),\n        zip(*starmap(\n            lambda key, param_img: (key, np.stack(list(map(lambda x: x[1], param_img)))),\n            # grouped image are returned as the common key, and then the zip of param and img, so we retrieve the img then we stack\n            groupby(zip_iter_img,\n                    key=lambda x: (x[0][0], x[0][1], x[0][2], x[0][3], x[0][5])))))\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Tools to fetch and visualize JUMP images",
    "version": "0.0.27",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "967a86892d70f49fc7cc57142a3e6bc164846505b27495c8eb14785d4f7a340a",
                "md5": "fe44e4c203834ec54a907e9339c81a5a",
                "sha256": "3e422784f3530b3ad7423f9c887bcf3fc05217ee9aee0898df834fefc9b67fe5"
            },
            "downloads": -1,
            "filename": "jump_portrait-0.0.27-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fe44e4c203834ec54a907e9339c81a5a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.10",
            "size": 15590,
            "upload_time": "2025-02-10T15:13:17",
            "upload_time_iso_8601": "2025-02-10T15:13:17.381270Z",
            "url": "https://files.pythonhosted.org/packages/96/7a/86892d70f49fc7cc57142a3e6bc164846505b27495c8eb14785d4f7a340a/jump_portrait-0.0.27-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3e7589c2d8f2a038b042d863f86876234f5d49be6bf621e33a975718940a62ba",
                "md5": "a21909f4cf6590d725eb13c798b4d5c1",
                "sha256": "be990041866cff925cacd4f9f4736a4bcb478c75ecbf5e66a7374e69658dd918"
            },
            "downloads": -1,
            "filename": "jump_portrait-0.0.27.tar.gz",
            "has_sig": false,
            "md5_digest": "a21909f4cf6590d725eb13c798b4d5c1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.10",
            "size": 14844,
            "upload_time": "2025-02-10T15:13:18",
            "upload_time_iso_8601": "2025-02-10T15:13:18.513140Z",
            "url": "https://files.pythonhosted.org/packages/3e/75/89c2d8f2a038b042d863f86876234f5d49be6bf621e33a975718940a62ba/jump_portrait-0.0.27.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-10 15:13:18",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "jump-portrait"
}
        
Elapsed time: 6.71799s