Name | jump-portrait JSON |
Version |
0.0.27
JSON |
| download |
home_page | None |
Summary | Tools to fetch and visualize JUMP images |
upload_time | 2025-02-10 15:13:18 |
maintainer | None |
docs_url | None |
author | Alan Munoz |
requires_python | <3.12,>=3.10 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Table of Contents
Fetch, visualize and.or download images from the JUMP dataset (cpg0016 in the [Cell Painting Gallery](https://github.com/broadinstitute/cellpainting-gallery)).
## Workflow
### Workflow 1: Download all images for a given item and their controls
```python
from jump_portrait.save import download_item_images
item_name = "MYT1" # Item or Compound of interest - (GC)OI
channels = ["DNA"] # Standard channels are ER, AGP, Mito, DNA, RNA and (for most plates) Brightfield
corrections = ["Orig"] # Can also be "Illum"
controls = True # Fetch controls in plates alongside (GC)OI?
download_item_images(item_name, channels, corrections=corrections, controls=controls)
```
### Workflow 2: get images from explicit metadata
Fetch one image for a given item.
```python
import polars as pl
from jump_portrait.fetch import get_jump_image, get_sample
sample = get_sample()
source, batch, plate, well, site = sample.select(pl.col(f"Metadata_{x}" for x in ("Source", "Batch", "Plate", "Well", "Site"))).row(0)
channel = "DNA"
correction = None # or "Illum"
img = get_jump_image(source, batch, plate, well, channel, site, correction)
```
### Developer
First, we Locate the images produced to a given perturbation.
```python
from jump_portrait.fetch import get_item_location_info
gene = "MYT1"
location_df = get_item_location_info(gene)
```
Returns a polars dataframe whose columns contain the metadata
alongside path and file locations
``` python
#┌───────────┬───────────┬───────────┬───────────┬───┬───────────┬───────────┬───────────┬──────────┐
#│ Metadata_ ┆ Metadata_ ┆ Metadata_ ┆ Metadata_ ┆ … ┆ PathName_ ┆ Metadata_ ┆ Metadata_ ┆ standard │
#│ Source ┆ Batch ┆ Plate ┆ Well ┆ ┆ OrigRNA ┆ PlateType ┆ JCP2022 ┆ _key │
#│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
#│ str ┆ str ┆ str ┆ str ┆ ┆ str ┆ str ┆ str ┆ str │
#╞═══════════╪═══════════╪═══════════╪═══════════╪═══╪═══════════╪═══════════╪═══════════╪══════════╡
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05 ┆ … ┆ s3://cell ┆ CRISPR ┆ JCP2022_8 ┆ MYT1 │
#│ ┆ Run1 ┆ -20 ┆ ┆ ┆ painting- ┆ ┆ 04400 ┆ │
#│ ┆ ┆ ┆ ┆ ┆ gallery/c ┆ ┆ ┆ │
#│ ┆ ┆ ┆ ┆ ┆ pg001… ┆ ┆ ┆ │
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05 ┆ … ┆ s3://cell ┆ CRISPR ┆ JCP2022_8 ┆ MYT1 │
#│ ┆ Run1 ┆ -20 ┆ ┆ ┆ painting- ┆ ┆ 04400 ┆ │
#│ ┆ ┆ ┆ ┆ ┆ gallery/c ┆ ┆ ┆ │
#│ ┆ ┆ ┆ ┆ ┆ pg001… ┆ ┆ ┆ │
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05 ┆ … ┆ s3://cell ┆ CRISPR ┆ JCP2022_8 ┆ MYT1 │
#│ ┆ Run1 ┆ -20 ┆ ┆ ┆ painting- ┆ ┆ 04400 ┆ │
#│ ┆ ┆ ┆ ┆ ┆ gallery/c ┆ ┆ ┆ │
#│ ┆ ┆ ┆ ┆ ┆ pg001… ┆ ┆ ┆ │
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05 ┆ … ┆ s3://cell ┆ CRISPR ┆ JCP2022_8 ┆ MYT1 │
#│ ┆ Run1 ┆ -20 ┆ ┆ ┆ painting- ┆ ┆ 04400 ┆ │
#│ ┆ ┆ ┆ ┆ ┆ gallery/c ┆ ┆ ┆ │
#│ ┆ ┆ ┆ ┆ ┆ pg001… ┆ ┆ ┆ │
#│ source_13 ┆ 20220914_ ┆ CP-CC9-R1 ┆ B05 ┆ … ┆ s3://cell ┆ CRISPR ┆ JCP2022_8 ┆ MYT1 │
#│ ┆ Run1 ┆ -20 ┆ ┆ ┆ painting- ┆ ┆ 04400 ┆ │
#│ ┆ ┆ ┆ ┆ ┆ gallery/c ┆ ┆ ┆ │
#│ ┆ ┆ ┆ ┆ ┆ pg001… ┆ ┆ ┆ │
#└───────────┴───────────┴───────────┴───────────┴───┴───────────┴───────────┴───────────┴──────────┘
```
The columns of these dataframes are:
```
Metadata_[Source/Batch/Plate/Well/Site]:
- Source: Source in the range 0-14.
- Plate: Plate containing a multitude of wells. It is a string.
- Batch: Collection of plates imaged at around the same time. It is a string.
- Well: Physical location wherein the experiment was performed and imaged. It is a string with format [SNN] where S={A-P} and NN={00-24}.
- Site: Foci or frame taken in a the well, these are 0-9 for the ORF and CRISPR datasets and 1-6 for the compounds dataset.
[File/Path]name_[Illum/Orig][Channel]
- Illum: Illumination correction
- Orig: Original File
Also, markers can be:
- DNA: Dna channel, generally Hoecsht.
- ER: Endoplasmatic Reticulum channel.
- Mito: Mitochondrial channel.
- RNA: RNA channel.
standard_key: Gene or compound queried
```
We can then feed this information to `jump_portrait.fetch.get_jump_image` to fetch the available images as in workflow 2.
Or we can feed this information straight to `jump_portrait.fetch.get_jump_image_batch` to fetch the available images in batches with desired channel and sites.
```python
from jump_portrait.fetch import get_jump_image_batch
sub_location_df = location_df.select(["Metadata_Source", "Metadata_Batch", "Metadata_Plate", "Metadata_Well"]).unique()
channel = ["DNA", "AGP", "Mito", "ER", "RNA"] # example
site = [str(i) for i in range(10)] # every site from 0 to 9 (as this is a CRISPR plate)
correction = "Orig" # or "Illum"
verbose = False # whether to have tqdm loading bar
iterable, img_list = get_jump_image_batch(sub_location_df, channel, site, correction, verbose)
```
Returns:
- iterable (list of tuple) > list containing the metadata, channel, site and correction
- img_list (list of array) > list containing the images. NB, if no image has been retrieved for a specific site (this might happen), array object is replaced by a None
From there, current processing will include:
1. Filter out images where no image has been retrieved (remove None values)
2. Stack images along a channel axis
```python
# first, filter out img / param where no img has been retrieved
mask = [x is not None for x in img_list]
iterable_filt = [param for i, param in enumerate(iterable) if mask[i]]
img_list_filt = [param for i, param in enumerate(img_list) if mask[i]]
```
``` python
# second, group image per source, batch, well, site > to stack on channel
from itertools import groupby, starmap
import numpy as np
zip_iter_img = sorted(zip(iterable_filt, img_list_filt),
key=lambda x: (x[0][0], x[0][1], x[0][2], x[0][3], x[0][5], x[0][4]))
iterable_stack, img_stack = map(lambda tup: list(tup),
zip(*starmap(
lambda key, param_img: (key, np.stack(list(map(lambda x: x[1], param_img)))),
# grouped image are returned as the common key, and then the zip of param and img, so we retrieve the img then we stack
groupby(zip_iter_img,
key=lambda x: (x[0][0], x[0][1], x[0][2], x[0][3], x[0][5])))))
```
Raw data
{
"_id": null,
"home_page": null,
"name": "jump-portrait",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Alan Munoz",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/3e/75/89c2d8f2a038b042d863f86876234f5d49be6bf621e33a975718940a62ba/jump_portrait-0.0.27.tar.gz",
"platform": null,
"description": "# Table of Contents\n\nFetch, visualize and.or download images from the JUMP dataset (cpg0016 in the [Cell Painting Gallery](https://github.com/broadinstitute/cellpainting-gallery)). \n\n## Workflow\n\n### Workflow 1: Download all images for a given item and their controls\n\n```python\nfrom jump_portrait.save import download_item_images\n\nitem_name = \"MYT1\" # Item or Compound of interest - (GC)OI\nchannels = [\"DNA\"] # Standard channels are ER, AGP, Mito, DNA, RNA and (for most plates) Brightfield\ncorrections = [\"Orig\"] # Can also be \"Illum\"\ncontrols = True # Fetch controls in plates alongside (GC)OI?\n\ndownload_item_images(item_name, channels, corrections=corrections, controls=controls)\n```\n\n### Workflow 2: get images from explicit metadata\n\nFetch one image for a given item.\n```python\nimport polars as pl\nfrom jump_portrait.fetch import get_jump_image, get_sample\n\nsample = get_sample()\n\nsource, batch, plate, well, site = sample.select(pl.col(f\"Metadata_{x}\" for x in (\"Source\", \"Batch\", \"Plate\", \"Well\", \"Site\"))).row(0)\nchannel = \"DNA\"\ncorrection = None # or \"Illum\"\n\nimg = get_jump_image(source, batch, plate, well, channel, site, correction)\n```\n\n### Developer\nFirst, we Locate the images produced to a given perturbation.\n\n```python \nfrom jump_portrait.fetch import get_item_location_info\n\ngene = \"MYT1\"\n\nlocation_df = get_item_location_info(gene)\n\n```\n\nReturns a polars dataframe whose columns contain the metadata \nalongside path and file locations\n\n``` python\n\n#\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n#\u2502 Metadata_ \u2506 Metadata_ \u2506 Metadata_ \u2506 Metadata_ \u2506 \u2026 \u2506 PathName_ \u2506 Metadata_ \u2506 Metadata_ \u2506 standard \u2502\n#\u2502 Source \u2506 Batch \u2506 Plate \u2506 Well \u2506 \u2506 OrigRNA \u2506 PlateType \u2506 JCP2022 \u2506 _key \u2502\n#\u2502 --- \u2506 --- \u2506 --- \u2506 --- \u2506 \u2506 --- \u2506 --- \u2506 --- \u2506 --- \u2502\n#\u2502 str \u2506 str \u2506 str \u2506 str \u2506 \u2506 str \u2506 str \u2506 str \u2506 str \u2502\n#\u255e\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05 \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR \u2506 JCP2022_8 \u2506 MYT1 \u2502\n#\u2502 \u2506 Run1 \u2506 -20 \u2506 \u2506 \u2506 painting- \u2506 \u2506 04400 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 gallery/c \u2506 \u2506 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 pg001\u2026 \u2506 \u2506 \u2506 \u2502\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05 \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR \u2506 JCP2022_8 \u2506 MYT1 \u2502\n#\u2502 \u2506 Run1 \u2506 -20 \u2506 \u2506 \u2506 painting- \u2506 \u2506 04400 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 gallery/c \u2506 \u2506 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 pg001\u2026 \u2506 \u2506 \u2506 \u2502\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05 \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR \u2506 JCP2022_8 \u2506 MYT1 \u2502\n#\u2502 \u2506 Run1 \u2506 -20 \u2506 \u2506 \u2506 painting- \u2506 \u2506 04400 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 gallery/c \u2506 \u2506 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 pg001\u2026 \u2506 \u2506 \u2506 \u2502\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05 \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR \u2506 JCP2022_8 \u2506 MYT1 \u2502\n#\u2502 \u2506 Run1 \u2506 -20 \u2506 \u2506 \u2506 painting- \u2506 \u2506 04400 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 gallery/c \u2506 \u2506 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 pg001\u2026 \u2506 \u2506 \u2506 \u2502\n#\u2502 source_13 \u2506 20220914_ \u2506 CP-CC9-R1 \u2506 B05 \u2506 \u2026 \u2506 s3://cell \u2506 CRISPR \u2506 JCP2022_8 \u2506 MYT1 \u2502\n#\u2502 \u2506 Run1 \u2506 -20 \u2506 \u2506 \u2506 painting- \u2506 \u2506 04400 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 gallery/c \u2506 \u2506 \u2506 \u2502\n#\u2502 \u2506 \u2506 \u2506 \u2506 \u2506 pg001\u2026 \u2506 \u2506 \u2506 \u2502\n#\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\nThe columns of these dataframes are:\n\n```\nMetadata_[Source/Batch/Plate/Well/Site]:\n - Source: Source in the range 0-14.\n - Plate: Plate containing a multitude of wells. It is a string.\n - Batch: Collection of plates imaged at around the same time. It is a string.\n - Well: Physical location wherein the experiment was performed and imaged. It is a string with format [SNN] where S={A-P} and NN={00-24}.\n - Site: Foci or frame taken in a the well, these are 0-9 for the ORF and CRISPR datasets and 1-6 for the compounds dataset.\n[File/Path]name_[Illum/Orig][Channel] \n \n - Illum: Illumination correction \n - Orig: Original File\n Also, markers can be:\n - DNA: Dna channel, generally Hoecsht.\n - ER: Endoplasmatic Reticulum channel.\n - Mito: Mitochondrial channel.\n - RNA: RNA channel.\nstandard_key: Gene or compound queried\n\n```\n\nWe can then feed this information to `jump_portrait.fetch.get_jump_image` to fetch the available images as in workflow 2.\n\nOr we can feed this information straight to `jump_portrait.fetch.get_jump_image_batch` to fetch the available images in batches with desired channel and sites.\n\n```python\nfrom jump_portrait.fetch import get_jump_image_batch\nsub_location_df = location_df.select([\"Metadata_Source\", \"Metadata_Batch\", \"Metadata_Plate\", \"Metadata_Well\"]).unique()\nchannel = [\"DNA\", \"AGP\", \"Mito\", \"ER\", \"RNA\"] # example\nsite = [str(i) for i in range(10)] # every site from 0 to 9 (as this is a CRISPR plate) \ncorrection = \"Orig\" # or \"Illum\"\nverbose = False # whether to have tqdm loading bar\n\niterable, img_list = get_jump_image_batch(sub_location_df, channel, site, correction, verbose)\n```\n\nReturns: \n- iterable (list of tuple) > list containing the metadata, channel, site and correction\n- img_list (list of array) > list containing the images. NB, if no image has been retrieved for a specific site (this might happen), array object is replaced by a None\n\nFrom there, current processing will include:\n1. Filter out images where no image has been retrieved (remove None values) \n2. Stack images along a channel axis\n\n```python\n# first, filter out img / param where no img has been retrieved\nmask = [x is not None for x in img_list]\niterable_filt = [param for i, param in enumerate(iterable) if mask[i]]\nimg_list_filt = [param for i, param in enumerate(img_list) if mask[i]]\n```\n\n``` python\n# second, group image per source, batch, well, site > to stack on channel\nfrom itertools import groupby, starmap\nimport numpy as np\nzip_iter_img = sorted(zip(iterable_filt, img_list_filt),\n key=lambda x: (x[0][0], x[0][1], x[0][2], x[0][3], x[0][5], x[0][4]))\niterable_stack, img_stack = map(lambda tup: list(tup),\n zip(*starmap(\n lambda key, param_img: (key, np.stack(list(map(lambda x: x[1], param_img)))),\n # grouped image are returned as the common key, and then the zip of param and img, so we retrieve the img then we stack\n groupby(zip_iter_img,\n key=lambda x: (x[0][0], x[0][1], x[0][2], x[0][3], x[0][5])))))\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Tools to fetch and visualize JUMP images",
"version": "0.0.27",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "967a86892d70f49fc7cc57142a3e6bc164846505b27495c8eb14785d4f7a340a",
"md5": "fe44e4c203834ec54a907e9339c81a5a",
"sha256": "3e422784f3530b3ad7423f9c887bcf3fc05217ee9aee0898df834fefc9b67fe5"
},
"downloads": -1,
"filename": "jump_portrait-0.0.27-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fe44e4c203834ec54a907e9339c81a5a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.10",
"size": 15590,
"upload_time": "2025-02-10T15:13:17",
"upload_time_iso_8601": "2025-02-10T15:13:17.381270Z",
"url": "https://files.pythonhosted.org/packages/96/7a/86892d70f49fc7cc57142a3e6bc164846505b27495c8eb14785d4f7a340a/jump_portrait-0.0.27-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3e7589c2d8f2a038b042d863f86876234f5d49be6bf621e33a975718940a62ba",
"md5": "a21909f4cf6590d725eb13c798b4d5c1",
"sha256": "be990041866cff925cacd4f9f4736a4bcb478c75ecbf5e66a7374e69658dd918"
},
"downloads": -1,
"filename": "jump_portrait-0.0.27.tar.gz",
"has_sig": false,
"md5_digest": "a21909f4cf6590d725eb13c798b4d5c1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.10",
"size": 14844,
"upload_time": "2025-02-10T15:13:18",
"upload_time_iso_8601": "2025-02-10T15:13:18.513140Z",
"url": "https://files.pythonhosted.org/packages/3e/75/89c2d8f2a038b042d863f86876234f5d49be6bf621e33a975718940a62ba/jump_portrait-0.0.27.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-10 15:13:18",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "jump-portrait"
}