tskit-arg-visualizer


Nametskit-arg-visualizer JSON
Version 0.0.1 PyPI version JSON
download
home_page
SummaryInteractive visualization method for ancestral recombination graphs
upload_time2023-11-03 23:17:21
maintainer
docs_urlNone
author
requires_python>=3.7
license
keywords population genetics tree sequence ancestral recombination graph visualization d3.js
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
  <img alt="ARG Visualizer Example" src="https://raw.githubusercontent.com/kitchensjn/tskit_arg_visualizer/master/images/stylized_arg_visualizer.png" width="500">
</p>

A method for drawing ancestral recombination graphs from tskit tree sequences in Python using D3.js. ARGs are plotted using a D3's [force layout](https://github.com/d3/d3-force). All nodes have a fixed position on the y-axis set by fy. Sample nodes have a fixed position on the x-axis set by fx; the ordering of the sample nodes comes from the first tree in the tskit tree sequence (this is not always the optimal ordering but is generally a good starting point for plotting). The x positions of other nodes are set by a force simulation where all nodes repel each other countered by a linkage force between connected nodes in the graph.

Users can click and drag the nodes (including the sample) along the x-axis to further clean up the layout of the graph. The simulation does not take into account line crosses, which can often be improved with some fiddling. Once a node has been moved by a user, its position is fixed with regards to the force simulation.

## Installation

```
pip install tskit_arg_visualizer
```

This package loads D3.js using a CDN, so requires access to the internet.

## Quickstart

```
import msprime
import random
import tskit_arg_visualizer

# Generate a random tree sequence with record_full_arg=True so that you get marked recombination nodes
ts_rs = random.randint(0,10000)   
ts = msprime.sim_ancestry(
    samples=3,
    recombination_rate=1e-8,
    sequence_length=3_000,
    population_size=10_000,
    record_full_arg=True,
    random_seed=ts_rs
)

d3arg = tskit_arg_visualizer.D3ARG.from_ts(ts=ts)
d3arg.draw(
    width=750,
    height=750,
    edge_type="ortho"
)
```

The above code can be run in three ways: terminal, Jupyter Notebook, or JupyterLab. For Jupyter Notebook and JupyterLab, you will need to add the following code block to the top of the document to properly load D3.js.

### Jupyter Notebook -

```
%%javascript
require.config({ 
    paths: { 
    d3: 'https://d3js.org/d3.v7.min'
}});

require(["d3"], function(d3) {
    window.d3 = d3;
});
```

### JupyterLab - 

```
%%javascript
var script = document.createElement('script');
script.type = 'text/javascript';
script.src = 'https://d3js.org/d3.v7.min.js';
document.head.appendChild(script);
```

## Drawing Parameters

The `draw()` function for a `D3ARG` object has multiple parameters that give users a small amount of customization with their figures:

```
def draw(
    self,
    width=500,
    height=500,
    tree_highlighting=True,
    y_axis_labels=True,
    y_axis_scale="rank",
    edge_type="line",
    variable_edge_width=False,
    subset_nodes=None,
    include_node_labels=True,
    sample_order=[]
):
"""Draws the D3ARG using D3.js by sending a custom JSON object to visualizer.js 

Parameters
----------
width : int
    Width of the force layout graph plot in pixels (default=500)
height : int
    Height of the force layout graph plot in pixels (default=500)
tree_highlighting : bool
    Include the interactive chromosome at the bottom of the figure to
    to let users highlight trees in the ARG (default=True)
y_axis_labels : bool
    Includes labelled y-axis on the left of the figure (default=True)
y_axis_scale : string
    Scale used for the positioning nodes along the y-axis. Options:
        "rank" (default) - equal vertical spacing between nodes
        "time" - vertical spacing is proportional to the time
        "log_time" - proportional to the log of time
edge_type : string
    Pathing type for edges between nodes. Options:
        "line" (default) - simple straight lines between the nodes
        "ortho" - custom pathing (see pathing.md for more details, should only be used with full ARGs)
variable_edge_width : bool
    Scales the stroke width of edges in the visualization will be proportional to the fraction of
    sequence in which that edge is found. (default=False)
subset_nodes : list (EXPERIMENTAL)
    List of nodes that user wants to stand out within the ARG. These nodes and the edges between them
    will have full opacity; other nodes will be faint (default=None, parameter is ignored and all
    nodes will have opacity)
include_node_labels : bool
    Includes the node labels for each node in the ARG (default=True)
sample_order : list
    Sample nodes IDs in desired order. Must only include sample nodes IDs, but does not
    need to include all sample nodes IDs. (default=[], order is set by first tree in tree sequence)
"""
```

A quick note about line_type="ortho" (more details can be found within [pathing.md](https://github.com/kitchensjn/tskit_arg_visualizer/blob/main/docs/pathing.md)) - this parameter identifies node types based on msprime flags and applies pathing rules following those types. Because of this, "ortho" should only be used for full ARGs with proper msprime flags and where nodes have a maximum of two parents or children. Other tree sequences, including simplified tree sequences (those without marked recombination nodes marked) should use the "line" edge_type.

## Saving Figures

Each figure is actually just a JSON object that D3.js interprets and plots to the screen (see [plotting.md](https://github.com/kitchensjn/tskit_arg_visualizer/blob/main/docs/plotting.md) for more information about this object). The "Copy Source To Clipboard" button to the top left of each figure copies that specific figure's JSON object to your computer's clipboard. This object includes all of the information needed to replicate the figure in a subsequent simulation and can be pasted into a `.json` file for later. To revisualize this figure:

```
import json
import tskit_arg_visualizer

arg_json = json.load(open("example.json", "r"))
tskit_arg_visualizer.draw_D3(arg_json=arg_json)
```

Alternatively, you can pass the JSON into a D3ARG object constructor, which then gives you access to all of the D3ARG object methods, such as drawing and modifying node labels.

```
d3arg = tskit_arg_visualizer.D3ARG.from_json(json=arg_json)
d3arg.draw()
```


## Modifying Node Labels

```
d3arg.set_node_labels(labels={0:"James",6:""})
```

Node labels can be changed using the `set_node_labels()` function. The example above will change the labels of Nodes 0 and 6 to "James" and "", respectively. An empty string is equivalent to removing the label of that specific node. Labels are always converted to strings. You can then redraw `d3arg` with the updated labels. If you wish to go back to the default labelling scheme, you run:

```
d3arg.reset_node_labels()
```

## Reheating A Figure (EXPERIMENTAL)

The energy of the force layout simulation reduces overtime, causing the nodes to lose speed and settle into positions. Additionally, anytime the user moves a node by dragging, its new position becomes fixed and there on out unchanged by the simulation. The "Reheat Simulation" button in the top left of each figure unfixes the positions of all nodes except for the sample nodes at the tips, and gives a burst of energy to the simulation to allow the nodes to find new optimal positions. This feature is most useful when the starting sample node positions are not optimal; the user can rearrange them and then reheat the simulation to see if that helps with untangling.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "tskit-arg-visualizer",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "population genetics,tree sequence,ancestral recombination graph,visualization,D3.js",
    "author": "",
    "author_email": "James Kitchens <kitchensjn@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/e4/a2/8129dfa55e21a9dfddd014649494abdeecefe2a5cfda0434437908743c00/tskit_arg_visualizer-0.0.1.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <img alt=\"ARG Visualizer Example\" src=\"https://raw.githubusercontent.com/kitchensjn/tskit_arg_visualizer/master/images/stylized_arg_visualizer.png\" width=\"500\">\n</p>\n\nA method for drawing ancestral recombination graphs from tskit tree sequences in Python using D3.js. ARGs are plotted using a D3's [force layout](https://github.com/d3/d3-force). All nodes have a fixed position on the y-axis set by fy. Sample nodes have a fixed position on the x-axis set by fx; the ordering of the sample nodes comes from the first tree in the tskit tree sequence (this is not always the optimal ordering but is generally a good starting point for plotting). The x positions of other nodes are set by a force simulation where all nodes repel each other countered by a linkage force between connected nodes in the graph.\n\nUsers can click and drag the nodes (including the sample) along the x-axis to further clean up the layout of the graph. The simulation does not take into account line crosses, which can often be improved with some fiddling. Once a node has been moved by a user, its position is fixed with regards to the force simulation.\n\n## Installation\n\n```\npip install tskit_arg_visualizer\n```\n\nThis package loads D3.js using a CDN, so requires access to the internet.\n\n## Quickstart\n\n```\nimport msprime\nimport random\nimport tskit_arg_visualizer\n\n# Generate a random tree sequence with record_full_arg=True so that you get marked recombination nodes\nts_rs = random.randint(0,10000)   \nts = msprime.sim_ancestry(\n    samples=3,\n    recombination_rate=1e-8,\n    sequence_length=3_000,\n    population_size=10_000,\n    record_full_arg=True,\n    random_seed=ts_rs\n)\n\nd3arg = tskit_arg_visualizer.D3ARG.from_ts(ts=ts)\nd3arg.draw(\n    width=750,\n    height=750,\n    edge_type=\"ortho\"\n)\n```\n\nThe above code can be run in three ways: terminal, Jupyter Notebook, or JupyterLab. For Jupyter Notebook and JupyterLab, you will need to add the following code block to the top of the document to properly load D3.js.\n\n### Jupyter Notebook -\n\n```\n%%javascript\nrequire.config({ \n    paths: { \n    d3: 'https://d3js.org/d3.v7.min'\n}});\n\nrequire([\"d3\"], function(d3) {\n    window.d3 = d3;\n});\n```\n\n### JupyterLab - \n\n```\n%%javascript\nvar script = document.createElement('script');\nscript.type = 'text/javascript';\nscript.src = 'https://d3js.org/d3.v7.min.js';\ndocument.head.appendChild(script);\n```\n\n## Drawing Parameters\n\nThe `draw()` function for a `D3ARG` object has multiple parameters that give users a small amount of customization with their figures:\n\n```\ndef draw(\n    self,\n    width=500,\n    height=500,\n    tree_highlighting=True,\n    y_axis_labels=True,\n    y_axis_scale=\"rank\",\n    edge_type=\"line\",\n    variable_edge_width=False,\n    subset_nodes=None,\n    include_node_labels=True,\n    sample_order=[]\n):\n\"\"\"Draws the D3ARG using D3.js by sending a custom JSON object to visualizer.js \n\nParameters\n----------\nwidth : int\n    Width of the force layout graph plot in pixels (default=500)\nheight : int\n    Height of the force layout graph plot in pixels (default=500)\ntree_highlighting : bool\n    Include the interactive chromosome at the bottom of the figure to\n    to let users highlight trees in the ARG (default=True)\ny_axis_labels : bool\n    Includes labelled y-axis on the left of the figure (default=True)\ny_axis_scale : string\n    Scale used for the positioning nodes along the y-axis. Options:\n        \"rank\" (default) - equal vertical spacing between nodes\n        \"time\" - vertical spacing is proportional to the time\n        \"log_time\" - proportional to the log of time\nedge_type : string\n    Pathing type for edges between nodes. Options:\n        \"line\" (default) - simple straight lines between the nodes\n        \"ortho\" - custom pathing (see pathing.md for more details, should only be used with full ARGs)\nvariable_edge_width : bool\n    Scales the stroke width of edges in the visualization will be proportional to the fraction of\n    sequence in which that edge is found. (default=False)\nsubset_nodes : list (EXPERIMENTAL)\n    List of nodes that user wants to stand out within the ARG. These nodes and the edges between them\n    will have full opacity; other nodes will be faint (default=None, parameter is ignored and all\n    nodes will have opacity)\ninclude_node_labels : bool\n    Includes the node labels for each node in the ARG (default=True)\nsample_order : list\n    Sample nodes IDs in desired order. Must only include sample nodes IDs, but does not\n    need to include all sample nodes IDs. (default=[], order is set by first tree in tree sequence)\n\"\"\"\n```\n\nA quick note about line_type=\"ortho\" (more details can be found within [pathing.md](https://github.com/kitchensjn/tskit_arg_visualizer/blob/main/docs/pathing.md)) - this parameter identifies node types based on msprime flags and applies pathing rules following those types. Because of this, \"ortho\" should only be used for full ARGs with proper msprime flags and where nodes have a maximum of two parents or children. Other tree sequences, including simplified tree sequences (those without marked recombination nodes marked) should use the \"line\" edge_type.\n\n## Saving Figures\n\nEach figure is actually just a JSON object that D3.js interprets and plots to the screen (see [plotting.md](https://github.com/kitchensjn/tskit_arg_visualizer/blob/main/docs/plotting.md) for more information about this object). The \"Copy Source To Clipboard\" button to the top left of each figure copies that specific figure's JSON object to your computer's clipboard. This object includes all of the information needed to replicate the figure in a subsequent simulation and can be pasted into a `.json` file for later. To revisualize this figure:\n\n```\nimport json\nimport tskit_arg_visualizer\n\narg_json = json.load(open(\"example.json\", \"r\"))\ntskit_arg_visualizer.draw_D3(arg_json=arg_json)\n```\n\nAlternatively, you can pass the JSON into a D3ARG object constructor, which then gives you access to all of the D3ARG object methods, such as drawing and modifying node labels.\n\n```\nd3arg = tskit_arg_visualizer.D3ARG.from_json(json=arg_json)\nd3arg.draw()\n```\n\n\n## Modifying Node Labels\n\n```\nd3arg.set_node_labels(labels={0:\"James\",6:\"\"})\n```\n\nNode labels can be changed using the `set_node_labels()` function. The example above will change the labels of Nodes 0 and 6 to \"James\" and \"\", respectively. An empty string is equivalent to removing the label of that specific node. Labels are always converted to strings. You can then redraw `d3arg` with the updated labels. If you wish to go back to the default labelling scheme, you run:\n\n```\nd3arg.reset_node_labels()\n```\n\n## Reheating A Figure (EXPERIMENTAL)\n\nThe energy of the force layout simulation reduces overtime, causing the nodes to lose speed and settle into positions. Additionally, anytime the user moves a node by dragging, its new position becomes fixed and there on out unchanged by the simulation. The \"Reheat Simulation\" button in the top left of each figure unfixes the positions of all nodes except for the sample nodes at the tips, and gives a burst of energy to the simulation to allow the nodes to find new optimal positions. This feature is most useful when the starting sample node positions are not optimal; the user can rearrange them and then reheat the simulation to see if that helps with untangling.\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Interactive visualization method for ancestral recombination graphs",
    "version": "0.0.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/kitchensjn/tskit_arg_visualizer/issues",
        "Changelog": "https://github.com/kitchensjn/tskit_arg_visualizer/blob/main/CHANGELOG.rst",
        "Homepage": "https://github.com/kitchensjn/tskit_arg_visualizer"
    },
    "split_keywords": [
        "population genetics",
        "tree sequence",
        "ancestral recombination graph",
        "visualization",
        "d3.js"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f757f9ef92bc9c95969d99fc5217db861881f4b437522bb1eb2d5632b31cb89f",
                "md5": "dbf2aeb149560a3dd4687a2f88d24d07",
                "sha256": "054e3dac771688d75986b1cebb2ca5ea4fea288af0c21274c5c5aae44211f3ec"
            },
            "downloads": -1,
            "filename": "tskit_arg_visualizer-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dbf2aeb149560a3dd4687a2f88d24d07",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 17469,
            "upload_time": "2023-11-03T23:17:19",
            "upload_time_iso_8601": "2023-11-03T23:17:19.057244Z",
            "url": "https://files.pythonhosted.org/packages/f7/57/f9ef92bc9c95969d99fc5217db861881f4b437522bb1eb2d5632b31cb89f/tskit_arg_visualizer-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e4a28129dfa55e21a9dfddd014649494abdeecefe2a5cfda0434437908743c00",
                "md5": "3e82675d2034e1e343cad89c00491a21",
                "sha256": "10d2965ed20eb9bf8fa22bb51f490d138352c53b4b8f89010426192a76cd8ded"
            },
            "downloads": -1,
            "filename": "tskit_arg_visualizer-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3e82675d2034e1e343cad89c00491a21",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 167489,
            "upload_time": "2023-11-03T23:17:21",
            "upload_time_iso_8601": "2023-11-03T23:17:21.049283Z",
            "url": "https://files.pythonhosted.org/packages/e4/a2/8129dfa55e21a9dfddd014649494abdeecefe2a5cfda0434437908743c00/tskit_arg_visualizer-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-03 23:17:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kitchensjn",
    "github_project": "tskit_arg_visualizer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "tskit-arg-visualizer"
}
        
Elapsed time: 0.16649s