# python-nexus
A Generic phylogenetic nexus format (.nex, .trees) reader and writer for python.
[![Build Status](https://github.com/dlce-eva/python-nexus/workflows/tests/badge.svg)](https://github.com/dlce-eva/python-nexus/actions?query=workflow%3Atests)
[![codecov](https://codecov.io/gh/dlce-eva/python-nexus/branch/master/graph/badge.svg)](https://codecov.io/gh/dlce-eva/python-nexus)
[![PyPI](https://img.shields.io/pypi/v/python-nexus.svg)](https://pypi.org/project/python-nexus)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.595426.svg)](https://doi.org/10.5281/zenodo.595426)
## Description
python-nexus provides simple nexus file-format reading/writing tools, and a small
collection of nexus manipulation scripts.
Please note that this library works with the phylogenetics data format (e.g. https://en.wikipedia.org/wiki/Nexus_file)
and not the phyics data format (e.g. https://manual.nexusformat.org/).
Note: Due to a name clash with another python package, this package must be **installed** as
`pip install python-nexus` but **imported** as `import nexus`.
## Usage
### CLI
`python-nexus` installs a command `nexus` for cli use. You can inspect its help via
```shell
nexus -h
```
### Python API
Reading a Nexus:
```python
>>> from nexus import NexusReader
>>> n = NexusReader.from_file('tests/examples/example.nex')
```
You can also load from a string:
```python
>>> n = NexusReader.from_string('#NEXUS\n\nbegin foo; ... end;')
```
NexusReader will load each of the nexus `blocks` it identifies using specific `handlers`.
```python
>>> n.blocks
{'foo': <nexus.handlers.GenericHandler object at 0x7f55d94140f0>}
>>> n = NexusReader('tests/examples/example.nex')
>>> n.blocks
{'data': <NexusDataBlock: 2 characters from 4 taxa>}
```
A dictionary mapping blocks to handlers is available as `nexus.reader.HANDLERS:
```python
>>> from nexus.reader import HANDLERS
>>> HANDLERS
{
'trees': <class 'nexus.handlers.tree.TreeHandler'>,
'taxa': <class 'nexus.handlers.taxa.TaxaHandler'>,
'characters': <class 'nexus.handlers.data.CharacterHandler'>,
'data': <class 'nexus.handlers.data.DataHandler'>
}
```
Any blocks that aren't in this dictionary will be parsed using `nexus.handlers.GenericHandler`.
`NexusReader` can then write the nexus to a string using `NexusReader.write` or to another
file using `NexusReader.write_to_file`:
```python
>>> output = n.write()
>>> n.write_to_file("mynewnexus.nex")
```
Note: if you want more fine-grained control over generating nexus files, then try
`NexusWriter` discussed below.
### Block Handlers:
There are specific "Handlers" to parse certain known nexus blocks, including the
common 'data', 'trees', and 'taxa' blocks. Any blocks that are unknown will be
parsed with GenericHandler.
ALL handlers extend the `GenericHandler` class and have the following methods.
* `__init__(self, name=None, data=None)`
`__init__` is called by `NexusReader` to parse the contents of the block (in `data`)
appropriately.
* `write(self)`
write is called by `NexusReader` to write the contents of a block to a string
(i.e. for regenerating the nexus format for saving a file to disk)
#### `generic` block handler
The generic block handler simply stores each line of the block in `.block`:
n.blockname.block
['line1', 'line2', ... ]
#### `data` block handler
These are the main blocks encountered in nexus files - and contain the data matrix.
So, given the following nexus file with a data block:
#NEXUS
Begin data;
Dimensions ntax=4 nchar=2;
Format datatype=standard symbols="01" gap=-;
Matrix
Harry 00
Simon 01
Betty 10
Louise 11
;
End;
begin trees;
tree A = ((Harry:0.1,Simon:0.2),Betty:0.2)Louise:0.1;;
tree B = ((Simon:0.1,Harry:0.2),Betty:0.2)Louise:0.1;;
end;
You can do the following:
Find out how many characters:
```python
>>> n.data.nchar
2
```
Ask about how many taxa:
```python
>>> n.data.ntaxa
4
```
Get the taxa names:
```python
>>> n.data.taxa
['Harry', 'Simon', 'Betty', 'Louise']
```
Get the `format` info:
```python
>>> n.data.format
{'datatype': 'standard', 'symbols': '01', 'gap': '-'}
```
The actual data matrix is a dictionary, which you can get to in `.matrix`:
```python
>>> n.data.matrix
defaultdict(<class 'list'>, {'Harry': ['0', '0'], 'Simon': ['0', '1'], 'Betty': ['1', '0'], 'Louise': ['1', '1']})
```
Or, you could access the data matrix via taxon:
```python
>>> n.data.matrix['Simon']
['0', '1']
```
Or even loop over it like this:
```python
>>> for taxon, characters in n.data:
... print(taxon, characters)
...
Harry ['0', '0']
Simon ['0', '1']
Betty ['1', '0']
Louise ['1', '1']
```
You can also iterate over the sites (rather than the taxa):
```python
>>> for site, data in n.data.characters.items():
... print(site, data)
...
0 {'Harry': '0', 'Simon': '0', 'Betty': '1', 'Louise': '1'}
1 {'Harry': '0', 'Simon': '1', 'Betty': '0', 'Louise': '1'}
```
..or you can access the characters matrix directly:
```python
>>> n.data.characters[0]
{'Harry': '0', 'Simon': '0', 'Betty': '1', 'Louise': '1'}
```
Note: that sites are zero-indexed!
#### `trees` block handler
If there's a `trees` block, then you can do the following
You can get the number of trees:
```python
>>> n.trees.ntrees
2
```
You can access the trees via the `.trees` dictionary:
```python
>>> n.trees.trees[0]
'tree A = ((Harry:0.1,Simon:0.2):0.1,Betty:0.2):Louise:0.1);'
```
Or loop over them:
```python
>>> for tree in n.trees:
... print(tree)
...
tree A = ((Harry:0.1,Simon:0.2):0.1,Betty:0.2):Louise:0.1);
tree B = ((Simon:0.1,Harry:0.2):0.1,Betty:0.2):Louise:0.1);
```
For further inspection of trees via the [newick package](https://pypi.org/project/newick/), you can retrieve
a `nexus.Node` object for a tree:
```python
>>> print(n.trees.trees[0].newick_tree.ascii_art())
┌─Harry
┌────────┤
──Louise─┤ └─Simon
└─Betty
```
#### `taxa` block handler
Programs like SplitsTree understand "TAXA" blocks in Nexus files:
BEGIN Taxa;
DIMENSIONS ntax=4;
TAXLABELS
[1] 'John'
[2] 'Paul'
[3] 'George'
[4] 'Ringo'
;
END; [Taxa]
In a taxa block you can get the number of taxa and the taxa list:
```python
>>> n.taxa.ntaxa
4
>>> n.taxa.taxa
['John', 'Paul', 'George', 'Ringo']
```
NOTE: with this alternate nexus format the Characters blocks *should* be parsed by
DataHandler.
### Writing a Nexus File using NexusWriter
`NexusWriter` provides more fine-grained control over writing nexus files, and
is useful if you're programmatically generating a nexus file rather than loading
a pre-existing one.
```python
>>> from nexus import NexusWriter
>>> n = NexusWriter()
>>> #Add a comment to appear in the header of the file
>>> n.add_comment("I am a comment")
```
Data are added by using the "add" function - which takes 3 arguments, a taxon,
a character name, and a value.
```python
>>> n.add('taxon1', 'Character1', 'A')
>>> n.data
{'Character1': {'taxon1': 'A'}}
>>> n.add('taxon2', 'Character1', 'C')
>>> n.add('taxon3', 'Character1', 'A')
```
Characters and values can be strings or integers (but you **cannot** mix string and
integer characters).
```python
>>> n.add('taxon1', 2, 1)
>>> n.add('taxon2', 2, 2)
>>> n.add('taxon3', 2, 3)
```
NexusWriter will interpolate missing entries (i.e. taxon2 in this case)
```python
>>> n.add('taxon1', "Char3", '4')
>>> n.add('taxon3', "Char3", '4')
```
... when you're ready, you can generate the nexus using `make_nexus` or `write_to_file`:
```python
>>> data = n.make_nexus(interleave=True, charblock=True, preserve_order=False)
>>> n.write_to_file("output.nex", interleave=True, charblock=True, preserve_order=False)
```
... you can make an interleaved nexus by setting `interleave` to True, and you can
include a character block in the nexus (if you have character labels for example)
by setting charblock to True. Furthermore you can specify whether the order of added
taxa and characters should be preserved by setting `preserve_order` to True, otherwise they will
be sorted alphanumerically.
There is rudimentary support for handling trees e.g.:
```python
>>> n.trees.append("tree tree1 = (a,b,c);")
>>> n.trees.append("tree tree2 = (a,b,c);")
```
Raw data
{
"_id": null,
"home_page": "https://github.com/dlce-eva/python-nexus",
"name": "python-nexus",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "phylogenetics nexus newick paup splitstree",
"author": "Simon Greenhill and Robert Forkel",
"author_email": "simon@simon.net.nz",
"download_url": "https://files.pythonhosted.org/packages/31/12/094a553953695c5f0ecc0c11fd3a97e25560933a3649782e736459cda0c6/python-nexus-2.9.0.tar.gz",
"platform": "any",
"description": "# python-nexus\n\nA Generic phylogenetic nexus format (.nex, .trees) reader and writer for python.\n\n[![Build Status](https://github.com/dlce-eva/python-nexus/workflows/tests/badge.svg)](https://github.com/dlce-eva/python-nexus/actions?query=workflow%3Atests)\n[![codecov](https://codecov.io/gh/dlce-eva/python-nexus/branch/master/graph/badge.svg)](https://codecov.io/gh/dlce-eva/python-nexus)\n[![PyPI](https://img.shields.io/pypi/v/python-nexus.svg)](https://pypi.org/project/python-nexus)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.595426.svg)](https://doi.org/10.5281/zenodo.595426)\n\n\n## Description\n\npython-nexus provides simple nexus file-format reading/writing tools, and a small\ncollection of nexus manipulation scripts. \n\nPlease note that this library works with the phylogenetics data format (e.g. https://en.wikipedia.org/wiki/Nexus_file)\nand not the phyics data format (e.g. https://manual.nexusformat.org/).\n\nNote: Due to a name clash with another python package, this package must be **installed** as\n`pip install python-nexus` but **imported** as `import nexus`.\n\n\n## Usage\n\n### CLI\n\n`python-nexus` installs a command `nexus` for cli use. You can inspect its help via\n```shell\nnexus -h\n```\n\n### Python API\n\nReading a Nexus:\n```python\n>>> from nexus import NexusReader\n>>> n = NexusReader.from_file('tests/examples/example.nex')\n``` \n\nYou can also load from a string:\n```python\n>>> n = NexusReader.from_string('#NEXUS\\n\\nbegin foo; ... end;')\n```\n\nNexusReader will load each of the nexus `blocks` it identifies using specific `handlers`. \n```python\n>>> n.blocks\n{'foo': <nexus.handlers.GenericHandler object at 0x7f55d94140f0>}\n>>> n = NexusReader('tests/examples/example.nex')\n>>> n.blocks\n{'data': <NexusDataBlock: 2 characters from 4 taxa>}\n```\n\nA dictionary mapping blocks to handlers is available as `nexus.reader.HANDLERS:\n```python\n>>> from nexus.reader import HANDLERS\n>>> HANDLERS\n{\n 'trees': <class 'nexus.handlers.tree.TreeHandler'>, \n 'taxa': <class 'nexus.handlers.taxa.TaxaHandler'>, \n 'characters': <class 'nexus.handlers.data.CharacterHandler'>, \n 'data': <class 'nexus.handlers.data.DataHandler'>\n}\n```\n\nAny blocks that aren't in this dictionary will be parsed using `nexus.handlers.GenericHandler`.\n\n`NexusReader` can then write the nexus to a string using `NexusReader.write` or to another \nfile using `NexusReader.write_to_file`:\n```python\n>>> output = n.write()\n>>> n.write_to_file(\"mynewnexus.nex\")\n```\n\nNote: if you want more fine-grained control over generating nexus files, then try\n`NexusWriter` discussed below.\n\n\n### Block Handlers:\n\nThere are specific \"Handlers\" to parse certain known nexus blocks, including the\ncommon 'data', 'trees', and 'taxa' blocks. Any blocks that are unknown will be \nparsed with GenericHandler.\n\nALL handlers extend the `GenericHandler` class and have the following methods.\n\n* `__init__(self, name=None, data=None)`\n `__init__` is called by `NexusReader` to parse the contents of the block (in `data`)\n appropriately.\n\n* `write(self)`\n write is called by `NexusReader` to write the contents of a block to a string \n (i.e. for regenerating the nexus format for saving a file to disk)\n\n\n\n#### `generic` block handler\n\nThe generic block handler simply stores each line of the block in `.block`:\n\n n.blockname.block\n ['line1', 'line2', ... ]\n\n\n#### `data` block handler\n\nThese are the main blocks encountered in nexus files - and contain the data matrix.\n\nSo, given the following nexus file with a data block:\n\n #NEXUS \n\n Begin data;\n Dimensions ntax=4 nchar=2;\n Format datatype=standard symbols=\"01\" gap=-;\n Matrix\n Harry 00\n Simon 01\n Betty 10\n Louise 11\n ;\n End;\n\n begin trees;\n tree A = ((Harry:0.1,Simon:0.2),Betty:0.2)Louise:0.1;;\n tree B = ((Simon:0.1,Harry:0.2),Betty:0.2)Louise:0.1;;\n end;\n\n\nYou can do the following:\n\nFind out how many characters:\n```python\n>>> n.data.nchar\n2\n```\n\nAsk about how many taxa:\n```python\n>>> n.data.ntaxa\n4\n```\n\nGet the taxa names:\n```python \n>>> n.data.taxa\n['Harry', 'Simon', 'Betty', 'Louise']\n```\n\nGet the `format` info:\n```python \n>>> n.data.format\n{'datatype': 'standard', 'symbols': '01', 'gap': '-'}\n```\n\nThe actual data matrix is a dictionary, which you can get to in `.matrix`:\n```python\n>>> n.data.matrix\ndefaultdict(<class 'list'>, {'Harry': ['0', '0'], 'Simon': ['0', '1'], 'Betty': ['1', '0'], 'Louise': ['1', '1']})\n```\n\nOr, you could access the data matrix via taxon:\n```python\n>>> n.data.matrix['Simon']\n['0', '1']\n``` \n\nOr even loop over it like this:\n```python\n>>> for taxon, characters in n.data:\n... print(taxon, characters)\n... \nHarry ['0', '0']\nSimon ['0', '1']\nBetty ['1', '0']\nLouise ['1', '1']\n```\n\nYou can also iterate over the sites (rather than the taxa):\n```python\n>>> for site, data in n.data.characters.items():\n... print(site, data)\n... \n0 {'Harry': '0', 'Simon': '0', 'Betty': '1', 'Louise': '1'}\n1 {'Harry': '0', 'Simon': '1', 'Betty': '0', 'Louise': '1'}\n```\n\n..or you can access the characters matrix directly:\n```python\n>>> n.data.characters[0]\n{'Harry': '0', 'Simon': '0', 'Betty': '1', 'Louise': '1'}\n\n```\n\nNote: that sites are zero-indexed!\n\n#### `trees` block handler\n\nIf there's a `trees` block, then you can do the following\n\nYou can get the number of trees:\n```python\n>>> n.trees.ntrees\n2\n```\n\nYou can access the trees via the `.trees` dictionary:\n```python\n>>> n.trees.trees[0]\n'tree A = ((Harry:0.1,Simon:0.2):0.1,Betty:0.2):Louise:0.1);'\n```\n\nOr loop over them:\n```python\n>>> for tree in n.trees:\n... print(tree)\n... \ntree A = ((Harry:0.1,Simon:0.2):0.1,Betty:0.2):Louise:0.1);\ntree B = ((Simon:0.1,Harry:0.2):0.1,Betty:0.2):Louise:0.1);\n```\n\nFor further inspection of trees via the [newick package](https://pypi.org/project/newick/), you can retrieve \na `nexus.Node` object for a tree:\n```python\n>>> print(n.trees.trees[0].newick_tree.ascii_art())\n \u250c\u2500Harry\n \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2500\u2500Louise\u2500\u2524 \u2514\u2500Simon\n \u2514\u2500Betty\n\n```\n\n\n#### `taxa` block handler\n\nPrograms like SplitsTree understand \"TAXA\" blocks in Nexus files:\n\n BEGIN Taxa;\n DIMENSIONS ntax=4;\n TAXLABELS\n [1] 'John'\n [2] 'Paul'\n [3] 'George'\n [4] 'Ringo'\n ;\n END; [Taxa]\n\n\nIn a taxa block you can get the number of taxa and the taxa list:\n```python\n>>> n.taxa.ntaxa\n4\n>>> n.taxa.taxa\n['John', 'Paul', 'George', 'Ringo']\n```\n\nNOTE: with this alternate nexus format the Characters blocks *should* be parsed by\nDataHandler.\n\n\n### Writing a Nexus File using NexusWriter\n\n\n`NexusWriter` provides more fine-grained control over writing nexus files, and \nis useful if you're programmatically generating a nexus file rather than loading\na pre-existing one.\n```python\n>>> from nexus import NexusWriter\n>>> n = NexusWriter()\n>>> #Add a comment to appear in the header of the file\n>>> n.add_comment(\"I am a comment\")\n```\n\nData are added by using the \"add\" function - which takes 3 arguments, a taxon, \na character name, and a value.\n```python\n>>> n.add('taxon1', 'Character1', 'A')\n>>> n.data\n{'Character1': {'taxon1': 'A'}}\n>>> n.add('taxon2', 'Character1', 'C')\n>>> n.add('taxon3', 'Character1', 'A')\n```\n\nCharacters and values can be strings or integers (but you **cannot** mix string and\ninteger characters).\n```python\n>>> n.add('taxon1', 2, 1)\n>>> n.add('taxon2', 2, 2)\n>>> n.add('taxon3', 2, 3)\n```\n\nNexusWriter will interpolate missing entries (i.e. taxon2 in this case)\n```python\n>>> n.add('taxon1', \"Char3\", '4')\n>>> n.add('taxon3', \"Char3\", '4')\n```\n\n... when you're ready, you can generate the nexus using `make_nexus` or `write_to_file`:\n```python \n>>> data = n.make_nexus(interleave=True, charblock=True, preserve_order=False)\n>>> n.write_to_file(\"output.nex\", interleave=True, charblock=True, preserve_order=False)\n```\n\n... you can make an interleaved nexus by setting `interleave` to True, and you can\ninclude a character block in the nexus (if you have character labels for example) \nby setting charblock to True. Furthermore you can specify whether the order of added\ntaxa and characters should be preserved by setting `preserve_order` to True, otherwise they will\nbe sorted alphanumerically.\n\nThere is rudimentary support for handling trees e.g.:\n```python\n>>> n.trees.append(\"tree tree1 = (a,b,c);\")\n>>> n.trees.append(\"tree tree2 = (a,b,c);\")\n```\n\n\n",
"bugtrack_url": null,
"license": "BSD-2-Clause",
"summary": "A nexus (phylogenetics) file reader (.nex, .trees)",
"version": "2.9.0",
"split_keywords": [
"phylogenetics",
"nexus",
"newick",
"paup",
"splitstree"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1e9b8da2c6ea98c3a88197c027c9a1d73fc7ac8a67abed47caf1998203f0925c",
"md5": "ba2adb671eb5d7bd86a853264ace8e98",
"sha256": "9da55f94cb16526d5f3afa94cc5a3cc771f6a1db8c26b5f359f2481a08f52dbb"
},
"downloads": -1,
"filename": "python_nexus-2.9.0-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "ba2adb671eb5d7bd86a853264ace8e98",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.7",
"size": 39890,
"upload_time": "2022-09-12T13:00:09",
"upload_time_iso_8601": "2022-09-12T13:00:09.209183Z",
"url": "https://files.pythonhosted.org/packages/1e/9b/8da2c6ea98c3a88197c027c9a1d73fc7ac8a67abed47caf1998203f0925c/python_nexus-2.9.0-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3112094a553953695c5f0ecc0c11fd3a97e25560933a3649782e736459cda0c6",
"md5": "87fcf4840cba180422f2736a3f3e8890",
"sha256": "9eea1a0e79dc20b84310a84d4cc90665b1a359a74c17cc0a7602e54156188204"
},
"downloads": -1,
"filename": "python-nexus-2.9.0.tar.gz",
"has_sig": false,
"md5_digest": "87fcf4840cba180422f2736a3f3e8890",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 34242,
"upload_time": "2022-09-12T13:00:11",
"upload_time_iso_8601": "2022-09-12T13:00:11.355750Z",
"url": "https://files.pythonhosted.org/packages/31/12/094a553953695c5f0ecc0c11fd3a97e25560933a3649782e736459cda0c6/python-nexus-2.9.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-09-12 13:00:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "dlce-eva",
"github_project": "python-nexus",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "python-nexus"
}