# yxcompgen
Xu Yuxing's personal comparative genomics tools
## Installation
```
pip install yxcompgen
```
## Usage
### Example for orthogroups analysis
orthogroups tsv file: The format which OrthoFinder outputs (`Orthogroups.tsv`), with the first column as orthogroup ID and the rest columns as gene IDs from different species, separated by tab (`\t`) and gene IDs separated by a comma and a space (`, `). File have a header line, which first column is `Orthogroup` and the rest columns are species names.
read a orthogroups file:
```python
from yxcompgen import OrthoGroups
OGs = OrthoGroups(OG_tsv_file="/path/to/Orthogroups.tsv")
# get orthogroup information
OGs.get(('OG0000000', 'Ath'))
```
species info file: An Excel file with columns `sp_id`, `taxon_id`, `species_name`, `genome_file`, `gff_file`, `pt_file`, `cDNA_file`, `cds_file`. `sp_id` is the species ID, `taxon_id` is the taxon ID, `species_name` is the species name, `genome_file` is the genome file path, `gff_file` is the GFF file path, `pt_file` is the protein sequence file path, `cDNA_file` is the cDNA sequence file path, `cds_file` is the CDS sequence file path.
read a species info file:
```python
from yxcompgen import read_species_info
ref_xlsx = '/path/to/species_info.xlsx'
sp_info_dict = read_species_info(ref_xlsx)
```
### Example for synteny blocks building
1. input: gff file and gene pair file
gff file should be in gff3 format, and gene pair file should be a tab-delimited file with two columns, each row is a gene pair from two species.
```
Cca_Gene1 Sly_Gene1
Cca_Gene2 Sly_Gene2
...
```
```python
sp1_id = 'Cca'
sp1_gff = '/path/to/Cca.gff3'
sp2_id = 'Sly'
sp2_gff = '/path/to/Sly.gff3'
gene_pair_file = '/path/to/gene_pair.txt'
```
2. build synteny blocks
```python
from yxcompgen import GenomeSyntenyBlockJob
sb_job = GenomeSyntenyBlockJob(
sp1_id, sp1_gff, sp2_id, sp2_gff, gene_pair_file)
sb_job.build_synteny_blocks()
```
3. write synteny blocks to file
output file is in MCScan format
```python
mcscan_output_file = "/path/to/collinearity_output.txt"
sb_job.write_mcscan_output(mcscan_output_file)
```
4. Or you can read synteny blocks from file
```python
sb_job = GenomeSyntenyBlockJob(
sp1_id, sp1_gff, sp2_id, sp2_gff)
sb_job.read_mcscan_output(mcscan_output_file)
```
5. You can also work with only one genome
```python
sb_job = GenomeSyntenyBlockJob(
sp1_id, sp1_gff, gene_pair_file=gene_pair_file)
```
### Example for synteny blocks plot
```python
sb_job.plot()
highlight_sb_list = [65, 178, 237, 331]
sb_job.plot(mode='loci', reverse=True, highlight_synteny_blocks=highlight_sb_list)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/SouthernCD/yxcompgen",
"name": "yxcompgen",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.5",
"maintainer_email": null,
"keywords": null,
"author": "Yuxing Xu",
"author_email": "xuyuxing@mail.kib.ac.cn",
"download_url": "https://files.pythonhosted.org/packages/ec/b8/1bf4a95e66a1ab330d5dce87f5e6809e968c6e3e32ce6d2baf028cb6401f/yxcompgen-0.0.1.tar.gz",
"platform": null,
"description": "# yxcompgen\nXu Yuxing's personal comparative genomics tools\n\n## Installation\n```\npip install yxcompgen\n```\n\n## Usage\n\n### Example for orthogroups analysis\n\northogroups tsv file: The format which OrthoFinder outputs (`Orthogroups.tsv`), with the first column as orthogroup ID and the rest columns as gene IDs from different species, separated by tab (`\\t`) and gene IDs separated by a comma and a space (`, `). File have a header line, which first column is `Orthogroup` and the rest columns are species names.\n\nread a orthogroups file:\n```python\nfrom yxcompgen import OrthoGroups\nOGs = OrthoGroups(OG_tsv_file=\"/path/to/Orthogroups.tsv\")\n# get orthogroup information\nOGs.get(('OG0000000', 'Ath'))\n```\n\nspecies info file: An Excel file with columns `sp_id`, `taxon_id`, `species_name`, `genome_file`, `gff_file`, `pt_file`, `cDNA_file`, `cds_file`. `sp_id` is the species ID, `taxon_id` is the taxon ID, `species_name` is the species name, `genome_file` is the genome file path, `gff_file` is the GFF file path, `pt_file` is the protein sequence file path, `cDNA_file` is the cDNA sequence file path, `cds_file` is the CDS sequence file path.\n\nread a species info file:\n```python\nfrom yxcompgen import read_species_info\nref_xlsx = '/path/to/species_info.xlsx'\nsp_info_dict = read_species_info(ref_xlsx)\n```\n\n\n### Example for synteny blocks building\n\n1. input: gff file and gene pair file\n\ngff file should be in gff3 format, and gene pair file should be a tab-delimited file with two columns, each row is a gene pair from two species.\n```\nCca_Gene1 Sly_Gene1\nCca_Gene2 Sly_Gene2\n...\n```\n\n```python\nsp1_id = 'Cca'\nsp1_gff = '/path/to/Cca.gff3'\nsp2_id = 'Sly'\nsp2_gff = '/path/to/Sly.gff3'\ngene_pair_file = '/path/to/gene_pair.txt'\n```\n\n2. build synteny blocks\n \n```python\nfrom yxcompgen import GenomeSyntenyBlockJob\nsb_job = GenomeSyntenyBlockJob(\n sp1_id, sp1_gff, sp2_id, sp2_gff, gene_pair_file)\nsb_job.build_synteny_blocks()\n```\n\n3. write synteny blocks to file\n\noutput file is in MCScan format\n\n```python\nmcscan_output_file = \"/path/to/collinearity_output.txt\"\nsb_job.write_mcscan_output(mcscan_output_file)\n```\n\n4. Or you can read synteny blocks from file\n \n```python\nsb_job = GenomeSyntenyBlockJob(\n sp1_id, sp1_gff, sp2_id, sp2_gff)\nsb_job.read_mcscan_output(mcscan_output_file)\n```\n\n5. You can also work with only one genome\n\n```python\nsb_job = GenomeSyntenyBlockJob(\n sp1_id, sp1_gff, gene_pair_file=gene_pair_file)\n```\n\n### Example for synteny blocks plot\n\n```python\nsb_job.plot()\nhighlight_sb_list = [65, 178, 237, 331]\nsb_job.plot(mode='loci', reverse=True, highlight_synteny_blocks=highlight_sb_list)\n```\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Xu Yuxing's personal comparative genomics tools",
"version": "0.0.1",
"project_urls": {
"Homepage": "https://github.com/SouthernCD/yxcompgen"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "731928a0fea799458ae64bbcc2a9cae2678445277ef343fa25f6f70002a1e672",
"md5": "2f5148832fc34f9c6c1deb5ee0c299e4",
"sha256": "857e311bf4830fa059b3a7f11783058ac6ab5ec1d46a4016761d81f188c3b4af"
},
"downloads": -1,
"filename": "yxcompgen-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2f5148832fc34f9c6c1deb5ee0c299e4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.5",
"size": 74551,
"upload_time": "2024-10-20T17:57:48",
"upload_time_iso_8601": "2024-10-20T17:57:48.224537Z",
"url": "https://files.pythonhosted.org/packages/73/19/28a0fea799458ae64bbcc2a9cae2678445277ef343fa25f6f70002a1e672/yxcompgen-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ecb81bf4a95e66a1ab330d5dce87f5e6809e968c6e3e32ce6d2baf028cb6401f",
"md5": "0ed4a807c4847eafeb9979134fd2c215",
"sha256": "8999fe841da5aca881c19f19fc20c5653d7a6bc6158cf6965c05e069b4184f78"
},
"downloads": -1,
"filename": "yxcompgen-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "0ed4a807c4847eafeb9979134fd2c215",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.5",
"size": 71101,
"upload_time": "2024-10-20T17:57:50",
"upload_time_iso_8601": "2024-10-20T17:57:50.559714Z",
"url": "https://files.pythonhosted.org/packages/ec/b8/1bf4a95e66a1ab330d5dce87f5e6809e968c6e3e32ce6d2baf028cb6401f/yxcompgen-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-20 17:57:50",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "SouthernCD",
"github_project": "yxcompgen",
"github_not_found": true,
"lcname": "yxcompgen"
}