dogma-data


Namedogma-data JSON
Version 0.2.19 PyPI version JSON
download
home_pageNone
SummaryData processing for Dogma
upload_time2024-11-04 21:48:10
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords single-cell rna-seq embedding pytorch uce
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Dogma Data

**Dogma Data** is a Python package built for fast and efficient parsing of FASTA files, optimized for high-performance computing. It leverages multi-threading to fully utilize all available system threads, enabling parallel processing. Additionally, the package supports exporting parsed data to the HDF5 file format for easy storage and access.

## Installation

To install Dogma Data, you can use **pip**:

```bash
pip install dogma-data
```

## Usage
```python
import dogma_data

vocab = {
    'a': 0,
    'g': 1,
    'c': 2,
    't': 3,
    ...
}

mapping = dogma_data.FastaMapping(vocab, vocab['a'])
(tokens, sequences, (taxons)) = dogma_data.parse_fasta('input_path.fa', dogma_data.HeaderType.TaxonId, mapping)

header_info = {"taxons": taxons}

dogma_data.export_hdf5(
    'output_path.h5',
    dogma_data.Splitter(
        train_prop=0.95,
        val_prop=0.025,
        test_prop=0.025,
        length=len(sequences) - 1,
    ),
    tokens,
    sequences,
    header_info,
    mapping
)
```

## Requirements
- Python 3.10
  
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Contact
For any questions, feel free to reach out:
- **Marcel Rød** - roed@stanford.edu
- **Miha Krajnc** - miha.krajnc@cs.stanford.edu

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "dogma-data",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "single-cell, RNA-seq, embedding, pytorch, uce",
    "author": null,
    "author_email": "Marcel R\u00f8d <roed@stanford.edu>",
    "download_url": null,
    "platform": null,
    "description": "# Dogma Data\n\n**Dogma Data** is a Python package built for fast and efficient parsing of FASTA files, optimized for high-performance computing. It leverages multi-threading to fully utilize all available system threads, enabling parallel processing. Additionally, the package supports exporting parsed data to the HDF5 file format for easy storage and access.\n\n## Installation\n\nTo install Dogma Data, you can use **pip**:\n\n```bash\npip install dogma-data\n```\n\n## Usage\n```python\nimport dogma_data\n\nvocab = {\n    'a': 0,\n    'g': 1,\n    'c': 2,\n    't': 3,\n    ...\n}\n\nmapping = dogma_data.FastaMapping(vocab, vocab['a'])\n(tokens, sequences, (taxons)) = dogma_data.parse_fasta('input_path.fa', dogma_data.HeaderType.TaxonId, mapping)\n\nheader_info = {\"taxons\": taxons}\n\ndogma_data.export_hdf5(\n    'output_path.h5',\n    dogma_data.Splitter(\n        train_prop=0.95,\n        val_prop=0.025,\n        test_prop=0.025,\n        length=len(sequences) - 1,\n    ),\n    tokens,\n    sequences,\n    header_info,\n    mapping\n)\n```\n\n## Requirements\n- Python 3.10\n  \n## License\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Contact\nFor any questions, feel free to reach out:\n- **Marcel R\u00f8d** - roed@stanford.edu\n- **Miha Krajnc** - miha.krajnc@cs.stanford.edu\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Data processing for Dogma",
    "version": "0.2.19",
    "project_urls": {
        "Homepage": "https://github.com/marcelroed/dogma-data",
        "Repository": "https://github.com/marcelroed/dogma-data.git"
    },
    "split_keywords": [
        "single-cell",
        " rna-seq",
        " embedding",
        " pytorch",
        " uce"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5e8643c7f73f48e94ce7acf3926b39ae6bd3057bafea7c90aca57977e5a341da",
                "md5": "f6100245b01655160e77a64389fef7e0",
                "sha256": "c529509cbdac6c84680e8c6ea45561645afcb176414c12917d61f71a36828e7b"
            },
            "downloads": -1,
            "filename": "dogma_data-0.2.19-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "f6100245b01655160e77a64389fef7e0",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.10",
            "size": 498873,
            "upload_time": "2024-11-04T21:48:10",
            "upload_time_iso_8601": "2024-11-04T21:48:10.137023Z",
            "url": "https://files.pythonhosted.org/packages/5e/86/43c7f73f48e94ce7acf3926b39ae6bd3057bafea7c90aca57977e5a341da/dogma_data-0.2.19-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "093a66da3e5b4b658089f8b982129bb4932123fdb1c6c367845989ec8572e347",
                "md5": "d2f08fe33ce26c3ef524817395505513",
                "sha256": "877785e212ae028020f45f3bb8bffd9d1be25ffa39d474a363defdd381157c48"
            },
            "downloads": -1,
            "filename": "dogma_data-0.2.19-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "d2f08fe33ce26c3ef524817395505513",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.10",
            "size": 498660,
            "upload_time": "2024-11-04T21:48:14",
            "upload_time_iso_8601": "2024-11-04T21:48:14.499967Z",
            "url": "https://files.pythonhosted.org/packages/09/3a/66da3e5b4b658089f8b982129bb4932123fdb1c6c367845989ec8572e347/dogma_data-0.2.19-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1dbd44a028a602d65c2cb2874e706b4ca62d68bfe8f950ce1478c7a454ba58b0",
                "md5": "1164291cabe1868812a7af6b52d4c669",
                "sha256": "fd3bead441a36aa4759ecbc8a7d1d0c5ef4a340970925e6311da2ae8058ec4ea"
            },
            "downloads": -1,
            "filename": "dogma_data-0.2.19-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "1164291cabe1868812a7af6b52d4c669",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.10",
            "size": 498044,
            "upload_time": "2024-11-04T21:48:08",
            "upload_time_iso_8601": "2024-11-04T21:48:08.537288Z",
            "url": "https://files.pythonhosted.org/packages/1d/bd/44a028a602d65c2cb2874e706b4ca62d68bfe8f950ce1478c7a454ba58b0/dogma_data-0.2.19-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-04 21:48:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "marcelroed",
    "github_project": "dogma-data",
    "github_not_found": true,
    "lcname": "dogma-data"
}
        
Elapsed time: 0.35815s