pdb-eda


Namepdb-eda JSON
Version 2.7.1 PyPI version JSON
download
home_pagehttps://github.com/MoseleyBioinformaticsLab/pdb_eda
SummaryMethods for analyzing electron density maps in wwPDB
upload_time2023-09-29 01:49:04
maintainer
docs_urlNone
authorSen Yao, Hunter N.B. Moseley
requires_python
licenseModified Clear BSD License
keywords pdb electron densiy map
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            pdb-eda
==========

.. image:: https://raw.githubusercontent.com/MoseleyBioinformaticsLab/pdb_eda/master/docs/_static/images/pdb_eda_logo.png
   :width: 50%
   :align: center
   :target: https://pdb-eda.readthedocs.io/

Description
-----------
The `pdb_eda` package provides a simple Python tool for parsing and analyzing electron density maps data
available from the world wide Protein Data Bank (PDB_).

The `pdb_eda` package currently provides facilities that can:
    * Parse .ccp4 format file into their object representation.
    * Parse .pdb format file to get information that complimentary to the Bio.PDB module in BioPython_ package.
    * Analyze the electron density maps on atom/residue/domain levels and
      interpret the electron densities in terms of number of electrons by estimating a density-electron ratio.

Full API documentation, user guide, and tutorial can be found on readthedocs_

Citation
--------
Please cite the following papers when using pdb_eda:

Sen Yao and Hunter N.B. Moseley. "A chemical interpretation of protein electron density maps in the worldwide protein data
bank" *PLOS One* 15, e0236894 (2020).
https://doi.org/10.1371/journal.pone.0236894

Sen Yao and Hunter N.B. Moseley. "Finding high-quality metal ion-centric regions across the worldwide Protein Data Bank"
*Molecules* 24, 3179 (2019).
https://doi.org/10.3390/molecules24173179

Installation
------------
`pdb_eda` runs under Python 3.4+ and is available through python3-pip.
Install via pip or clone the git repo and install the following dependencies and you are ready to go!

Install on Linux, Mac OS X
~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code:: bash

   python3 -m pip install pdb_eda

GitHub Package installation
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Make sure you have git_ installed:

.. code:: bash

   git clone https://github.com/MoseleyBioinformaticsLab/pdb_eda.git

Dependencies
~~~~~~~~~~~~

`pdb_eda` requires the following Python libraries:

   * Biopython_ for creating and analyzing the `pdb_eda` atom objects.
   * Cython_ for cythonizing low-level utility functions to improve computational performance.
      * Requires gcc to be installed for the cythonization process.
   * numpy_ and scipy_ for mathmatical calculations.
   * docopt_ for better command line interface.
   * jsonpickle_ for formatted and reusable output.
   * PyCifRW_ for reading Cif formatted files.
      * Requires gcc to be installed for compiling components of the package.
   * pymol_ for calculating crystal contacts. (This package is not required, except for this functionality).


To install dependencies manually:

.. code:: bash

   pip3 install biopython
   pip3 install cython
   pip3 install numpy
   pip3 install scipy
   pip3 install docopt
   pip3 install jsonpickle
   pip3 install PyCifRW


Basic usage
-----------
The `pdb_eda` package can be used in several ways:

    * As a library for accessing and manipulating data in PDB and CCP4 format files.

        * Use the `~pdb_eda.densityAnalysis.fromPDBid` generator function that will generate
          (yield) a single `~pdb_eda.densityAnalysis` instance at a time.

        * Process each `~pdb_eda.densityAnalysis` instance:

        * Generate symmetry atoms.
        * Generate red (negative density) or green (positive density) blob lists.
        * Process PDB structures to aggregate cloud.
        * Calculate atom blob list and statistics.
        * Calculate atom regional discrepancies and statistics.
        * Calculate residue regional discrepancies and statistics.

    * As a command-line tool using the pdb_eda command (or "python3 -m pdb_eda").

        * The command-line interface has multiple modes.

        * single - single-structure mode:
            * Convert electron density map CCP4 files into its equivalent JSON file format.
            * Aggregate electron density map by atom, residue, and domain, and return the results in
              either JSON or csv format.
            * Aggregate difference electron density map into green (positive) or red (negative) blobs,
              and return the object or statistics results in either JSON or csv format.
            * Aggregate difference electron density map for atom and residue specific regions and return
              results in either JSON or csv format.
            * Return traditional quality metrics and statistics for atoms and residues.

        * multiple - multiple-structure mode:
            * Analyze and return cumulative statistics for a given list of PDB IDs.
            * Filter list of PDB IDs by cumulative statistic criteria.
            * Check and redownload problematic PDB entries.
            * Run single structure mode with multicore processing.
            * Run crystal contacts mode with multicore processing.

        * contacts - crystal contacts mode:
            * Analyze and return atoms with crystal contacts.
            * This mode requires pymol to be installed.

        * generate - parameter generation mode: (rarely used mode)
            * Downloads PDB chemical component list and extracts information to create atom type parameters.
            * Analyzes list of PDB IDs for specific atom types.
            * Generates atom type parameter file and list of PDB IDs for their optimization.

        * optimize - parameter optimization mode: (rarely used mode)
            * Optimizes atom type radii and b-factor density correction slopes using a given list of PDB IDs.

CHANGELOG
---------
Version 2.4.1:
Added -O3 compile option to cythonization.

Version 2.3.2:
Fixed logical and runtime errors in single density --atom-mask option.
Improved cythonization further.
Added --optimized-radii option to single density submode.

Version 2.2.1:
Moved previous single density submode to cloud submode.
Created new single density submode that has near parallel options to single difference submode.
Added --atom-mask option to single density submode.
Improved cythonization to gain additional computational performance.
Performed a variety of bug fixes.

Version 2.1.1:
Over 2200 lines of additional code has been written and most of the code base has been revised and refactored.
Computationally intensive parts of the code have been cythonized to improve execution performance.
Many variables and functions have been renamed to greatly improve readability and understanding of the code base, API, and CLI.

The application programming interface (API) has been greatly expanded and much of the functionality streamlined.

The command line interface has been greatly expanded and now includes single, multiple, contacts, generate, and optimize modes.

Optimize mode has a new penalty function being optimized that both minimizes differences in density-electron ratio estimates and
maximizes electron cloud aggregation.  The optimization is also roughly 10-fold faster than the previous generation of algorithm.

The atom types have been systematically generated from the wwPDB master chemical components file.
Both amino acid and nucleic acid type parameters have been optimized.
So both protein and nucleic acid PDB entries can be analyzed now.


License
-------
A modified Clear BSD License

Copyright (c) 2019, Sen Yao, Hunter N.B. Moseley
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted (subject to the limitations in the disclaimer
below) provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its contributors may be used
  to endorse or promote products derived from this software without specific
  prior written permission.

* If the source code is used in a published work, then proper citation of the source
  code must be included with the published work.

NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY THIS
LICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
DAMAGE.

.. _readthedocs: https://pdb-eda.readthedocs.io/en/latest/
.. _PDB: https://www.wwpdb.org/
.. _BioPython: https://biopython.org/
.. _Cython: https://cython.readthedocs.io/en/latest/index.html
.. _git: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git/
.. _numpy: http://www.numpy.org/
.. _scipy: https://scipy.org/scipylib/index.html
.. _docopt: http://docopt.org/
.. _jsonpickle: https://github.com/jsonpickle/jsonpickle
.. _PyCifRW: https://pypi.org/project/PyCifRW/4.3/
.. _pymol: https://pymol.org/2/

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MoseleyBioinformaticsLab/pdb_eda",
    "name": "pdb-eda",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "PDB,electron densiy map",
    "author": "Sen Yao, Hunter N.B. Moseley",
    "author_email": "hunter.moseley@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/d2/e8/ba437ff0c9158216bc0607676f2701123339241787d03cbb4bd4199b96aa/pdb_eda-2.7.1.tar.gz",
    "platform": "any",
    "description": "pdb-eda\n==========\n\n.. image:: https://raw.githubusercontent.com/MoseleyBioinformaticsLab/pdb_eda/master/docs/_static/images/pdb_eda_logo.png\n   :width: 50%\n   :align: center\n   :target: https://pdb-eda.readthedocs.io/\n\nDescription\n-----------\nThe `pdb_eda` package provides a simple Python tool for parsing and analyzing electron density maps data\navailable from the world wide Protein Data Bank (PDB_).\n\nThe `pdb_eda` package currently provides facilities that can:\n    * Parse .ccp4 format file into their object representation.\n    * Parse .pdb format file to get information that complimentary to the Bio.PDB module in BioPython_ package.\n    * Analyze the electron density maps on atom/residue/domain levels and\n      interpret the electron densities in terms of number of electrons by estimating a density-electron ratio.\n\nFull API documentation, user guide, and tutorial can be found on readthedocs_\n\nCitation\n--------\nPlease cite the following papers when using pdb_eda:\n\nSen Yao and Hunter N.B. Moseley. \"A chemical interpretation of protein electron density maps in the worldwide protein data\nbank\" *PLOS One* 15, e0236894 (2020).\nhttps://doi.org/10.1371/journal.pone.0236894\n\nSen Yao and Hunter N.B. Moseley. \"Finding high-quality metal ion-centric regions across the worldwide Protein Data Bank\"\n*Molecules* 24, 3179 (2019).\nhttps://doi.org/10.3390/molecules24173179\n\nInstallation\n------------\n`pdb_eda` runs under Python 3.4+ and is available through python3-pip.\nInstall via pip or clone the git repo and install the following dependencies and you are ready to go!\n\nInstall on Linux, Mac OS X\n~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n   python3 -m pip install pdb_eda\n\nGitHub Package installation\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nMake sure you have git_ installed:\n\n.. code:: bash\n\n   git clone https://github.com/MoseleyBioinformaticsLab/pdb_eda.git\n\nDependencies\n~~~~~~~~~~~~\n\n`pdb_eda` requires the following Python libraries:\n\n   * Biopython_ for creating and analyzing the `pdb_eda` atom objects.\n   * Cython_ for cythonizing low-level utility functions to improve computational performance.\n      * Requires gcc to be installed for the cythonization process.\n   * numpy_ and scipy_ for mathmatical calculations.\n   * docopt_ for better command line interface.\n   * jsonpickle_ for formatted and reusable output.\n   * PyCifRW_ for reading Cif formatted files.\n      * Requires gcc to be installed for compiling components of the package.\n   * pymol_ for calculating crystal contacts. (This package is not required, except for this functionality).\n\n\nTo install dependencies manually:\n\n.. code:: bash\n\n   pip3 install biopython\n   pip3 install cython\n   pip3 install numpy\n   pip3 install scipy\n   pip3 install docopt\n   pip3 install jsonpickle\n   pip3 install PyCifRW\n\n\nBasic usage\n-----------\nThe `pdb_eda` package can be used in several ways:\n\n    * As a library for accessing and manipulating data in PDB and CCP4 format files.\n\n        * Use the `~pdb_eda.densityAnalysis.fromPDBid` generator function that will generate\n          (yield) a single `~pdb_eda.densityAnalysis` instance at a time.\n\n        * Process each `~pdb_eda.densityAnalysis` instance:\n\n        * Generate symmetry atoms.\n        * Generate red (negative density) or green (positive density) blob lists.\n        * Process PDB structures to aggregate cloud.\n        * Calculate atom blob list and statistics.\n        * Calculate atom regional discrepancies and statistics.\n        * Calculate residue regional discrepancies and statistics.\n\n    * As a command-line tool using the pdb_eda command (or \"python3 -m pdb_eda\").\n\n        * The command-line interface has multiple modes.\n\n        * single - single-structure mode:\n            * Convert electron density map CCP4 files into its equivalent JSON file format.\n            * Aggregate electron density map by atom, residue, and domain, and return the results in\n              either JSON or csv format.\n            * Aggregate difference electron density map into green (positive) or red (negative) blobs,\n              and return the object or statistics results in either JSON or csv format.\n            * Aggregate difference electron density map for atom and residue specific regions and return\n              results in either JSON or csv format.\n            * Return traditional quality metrics and statistics for atoms and residues.\n\n        * multiple - multiple-structure mode:\n            * Analyze and return cumulative statistics for a given list of PDB IDs.\n            * Filter list of PDB IDs by cumulative statistic criteria.\n            * Check and redownload problematic PDB entries.\n            * Run single structure mode with multicore processing.\n            * Run crystal contacts mode with multicore processing.\n\n        * contacts - crystal contacts mode:\n            * Analyze and return atoms with crystal contacts.\n            * This mode requires pymol to be installed.\n\n        * generate - parameter generation mode: (rarely used mode)\n            * Downloads PDB chemical component list and extracts information to create atom type parameters.\n            * Analyzes list of PDB IDs for specific atom types.\n            * Generates atom type parameter file and list of PDB IDs for their optimization.\n\n        * optimize - parameter optimization mode: (rarely used mode)\n            * Optimizes atom type radii and b-factor density correction slopes using a given list of PDB IDs.\n\nCHANGELOG\n---------\nVersion 2.4.1:\nAdded -O3 compile option to cythonization.\n\nVersion 2.3.2:\nFixed logical and runtime errors in single density --atom-mask option.\nImproved cythonization further.\nAdded --optimized-radii option to single density submode.\n\nVersion 2.2.1:\nMoved previous single density submode to cloud submode.\nCreated new single density submode that has near parallel options to single difference submode.\nAdded --atom-mask option to single density submode.\nImproved cythonization to gain additional computational performance.\nPerformed a variety of bug fixes.\n\nVersion 2.1.1:\nOver 2200 lines of additional code has been written and most of the code base has been revised and refactored.\nComputationally intensive parts of the code have been cythonized to improve execution performance.\nMany variables and functions have been renamed to greatly improve readability and understanding of the code base, API, and CLI.\n\nThe application programming interface (API) has been greatly expanded and much of the functionality streamlined.\n\nThe command line interface has been greatly expanded and now includes single, multiple, contacts, generate, and optimize modes.\n\nOptimize mode has a new penalty function being optimized that both minimizes differences in density-electron ratio estimates and\nmaximizes electron cloud aggregation.  The optimization is also roughly 10-fold faster than the previous generation of algorithm.\n\nThe atom types have been systematically generated from the wwPDB master chemical components file.\nBoth amino acid and nucleic acid type parameters have been optimized.\nSo both protein and nucleic acid PDB entries can be analyzed now.\n\n\nLicense\n-------\nA modified Clear BSD License\n\nCopyright (c) 2019, Sen Yao, Hunter N.B. Moseley\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted (subject to the limitations in the disclaimer\nbelow) provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this\n  list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice,\n  this list of conditions and the following disclaimer in the documentation\n  and/or other materials provided with the distribution.\n\n* Neither the name of the copyright holder nor the names of its contributors may be used\n  to endorse or promote products derived from this software without specific\n  prior written permission.\n\n* If the source code is used in a published work, then proper citation of the source\n  code must be included with the published work.\n\nNO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY THIS\nLICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n\"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,\nTHE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE\nARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE\nLIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR\nCONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE\nGOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)\nHOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT\nLIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT\nOF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\nDAMAGE.\n\n.. _readthedocs: https://pdb-eda.readthedocs.io/en/latest/\n.. _PDB: https://www.wwpdb.org/\n.. _BioPython: https://biopython.org/\n.. _Cython: https://cython.readthedocs.io/en/latest/index.html\n.. _git: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git/\n.. _numpy: http://www.numpy.org/\n.. _scipy: https://scipy.org/scipylib/index.html\n.. _docopt: http://docopt.org/\n.. _jsonpickle: https://github.com/jsonpickle/jsonpickle\n.. _PyCifRW: https://pypi.org/project/PyCifRW/4.3/\n.. _pymol: https://pymol.org/2/\n",
    "bugtrack_url": null,
    "license": "Modified Clear BSD License",
    "summary": "Methods for analyzing electron density maps in wwPDB",
    "version": "2.7.1",
    "project_urls": {
        "Homepage": "https://github.com/MoseleyBioinformaticsLab/pdb_eda"
    },
    "split_keywords": [
        "pdb",
        "electron densiy map"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "33a8ab69693d7096ef439aa58e16b8f5273a756b8bd51b23bc8b26f6de49f1ca",
                "md5": "58c7769b9d4aca9a843a86b4204b1df4",
                "sha256": "6556f790cd2e32c83a3492f93d1e86c10e972e5cd173354b4f3081670693d96e"
            },
            "downloads": -1,
            "filename": "pdb_eda-2.7.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "58c7769b9d4aca9a843a86b4204b1df4",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 3259818,
            "upload_time": "2023-09-29T01:49:01",
            "upload_time_iso_8601": "2023-09-29T01:49:01.679106Z",
            "url": "https://files.pythonhosted.org/packages/33/a8/ab69693d7096ef439aa58e16b8f5273a756b8bd51b23bc8b26f6de49f1ca/pdb_eda-2.7.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d2e8ba437ff0c9158216bc0607676f2701123339241787d03cbb4bd4199b96aa",
                "md5": "7c93eace9b5f75001d2c21d902120a01",
                "sha256": "1f3188c952f75b4955c3f38e2247fb24d53627510cf83005558912a81de69f81"
            },
            "downloads": -1,
            "filename": "pdb_eda-2.7.1.tar.gz",
            "has_sig": false,
            "md5_digest": "7c93eace9b5f75001d2c21d902120a01",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 2830662,
            "upload_time": "2023-09-29T01:49:04",
            "upload_time_iso_8601": "2023-09-29T01:49:04.353905Z",
            "url": "https://files.pythonhosted.org/packages/d2/e8/ba437ff0c9158216bc0607676f2701123339241787d03cbb4bd4199b96aa/pdb_eda-2.7.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-29 01:49:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MoseleyBioinformaticsLab",
    "github_project": "pdb_eda",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "pdb-eda"
}
        
Elapsed time: 2.15731s