pymatgen-db


Namepymatgen-db JSON
Version 2023.7.18 PyPI version JSON
download
home_pagehttps://github.com/materialsproject/pymatgen-db
SummaryPymatgen-db is a database add-on for the Python Materials Genomics (pymatgen) materials analysis library.
upload_time2023-07-18 14:47:18
maintainerShyue Ping Ong
docs_urlNone
authorShyue Ping Ong
requires_python
licenseMIT
keywords vasp gaussian materials project electronic structure mongo
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Pymatgen-db is a database add-on for the Python Materials Genomics (pymatgen)
materials analysis library. It enables the creation of Materials
Project-style `MongoDB`_ databases for management of materials data. A query
engine is also provided to enable the easy translation of MongoDB docs to
useful pymatgen objects for analysis purposes.

Major change
------------

From v2021.5.13, pymatgen-db is now a proper namespace add-on to pymatgen. In
other words, you no longer import from matgendb but rather pymatgen.db.

Getting pymatgen-db
===================

Stable version
--------------

The easiest way to install pymatgen-db on any system is to use pip, as follows::

    pip install pymatgen-db

Requirements
============

All required python dependencies should be automatically taken care of if you
install pymatgen-db using easy_install or pip. Otherwise, these packages should
be available on `PyPI <http://pypi.python.org>`_.

1. Python 3.7+ required.
2. Pymatgen 2022+, including all dependencies associated with it. Please refer
   to the `pymatgen docs <http://pythonhosted.org//pymatgen>`_ for detailed
   installation instructions.
3. Pymongo 3.3+: For interfacing with MongoDb.
4. MongoDB 2.2+: Get it at the `MongoDB`_ website.

Usage
=====

A powerful command-line script (mgdb) provides most of the access to many of
the features in pymatgen-db, including db initialization, insertion of data,
running the materials genomics ui, etc. To see all options available, type::

    mgdb --help

Initial setup
-------------

The first step is to install and setup MongoDB on a server of your choice.
The `MongoDB manual`_ is an excellent place to start. For the purposes of
testing out the tools here, you may simply download the binary distributions
corresponding to your OS from the `MongoDB`_ website, and then running the
following commands::

    # For Mac and Linux OS.
    mkdir test_db && mongod --dbpath test_db

This will create a test database and start the Mongo daemon. Once you are
done with testing, you can simply press Ctrl-C to stop the server and delete
the "test_db" folder. Running a Mongo server this way is insecure as Mongo
does not enable authentication by default. Please refer to the `MongoDB
manual`_ when setting up your production database.

After your server is up, you should create a database config file by running
the following command::

    mgdb init -c db.json

This will prompt you for a few parameters to create a database config file,
which will make it much easier to use mgdb in future. Note that the config file
name can be anything of your choice, but using "db.json" will allow you to use
mgdb without explicitly specifying the filename in future. If you are just
testing using the test database, simply hit Enter to accept the defaults for
all settings.

For more advanced use of the "db.json" config file (e.g., specifying aliases,
defaults, etc., please refer to the following `sample
<http://pythonhosted.org/pymatgen-db/_static/db.json>`_.

Inserting calculations
----------------------

To insert an entire directory of runs (where the topmost directory is
"dir_name") into the database, use the following command::

    # Note that "-c db.json" may be omitted if the config filename is the
    # current directory under the default filename of db.json.

    mgdb insert -c db.json dir_name

A sample run has been provided for `download
<http://pythonhosted.org/pymatgen-db/_static/Li2O.zip>`_ for testing
purposes. Unzip the file and run the above command in the directory.

Querying a database
-------------------

Sometimes, more fine-grained querying is needed (e.g., for subsequent
postprocessing and analysis).

The mgdb script allows you to make simple queries from the command line::

    # Query for the task id and energy per atom of all calculations with
    # formula Li2O. Note that the criteria has to be specified in the form of
    # a json string. Note that "-c db.json" may be omitted if the config
    # filename is the current directory under the default filename of db.json.

    mgdb query -c db.json --crit '{"pretty_formula": "Li2O"}' --props task_id energy_per_atom

For more advanced queries, you can use the QueryEngine class for which an
alias is provided at the root package. Some examples are as follows::

    >>> from pymatgen.db import QueryEngine
    # Depending on your db.json, you may need to supply keyword args below
    # for `port`, `database`, `collection`, etc.
    >>> qe = QueryEngine()

    #Print the task id and formula of all entries in the database.
    >>> for r in qe.query(properties=["pretty_formula", "task_id"]):
    ...     print "{task_id} - {pretty_formula}".format(**r)
    ...
    12 - Li2O

    # Get a pymatgen Structure from the task_id.
    >>> structure = qe.get_structure_from_id(12)

    # Get pymatgen ComputedEntries using a criteria.
    >>> entries = qe.get_entries({})

The language follows very closely to pymongo/MongoDB syntax, except that
QueryEngine provides useful aliases for commonly used fields as well as
translation to commonly used pymatgen objects like Structure and
ComputedEntries.

Extending pymatgen-db
---------------------

Currently, pymatgen-db is written with standard VASP runs in mind. However,
it is perfectly extensible to any kind of data, e.g., other kinds of VASP runs
(bandstructure, NEB, etc.) or just any form of data in general. Developers
looking to adapt pymatgen-db for other purposes should look at the
VaspToDbTaskDrone class as an example and write similar drones for their
needs. The QueryEngine can generally be applied to any Mongo collection,
with suitable specification of aliases if desired.

How to cite pymatgen-db
=======================

If you use pymatgen and pymatgen-db in your research, please consider citing
the following work:

    Shyue Ping Ong, William Davidson Richards, Anubhav Jain, Geoffroy Hautier,
    Michael Kocher, Shreyas Cholia, Dan Gunter, Vincent Chevrier, Kristin A.
    Persson, Gerbrand Ceder. *Python Materials Genomics (pymatgen) : A Robust,
    Open-Source Python Library for Materials Analysis.* Computational
    Materials Science, 2013, 68, 314-319. `doi:10.1016/j.commatsci.2012.10.028
    <http://dx.doi.org/10.1016/j.commatsci.2012.10.028>`_

.. _`MongoDB` : http://www.mongodb.org/
.. _`Github repo` : https://github.com/materialsproject/pymatgen-db
.. _`MongoDB manual` : http://docs.mongodb.org/manual/

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/materialsproject/pymatgen-db",
    "name": "pymatgen-db",
    "maintainer": "Shyue Ping Ong",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "shyuep@gmail.com",
    "keywords": "vasp,gaussian,materials,project,electronic,structure,mongo",
    "author": "Shyue Ping Ong",
    "author_email": "shyuep@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f8/07/d728b338f2ecdd609d9f84f377021aeab87a13f1ea24a656e26f7622d27e/pymatgen-db-2023.7.18.tar.gz",
    "platform": null,
    "description": "Pymatgen-db is a database add-on for the Python Materials Genomics (pymatgen)\nmaterials analysis library. It enables the creation of Materials\nProject-style `MongoDB`_ databases for management of materials data. A query\nengine is also provided to enable the easy translation of MongoDB docs to\nuseful pymatgen objects for analysis purposes.\n\nMajor change\n------------\n\nFrom v2021.5.13, pymatgen-db is now a proper namespace add-on to pymatgen. In\nother words, you no longer import from matgendb but rather pymatgen.db.\n\nGetting pymatgen-db\n===================\n\nStable version\n--------------\n\nThe easiest way to install pymatgen-db on any system is to use pip, as follows::\n\n    pip install pymatgen-db\n\nRequirements\n============\n\nAll required python dependencies should be automatically taken care of if you\ninstall pymatgen-db using easy_install or pip. Otherwise, these packages should\nbe available on `PyPI <http://pypi.python.org>`_.\n\n1. Python 3.7+ required.\n2. Pymatgen 2022+, including all dependencies associated with it. Please refer\n   to the `pymatgen docs <http://pythonhosted.org//pymatgen>`_ for detailed\n   installation instructions.\n3. Pymongo 3.3+: For interfacing with MongoDb.\n4. MongoDB 2.2+: Get it at the `MongoDB`_ website.\n\nUsage\n=====\n\nA powerful command-line script (mgdb) provides most of the access to many of\nthe features in pymatgen-db, including db initialization, insertion of data,\nrunning the materials genomics ui, etc. To see all options available, type::\n\n    mgdb --help\n\nInitial setup\n-------------\n\nThe first step is to install and setup MongoDB on a server of your choice.\nThe `MongoDB manual`_ is an excellent place to start. For the purposes of\ntesting out the tools here, you may simply download the binary distributions\ncorresponding to your OS from the `MongoDB`_ website, and then running the\nfollowing commands::\n\n    # For Mac and Linux OS.\n    mkdir test_db && mongod --dbpath test_db\n\nThis will create a test database and start the Mongo daemon. Once you are\ndone with testing, you can simply press Ctrl-C to stop the server and delete\nthe \"test_db\" folder. Running a Mongo server this way is insecure as Mongo\ndoes not enable authentication by default. Please refer to the `MongoDB\nmanual`_ when setting up your production database.\n\nAfter your server is up, you should create a database config file by running\nthe following command::\n\n    mgdb init -c db.json\n\nThis will prompt you for a few parameters to create a database config file,\nwhich will make it much easier to use mgdb in future. Note that the config file\nname can be anything of your choice, but using \"db.json\" will allow you to use\nmgdb without explicitly specifying the filename in future. If you are just\ntesting using the test database, simply hit Enter to accept the defaults for\nall settings.\n\nFor more advanced use of the \"db.json\" config file (e.g., specifying aliases,\ndefaults, etc., please refer to the following `sample\n<http://pythonhosted.org/pymatgen-db/_static/db.json>`_.\n\nInserting calculations\n----------------------\n\nTo insert an entire directory of runs (where the topmost directory is\n\"dir_name\") into the database, use the following command::\n\n    # Note that \"-c db.json\" may be omitted if the config filename is the\n    # current directory under the default filename of db.json.\n\n    mgdb insert -c db.json dir_name\n\nA sample run has been provided for `download\n<http://pythonhosted.org/pymatgen-db/_static/Li2O.zip>`_ for testing\npurposes. Unzip the file and run the above command in the directory.\n\nQuerying a database\n-------------------\n\nSometimes, more fine-grained querying is needed (e.g., for subsequent\npostprocessing and analysis).\n\nThe mgdb script allows you to make simple queries from the command line::\n\n    # Query for the task id and energy per atom of all calculations with\n    # formula Li2O. Note that the criteria has to be specified in the form of\n    # a json string. Note that \"-c db.json\" may be omitted if the config\n    # filename is the current directory under the default filename of db.json.\n\n    mgdb query -c db.json --crit '{\"pretty_formula\": \"Li2O\"}' --props task_id energy_per_atom\n\nFor more advanced queries, you can use the QueryEngine class for which an\nalias is provided at the root package. Some examples are as follows::\n\n    >>> from pymatgen.db import QueryEngine\n    # Depending on your db.json, you may need to supply keyword args below\n    # for `port`, `database`, `collection`, etc.\n    >>> qe = QueryEngine()\n\n    #Print the task id and formula of all entries in the database.\n    >>> for r in qe.query(properties=[\"pretty_formula\", \"task_id\"]):\n    ...     print \"{task_id} - {pretty_formula}\".format(**r)\n    ...\n    12 - Li2O\n\n    # Get a pymatgen Structure from the task_id.\n    >>> structure = qe.get_structure_from_id(12)\n\n    # Get pymatgen ComputedEntries using a criteria.\n    >>> entries = qe.get_entries({})\n\nThe language follows very closely to pymongo/MongoDB syntax, except that\nQueryEngine provides useful aliases for commonly used fields as well as\ntranslation to commonly used pymatgen objects like Structure and\nComputedEntries.\n\nExtending pymatgen-db\n---------------------\n\nCurrently, pymatgen-db is written with standard VASP runs in mind. However,\nit is perfectly extensible to any kind of data, e.g., other kinds of VASP runs\n(bandstructure, NEB, etc.) or just any form of data in general. Developers\nlooking to adapt pymatgen-db for other purposes should look at the\nVaspToDbTaskDrone class as an example and write similar drones for their\nneeds. The QueryEngine can generally be applied to any Mongo collection,\nwith suitable specification of aliases if desired.\n\nHow to cite pymatgen-db\n=======================\n\nIf you use pymatgen and pymatgen-db in your research, please consider citing\nthe following work:\n\n    Shyue Ping Ong, William Davidson Richards, Anubhav Jain, Geoffroy Hautier,\n    Michael Kocher, Shreyas Cholia, Dan Gunter, Vincent Chevrier, Kristin A.\n    Persson, Gerbrand Ceder. *Python Materials Genomics (pymatgen) : A Robust,\n    Open-Source Python Library for Materials Analysis.* Computational\n    Materials Science, 2013, 68, 314-319. `doi:10.1016/j.commatsci.2012.10.028\n    <http://dx.doi.org/10.1016/j.commatsci.2012.10.028>`_\n\n.. _`MongoDB` : http://www.mongodb.org/\n.. _`Github repo` : https://github.com/materialsproject/pymatgen-db\n.. _`MongoDB manual` : http://docs.mongodb.org/manual/\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Pymatgen-db is a database add-on for the Python Materials Genomics (pymatgen) materials analysis library.",
    "version": "2023.7.18",
    "project_urls": {
        "Homepage": "https://github.com/materialsproject/pymatgen-db"
    },
    "split_keywords": [
        "vasp",
        "gaussian",
        "materials",
        "project",
        "electronic",
        "structure",
        "mongo"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "97a44659cb9c8ed21cd80c36514d1fc84d818734d9d9a24e6d5528245c916393",
                "md5": "7022fe76f67605226e49a2d0634b5517",
                "sha256": "7f9079ecc118cd5a7a4f942e6fd1e8611995210031a03a714ffee08d13fe5c74"
            },
            "downloads": -1,
            "filename": "pymatgen_db-2023.7.18-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7022fe76f67605226e49a2d0634b5517",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 87791,
            "upload_time": "2023-07-18T14:47:16",
            "upload_time_iso_8601": "2023-07-18T14:47:16.448513Z",
            "url": "https://files.pythonhosted.org/packages/97/a4/4659cb9c8ed21cd80c36514d1fc84d818734d9d9a24e6d5528245c916393/pymatgen_db-2023.7.18-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f807d728b338f2ecdd609d9f84f377021aeab87a13f1ea24a656e26f7622d27e",
                "md5": "b53c5fed6d64f22a2d3985d59d718346",
                "sha256": "ca0acab590c1cc1000d745b8001ddd2e26ee56612257754ce21d96db69b0f8b2"
            },
            "downloads": -1,
            "filename": "pymatgen-db-2023.7.18.tar.gz",
            "has_sig": false,
            "md5_digest": "b53c5fed6d64f22a2d3985d59d718346",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 63299,
            "upload_time": "2023-07-18T14:47:18",
            "upload_time_iso_8601": "2023-07-18T14:47:18.603547Z",
            "url": "https://files.pythonhosted.org/packages/f8/07/d728b338f2ecdd609d9f84f377021aeab87a13f1ea24a656e26f7622d27e/pymatgen-db-2023.7.18.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-18 14:47:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "materialsproject",
    "github_project": "pymatgen-db",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "pymatgen-db"
}
        
Elapsed time: 0.10159s