matgraphdb


Namematgraphdb JSON
Version 0.0.3 PyPI version JSON
download
home_pageNone
SummaryWelcome to MatGraphDB, a powerful Python package designed to interface with primary and graph databases for advanced material analysis.
upload_time2024-10-03 02:28:51
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT License Copyright (c) 2023 lllangWV Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords materials science graph database python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MatGraphDB

## Introduction to MatGraphDB

Welcome to **MatGraphDB**, a powerful Python package designed to interface with primary and graph databases for advanced material analysis. MatGraphDB excels in managing vast datasets of materials data, performing complex computational tasks, and encoding material properties and relationships within a graph-based analytical model.

MatGraphDB is structured around several modular components that work together to streamline data management and analysis:

- **DataManager**: Handles interactions with JSON databases and manages the extraction of information from completed calculation directories.
- **CalcManager**: Manages Density Functional Theory (DFT) calculations, including setting up directories and launching calculations within `MaterialsData`.
- **GraphDBGenerator**: Facilitates the creation of nodes and relationships for graph databases, storing this information in a specified directory and generating necessary CSV files for nodes and relationships.
- **Neo4jManager** and **Neo4jGDSManager**: Manage connections to Neo4j databases, allowing for the creation, update, and removal of databases on the Neo4j server, as well as interaction with the Neo4j Graph Data Science library for advanced graph analytics.

The ultimate goal of MatGraphDB is to leverage the capabilities of graph databases, specifically Neo4j, to enable advanced analysis and discovery in the realm of material science. By integrating data management, DFT calculations, and graph database functionalities, MatGraphDB provides a cohesive workflow for researchers and analysts to explore and understand complex material data.

This documentation provides an overview of the package, detailing how the various components interact to facilitate efficient data management, computation, and analysis, ensuring that you can make the most out of your material science research with MatGraphDB.

## Components and Their Interactions

![System Architecture of MatGraphDB](MatGraphDB.png)
*Figure 1: MatGraphDB Package Overview - This diagram illustrates the main components and their interactions within the MatGraphDB package. It highlights the initialization of the `DataManager`, the execution of DFT calculations by `CalcManager`, the generation of the graph database using `GraphDBGenerator`, and the management of Neo4j databases through `Neo4jManager` and `Neo4jGDSManager`. The workflow demonstrates how data flows from JSON files to advanced graph analytics, facilitating comprehensive materials data analysis.

### 1. DataManager Initialization
The `DataManager` is the foundational component of the package. It is initialized with the `directory_path`, which points to the JSON database directory. The primary role of `DataManager` is to manage interactions with the JSON files and extract information from the completed calculation directories.

### 2. DFT Calculations with CalcManager
The `CalcManager` component manages DFT calculations through interactions with `data_manager`. This includes:
- Performing DFT calculations (`dft_calcs`).
- Managing calculations within `MaterialsData`.
- Setting up directories and launching calculations.

### 3. Graph Database Generation
The `GraphDBGenerator` plays a crucial role in creating nodes and relationships for the graph database. It utilizes `data_manager` to handle these processes. To build the graph database, `GraphDBGenerator` uses the methods `create_nodes()` and `create_relationships()`. Once the database directory is set up, an additional method transforms the graph database into a GraphML file, compatible with various graph packages.

- Creates and stores information in the `graph_databases/{database_name}/neo4j_csv` directory.
- Generates node CSV files in the format `{node_filename}.csv`.
- Generates relationship CSV files in the format `{node_1_filename}-{node_2_filename}-{connection_names}.csv`.


### 4. Managing Neo4j Database Connections
Neo4j database connections are managed by `Neo4jManager` and `Neo4jGDSManager`:
- **Neo4jManager**: After the `GraphDBGenerator` creates the graph database directory, `Neo4jManager` can be initialized with this directory to manage databases on the Neo4j server. This includes creating, removing, updating, and listing databases, as well as importing all data from the directory.
- **Neo4jGDSManager**: Once the database is imported, `Neo4jGDSManager` can be initialized. This component interacts with the Neo4j Graph Data Science library to perform various operations, such as:
  - Loading graphs into memory.
  - Removing graphs from memory.
  - Writing and exporting graphs.
  - Running algorithms on the graphs.

### Summary

MatGraphDB seamlessly integrates materials data management with advanced graph database functionalities, leveraging DFT calculations and Neo4j database management to provide a robust tool for materials science research. The interactions between `DataManager`, `CalcManager`, `GraphDBGenerator`, `Neo4jManager`, and `Neo4jGDSManager` create a cohesive workflow, from data management and DFT calculations to graph database creation and sophisticated data analysis.




## Getting Started

### Installing the data

You can install the data here:


### Setting up Conda environment 

Navigate to the root directory `MatGraphDB`. Then do the following
#### Use if you are going to not use graph-tool library
**Windows**
```bash
conda env create -f env_win.yml
```

**Linux**
```bash
conda env create -f env_linux.yml
```

#### Use if you are going to not use graph-tool library
This allows you to use the graph-tool library. Currently, we only support the graph-tool library for linux.

**Linux**
```bash
conda env create -f env_graph_tool.yml
```

#### To activeate the enviornment use:

```bash
conda activate matgraphdb
```

### Adjusting configs

Th configurations of the project are stored in the `MatGraphDB/config.yml` file. You can adjust the configurations to your needs. The most important configurations that need to be adjusted are `DB_NAME`, `USER`, `PASSWORD`, `LOCATION`, `NEO4J_DESKTOP_DIR`, and `N_CORES`.



- `DB_NAME`: The name of the database that will be created. This will search for the database with the same name in the `MatGraphDB/data/production` directory.

- `USER`: The username for the Neo4j database.

- `PASSWORD`: The password for the Neo4j database.

- `LOCATION`: This is the location of the Neo4j DBMS. This is usually `"bolt://localhost:7687"`

- `NEO4J_DESKTOP_DIR`: This is the directory where the Neo4j Desktop is installed. This could be in various locations depending on your system.

    - **Windows** - `C:/Users/{username}/.Neo4jDesktop`
    - **Linux** - `/home/neo4j` : Might be different depending on your system

- `N_CORES`: The number of cores to be used for parallel processing.

### Neo4jDektop
To use neo4j, you will need to install the neo4j desktop application. You can download the application from the [neo4j website](https://neo4j.com/docs/operations-manual/current/installation/). Create a project and then create a new database management system (DBMS) , name it `MatGraphDB` and select the `Neo4j Community Edition` as the DBMS.

You will also need to install the APOC library and Graph Data Science Library. You can do this by click on you DBMS name and the on the right clicking `Plugins`, then click on the libraries and install them.

You will also need to set an apoc environment variable. You can do this by running the following code:

```python
with Neo4jGraphDatabase() as manager:
    settings={'apoc.export.file.enabled':'true'}
    manager.set_apoc_environment_variables(settings=settings)
```

After running this code, you will need to stop the dbms and restart it.




## Usage

### Interacting with the json database:
**Checking properties**
```python
from matgraphdb import DatabaseManager

db=DatabaseManager()

success,failed=db.check_property(property_name="band_gap")

```

**Adding material properties**

```python
from matgraphdb import DatabaseManager

db=DatabaseManager()
structure = Structure(
        Lattice.cubic(3.0),
        ["C", "C"],  # Elements
        [
            [0, 0, 0],          # Coordinates for the first Si atom
            [0.25, 0.25, 0.25],  # Coordinates for the second Si atom (basis of the diamond structure)
        ]
    )

# Add material by structure
db.create_material(structure=structure)

# Add material by composition
db.create_material(structure="BaTe")
```

### Creating Graph Databases
To create graph databases, you can use the `GraphGenerator` class. This class takes in a `from_scratch` parameter, which determines whether to start from scratch or use an existing graph database. The default value is `False`. This class also takes in a `skip_main_init` parameter, which determines whether to skip the initial node and relationship creation. The default value is `True`.

When the object is created, it will create the main graph database based on the json files in the `MatGraphDB/data/production/json_database` directory. The main graph database will contain the initial material nodes and relationships. The file can be found at `MatGraphDB/data/production/graph_database/main/neo4j_csv` 

```python
from matgraphdb.graph.graph_generator import GraphGenerator

generator=GraphGenerator(skip_main_init=False)
```

Once the initial graph database is created, you can screen the existing materials using the `screen_graph_database` function.

```python
generator.screen_graph_database('nelements-2-2',nelements=(2,2), from_scratch=True)
generator.screen_graph_database('nelements-3-3',nelements=(3,3), from_scratch=True)

generator.screen_graph_database('spg-145',space_groups=[145], from_scratch=True)
generator.screen_graph_database('spg-145-196',space_groups=[145,196], from_scratch=True)
generator.screen_graph_database('spg-no-145',space_groups=[145], from_scratch=True, include=False)
generator.screen_graph_database('spg-no-196',space_groups=[196], from_scratch=True, include=False)

generator.screen_graph_database('elements-no-Ti',elements=["Ti"], from_scratch=True, include=False)
generator.screen_graph_database('elements-no-Fe',elements=["Fe"], from_scratch=True, include=False)
generator.screen_graph_database('elements-no-Ti-Fe',elements=["Ti","Fe"], from_scratch=True, include=False)
```

Here, we are using the `screen_graph_database` function to create a 9 new graph databases. The `nelements` parameter specifies the number of elements to include in the graph database. The `space_groups` parameter specifies the space groups to include in the graph database. The `elements` parameter specifies the elements to include in the graph database. The `from_scratch` parameter determines whether to start from scratch or use an existing graph database. The `include` parameter determines whether to include the specified elements or space groups in the graph database.


### Writing GraphML
To write a graphml file from the graph, you can use the `write_graphml` function. This function takes a graph database name as input and writes the graph to a file in the specified format.

```python
generator.write_graphml(graph_dirname='nelements-2-2')
```


### Interacting with the Graph Databse in Neo4j

**List Database Schema**
```python
from matgraphdb import Neo4jGraphDatabase

with Neo4jGraphDatabase() as session:
    schema_list=session.list_schema()
```

**Execute Cypher Statement**
```python
with Neo4jGraphDatabase() as session:
    result = matgraphdb.query(query, parameters)
```

**Filter properties**
```python
with Neo4jGraphDatabase() as session:
    results=session.read_material(
                            material_ids=['mp-1000','mp-1001'], 
                            elements=['Te','Ba'])
    results=session.read_material(
                                material_ids=['mp-1000','mp-1001'],
                                elements=['Te','Ba'], 
                                crystal_systems=['cubic'])

    results=session.read_material(
                            material_ids=['mp-1000','mp-1001'],
                            elements=['Te','Ba'],
                            crystal_systems=['hexagonal'])
                            
    results=session.read_material(
                            material_ids=['mp-1000','mp-1001'],
                            elements=['Te','Ba'],
                            hall_symbols=['Fm-3m'])

    results=session.read_material(
                            material_ids=['mp-1000','mp-1001'],
                            elements=['Te','Ba'],
                            band_gap=[(1.0,'>')])

success,failed=db.check_property(property_name="band_gap")
```



### Interacting with the Neo4j Graph Datascience Library

**Initializing the Neo4jGDSManager**

```python
from matgraphdb import Neo4jGraphDatabase,Neo4jGDSManager

with Neo4jGraphDatabase() as session:
    manager=Neo4jGDSManager(session)
```

**Listing the graphs that are loaded into the gds system for a given database**

```python
with Neo4jGraphDatabase() as session:
    manager=Neo4jGDSManager(session)
    database_name=
    results=manager.list_graphs(database_name='main')
    print(results)

```

**Check if graph is in memory**
```python
with Neo4jGraphDatabase() as session:
    manager=Neo4jGDSManager(session)
    results=manager.is_graph_in_memory(database_name='main', graph_name='materials_chemenvElements')
    print(results)
```
**Loading a graph into the gds system**
```python
with Neo4jGraphDatabase() as session:
    manager=Neo4jGDSManager(session)

    database_name='main'
    graph_name='materials_chemenvElements'
    node_projections=['ChemenvElement','Material']
    relationship_projections={
                "GEOMETRIC_ELECTRIC_CONNECTS": {
                "orientation": 'UNDIRECTED',
                "properties": 'weight'
                },
                "COMPOSED_OF": {
                    "orientation": 'UNDIRECTED',
                    "properties": 'weight'
                }
            }
    manager.load_graph_into_memory(database_name=database_name,
                                       graph_name=graph_name,
                                       node_projections=node_projections,
                                       relationship_projections=relationship_projections)
    print(manager.get_graph_info(database_name=database_name,graph_name=graph_name))
```

**Dropping a graph from memory**
```python
with Neo4jGraphDatabase() as session:
    manager=Neo4jGDSManager(session)
    database_name='main'
    graph_name='materials_chemenvElements'
    reuslts=manager.drop_graph(database_name,graph_name)
```

**Using graph algorithms**
Make sure the graph is loaded into memory before running the algorithms.
```python
with Neo4jGraphDatabase() as session:
    manager=Neo4jGDSManager(session)
    database_name='main'
    graph_name='materials_chemenvElements'
    results=manager.run_fastRP_algorithm(database_name=database_name,
                                  graph_name=graph_name,
                                  algorithm_name='pageRank',
                                  algorithm_mode='stream',
                                  embedding_dimension=128,
                                  concurrency=4,
                                  random_seed=42)
    print(results)
```

**Write to graph database**

```python
with Neo4jGraphDatabase() as session:
    manager=Neo4jGDSManager(session)
    database_name='main'
    graph_name='materials_chemenvElements'
    results=manager.run_fastRP_algorithm(database_name=database_name,
                                  graph_name=graph_name,
                                  algorithm_name='pageRank',
                                  algorithm_mode='write',
                                  embedding_dimension=128,
                                  concurrency=4,
                                  random_seed=42,
                                  write_property='fastrp-embedding')
    print(results)
```
**or**

```python
with Neo4jGraphDatabase() as session:
    manager=Neo4jGDSManager(session)
    database_name='main'
    graph_name='materials_chemenvElements'
    results=manager.run_fastRP_algorithm(database_name=database_name,
                                  graph_name=graph_name,
                                  algorithm_name='pageRank',
                                  algorithm_mode='mutate',
                                  embedding_dimension=128,
                                  concurrency=4,
                                  random_seed=42,
                                  mutate_property='fastrp-embedding')
    print(results)

    manager.write_graph(database_name=database_name,
                        graph_name=graph_name,
                        node_properties=['fastrp-embedding'],
                        node_labels=['Materials'],
                        concurrency=4)

    
```

**Export graph to csv**
```python
with Neo4jGraphDatabase() as session:
    manager=Neo4jGDSManager(session)
    database_name='main'
    graph_name='materials_chemenvElements'
    results=manager.export_graph_csv(database_name=database_name,
                                  graph_name=graph_name,
                                  export_name='materials-chemenvElements.csv',
                                  concurrency=4,
                                  default_relationship_type='COMPOSED_OF',
                                  additional_node_properties=['ChemenvElement','Material'])
    print(results)
```


## Authors
Logan Lang,
Aldo Romero,
Eduardo Hernandez,





            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "matgraphdb",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "materials, science, graph, database, python",
    "author": null,
    "author_email": "Logan Lang <lllang@mix.wvu.edu>",
    "download_url": "https://files.pythonhosted.org/packages/3c/dc/c1762fbf9c68a30eca6a9232c5a5da66221d3a8926572dd1511b41b2b340/matgraphdb-0.0.3.tar.gz",
    "platform": null,
    "description": "# MatGraphDB\n\n## Introduction to MatGraphDB\n\nWelcome to **MatGraphDB**, a powerful Python package designed to interface with primary and graph databases for advanced material analysis. MatGraphDB excels in managing vast datasets of materials data, performing complex computational tasks, and encoding material properties and relationships within a graph-based analytical model.\n\nMatGraphDB is structured around several modular components that work together to streamline data management and analysis:\n\n- **DataManager**: Handles interactions with JSON databases and manages the extraction of information from completed calculation directories.\n- **CalcManager**: Manages Density Functional Theory (DFT) calculations, including setting up directories and launching calculations within `MaterialsData`.\n- **GraphDBGenerator**: Facilitates the creation of nodes and relationships for graph databases, storing this information in a specified directory and generating necessary CSV files for nodes and relationships.\n- **Neo4jManager** and **Neo4jGDSManager**: Manage connections to Neo4j databases, allowing for the creation, update, and removal of databases on the Neo4j server, as well as interaction with the Neo4j Graph Data Science library for advanced graph analytics.\n\nThe ultimate goal of MatGraphDB is to leverage the capabilities of graph databases, specifically Neo4j, to enable advanced analysis and discovery in the realm of material science. By integrating data management, DFT calculations, and graph database functionalities, MatGraphDB provides a cohesive workflow for researchers and analysts to explore and understand complex material data.\n\nThis documentation provides an overview of the package, detailing how the various components interact to facilitate efficient data management, computation, and analysis, ensuring that you can make the most out of your material science research with MatGraphDB.\n\n## Components and Their Interactions\n\n![System Architecture of MatGraphDB](MatGraphDB.png)\n*Figure 1: MatGraphDB Package Overview - This diagram illustrates the main components and their interactions within the MatGraphDB package. It highlights the initialization of the `DataManager`, the execution of DFT calculations by `CalcManager`, the generation of the graph database using `GraphDBGenerator`, and the management of Neo4j databases through `Neo4jManager` and `Neo4jGDSManager`. The workflow demonstrates how data flows from JSON files to advanced graph analytics, facilitating comprehensive materials data analysis.\n\n### 1. DataManager Initialization\nThe `DataManager` is the foundational component of the package. It is initialized with the `directory_path`, which points to the JSON database directory. The primary role of `DataManager` is to manage interactions with the JSON files and extract information from the completed calculation directories.\n\n### 2. DFT Calculations with CalcManager\nThe `CalcManager` component manages DFT calculations through interactions with `data_manager`. This includes:\n- Performing DFT calculations (`dft_calcs`).\n- Managing calculations within `MaterialsData`.\n- Setting up directories and launching calculations.\n\n### 3. Graph Database Generation\nThe `GraphDBGenerator` plays a crucial role in creating nodes and relationships for the graph database. It utilizes `data_manager` to handle these processes. To build the graph database, `GraphDBGenerator` uses the methods `create_nodes()` and `create_relationships()`. Once the database directory is set up, an additional method transforms the graph database into a GraphML file, compatible with various graph packages.\n\n- Creates and stores information in the `graph_databases/{database_name}/neo4j_csv` directory.\n- Generates node CSV files in the format `{node_filename}.csv`.\n- Generates relationship CSV files in the format `{node_1_filename}-{node_2_filename}-{connection_names}.csv`.\n\n\n### 4. Managing Neo4j Database Connections\nNeo4j database connections are managed by `Neo4jManager` and `Neo4jGDSManager`:\n- **Neo4jManager**: After the `GraphDBGenerator` creates the graph database directory, `Neo4jManager` can be initialized with this directory to manage databases on the Neo4j server. This includes creating, removing, updating, and listing databases, as well as importing all data from the directory.\n- **Neo4jGDSManager**: Once the database is imported, `Neo4jGDSManager` can be initialized. This component interacts with the Neo4j Graph Data Science library to perform various operations, such as:\n  - Loading graphs into memory.\n  - Removing graphs from memory.\n  - Writing and exporting graphs.\n  - Running algorithms on the graphs.\n\n### Summary\n\nMatGraphDB seamlessly integrates materials data management with advanced graph database functionalities, leveraging DFT calculations and Neo4j database management to provide a robust tool for materials science research. The interactions between `DataManager`, `CalcManager`, `GraphDBGenerator`, `Neo4jManager`, and `Neo4jGDSManager` create a cohesive workflow, from data management and DFT calculations to graph database creation and sophisticated data analysis.\n\n\n\n\n## Getting Started\n\n### Installing the data\n\nYou can install the data here:\n\n\n### Setting up Conda environment \n\nNavigate to the root directory `MatGraphDB`. Then do the following\n#### Use if you are going to not use graph-tool library\n**Windows**\n```bash\nconda env create -f env_win.yml\n```\n\n**Linux**\n```bash\nconda env create -f env_linux.yml\n```\n\n#### Use if you are going to not use graph-tool library\nThis allows you to use the graph-tool library. Currently, we only support the graph-tool library for linux.\n\n**Linux**\n```bash\nconda env create -f env_graph_tool.yml\n```\n\n#### To activeate the enviornment use:\n\n```bash\nconda activate matgraphdb\n```\n\n### Adjusting configs\n\nTh configurations of the project are stored in the `MatGraphDB/config.yml` file. You can adjust the configurations to your needs. The most important configurations that need to be adjusted are `DB_NAME`, `USER`, `PASSWORD`, `LOCATION`, `NEO4J_DESKTOP_DIR`, and `N_CORES`.\n\n\n\n- `DB_NAME`: The name of the database that will be created. This will search for the database with the same name in the `MatGraphDB/data/production` directory.\n\n- `USER`: The username for the Neo4j database.\n\n- `PASSWORD`: The password for the Neo4j database.\n\n- `LOCATION`: This is the location of the Neo4j DBMS. This is usually `\"bolt://localhost:7687\"`\n\n- `NEO4J_DESKTOP_DIR`: This is the directory where the Neo4j Desktop is installed. This could be in various locations depending on your system.\n\n    - **Windows** - `C:/Users/{username}/.Neo4jDesktop`\n    - **Linux** - `/home/neo4j` : Might be different depending on your system\n\n- `N_CORES`: The number of cores to be used for parallel processing.\n\n### Neo4jDektop\nTo use neo4j, you will need to install the neo4j desktop application. You can download the application from the [neo4j website](https://neo4j.com/docs/operations-manual/current/installation/). Create a project and then create a new database management system (DBMS) , name it `MatGraphDB` and select the `Neo4j Community Edition` as the DBMS.\n\nYou will also need to install the APOC library and Graph Data Science Library. You can do this by click on you DBMS name and the on the right clicking `Plugins`, then click on the libraries and install them.\n\nYou will also need to set an apoc environment variable. You can do this by running the following code:\n\n```python\nwith Neo4jGraphDatabase() as manager:\n    settings={'apoc.export.file.enabled':'true'}\n    manager.set_apoc_environment_variables(settings=settings)\n```\n\nAfter running this code, you will need to stop the dbms and restart it.\n\n\n\n\n## Usage\n\n### Interacting with the json database:\n**Checking properties**\n```python\nfrom matgraphdb import DatabaseManager\n\ndb=DatabaseManager()\n\nsuccess,failed=db.check_property(property_name=\"band_gap\")\n\n```\n\n**Adding material properties**\n\n```python\nfrom matgraphdb import DatabaseManager\n\ndb=DatabaseManager()\nstructure = Structure(\n        Lattice.cubic(3.0),\n        [\"C\", \"C\"],  # Elements\n        [\n            [0, 0, 0],          # Coordinates for the first Si atom\n            [0.25, 0.25, 0.25],  # Coordinates for the second Si atom (basis of the diamond structure)\n        ]\n    )\n\n# Add material by structure\ndb.create_material(structure=structure)\n\n# Add material by composition\ndb.create_material(structure=\"BaTe\")\n```\n\n### Creating Graph Databases\nTo create graph databases, you can use the `GraphGenerator` class. This class takes in a `from_scratch` parameter, which determines whether to start from scratch or use an existing graph database. The default value is `False`. This class also takes in a `skip_main_init` parameter, which determines whether to skip the initial node and relationship creation. The default value is `True`.\n\nWhen the object is created, it will create the main graph database based on the json files in the `MatGraphDB/data/production/json_database` directory. The main graph database will contain the initial material nodes and relationships. The file can be found at `MatGraphDB/data/production/graph_database/main/neo4j_csv` \n\n```python\nfrom matgraphdb.graph.graph_generator import GraphGenerator\n\ngenerator=GraphGenerator(skip_main_init=False)\n```\n\nOnce the initial graph database is created, you can screen the existing materials using the `screen_graph_database` function.\n\n```python\ngenerator.screen_graph_database('nelements-2-2',nelements=(2,2), from_scratch=True)\ngenerator.screen_graph_database('nelements-3-3',nelements=(3,3), from_scratch=True)\n\ngenerator.screen_graph_database('spg-145',space_groups=[145], from_scratch=True)\ngenerator.screen_graph_database('spg-145-196',space_groups=[145,196], from_scratch=True)\ngenerator.screen_graph_database('spg-no-145',space_groups=[145], from_scratch=True, include=False)\ngenerator.screen_graph_database('spg-no-196',space_groups=[196], from_scratch=True, include=False)\n\ngenerator.screen_graph_database('elements-no-Ti',elements=[\"Ti\"], from_scratch=True, include=False)\ngenerator.screen_graph_database('elements-no-Fe',elements=[\"Fe\"], from_scratch=True, include=False)\ngenerator.screen_graph_database('elements-no-Ti-Fe',elements=[\"Ti\",\"Fe\"], from_scratch=True, include=False)\n```\n\nHere, we are using the `screen_graph_database` function to create a 9 new graph databases. The `nelements` parameter specifies the number of elements to include in the graph database. The `space_groups` parameter specifies the space groups to include in the graph database. The `elements` parameter specifies the elements to include in the graph database. The `from_scratch` parameter determines whether to start from scratch or use an existing graph database. The `include` parameter determines whether to include the specified elements or space groups in the graph database.\n\n\n### Writing GraphML\nTo write a graphml file from the graph, you can use the `write_graphml` function. This function takes a graph database name as input and writes the graph to a file in the specified format.\n\n```python\ngenerator.write_graphml(graph_dirname='nelements-2-2')\n```\n\n\n### Interacting with the Graph Databse in Neo4j\n\n**List Database Schema**\n```python\nfrom matgraphdb import Neo4jGraphDatabase\n\nwith Neo4jGraphDatabase() as session:\n    schema_list=session.list_schema()\n```\n\n**Execute Cypher Statement**\n```python\nwith Neo4jGraphDatabase() as session:\n    result = matgraphdb.query(query, parameters)\n```\n\n**Filter properties**\n```python\nwith Neo4jGraphDatabase() as session:\n    results=session.read_material(\n                            material_ids=['mp-1000','mp-1001'], \n                            elements=['Te','Ba'])\n    results=session.read_material(\n                                material_ids=['mp-1000','mp-1001'],\n                                elements=['Te','Ba'], \n                                crystal_systems=['cubic'])\n\n    results=session.read_material(\n                            material_ids=['mp-1000','mp-1001'],\n                            elements=['Te','Ba'],\n                            crystal_systems=['hexagonal'])\n                            \n    results=session.read_material(\n                            material_ids=['mp-1000','mp-1001'],\n                            elements=['Te','Ba'],\n                            hall_symbols=['Fm-3m'])\n\n    results=session.read_material(\n                            material_ids=['mp-1000','mp-1001'],\n                            elements=['Te','Ba'],\n                            band_gap=[(1.0,'>')])\n\nsuccess,failed=db.check_property(property_name=\"band_gap\")\n```\n\n\n\n### Interacting with the Neo4j Graph Datascience Library\n\n**Initializing the Neo4jGDSManager**\n\n```python\nfrom matgraphdb import Neo4jGraphDatabase,Neo4jGDSManager\n\nwith Neo4jGraphDatabase() as session:\n    manager=Neo4jGDSManager(session)\n```\n\n**Listing the graphs that are loaded into the gds system for a given database**\n\n```python\nwith Neo4jGraphDatabase() as session:\n    manager=Neo4jGDSManager(session)\n    database_name=\n    results=manager.list_graphs(database_name='main')\n    print(results)\n\n```\n\n**Check if graph is in memory**\n```python\nwith Neo4jGraphDatabase() as session:\n    manager=Neo4jGDSManager(session)\n    results=manager.is_graph_in_memory(database_name='main', graph_name='materials_chemenvElements')\n    print(results)\n```\n**Loading a graph into the gds system**\n```python\nwith Neo4jGraphDatabase() as session:\n    manager=Neo4jGDSManager(session)\n\n    database_name='main'\n    graph_name='materials_chemenvElements'\n    node_projections=['ChemenvElement','Material']\n    relationship_projections={\n                \"GEOMETRIC_ELECTRIC_CONNECTS\": {\n                \"orientation\": 'UNDIRECTED',\n                \"properties\": 'weight'\n                },\n                \"COMPOSED_OF\": {\n                    \"orientation\": 'UNDIRECTED',\n                    \"properties\": 'weight'\n                }\n            }\n    manager.load_graph_into_memory(database_name=database_name,\n                                       graph_name=graph_name,\n                                       node_projections=node_projections,\n                                       relationship_projections=relationship_projections)\n    print(manager.get_graph_info(database_name=database_name,graph_name=graph_name))\n```\n\n**Dropping a graph from memory**\n```python\nwith Neo4jGraphDatabase() as session:\n    manager=Neo4jGDSManager(session)\n    database_name='main'\n    graph_name='materials_chemenvElements'\n    reuslts=manager.drop_graph(database_name,graph_name)\n```\n\n**Using graph algorithms**\nMake sure the graph is loaded into memory before running the algorithms.\n```python\nwith Neo4jGraphDatabase() as session:\n    manager=Neo4jGDSManager(session)\n    database_name='main'\n    graph_name='materials_chemenvElements'\n    results=manager.run_fastRP_algorithm(database_name=database_name,\n                                  graph_name=graph_name,\n                                  algorithm_name='pageRank',\n                                  algorithm_mode='stream',\n                                  embedding_dimension=128,\n                                  concurrency=4,\n                                  random_seed=42)\n    print(results)\n```\n\n**Write to graph database**\n\n```python\nwith Neo4jGraphDatabase() as session:\n    manager=Neo4jGDSManager(session)\n    database_name='main'\n    graph_name='materials_chemenvElements'\n    results=manager.run_fastRP_algorithm(database_name=database_name,\n                                  graph_name=graph_name,\n                                  algorithm_name='pageRank',\n                                  algorithm_mode='write',\n                                  embedding_dimension=128,\n                                  concurrency=4,\n                                  random_seed=42,\n                                  write_property='fastrp-embedding')\n    print(results)\n```\n**or**\n\n```python\nwith Neo4jGraphDatabase() as session:\n    manager=Neo4jGDSManager(session)\n    database_name='main'\n    graph_name='materials_chemenvElements'\n    results=manager.run_fastRP_algorithm(database_name=database_name,\n                                  graph_name=graph_name,\n                                  algorithm_name='pageRank',\n                                  algorithm_mode='mutate',\n                                  embedding_dimension=128,\n                                  concurrency=4,\n                                  random_seed=42,\n                                  mutate_property='fastrp-embedding')\n    print(results)\n\n    manager.write_graph(database_name=database_name,\n                        graph_name=graph_name,\n                        node_properties=['fastrp-embedding'],\n                        node_labels=['Materials'],\n                        concurrency=4)\n\n    \n```\n\n**Export graph to csv**\n```python\nwith Neo4jGraphDatabase() as session:\n    manager=Neo4jGDSManager(session)\n    database_name='main'\n    graph_name='materials_chemenvElements'\n    results=manager.export_graph_csv(database_name=database_name,\n                                  graph_name=graph_name,\n                                  export_name='materials-chemenvElements.csv',\n                                  concurrency=4,\n                                  default_relationship_type='COMPOSED_OF',\n                                  additional_node_properties=['ChemenvElement','Material'])\n    print(results)\n```\n\n\n## Authors\nLogan Lang,\nAldo Romero,\nEduardo Hernandez,\n\n\n\n\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2023 lllangWV  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "Welcome to MatGraphDB, a powerful Python package designed to interface with primary and graph databases for advanced material analysis.",
    "version": "0.0.3",
    "project_urls": {
        "Changelog": "https://github.com/romerogroup/MatGraphDB/CHANGELOG.md",
        "Issues": "https://github.com/romerogroup/MatGraphDB/issues",
        "Repository": "https://github.com/romerogroup/MatGraphDB"
    },
    "split_keywords": [
        "materials",
        " science",
        " graph",
        " database",
        " python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "705ab2d06df2468425987ab5868512cfc75a4db3871a11e66523b544f0c32cdb",
                "md5": "ed36cd58a99258a35789d4daa63f66f6",
                "sha256": "7e7b58224fd5b6b7df6d9d1d37e1b1158b1c14f9dd8eeb15041e6dd2073c7689"
            },
            "downloads": -1,
            "filename": "matgraphdb-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ed36cd58a99258a35789d4daa63f66f6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 613108,
            "upload_time": "2024-10-03T02:28:49",
            "upload_time_iso_8601": "2024-10-03T02:28:49.597688Z",
            "url": "https://files.pythonhosted.org/packages/70/5a/b2d06df2468425987ab5868512cfc75a4db3871a11e66523b544f0c32cdb/matgraphdb-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3cdcc1762fbf9c68a30eca6a9232c5a5da66221d3a8926572dd1511b41b2b340",
                "md5": "4febe2950408cdc52eed52c1bf4bed3d",
                "sha256": "d6673cd665db3b090d121aae5c6dce1d57c74aefb761850ea4da0578a23d2f17"
            },
            "downloads": -1,
            "filename": "matgraphdb-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "4febe2950408cdc52eed52c1bf4bed3d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 15737281,
            "upload_time": "2024-10-03T02:28:51",
            "upload_time_iso_8601": "2024-10-03T02:28:51.962248Z",
            "url": "https://files.pythonhosted.org/packages/3c/dc/c1762fbf9c68a30eca6a9232c5a5da66221d3a8926572dd1511b41b2b340/matgraphdb-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-03 02:28:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "romerogroup",
    "github_project": "MatGraphDB",
    "github_not_found": true,
    "lcname": "matgraphdb"
}
        
Elapsed time: 1.19708s