graphdatascience


Namegraphdatascience JSON
Version 1.10 PyPI version JSON
download
home_pagehttps://neo4j.com/product/graph-data-science/
SummaryA Python client for the Neo4j Graph Data Science (GDS) library
upload_time2024-04-03 13:19:58
maintainerNone
docs_urlNone
authorNeo4j
requires_python>=3.8
licenseApache License 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Neo4j Graph Data Science Client

[![Latest version](https://img.shields.io/pypi/v/graphdatascience)](https://pypi.org/project/graphdatascience/)
[![PyPI downloads month](https://img.shields.io/pypi/dm/graphdatascience)](https://pypi.org/project/graphdatascience/)
![Python versions](https://img.shields.io/pypi/pyversions/graphdatascience)
[![Documentation](https://img.shields.io/badge/Documentation-latest-blue)](https://neo4j.com/docs/graph-data-science-client/current/)
[![Discord](https://img.shields.io/discord/787399249741479977?label=Chat&logo=discord)](https://discord.gg/neo4j)
[![Community forum](https://img.shields.io/website?down_color=lightgrey&down_message=offline&label=Forums&logo=discourse&up_color=green&up_message=online&url=https%3A%2F%2Fcommunity.neo4j.com%2F)](https://community.neo4j.com)
[![License](https://img.shields.io/pypi/l/graphdatascience)](https://www.apache.org/licenses/LICENSE-2.0)

`graphdatascience` is a Python client for operating and working with the [Neo4j Graph Data Science (GDS) library](https://github.com/neo4j/graph-data-science).
It enables users to write pure Python code to project graphs, run algorithms, as well as define and use machine learning pipelines in GDS.

The API is designed to mimic the GDS Cypher procedure API in Python code.
It abstracts the necessary operations of the [Neo4j Python driver](https://neo4j.com/docs/python-manual/current/) to offer a simpler surface.
Additionally, the client-specific graph, model, and pipeline objects offer convenient functions that heavily reduce the need to use Cypher to access and operate these GDS resources.

`graphdatascience` is only guaranteed to work with GDS versions 2.0+.

Please leave any feedback as issues on [the source repository](https://github.com/neo4j/graph-data-science-client).
Happy coding!


## Installation

To install the latest deployed version of `graphdatascience`, simply run:

```bash
pip install graphdatascience
```


## Getting started

To use the GDS Python Client, we need to instantiate a GraphDataScience object.
Then, we can project graphs, create pipelines, train models, and run algorithms.

```python
from graphdatascience import GraphDataScience

# Configure the driver with AuraDS-recommended settings
gds = GraphDataScience("neo4j+s://my-aura-ds.databases.neo4j.io:7687", auth=("neo4j", "my-password"), aura_ds=True)

# Import the Cora common dataset to GDS
G = gds.graph.load_cora()
assert G.node_count() == 2708

# Run PageRank in mutate mode on G
pagerank_result = gds.pageRank.mutate(G, tolerance=0.5, mutateProperty="pagerank")
assert pagerank_result["nodePropertiesWritten"] == G.node_count()

# Create a Node Classification pipeline
pipeline = gds.nc_pipe("myPipe")
assert pipeline.type() == "Node classification training pipeline"

# Add a Degree Centrality feature to the pipeline
pipeline.addNodeProperty("degree", mutateProperty="rank")
pipeline.selectFeatures("rank")
features = pipeline.feature_properties()
assert len(features) == 1
assert features[0]["feature"] == "rank"

# Add a training method
pipeline.addLogisticRegression(penalty=(0.1, 2))

# Train a model on G
model, train_result = pipeline.train(G, modelName="myModel", targetProperty="myClass", metrics=["ACCURACY"])
assert model.metrics()["ACCURACY"]["test"] > 0
assert train_result["trainMillis"] >= 0

# Compute predictions in stream mode
predictions = model.predict_stream(G)
assert len(predictions) == G.node_count()
```

The example here assumes using an AuraDS instance.
For additional examples and extensive documentation of all capabilities, please refer to the [GDS Python Client Manual](https://neo4j.com/docs/graph-data-science-client/current/).

Full end-to-end examples in Jupyter ready-to-run notebooks can be found in the [`examples` source directory](https://github.com/neo4j/graph-data-science-client/tree/main/examples):

* [Machine learning pipelines: Node classification](examples/ml-pipelines-node-classification.ipynb)
* [Node Regression with Subgraph and Graph Sample projections](examples/node-regression-with-subgraph-and-graph-sample.ipynb)
* [Product recommendations with kNN based on FastRP embeddings](examples/fastrp-and-knn.ipynb)
* [Sampling, Export and Integration with PyG example](examples/import-sample-export-gnn.ipynb)
* [Load data to a projected graph via graph construction](examples/load-data-via-graph-construction.ipynb)
* [Heterogeneous Node Classification with HashGNN and Autotuning](https://github.com/neo4j/graph-data-science-client/tree/main/examples/heterogeneous-node-classification-with-hashgnn.ipynb)
* [Perform inference using pre-trained KGE models](examples/kge-predict-transe-pyg-train.ipynb)


## Documentation

The primary source for learning everything about the GDS Python Client is the manual, hosted at https://neo4j.com/docs/graph-data-science-client/current/.
The manual is versioned to cover all GDS Python Client versions, so make sure to use the correct version to get the correct information.


## Known limitations

Operations known to not yet work with `graphdatascience`:

* [Numeric utility functions](https://neo4j.com/docs/graph-data-science/current/management-ops/utility-functions/#utility-functions-numeric) (will never be supported)
* [Cypher on GDS](https://neo4j.com/docs/graph-data-science/current/management-ops/create-cypher-db/) (might be supported in the future)
* [Projecting graphs using Cypher Aggregation](https://neo4j.com/docs/graph-data-science/current/management-ops/projections/graph-project-cypher-aggregation/) (might be supported in the future)


## License

`graphdatascience` is licensed under the Apache Software License version 2.0.
All content is copyright © Neo4j Sweden AB.


## Acknowledgements

This work has been inspired by the great work done in the following libraries:

* [pygds](https://github.com/stellasia/pygds) by stellasia
* [gds-python](https://github.com/moxious/gds-python) by moxious

            

Raw data

            {
    "_id": null,
    "home_page": "https://neo4j.com/product/graph-data-science/",
    "name": "graphdatascience",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Neo4j",
    "author_email": "team-gds@neo4j.org",
    "download_url": "https://files.pythonhosted.org/packages/cd/38/bb0725819ef0dc150ef9d63267862014934b1b6f0dec943054e78002740d/graphdatascience-1.10.tar.gz",
    "platform": null,
    "description": "# Neo4j Graph Data Science Client\n\n[![Latest version](https://img.shields.io/pypi/v/graphdatascience)](https://pypi.org/project/graphdatascience/)\n[![PyPI downloads month](https://img.shields.io/pypi/dm/graphdatascience)](https://pypi.org/project/graphdatascience/)\n![Python versions](https://img.shields.io/pypi/pyversions/graphdatascience)\n[![Documentation](https://img.shields.io/badge/Documentation-latest-blue)](https://neo4j.com/docs/graph-data-science-client/current/)\n[![Discord](https://img.shields.io/discord/787399249741479977?label=Chat&logo=discord)](https://discord.gg/neo4j)\n[![Community forum](https://img.shields.io/website?down_color=lightgrey&down_message=offline&label=Forums&logo=discourse&up_color=green&up_message=online&url=https%3A%2F%2Fcommunity.neo4j.com%2F)](https://community.neo4j.com)\n[![License](https://img.shields.io/pypi/l/graphdatascience)](https://www.apache.org/licenses/LICENSE-2.0)\n\n`graphdatascience` is a Python client for operating and working with the [Neo4j Graph Data Science (GDS) library](https://github.com/neo4j/graph-data-science).\nIt enables users to write pure Python code to project graphs, run algorithms, as well as define and use machine learning pipelines in GDS.\n\nThe API is designed to mimic the GDS Cypher procedure API in Python code.\nIt abstracts the necessary operations of the [Neo4j Python driver](https://neo4j.com/docs/python-manual/current/) to offer a simpler surface.\nAdditionally, the client-specific graph, model, and pipeline objects offer convenient functions that heavily reduce the need to use Cypher to access and operate these GDS resources.\n\n`graphdatascience` is only guaranteed to work with GDS versions 2.0+.\n\nPlease leave any feedback as issues on [the source repository](https://github.com/neo4j/graph-data-science-client).\nHappy coding!\n\n\n## Installation\n\nTo install the latest deployed version of `graphdatascience`, simply run:\n\n```bash\npip install graphdatascience\n```\n\n\n## Getting started\n\nTo use the GDS Python Client, we need to instantiate a GraphDataScience object.\nThen, we can project graphs, create pipelines, train models, and run algorithms.\n\n```python\nfrom graphdatascience import GraphDataScience\n\n# Configure the driver with AuraDS-recommended settings\ngds = GraphDataScience(\"neo4j+s://my-aura-ds.databases.neo4j.io:7687\", auth=(\"neo4j\", \"my-password\"), aura_ds=True)\n\n# Import the Cora common dataset to GDS\nG = gds.graph.load_cora()\nassert G.node_count() == 2708\n\n# Run PageRank in mutate mode on G\npagerank_result = gds.pageRank.mutate(G, tolerance=0.5, mutateProperty=\"pagerank\")\nassert pagerank_result[\"nodePropertiesWritten\"] == G.node_count()\n\n# Create a Node Classification pipeline\npipeline = gds.nc_pipe(\"myPipe\")\nassert pipeline.type() == \"Node classification training pipeline\"\n\n# Add a Degree Centrality feature to the pipeline\npipeline.addNodeProperty(\"degree\", mutateProperty=\"rank\")\npipeline.selectFeatures(\"rank\")\nfeatures = pipeline.feature_properties()\nassert len(features) == 1\nassert features[0][\"feature\"] == \"rank\"\n\n# Add a training method\npipeline.addLogisticRegression(penalty=(0.1, 2))\n\n# Train a model on G\nmodel, train_result = pipeline.train(G, modelName=\"myModel\", targetProperty=\"myClass\", metrics=[\"ACCURACY\"])\nassert model.metrics()[\"ACCURACY\"][\"test\"] > 0\nassert train_result[\"trainMillis\"] >= 0\n\n# Compute predictions in stream mode\npredictions = model.predict_stream(G)\nassert len(predictions) == G.node_count()\n```\n\nThe example here assumes using an AuraDS instance.\nFor additional examples and extensive documentation of all capabilities, please refer to the [GDS Python Client Manual](https://neo4j.com/docs/graph-data-science-client/current/).\n\nFull end-to-end examples in Jupyter ready-to-run notebooks can be found in the [`examples` source directory](https://github.com/neo4j/graph-data-science-client/tree/main/examples):\n\n* [Machine learning pipelines: Node classification](examples/ml-pipelines-node-classification.ipynb)\n* [Node Regression with Subgraph and Graph Sample projections](examples/node-regression-with-subgraph-and-graph-sample.ipynb)\n* [Product recommendations with kNN based on FastRP embeddings](examples/fastrp-and-knn.ipynb)\n* [Sampling, Export and Integration with PyG example](examples/import-sample-export-gnn.ipynb)\n* [Load data to a projected graph via graph construction](examples/load-data-via-graph-construction.ipynb)\n* [Heterogeneous Node Classification with HashGNN and Autotuning](https://github.com/neo4j/graph-data-science-client/tree/main/examples/heterogeneous-node-classification-with-hashgnn.ipynb)\n* [Perform inference using pre-trained KGE models](examples/kge-predict-transe-pyg-train.ipynb)\n\n\n## Documentation\n\nThe primary source for learning everything about the GDS Python Client is the manual, hosted at https://neo4j.com/docs/graph-data-science-client/current/.\nThe manual is versioned to cover all GDS Python Client versions, so make sure to use the correct version to get the correct information.\n\n\n## Known limitations\n\nOperations known to not yet work with `graphdatascience`:\n\n* [Numeric utility functions](https://neo4j.com/docs/graph-data-science/current/management-ops/utility-functions/#utility-functions-numeric) (will never be supported)\n* [Cypher on GDS](https://neo4j.com/docs/graph-data-science/current/management-ops/create-cypher-db/) (might be supported in the future)\n* [Projecting graphs using Cypher Aggregation](https://neo4j.com/docs/graph-data-science/current/management-ops/projections/graph-project-cypher-aggregation/) (might be supported in the future)\n\n\n## License\n\n`graphdatascience` is licensed under the Apache Software License version 2.0.\nAll content is copyright \u00a9 Neo4j Sweden AB.\n\n\n## Acknowledgements\n\nThis work has been inspired by the great work done in the following libraries:\n\n* [pygds](https://github.com/stellasia/pygds) by stellasia\n* [gds-python](https://github.com/moxious/gds-python) by moxious\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "A Python client for the Neo4j Graph Data Science (GDS) library",
    "version": "1.10",
    "project_urls": {
        "Bug Tracker": "https://github.com/neo4j/graph-data-science-client/issues",
        "Documentation": "https://neo4j.com/docs/graph-data-science-client/current/",
        "Homepage": "https://neo4j.com/product/graph-data-science/",
        "Source": "https://github.com/neo4j/graph-data-science-client"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d7729828d934df6de76f3f7eada486e37dfc035968a96f6c0b4fb542c4264b33",
                "md5": "6259f7ef801d391f261cccb2a2387db5",
                "sha256": "88b7a5432f28dd340bd3f84e162041b9b6a6e9b298cac42973fc25a36f5eaad4"
            },
            "downloads": -1,
            "filename": "graphdatascience-1.10-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6259f7ef801d391f261cccb2a2387db5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 1640160,
            "upload_time": "2024-04-03T13:19:55",
            "upload_time_iso_8601": "2024-04-03T13:19:55.488507Z",
            "url": "https://files.pythonhosted.org/packages/d7/72/9828d934df6de76f3f7eada486e37dfc035968a96f6c0b4fb542c4264b33/graphdatascience-1.10-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cd38bb0725819ef0dc150ef9d63267862014934b1b6f0dec943054e78002740d",
                "md5": "8a710173e9c4c02257abd794c56e3b05",
                "sha256": "7699b15417a923c89c5b7f07ffb4e073141f6cf711b0082cb82341fcee968101"
            },
            "downloads": -1,
            "filename": "graphdatascience-1.10.tar.gz",
            "has_sig": false,
            "md5_digest": "8a710173e9c4c02257abd794c56e3b05",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 1598178,
            "upload_time": "2024-04-03T13:19:58",
            "upload_time_iso_8601": "2024-04-03T13:19:58.280801Z",
            "url": "https://files.pythonhosted.org/packages/cd/38/bb0725819ef0dc150ef9d63267862014934b1b6f0dec943054e78002740d/graphdatascience-1.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-03 13:19:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "neo4j",
    "github_project": "graph-data-science-client",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "graphdatascience"
}
        
Elapsed time: 0.23995s