# AutoRDF2GML
## Overview
AutoRDF2GML is a framework designed to transform RDF data into graph representations suitable for graph-based machine learning methods, e.g., Graph Neural Networks (GNNs). It uniquely generates content-based features from RDF datatype properties and topology-based features from RDF object properties, enabling the effective integration of Semantic Web technologies with Graph Machine Learning.
## Installation
To install the current PyPI version, run:
```sh
pip install autordf2gml
```
We recommend users to use **isolated environment, such as venv or conda**, to use the library. Please note that the current version has only been tested with **Python versions 3.8 to 3.9.9**.
## Usage
To start using AutoRDF2GML, you need: **(1) RDF file** and **(2) Configuration file** describing the configuration for the transformation. In the configuration file, define the RDF classes and properties as needed for your project. See the following for quick example.
## Quick Example
This example uses the [semopenalex-C1793878-sample.nt](https://github.com/davidlamprecht/AutoRDF2GML/blob/main/example/semopenalex-C1793878-sample.nt) RDF file, a curated subset from [SemOpenAlex](https://semopenalex.org).
#### 1. Preparing the configuration file
Fill all the required fields in the config file: see [config-soa-cb.ini](https://github.com/davidlamprecht/AutoRDF2GML/blob/main/example/config-soa-cb.ini) and [config-soa-tb.ini](https://github.com/davidlamprecht/AutoRDF2GML/blob/main/example/example-topologyfeatures/config-soa-tb.ini) as examples for the content-based and topology-based transformation, respectively. The following shows an example of the config file format:
```ini
[InputPath] ;required
input_path = semopenalex-C1793878-sample.nt
[SavePath] ;required
save_path_numeric_graph = semopenalex/numeric-graph/
save_path_mapping = semopenalex/mapping/
[NLD] ;required
nld_class = work
[EMBEDDING] ;required
embedding_model = allenai/scibert_scivocab_uncased
[Nodes] ;required
classes = work, author, institution, source, concept, publisher
work = https://semopenalex.org/class/Work
author = https://semopenalex.org/class/Author
institution = https://semopenalex.org/class/Institution
[SimpleEdges] ;required
edge_names = author_institution
author_institution_start_node = author
author_institution_properties = http://www.w3.org/ns/org#memberOf
author_institution_end_node = institution
```
#### 2. Using the library
```python
import autordf2gml
#to run content-based transformation
autordf2gml.content_feature("config-soa-cb.ini")
#to run topology-based transformation
autordf2gml.topology_feature("config-soa-tb.ini")
#to run content-based transformation only using simple-edges
autordf2gml.simpleedges_feature("config-aifb-cb-simple.ini")
```
## Our Github
The most recent updates, documentation, and examples can be accessed through the following repository:
- <https://github.com/davidlamprecht/AutoRDF2GML>
Raw data
{
"_id": null,
"home_page": null,
"name": "autordf2gml",
"maintainer": null,
"docs_url": null,
"requires_python": "<=3.9.9,>=3.8",
"maintainer_email": null,
"keywords": "rdf, graph dataset, knowledge graph, autordf2gml, gml, gnn",
"author": null,
"author_email": "Michael Faerber <michael.faerber@tu-dresden.de>, David Lamprecht <lamprecht.david@web.de>, Yuni Susanti <yuni.susanti@mailbox.tu-dresden.de>",
"download_url": "https://files.pythonhosted.org/packages/9a/3e/a438d0a4ea422348f2c5646e978d3e0a33ecc558ddb05fd1946a55e7482c/autordf2gml-0.0.1.tar.gz",
"platform": null,
"description": "# AutoRDF2GML\n\n## Overview\n\nAutoRDF2GML is a framework designed to transform RDF data into graph representations suitable for graph-based machine learning methods, e.g., Graph Neural Networks (GNNs). It uniquely generates content-based features from RDF datatype properties and topology-based features from RDF object properties, enabling the effective integration of Semantic Web technologies with Graph Machine Learning.\n\n## Installation\n\nTo install the current PyPI version, run:\n\n```sh\npip install autordf2gml\n```\n\nWe recommend users to use **isolated environment, such as venv or conda**, to use the library. Please note that the current version has only been tested with **Python versions 3.8 to 3.9.9**. \n\n## Usage\n\nTo start using AutoRDF2GML, you need: **(1) RDF file** and **(2) Configuration file** describing the configuration for the transformation. In the configuration file, define the RDF classes and properties as needed for your project. See the following for quick example. \n\n## Quick Example\n\nThis example uses the [semopenalex-C1793878-sample.nt](https://github.com/davidlamprecht/AutoRDF2GML/blob/main/example/semopenalex-C1793878-sample.nt) RDF file, a curated subset from [SemOpenAlex](https://semopenalex.org).\n\n#### 1. Preparing the configuration file\n\nFill all the required fields in the config file: see [config-soa-cb.ini](https://github.com/davidlamprecht/AutoRDF2GML/blob/main/example/config-soa-cb.ini) and [config-soa-tb.ini](https://github.com/davidlamprecht/AutoRDF2GML/blob/main/example/example-topologyfeatures/config-soa-tb.ini) as examples for the content-based and topology-based transformation, respectively. The following shows an example of the config file format:\n\n ```ini\n [InputPath] ;required\n input_path = semopenalex-C1793878-sample.nt\n\n [SavePath] ;required\n save_path_numeric_graph = semopenalex/numeric-graph/\n save_path_mapping = semopenalex/mapping/\n\n [NLD] ;required\n nld_class = work\n\n [EMBEDDING] ;required\n embedding_model = allenai/scibert_scivocab_uncased\n\n [Nodes] ;required\n classes = work, author, institution, source, concept, publisher\n work = https://semopenalex.org/class/Work\n author = https://semopenalex.org/class/Author\n institution = https://semopenalex.org/class/Institution\n \n [SimpleEdges] ;required\n edge_names = author_institution\n author_institution_start_node = author\n author_institution_properties = http://www.w3.org/ns/org#memberOf\n author_institution_end_node = institution\n ```\n\n#### 2. Using the library\n\n```python\nimport autordf2gml\n\n#to run content-based transformation\nautordf2gml.content_feature(\"config-soa-cb.ini\") \n\n#to run topology-based transformation\nautordf2gml.topology_feature(\"config-soa-tb.ini\") \n\n#to run content-based transformation only using simple-edges\nautordf2gml.simpleedges_feature(\"config-aifb-cb-simple.ini\")\n```\n\n## Our Github\n\nThe most recent updates, documentation, and examples can be accessed through the following repository:\n\n- <https://github.com/davidlamprecht/AutoRDF2GML>\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "AutoRDF2GML: A Framework for Transforming RDF Data into Graph Representations for Graph Machine Learning.",
"version": "0.0.1",
"project_urls": {
"Homepage": "https://github.com/davidlamprecht/AutoRDF2GML",
"Issues": "https://github.com/davidlamprecht/AutoRDF2GML/issues"
},
"split_keywords": [
"rdf",
" graph dataset",
" knowledge graph",
" autordf2gml",
" gml",
" gnn"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f63412bcfaa9d1ba0329ceddd7b36761bf793c76808c058d9b04a04e0748c96c",
"md5": "9758f7799b52f81f4844c63268be1255",
"sha256": "905519432e9261a1b7ffdcfaaaaca2cb49e3f5414b6d40647b7100cb96c0f027"
},
"downloads": -1,
"filename": "autordf2gml-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9758f7799b52f81f4844c63268be1255",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<=3.9.9,>=3.8",
"size": 19925,
"upload_time": "2024-06-13T08:37:38",
"upload_time_iso_8601": "2024-06-13T08:37:38.433320Z",
"url": "https://files.pythonhosted.org/packages/f6/34/12bcfaa9d1ba0329ceddd7b36761bf793c76808c058d9b04a04e0748c96c/autordf2gml-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "9a3ea438d0a4ea422348f2c5646e978d3e0a33ecc558ddb05fd1946a55e7482c",
"md5": "b8f56bf1c1f859c4fd402976513a55e6",
"sha256": "8a39a521dc7d470c1258894b3458cbfa3d73453ee9a307effa482c29727607c9"
},
"downloads": -1,
"filename": "autordf2gml-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "b8f56bf1c1f859c4fd402976513a55e6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<=3.9.9,>=3.8",
"size": 15726,
"upload_time": "2024-06-13T08:37:39",
"upload_time_iso_8601": "2024-06-13T08:37:39.749069Z",
"url": "https://files.pythonhosted.org/packages/9a/3e/a438d0a4ea422348f2c5646e978d3e0a33ecc558ddb05fd1946a55e7482c/autordf2gml-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-13 08:37:39",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "davidlamprecht",
"github_project": "AutoRDF2GML",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "autordf2gml"
}