henhoe2vec


Namehenhoe2vec JSON
Version 1.0.4 PyPI version JSON
download
home_page
SummaryImplementation of the HeNHoE-2vec algorithm by Valentini et al. (2021).
upload_time2023-08-22 15:26:09
maintainer
docs_urlNone
author
requires_python>=3.8
licenseMIT License Copyright (c) 2023 Robert Giesler Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords embeddings graph embeddings network embeddings node embeddings multilayer graph embeddings multilayer network embeddings node2vec henhoe-2vec het-node2vec multilayer networks
VCS
bugtrack_url
requirements gensim pre-commit pytest networkx pandas numpy
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
![PyPI](https://img.shields.io/pypi/v/henhoe2vec)
[![Tests](https://github.com/Bertr0/HeNHoE-2vec/actions/workflows/tests.yml/badge.svg)](https://github.com/Bertr0/HeNHoE-2vec/actions/workflows/tests.yml)

# HeNHoE-2vec
A Python implementation of the HeNHoE-2vec algorithm by [Valentini et al.](https://arxiv.org/abs/2101.01425) for the embedding of networks with heterogeneous nodes and homogeneous edges (HeNHoE).

_Note_: HeNHoE networks are analogous to multilayer networks: in HeNHoE networks, each node has a distinct node type, and in multilayer networks, each node belongs to a distinct layer. The terms `type` and `layer` may therefore be regarded synonymous. Throughout the code and for the remainder of this documentation, we will use the terms `multilayer network` and `layer` as opposed to `HeNHoE network` and `type`.

## Installation
Install the package from PyPI by running the following command:
```
$ pip install henhoe2vec
```

Alternatively, clone this repository by running
```
$ git clone git@github.com:Bertr0/HeNHoE-2vec.git
```

and then install the package by running `pip install .` from the root of the repository.

## Usage
This package may be used as a Python script or as a package, allowing its modules to be imported by other Python projects. Both forms of use make it easy to run HeNHoE-2vec on multilayer networks.

### As a Package
After installing the package using `pip`, its modules may be imported using
```python
import henhoe2vec
```

The many individual steps of HeNHoE-2vec are accumulated in a single `run()` method in the `henhoe2vec.henhoe2vec` module. HeNHoE-2vec can be run from start to finish as follows:
```python
import henho2vec as hh2v

hh2v.henhoe2vec.run(input_csv, output_dir)
```

`input_csv` is the path to the multilayer edge list of the network to be embedded (csv file with no index). `output_dir` is the path to the output directory where the embedding files will be saved. The `run()` method takes a bunch of other optional parameters which can be used to configure HeNHoE-2vec. A comprehensive overview of parameters can be found in the code documentation.

### As a Python Script
To run HeNHoE-2vec as a script, clone this repository using
```
$ git clone git@github.com:Bertr0/HeNHoE-2vec.git
```
, install the requirements found in `requirements.txt` and run the following command from the root of the repository:
```
$ python3 -m src.henhoe2vec --input <input_path> --output_dir <output_dir_path>
```

This will generate node embeddings for the nodes of the network specified by the multilayer edge list saved at `<input_path>` and saves the embedding files in `<output_dir>`.

Run `python3 -m src.henhoe2vec --help` from the root of the repository to show an overview of all arguments taken by the script. The following table also shows an overview of all arguments:

#### Script Arguments
| Argument | Type | Description | Default Value |
| -------- | ---- | ----------- | ------------- |
| `--input` | str | Path to the multilayer edge list of the network to be embedded (csv file with no index). | - |
| `--sep` | str | Delimiter of the input csv edge list. | "\t" |
| `--header` | store_true | Pass this argument if the input csv edge list has a header. | - |
| `--output_name` | str | Name of the output .csv file (without suffix). | "embeddings" |
| `--is_directed` | store_true | Pass this argument if the network is directed. | - |
| `--edges_are_distances` | store_true | Pass this argument if edge weights indicate distance between nodes (opposed to weight/similarity). | - |
| `--output_dir` | str | Path of the output directory where the embedding files will be saved. | - |
| `--dimensions` | int | The dimensionality of the embeddings. | 128 |
| `--walk_length` | int | Length of each random walk. | 20 |
| `--num_walks` | int | Number of random walks to simulate for each node. | 10 |
| `--p` | float | Return parameter `p` from the node2vec algorithm. | 1.0 |
| `--q` | float | In-out parameter `q` from the node2vec algorithm. | 0.5 |
| `--s` | float | Default switching parameter for layer pairs which are not specified in the `--s-dict` argument. | 1.0 |
| `--s_dict` | list | Switching parameters for specific layer pairs in a dict-like manner. Pass the names of layer pairs followed by their switching parameters, separated by white spaces. E.g., if the switching parameter from `layer1` to `layer2` is `0.5` and the switching parameter from `layer2` to `layer1` is `0.7`, you would pass `layer1 layer2 0.5 layer2 layer1 0.7`. Note that layer pairs are directed. For all layer pairs which are not specified here, the default parameter `--s` is adopted. | empty list |
| `--window_size` | int | Context size for the word2vec optimization. | 10 |
| `--epochs` | int | Number of epochs in SGD. | 1 |
| `--workers` | int | Number of parallel workers (threads). | 8 |

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "henhoe2vec",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "embeddings,graph embeddings,network embeddings,node embeddings,multilayer graph embeddings,multilayer network embeddings,node2vec,HeNHoE-2vec,Het-node2vec,multilayer networks",
    "author": "",
    "author_email": "Robert Giesler <robert.giesler@rwth-aachen.de>",
    "download_url": "https://files.pythonhosted.org/packages/00/1d/72206c78730d3ac77ccbbffe9f43197fdf020dc3b344646bd7f6844134b4/henhoe2vec-1.0.4.tar.gz",
    "platform": null,
    "description": "[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n![PyPI](https://img.shields.io/pypi/v/henhoe2vec)\n[![Tests](https://github.com/Bertr0/HeNHoE-2vec/actions/workflows/tests.yml/badge.svg)](https://github.com/Bertr0/HeNHoE-2vec/actions/workflows/tests.yml)\n\n# HeNHoE-2vec\nA Python implementation of the HeNHoE-2vec algorithm by [Valentini et al.](https://arxiv.org/abs/2101.01425) for the embedding of networks with heterogeneous nodes and homogeneous edges (HeNHoE).\n\n_Note_: HeNHoE networks are analogous to multilayer networks: in HeNHoE networks, each node has a distinct node type, and in multilayer networks, each node belongs to a distinct layer. The terms `type` and `layer` may therefore be regarded synonymous. Throughout the code and for the remainder of this documentation, we will use the terms `multilayer network` and `layer` as opposed to `HeNHoE network` and `type`.\n\n## Installation\nInstall the package from PyPI by running the following command:\n```\n$ pip install henhoe2vec\n```\n\nAlternatively, clone this repository by running\n```\n$ git clone git@github.com:Bertr0/HeNHoE-2vec.git\n```\n\nand then install the package by running `pip install .` from the root of the repository.\n\n## Usage\nThis package may be used as a Python script or as a package, allowing its modules to be imported by other Python projects. Both forms of use make it easy to run HeNHoE-2vec on multilayer networks.\n\n### As a Package\nAfter installing the package using `pip`, its modules may be imported using\n```python\nimport henhoe2vec\n```\n\nThe many individual steps of HeNHoE-2vec are accumulated in a single `run()` method in the `henhoe2vec.henhoe2vec` module. HeNHoE-2vec can be run from start to finish as follows:\n```python\nimport henho2vec as hh2v\n\nhh2v.henhoe2vec.run(input_csv, output_dir)\n```\n\n`input_csv` is the path to the multilayer edge list of the network to be embedded (csv file with no index). `output_dir` is the path to the output directory where the embedding files will be saved. The `run()` method takes a bunch of other optional parameters which can be used to configure HeNHoE-2vec. A comprehensive overview of parameters can be found in the code documentation.\n\n### As a Python Script\nTo run HeNHoE-2vec as a script, clone this repository using\n```\n$ git clone git@github.com:Bertr0/HeNHoE-2vec.git\n```\n, install the requirements found in `requirements.txt` and run the following command from the root of the repository:\n```\n$ python3 -m src.henhoe2vec --input <input_path> --output_dir <output_dir_path>\n```\n\nThis will generate node embeddings for the nodes of the network specified by the multilayer edge list saved at `<input_path>` and saves the embedding files in `<output_dir>`.\n\nRun `python3 -m src.henhoe2vec --help` from the root of the repository to show an overview of all arguments taken by the script. The following table also shows an overview of all arguments:\n\n#### Script Arguments\n| Argument | Type | Description | Default Value |\n| -------- | ---- | ----------- | ------------- |\n| `--input` | str | Path to the multilayer edge list of the network to be embedded (csv file with no index). | - |\n| `--sep` | str | Delimiter of the input csv edge list. | \"\\t\" |\n| `--header` | store_true | Pass this argument if the input csv edge list has a header. | - |\n| `--output_name` | str | Name of the output .csv file (without suffix). | \"embeddings\" |\n| `--is_directed` | store_true | Pass this argument if the network is directed. | - |\n| `--edges_are_distances` | store_true | Pass this argument if edge weights indicate distance between nodes (opposed to weight/similarity). | - |\n| `--output_dir` | str | Path of the output directory where the embedding files will be saved. | - |\n| `--dimensions` | int | The dimensionality of the embeddings. | 128 |\n| `--walk_length` | int | Length of each random walk. | 20 |\n| `--num_walks` | int | Number of random walks to simulate for each node. | 10 |\n| `--p` | float | Return parameter `p` from the node2vec algorithm. | 1.0 |\n| `--q` | float | In-out parameter `q` from the node2vec algorithm. | 0.5 |\n| `--s` | float | Default switching parameter for layer pairs which are not specified in the `--s-dict` argument. | 1.0 |\n| `--s_dict` | list | Switching parameters for specific layer pairs in a dict-like manner. Pass the names of layer pairs followed by their switching parameters, separated by white spaces. E.g., if the switching parameter from `layer1` to `layer2` is `0.5` and the switching parameter from `layer2` to `layer1` is `0.7`, you would pass `layer1 layer2 0.5 layer2 layer1 0.7`. Note that layer pairs are directed. For all layer pairs which are not specified here, the default parameter `--s` is adopted. | empty list |\n| `--window_size` | int | Context size for the word2vec optimization. | 10 |\n| `--epochs` | int | Number of epochs in SGD. | 1 |\n| `--workers` | int | Number of parallel workers (threads). | 8 |\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2023 Robert Giesler  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Implementation of the HeNHoE-2vec algorithm by Valentini et al. (2021).",
    "version": "1.0.4",
    "project_urls": {
        "Repository": "https://github.com/RobertGiesler/HeNHoE-2vec"
    },
    "split_keywords": [
        "embeddings",
        "graph embeddings",
        "network embeddings",
        "node embeddings",
        "multilayer graph embeddings",
        "multilayer network embeddings",
        "node2vec",
        "henhoe-2vec",
        "het-node2vec",
        "multilayer networks"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "55ddd3b8e96c05c36357807c58fd2f2b727e097a9f65ac4c1aba6ea969bd6524",
                "md5": "ec1dd28f9a4b86615ebf9ca41def1ca2",
                "sha256": "2cd46441216617a96b9fa00036f2d844ed94078e85b7964db239f8e0732c0945"
            },
            "downloads": -1,
            "filename": "henhoe2vec-1.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ec1dd28f9a4b86615ebf9ca41def1ca2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 15207,
            "upload_time": "2023-08-22T15:26:07",
            "upload_time_iso_8601": "2023-08-22T15:26:07.573704Z",
            "url": "https://files.pythonhosted.org/packages/55/dd/d3b8e96c05c36357807c58fd2f2b727e097a9f65ac4c1aba6ea969bd6524/henhoe2vec-1.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "001d72206c78730d3ac77ccbbffe9f43197fdf020dc3b344646bd7f6844134b4",
                "md5": "e3a8e465f747d9a72aff94b35f7e3a8c",
                "sha256": "60f4b3aee8bd7b82e63463378e610826dee312ac67f28885c6604f4494fe8ada"
            },
            "downloads": -1,
            "filename": "henhoe2vec-1.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "e3a8e465f747d9a72aff94b35f7e3a8c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 17167,
            "upload_time": "2023-08-22T15:26:09",
            "upload_time_iso_8601": "2023-08-22T15:26:09.461585Z",
            "url": "https://files.pythonhosted.org/packages/00/1d/72206c78730d3ac77ccbbffe9f43197fdf020dc3b344646bd7f6844134b4/henhoe2vec-1.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-22 15:26:09",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "RobertGiesler",
    "github_project": "HeNHoE-2vec",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "gensim",
            "specs": [
                [
                    "==",
                    "4.3.1"
                ]
            ]
        },
        {
            "name": "pre-commit",
            "specs": [
                [
                    "==",
                    "3.3.2"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    "==",
                    "7.3.1"
                ]
            ]
        },
        {
            "name": "networkx",
            "specs": [
                [
                    "==",
                    "3.1"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "2.0.2"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.24.3"
                ]
            ]
        }
    ],
    "lcname": "henhoe2vec"
}
        
Elapsed time: 1.73630s