Name | NEExT JSON |
Version |
0.2.10
JSON |
| download |
home_page | None |
Summary | Network Embedding Experimentation Toolkit - A powerful framework for graph analysis, embedding computation, and machine learning on graph-structured data |
upload_time | 2025-07-11 02:45:36 |
maintainer | None |
docs_url | None |
author | None |
requires_python | <3.13,>=3.9 |
license | MIT |
keywords |
embedding
graph
graph-ml
machine-learning
network
network-analysis
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# NEExT: Network Embedding Experimentation Toolkit
NEExT is a powerful Python framework for graph analysis, embedding computation, and machine learning on graph-structured data. It provides a unified interface for working with different graph backends (NetworkX and iGraph), computing node features, generating graph embeddings, and training machine learning models.
## 📚 Documentation
Detailed documentation is available in the `docs` directory. Build it locally or visit the online documentation at [NEExT Documentation](https://neext.readthedocs.io/en/latest/).
## 🌟 Features
- **Flexible Graph Handling**
- Support for both NetworkX and iGraph backends
- Automatic graph reindexing and largest component filtering
- Node sampling capabilities for large graphs
- Rich attribute support for nodes and edges
- **Comprehensive Node Features**
- PageRank
- Degree Centrality
- Closeness Centrality
- Betweenness Centrality
- Eigenvector Centrality
- Clustering Coefficient
- Local Efficiency
- LSME (Local Structural Motif Embeddings)
- **Graph Embeddings**
- Approximate Wasserstein
- Exact Wasserstein
- Sinkhorn Vectorizer
- Customizable embedding dimensions
- **Machine Learning Integration**
- Classification and regression support
- Dataset balancing options
- Cross-validation with customizable splits
- Feature importance analysis
### Custom Node Feature Functions
NEExT allows you to define and compute your own custom node feature functions alongside the built-in ones. This provides great flexibility for experimenting with novel graph metrics.
**Defining a Custom Feature Function:**
Your custom feature function must adhere to the following structure:
1. **Input**: It must accept a single argument, which will be a `graph` object. This object provides access to the graph's structure (nodes, edges) and properties (e.g., `graph.nodes`, `graph.graph_id`, `graph.G` which is the underlying NetworkX or iGraph object).
2. **Output**: It must return a `pandas.DataFrame` with the following specific columns in order:
* `"node_id"`: Identifiers for the nodes for which features are computed.
* `"graph_id"`: The identifier of the graph to which these nodes belong.
* One or more feature columns: These columns should contain the computed feature values. The naming convention for these columns should ideally follow the pattern `your_feature_name_0`, `your_feature_name_1`, etc., if your feature has multiple components or is expanded over hops (though a single feature column like `your_feature_name` is also acceptable).
**Example:**
Here's how you can define a simple custom feature function and use it:
```python
import pandas as pd
# 1. Define your custom feature function
# This function must be defined at the top level of your script/module
# if you plan to use multiprocessing (n_jobs != 1).
def my_node_degree_squared(graph):
nodes = list(graph.nodes) # or range(graph.G.vcount()) for igraph if nodes are 0-indexed
graph_id = graph.graph_id
if hasattr(graph.G, 'degree'): # Handles both NetworkX and iGraph
if isinstance(graph.G, nx.Graph): # NetworkX
degrees = [graph.G.degree(n) for n in nodes]
else: # iGraph
degrees = graph.G.degree(nodes)
else:
raise TypeError("Graph object does not have a degree method.")
degree_squared_values = [d**2 for d in degrees]
df = pd.DataFrame({
'node_id': nodes,
'graph_id': graph_id,
'degree_sq_0': degree_squared_values
})
# Ensure the correct column order
return df[['node_id', 'graph_id', 'degree_sq_0']]
# 2. Prepare the list of custom feature methods
my_feature_methods = [
{"feature_name": "my_degree_squared", "feature_function": my_node_degree_squared}
]
# 3. Pass it to compute_node_features
# Initialize NEExT and load your graph_collection as shown in the Quick Start
# nxt = NEExT()
# graph_collection = nxt.read_from_csv(...)
features = nxt.compute_node_features(
graph_collection=graph_collection,
feature_list=["page_rank", "my_degree_squared"], # Include your custom feature name
feature_vector_length=3, # Applies to built-in features that use it
my_feature_methods=my_feature_methods
)
print(features.features_df.head())
```
When you include `"my_degree_squared"` in the `feature_list` and provide `my_feature_methods`, NEExT will automatically register and compute your custom function. If `"all"` is in `feature_list`, your custom registered function will also be included in the computation.
## 📦 Installation
### Basic Installation
```bash
pip install NEExT
```
### Development Installation
```bash
# Clone the repository
git clone https://github.com/ashdehghan/NEExT.git
cd NEExT
# Install with development dependencies
pip install -e ".[dev]"
```
### Additional Components
```bash
# For running tests
pip install -e ".[test]"
# For building documentation
pip install -e ".[docs]"
# For running experiments
pip install -e ".[experiments]"
# Install all components
pip install -e ".[dev,test,docs,experiments]"
```
## 🚀 Quick Start
### Basic Usage
```python
from NEExT import NEExT
# Initialize the framework
nxt = NEExT()
nxt.set_log_level("INFO")
# Load graph data
graph_collection = nxt.read_from_csv(
edges_path="edges.csv",
node_graph_mapping_path="node_graph_mapping.csv",
graph_label_path="graph_labels.csv",
reindex_nodes=True,
filter_largest_component=True,
graph_type="igraph"
)
# Compute node features
features = nxt.compute_node_features(
graph_collection=graph_collection,
feature_list=["all"],
feature_vector_length=3
)
# Compute graph embeddings
embeddings = nxt.compute_graph_embeddings(
graph_collection=graph_collection,
features=features,
embedding_algorithm="approx_wasserstein",
embedding_dimension=3
)
# Train a classifier
model_results = nxt.train_ml_model(
graph_collection=graph_collection,
embeddings=embeddings,
model_type="classifier",
sample_size=50
)
```
### Working with Large Graphs
NEExT supports node sampling for handling large graphs:
```python
# Load graphs with 70% of nodes
graph_collection = nxt.read_from_csv(
edges_path="edges.csv",
node_graph_mapping_path="node_graph_mapping.csv",
node_sample_rate=0.7 # Use 70% of nodes
)
```
### Feature Importance Analysis
```python
# Compute feature importance
importance_df = nxt.compute_feature_importance(
graph_collection=graph_collection,
features=features,
feature_importance_algorithm="supervised_fast",
embedding_algorithm="approx_wasserstein"
)
```
## 📊 Experiments
NEExT includes several pre-built experiments in the `examples/experiments` directory:
### Node Sampling Experiment
Investigates the effect of node sampling on classifier accuracy:
```bash
cd examples/experiments
python node_sampling_experiments.py
```
## 📝 Input File Formats
### edges.csv
```csv
src_node_id,dest_node_id
0,1
1,2
...
```
### node_graph_mapping.csv
```csv
node_id,graph_id
0,1
1,1
2,2
...
```
### graph_labels.csv
```csv
graph_id,graph_label
1,0
2,1
...
```
## 🛠️ Development
### Running Tests
```bash
# Run all tests
pytest
# Run with coverage
pytest --cov=NEExT
# Run specific test file
pytest tests/test_node_sampling.py
```
### Building Documentation
```bash
cd docs
make html
```
### Code Style
The project uses several tools for code quality:
```bash
# Format code
black .
# Sort imports
isort .
# Check style
flake8 .
# Type checking
mypy .
```
## 🤝 Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests
5. Submit a pull request
## 📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
## 👥 Authors
- Ash Dehghan - [ash.dehghan@gmail.com](mailto:ash.dehghan@gmail.com)
## 🙏 Acknowledgments
- NetworkX team for the graph algorithms
- iGraph team for the efficient graph operations
- Scikit-learn team for machine learning components
## 📧 Contact
For questions and support:
- Email: ash@anomalypoint.com
- GitHub Issues: [NEExT Issues](https://github.com/ashdehghan/NEExT/issues)
## 🔄 Version History
- 0.1.0
- Initial release
- Basic graph operations
- Node feature computation
- Graph embeddings
- Machine learning integration
Raw data
{
"_id": null,
"home_page": null,
"name": "NEExT",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.9",
"maintainer_email": "Ash Dehghan <ash.dehghan@gmail.com>",
"keywords": "embedding, graph, graph-ml, machine-learning, network, network-analysis",
"author": null,
"author_email": "Ash Dehghan <ash.dehghan@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/52/b2/cf37d6a388ae8e447e4da15f13163ec825e2edc514bdadf5727bd93b0bc6/neext-0.2.10.tar.gz",
"platform": null,
"description": "# NEExT: Network Embedding Experimentation Toolkit\n\nNEExT is a powerful Python framework for graph analysis, embedding computation, and machine learning on graph-structured data. It provides a unified interface for working with different graph backends (NetworkX and iGraph), computing node features, generating graph embeddings, and training machine learning models.\n\n## \ud83d\udcda Documentation\n\nDetailed documentation is available in the `docs` directory. Build it locally or visit the online documentation at [NEExT Documentation](https://neext.readthedocs.io/en/latest/).\n\n## \ud83c\udf1f Features\n\n- **Flexible Graph Handling**\n - Support for both NetworkX and iGraph backends\n - Automatic graph reindexing and largest component filtering\n - Node sampling capabilities for large graphs\n - Rich attribute support for nodes and edges\n\n- **Comprehensive Node Features**\n - PageRank\n - Degree Centrality\n - Closeness Centrality\n - Betweenness Centrality\n - Eigenvector Centrality\n - Clustering Coefficient\n - Local Efficiency\n - LSME (Local Structural Motif Embeddings)\n\n- **Graph Embeddings**\n - Approximate Wasserstein\n - Exact Wasserstein\n - Sinkhorn Vectorizer\n - Customizable embedding dimensions\n\n- **Machine Learning Integration**\n - Classification and regression support\n - Dataset balancing options\n - Cross-validation with customizable splits\n - Feature importance analysis\n\n### Custom Node Feature Functions\n\nNEExT allows you to define and compute your own custom node feature functions alongside the built-in ones. This provides great flexibility for experimenting with novel graph metrics.\n\n**Defining a Custom Feature Function:**\n\nYour custom feature function must adhere to the following structure:\n\n1. **Input**: It must accept a single argument, which will be a `graph` object. This object provides access to the graph's structure (nodes, edges) and properties (e.g., `graph.nodes`, `graph.graph_id`, `graph.G` which is the underlying NetworkX or iGraph object).\n2. **Output**: It must return a `pandas.DataFrame` with the following specific columns in order:\n * `\"node_id\"`: Identifiers for the nodes for which features are computed.\n * `\"graph_id\"`: The identifier of the graph to which these nodes belong.\n * One or more feature columns: These columns should contain the computed feature values. The naming convention for these columns should ideally follow the pattern `your_feature_name_0`, `your_feature_name_1`, etc., if your feature has multiple components or is expanded over hops (though a single feature column like `your_feature_name` is also acceptable).\n\n**Example:**\n\nHere's how you can define a simple custom feature function and use it:\n\n```python\nimport pandas as pd\n\n# 1. Define your custom feature function\n# This function must be defined at the top level of your script/module\n# if you plan to use multiprocessing (n_jobs != 1).\ndef my_node_degree_squared(graph):\n nodes = list(graph.nodes) # or range(graph.G.vcount()) for igraph if nodes are 0-indexed\n graph_id = graph.graph_id\n \n if hasattr(graph.G, 'degree'): # Handles both NetworkX and iGraph\n if isinstance(graph.G, nx.Graph): # NetworkX\n degrees = [graph.G.degree(n) for n in nodes]\n else: # iGraph\n degrees = graph.G.degree(nodes)\n else:\n raise TypeError(\"Graph object does not have a degree method.\")\n \n degree_squared_values = [d**2 for d in degrees]\n \n df = pd.DataFrame({\n 'node_id': nodes,\n 'graph_id': graph_id,\n 'degree_sq_0': degree_squared_values\n })\n # Ensure the correct column order\n return df[['node_id', 'graph_id', 'degree_sq_0']]\n\n# 2. Prepare the list of custom feature methods\nmy_feature_methods = [\n {\"feature_name\": \"my_degree_squared\", \"feature_function\": my_node_degree_squared}\n]\n\n# 3. Pass it to compute_node_features\n# Initialize NEExT and load your graph_collection as shown in the Quick Start\n# nxt = NEExT()\n# graph_collection = nxt.read_from_csv(...)\n\nfeatures = nxt.compute_node_features(\n graph_collection=graph_collection,\n feature_list=[\"page_rank\", \"my_degree_squared\"], # Include your custom feature name\n feature_vector_length=3, # Applies to built-in features that use it\n my_feature_methods=my_feature_methods\n)\n\nprint(features.features_df.head())\n```\n\nWhen you include `\"my_degree_squared\"` in the `feature_list` and provide `my_feature_methods`, NEExT will automatically register and compute your custom function. If `\"all\"` is in `feature_list`, your custom registered function will also be included in the computation.\n\n## \ud83d\udce6 Installation\n\n### Basic Installation\n```bash\npip install NEExT\n```\n\n### Development Installation\n```bash\n# Clone the repository\ngit clone https://github.com/ashdehghan/NEExT.git\ncd NEExT\n\n# Install with development dependencies\npip install -e \".[dev]\"\n```\n\n### Additional Components\n```bash\n# For running tests\npip install -e \".[test]\"\n\n# For building documentation\npip install -e \".[docs]\"\n\n# For running experiments\npip install -e \".[experiments]\"\n\n# Install all components\npip install -e \".[dev,test,docs,experiments]\"\n```\n\n## \ud83d\ude80 Quick Start\n\n### Basic Usage\n\n```python\nfrom NEExT import NEExT\n\n# Initialize the framework\nnxt = NEExT()\nnxt.set_log_level(\"INFO\")\n\n# Load graph data\ngraph_collection = nxt.read_from_csv(\n edges_path=\"edges.csv\",\n node_graph_mapping_path=\"node_graph_mapping.csv\",\n graph_label_path=\"graph_labels.csv\",\n reindex_nodes=True,\n filter_largest_component=True,\n graph_type=\"igraph\"\n)\n\n# Compute node features\nfeatures = nxt.compute_node_features(\n graph_collection=graph_collection,\n feature_list=[\"all\"],\n feature_vector_length=3\n)\n\n# Compute graph embeddings\nembeddings = nxt.compute_graph_embeddings(\n graph_collection=graph_collection,\n features=features,\n embedding_algorithm=\"approx_wasserstein\",\n embedding_dimension=3\n)\n\n# Train a classifier\nmodel_results = nxt.train_ml_model(\n graph_collection=graph_collection,\n embeddings=embeddings,\n model_type=\"classifier\",\n sample_size=50\n)\n```\n\n### Working with Large Graphs\n\nNEExT supports node sampling for handling large graphs:\n\n```python\n# Load graphs with 70% of nodes\ngraph_collection = nxt.read_from_csv(\n edges_path=\"edges.csv\",\n node_graph_mapping_path=\"node_graph_mapping.csv\",\n node_sample_rate=0.7 # Use 70% of nodes\n)\n```\n\n### Feature Importance Analysis\n\n```python\n# Compute feature importance\nimportance_df = nxt.compute_feature_importance(\n graph_collection=graph_collection,\n features=features,\n feature_importance_algorithm=\"supervised_fast\",\n embedding_algorithm=\"approx_wasserstein\"\n)\n```\n\n## \ud83d\udcca Experiments\n\nNEExT includes several pre-built experiments in the `examples/experiments` directory:\n\n### Node Sampling Experiment\nInvestigates the effect of node sampling on classifier accuracy:\n```bash\ncd examples/experiments\npython node_sampling_experiments.py\n```\n\n## \ud83d\udcdd Input File Formats\n\n### edges.csv\n```csv\nsrc_node_id,dest_node_id\n0,1\n1,2\n...\n```\n\n### node_graph_mapping.csv\n```csv\nnode_id,graph_id\n0,1\n1,1\n2,2\n...\n```\n\n### graph_labels.csv\n```csv\ngraph_id,graph_label\n1,0\n2,1\n...\n```\n\n## \ud83d\udee0\ufe0f Development\n\n### Running Tests\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=NEExT\n\n# Run specific test file\npytest tests/test_node_sampling.py\n```\n\n### Building Documentation\n```bash\ncd docs\nmake html\n```\n\n### Code Style\nThe project uses several tools for code quality:\n```bash\n# Format code\nblack .\n\n# Sort imports\nisort .\n\n# Check style\nflake8 .\n\n# Type checking\nmypy .\n```\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Run tests\n5. Submit a pull request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## \ud83d\udc65 Authors\n\n- Ash Dehghan - [ash.dehghan@gmail.com](mailto:ash.dehghan@gmail.com)\n\n## \ud83d\ude4f Acknowledgments\n\n- NetworkX team for the graph algorithms\n- iGraph team for the efficient graph operations\n- Scikit-learn team for machine learning components\n\n## \ud83d\udce7 Contact\n\nFor questions and support:\n- Email: ash@anomalypoint.com\n- GitHub Issues: [NEExT Issues](https://github.com/ashdehghan/NEExT/issues)\n\n## \ud83d\udd04 Version History\n\n- 0.1.0\n - Initial release\n - Basic graph operations\n - Node feature computation\n - Graph embeddings\n - Machine learning integration\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Network Embedding Experimentation Toolkit - A powerful framework for graph analysis, embedding computation, and machine learning on graph-structured data",
"version": "0.2.10",
"project_urls": {
"Documentation": "https://neext.readthedocs.io",
"Homepage": "https://github.com/ashdehghan/NEExT",
"Issues": "https://github.com/ashdehghan/NEExT/issues",
"Repository": "https://github.com/ashdehghan/NEExT"
},
"split_keywords": [
"embedding",
" graph",
" graph-ml",
" machine-learning",
" network",
" network-analysis"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "7f0b8dad70732bb1beb4e028d5126b29e60587d15c99b8e8b1d8fd3e547c3de8",
"md5": "2c4561b1ca24fbc01a6daef3879f71c9",
"sha256": "1af50636c6303776b06c50f213fbaf08db6286292f73b8613f880febf197b7e5"
},
"downloads": -1,
"filename": "neext-0.2.10-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2c4561b1ca24fbc01a6daef3879f71c9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.9",
"size": 52299,
"upload_time": "2025-07-11T02:45:34",
"upload_time_iso_8601": "2025-07-11T02:45:34.057087Z",
"url": "https://files.pythonhosted.org/packages/7f/0b/8dad70732bb1beb4e028d5126b29e60587d15c99b8e8b1d8fd3e547c3de8/neext-0.2.10-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "52b2cf37d6a388ae8e447e4da15f13163ec825e2edc514bdadf5727bd93b0bc6",
"md5": "77cb802f724040ab03b8d8e14b385235",
"sha256": "a61eaa6da61262215d3d914d14c2022f1b3fd762289f2bd0c287c3c4fe0766dc"
},
"downloads": -1,
"filename": "neext-0.2.10.tar.gz",
"has_sig": false,
"md5_digest": "77cb802f724040ab03b8d8e14b385235",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.9",
"size": 1924214,
"upload_time": "2025-07-11T02:45:36",
"upload_time_iso_8601": "2025-07-11T02:45:36.134314Z",
"url": "https://files.pythonhosted.org/packages/52/b2/cf37d6a388ae8e447e4da15f13163ec825e2edc514bdadf5727bd93b0bc6/neext-0.2.10.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-11 02:45:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ashdehghan",
"github_project": "NEExT",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "neext"
}