Name | arangoimport JSON |
Version |
0.1.11
JSON |
| download |
home_page | None |
Summary | A high-performance tool for importing Neo4j JSONL graph data exports into ArangoDB |
upload_time | 2025-02-04 04:23:45 |
maintainer | None |
docs_url | None |
author | Trent Leslie |
requires_python | >=3.11 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# ArangoImport
A high-performance tool for importing Neo4j JSONL graph data exports into ArangoDB.
## Features
- Import Neo4j database exports into ArangoDB
- Efficient parallel processing of large JSONL files
- Support for both local and Docker ArangoDB instances
- Dynamic memory management and batch sizing
- Connection pooling for optimal performance
- Progress tracking and detailed logging
- Available as both CLI tool and Python package
## Installation
```bash
pip install arangoimport
```
## Quick Start
1. Export your Neo4j database to JSONL:
```cypher
CALL apoc.export.json.all("path/to/export.jsonl", {useTypes: true})
```
2. Import into ArangoDB using either method:
### A. Command Line Interface (CLI)
After installation, the `arangoimport` command is available in your terminal:
```bash
# Show help and available options
arangoimport --help
# Import data with default settings (will prompt for password)
arangoimport import-data /path/to/neo4j_export.jsonl
# Import with custom settings
arangoimport import-data /path/to/neo4j_export.jsonl \
--db-name my_graph \
--host arangodb.example.com \
--port 8530 \
--username graph_user
```
### B. Python API
```python
from arangoimport.connection import ArangoConfig
from arangoimport.importer import parallel_load_data
# Configure database connection
db_config = ArangoConfig(
host="localhost",
port=8529,
username="root",
password="your_password", # Or use ARANGO_PASSWORD env var
db_name="db_name"
)
# Import the data
nodes, edges = parallel_load_data(
"path/to/neo4j_export.jsonl",
dict(db_config),
num_processes=None # None means use (CPU count - 1)
)
print(f"Successfully imported {nodes:,} nodes and {edges:,} edges!")
```
## Environment Variables
- `ARANGO_PASSWORD`: Database password (avoid hardcoding in scripts)
- `ARANGO_USER`: Username (default: root)
## CLI Options
### General Options
- `--file <string>`: The file to import ("-" for stdin)
- `--type <string>`: Input format (auto/csv/json/jsonl/tsv, default: auto)
- `--collection <string>`: Target collection name
- `--create-collection <boolean>`: Create collection if missing (default: false)
- `--create-collection-type <string>`: Collection type if created (document/edge, default: document)
- `--create-database <boolean>`: Create database if missing (default: false)
- `--threads <uint32>`: Number of parallel import threads (default: 32)
- `--batch-size <uint64>`: Data batch size in bytes (default: 8MB)
- `--progress <boolean>`: Show progress (default: true)
### Server Connection
- `--server.database <string>`: Target database (default: "_system")
- `--server.endpoint <string>`: Server endpoint (default: "http+tcp://127.0.0.1:8529")
- `--server.username <string>`: Username (default: "root")
- `--server.password <string>`: Password (prompted if not provided)
- `--server.authentication <boolean>`: Require authentication (default: true)
### Performance Options
- `--auto-rate-limit <boolean>`: Auto-adjust loading rate (default: false)
- `--compress-transfer <boolean>`: Compress data transfer (default: false)
- `--max-errors <uint64>`: Maximum errors before stopping (default: 20)
- `--skip-validation <boolean>`: Skip schema validation (default: false)
For a complete list of options, run:
```bash
arangoimport --help
```
## Docker Support
When using Docker, ensure your ArangoDB container is running:
```bash
docker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD=yourpassword arangodb:latest
```
Then import using either the CLI or Python API, pointing to the exposed port.
## Performance Tuning
The importer automatically optimizes for:
- Available system memory
- CPU cores (uses CPU count - 1 by default)
- Network conditions
You can fine-tune performance with:
- `--threads`: Control parallel threads
- `--batch-size`: Adjust batch size
- `--auto-rate-limit`: Enable automatic rate limiting
- `--compress-transfer`: Enable data compression
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "arangoimport",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": null,
"author": "Trent Leslie",
"author_email": "trent.leslie@phenomehealth.com",
"download_url": "https://files.pythonhosted.org/packages/64/ad/b9d4ec9b4717083b12bc3cdeb17255bbe7453237cadab8f3d51c075e48ed/arangoimport-0.1.11.tar.gz",
"platform": null,
"description": "# ArangoImport\n\nA high-performance tool for importing Neo4j JSONL graph data exports into ArangoDB.\n\n## Features\n\n- Import Neo4j database exports into ArangoDB\n- Efficient parallel processing of large JSONL files\n- Support for both local and Docker ArangoDB instances\n- Dynamic memory management and batch sizing\n- Connection pooling for optimal performance\n- Progress tracking and detailed logging\n- Available as both CLI tool and Python package\n\n## Installation\n\n```bash\npip install arangoimport\n```\n\n## Quick Start\n\n1. Export your Neo4j database to JSONL:\n ```cypher\n CALL apoc.export.json.all(\"path/to/export.jsonl\", {useTypes: true})\n ```\n\n2. Import into ArangoDB using either method:\n\n ### A. Command Line Interface (CLI)\n After installation, the `arangoimport` command is available in your terminal:\n ```bash\n # Show help and available options\n arangoimport --help\n \n # Import data with default settings (will prompt for password)\n arangoimport import-data /path/to/neo4j_export.jsonl\n \n # Import with custom settings\n arangoimport import-data /path/to/neo4j_export.jsonl \\\n --db-name my_graph \\\n --host arangodb.example.com \\\n --port 8530 \\\n --username graph_user\n ```\n\n ### B. Python API\n ```python\n from arangoimport.connection import ArangoConfig\n from arangoimport.importer import parallel_load_data\n \n # Configure database connection\n db_config = ArangoConfig(\n host=\"localhost\",\n port=8529,\n username=\"root\",\n password=\"your_password\", # Or use ARANGO_PASSWORD env var\n db_name=\"db_name\"\n )\n \n # Import the data\n nodes, edges = parallel_load_data(\n \"path/to/neo4j_export.jsonl\",\n dict(db_config),\n num_processes=None # None means use (CPU count - 1)\n )\n \n print(f\"Successfully imported {nodes:,} nodes and {edges:,} edges!\")\n ```\n\n## Environment Variables\n\n- `ARANGO_PASSWORD`: Database password (avoid hardcoding in scripts)\n- `ARANGO_USER`: Username (default: root)\n\n## CLI Options\n\n### General Options\n- `--file <string>`: The file to import (\"-\" for stdin)\n- `--type <string>`: Input format (auto/csv/json/jsonl/tsv, default: auto)\n- `--collection <string>`: Target collection name\n- `--create-collection <boolean>`: Create collection if missing (default: false)\n- `--create-collection-type <string>`: Collection type if created (document/edge, default: document)\n- `--create-database <boolean>`: Create database if missing (default: false)\n- `--threads <uint32>`: Number of parallel import threads (default: 32)\n- `--batch-size <uint64>`: Data batch size in bytes (default: 8MB)\n- `--progress <boolean>`: Show progress (default: true)\n\n### Server Connection\n- `--server.database <string>`: Target database (default: \"_system\")\n- `--server.endpoint <string>`: Server endpoint (default: \"http+tcp://127.0.0.1:8529\")\n- `--server.username <string>`: Username (default: \"root\")\n- `--server.password <string>`: Password (prompted if not provided)\n- `--server.authentication <boolean>`: Require authentication (default: true)\n\n### Performance Options\n- `--auto-rate-limit <boolean>`: Auto-adjust loading rate (default: false)\n- `--compress-transfer <boolean>`: Compress data transfer (default: false)\n- `--max-errors <uint64>`: Maximum errors before stopping (default: 20)\n- `--skip-validation <boolean>`: Skip schema validation (default: false)\n\nFor a complete list of options, run:\n```bash\narangoimport --help\n```\n\n## Docker Support\n\nWhen using Docker, ensure your ArangoDB container is running:\n```bash\ndocker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD=yourpassword arangodb:latest\n```\n\nThen import using either the CLI or Python API, pointing to the exposed port.\n\n## Performance Tuning\n\nThe importer automatically optimizes for:\n- Available system memory\n- CPU cores (uses CPU count - 1 by default)\n- Network conditions\n\nYou can fine-tune performance with:\n- `--threads`: Control parallel threads\n- `--batch-size`: Adjust batch size\n- `--auto-rate-limit`: Enable automatic rate limiting\n- `--compress-transfer`: Enable data compression\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.",
"bugtrack_url": null,
"license": null,
"summary": "A high-performance tool for importing Neo4j JSONL graph data exports into ArangoDB",
"version": "0.1.11",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8a1502cb3b20c042d5ec2e78486683480704ca3e5e4c6b95d8341c67d1bfae54",
"md5": "4c8514b71b74a79e2f191f96b817f05a",
"sha256": "162a3ca56dbc9318259ac5cd5673da526ea47d77ea439de35d51fb5643c4853d"
},
"downloads": -1,
"filename": "arangoimport-0.1.11-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4c8514b71b74a79e2f191f96b817f05a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 18197,
"upload_time": "2025-02-04T04:23:42",
"upload_time_iso_8601": "2025-02-04T04:23:42.902406Z",
"url": "https://files.pythonhosted.org/packages/8a/15/02cb3b20c042d5ec2e78486683480704ca3e5e4c6b95d8341c67d1bfae54/arangoimport-0.1.11-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "64adb9d4ec9b4717083b12bc3cdeb17255bbe7453237cadab8f3d51c075e48ed",
"md5": "4809f36b8f91b33935e82310301dceca",
"sha256": "0c94addde473b27b6423cabb41b1239fd4981dc799d3eeb2c324e1c10b1a5394"
},
"downloads": -1,
"filename": "arangoimport-0.1.11.tar.gz",
"has_sig": false,
"md5_digest": "4809f36b8f91b33935e82310301dceca",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 17483,
"upload_time": "2025-02-04T04:23:45",
"upload_time_iso_8601": "2025-02-04T04:23:45.233004Z",
"url": "https://files.pythonhosted.org/packages/64/ad/b9d4ec9b4717083b12bc3cdeb17255bbe7453237cadab8f3d51c075e48ed/arangoimport-0.1.11.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-04 04:23:45",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "arangoimport"
}