# IRIS Vector Graph
A knowledge graph system built on InterSystems IRIS that combines graph traversal, vector similarity search, and full-text search in a single database.
> **NEW**: [Interactive Demo Server](src/iris_demo_server/) showcasing fraud detection + biomedical capabilities
**Proven at Scale Across Industries**:
- **Financial Services**: Real-time fraud detection (130M+ transactions), bitemporal audit trails, <10ms queries
- **Biomedical Research**: Protein interaction networks (100K+ proteins), drug discovery, <50ms multi-hop queries
Same IRIS platform. Different domains. Powerful results.
---
## Table of Contents
- [Quick Start](#quick-start)
- [Option A: Fraud Detection (Financial Services)](#option-a-fraud-detection-financial-services)
- [Option B: Biomedical Graph (Life Sciences)](#option-b-biomedical-graph-life-sciences)
- [Use Cases by Industry](#use-cases-by-industry)
- [Architecture](#architecture)
- [Key Features](#key-features)
- [Performance](#performance)
- [Documentation](#documentation)
---
## Quick Start
**Two Deployment Modes**:
1. **External** (DEFAULT - simpler): Python app connects to IRIS via `iris.connect()`
2. **Embedded** (ADVANCED - optional): Python app runs INSIDE IRIS container
### Option A: Fraud Detection (Financial Services)
#### External Mode (Default - Simpler)
```bash
# 1. Start IRIS database
docker-compose up -d
# 2. Install Python dependencies
pip install iris-vector-graph[dev]
# 3. Load fraud schema
docker exec -i iris /usr/irissys/bin/irissession IRIS -U USER < sql/fraud/schema.sql
# 4. Start fraud API (external Python)
PYTHONPATH=src python -m iris_fraud_server
# Test fraud scoring API
curl -X POST http://localhost:8000/fraud/score \
-H 'Content-Type: application/json' \
-d '{"mode":"MLP","payer":"acct:test","device":"dev:laptop","amount":1000.0}'
```
#### Embedded Mode (Advanced - Optional)
```bash
# Run FastAPI INSIDE IRIS container (licensed IRIS required)
docker-compose -f docker-compose.fraud-embedded.yml up -d
# Test fraud scoring API (~2 min startup)
curl -X POST http://localhost:8100/fraud/score \
-H 'Content-Type: application/json' \
-d '{"mode":"MLP","payer":"acct:test","device":"dev:laptop","amount":1000.0}'
```
**What you get**:
- FastAPI fraud scoring (external `:8000` or embedded `:8100`)
- Bitemporal data (track when transactions happened vs. when you learned about them)
- Complete audit trails (regulatory compliance: SOX, MiFID II)
- Direct IRIS queries (no middleware)
**Learn more**: [`examples/bitemporal/README.md`](examples/bitemporal/README.md) - Fraud scenarios, chargeback defense, model tracking
---
### Option B: Biomedical Graph (Life Sciences)
#### External Mode (Default - Simpler)
```bash
# 1. Start IRIS database
docker-compose up -d
# 2. Install dependencies
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync && source .venv/bin/activate
# 3. Load STRING protein database (10K proteins, ~1 minute)
python scripts/performance/string_db_scale_test.py --max-proteins 10000
# 4. Start interactive demo server (external Python)
PYTHONPATH=src python -m iris_demo_server.app
# 5. Open browser
open http://localhost:8200/bio
```
#### Embedded Mode (Advanced - Optional)
```bash
# Run demo server INSIDE IRIS container (licensed IRIS required)
# Coming soon - currently only external mode supported for biomedical demo
```
**What you get**:
- **Interactive protein search** with vector similarity (EGFR, TP53, etc.)
- **D3.js graph visualization** with click-to-expand nodes showing interaction networks
- **Pathway analysis** between proteins using BFS graph traversal
- **Real STRING DB data** (10K proteins, 37K interactions)
- **<100ms queries** powered by direct IRIS integration (no API middleware)
- **20/20 contract tests passing** - production-ready biomedical demo
**Learn more**:
- [`docs/biomedical-demo-setup.md`](docs/biomedical-demo-setup.md) - Complete setup guide with scaling options
- [`biomedical/README.md`](biomedical/README.md) - Architecture and development patterns
---
## Use Cases by Industry
### Financial Services (IDFS)
| Use Case | Features | Performance |
|----------|----------|-------------|
| **Real-Time Fraud Detection** | Graph-based scoring, MLP models, device fingerprinting | <10ms scoring, 130M+ transactions |
| **Bitemporal Audit Trails** | Valid time vs. system time, chargeback defense, compliance | <10ms time-travel queries |
| **Late Arrival Detection** | Settlement delay analysis, backdated transaction flagging | Pattern detection across 130M events |
| **Regulatory Compliance** | SOX, GDPR, MiFID II, Basel III reporting | Complete audit trail preservation |
**Files**:
- `examples/bitemporal/` - Fraud scenarios, audit queries, Python API
- `sql/bitemporal/` - Schema (2 tables, 3 views, 8 indexes)
- `src/iris_fraud_server/` - FastAPI fraud scoring server
- `docker-compose.fraud-embedded.yml` - Licensed IRIS + embedded Python
**Quick Links**:
- [Bitemporal Fraud Detection README](examples/bitemporal/README.md)
- [Fraud API Documentation](src/iris_fraud_server/README.md)
---
### Biomedical Research
| Use Case | Features | Performance |
|----------|----------|-------------|
| **Protein Interaction Networks** | STRING DB integration, pathway analysis, vector similarity | <50ms multi-hop queries (100K+ proteins) |
| **Drug Discovery** | Compound similarity, target identification, graph analytics | <10ms vector search (HNSW) |
| **Literature Mining** | Hybrid search (embeddings + BM25), entity extraction | RRF fusion, sub-second queries |
| **Pathway Analysis** | Multi-hop traversal, PageRank, connected components | NetworkX integration, embedded Python |
**Files**:
- `biomedical/` - Protein queries, pathway examples
- `sql/schema.sql` - Graph schema (nodes, edges, properties, embeddings)
- `iris_vector_graph_core/` - Core Python graph engine
- `docker-compose.acorn.yml` - ACORN-1 with HNSW optimization
**Quick Links**:
- [Biomedical Examples](biomedical/README.md)
- [STRING DB Integration](docs/setup/STRING_DB.md)
---
### Graph Algorithms (TSP Examples)
Two standalone implementations of the **Traveling Salesman Problem** demonstrating graph algorithms on IRIS:
#### Option A: Python + NetworkX (Biomedical)
Find optimal pathways through protein interaction networks:
```bash
# Test with 10 cancer-related proteins
python scripts/algorithms/tsp_demo.py --proteins 10 --compare-methods
```
**Algorithms**: Greedy (1ms), Christofides (15ms), 2-opt (8ms)
**Use case**: Optimize order to study protein interactions in cancer pathways
#### Option B: ObjectScript (Healthcare Interoperability)
Optimize caregiver routes for home healthcare:
```bash
# Load sample data (8 patients, 26 travel edges)
docker exec -i iris /usr/irissys/bin/irissession IRIS -U USER < sql/caregiver_routing_demo.sql
# Run optimization demo (IRIS Terminal)
Do ^TestCaregiverRouter
```
**Performance**: <2ms for 8-patient routes
**Integration**: Direct Business Process method calls
**Impact**: 53% travel time reduction (75min → 35min)
**What you get**:
- **Python approach**: NetworkX integration, multiple algorithms, FastAPI endpoint example
- **ObjectScript approach**: Zero dependencies, Interoperability production integration, bitemporal audit
- **Comprehensive docs**: Neo4j comparison, performance benchmarks, real-world use cases
**Files**:
- `scripts/algorithms/tsp_demo.py` - Python demo (works with STRING protein data)
- `iris/src/Graph/CaregiverRouter.cls` - ObjectScript TSP optimizer
- `iris/src/Graph/ScheduleOptimizationProcess.cls` - Business Process integration
- `sql/caregiver_routing_demo.sql` - Sample healthcare data
**Learn more**:
- [`docs/algorithms/TSP_ANALYSIS.md`](docs/algorithms/TSP_ANALYSIS.md) - Deep dive and Neo4j comparison
- [`docs/algorithms/TSP_IMPLEMENTATION_SUMMARY.md`](docs/algorithms/TSP_IMPLEMENTATION_SUMMARY.md) - Overview and benchmarks
- [`docs/examples/CAREGIVER_ROUTING_DEMO.md`](docs/examples/CAREGIVER_ROUTING_DEMO.md) - Step-by-step tutorial
---
## Architecture
**Deployment Options**:
- **External (Default)**: Python app connects to IRIS via `iris.connect()` - simpler setup, easier debugging
- **Embedded (Advanced)**: Python app runs inside IRIS container - maximum performance, requires licensed IRIS
```
External Deployment (DEFAULT) Embedded Deployment (OPTIONAL)
┌────────────────────────┐ ┌──────────────────────────────┐
│ FastAPI Server │ │ IRIS Container │
│ (external Python) │ │ ┌──────────────────────────┐ │
│ │ │ │ FastAPI Server │ │
│ iris.connect() ─────┼──────────┤►│ (/usr/irissys/bin/ │ │
│ to localhost:1972 │ │ │ irispython) │ │
└────────────────────────┘ │ └──────────────────────────┘ │
│ ┌──────────────────────────┐ │
│ │ IRIS Database Engine │ │
│ │ (Bitemporal/Graph/Vector)│ │
│ └──────────────────────────┘ │
└──────────────────────────────┘
Same Platform: InterSystems IRIS
Same Features: Vector Search, Graph Traversal, Bitemporal Audit
Different Domains: Finance vs. Life Sciences
```
**Core Components**:
- **IRIS Globals**: Append-only storage (perfect for audit trails + graph data)
- **Embedded Python**: Run ML models and graph algorithms in-database
- **SQL Procedures**: `kg_KNN_VEC` (vector search), `kg_RRF_FUSE` (hybrid search)
- **HNSW Indexing**: 100x faster vector similarity (requires IRIS 2025.3+ or ACORN-1)
---
## Key Features
### Cross-Domain Capabilities
| Feature | Financial Services Use | Biomedical Use |
|---------|------------------------|----------------|
| **Embedded Python** | Fraud ML models in-database | Graph analytics (PageRank, etc.) |
| **Temporal Queries** | Bitemporal audit ("what did we know when?") | Time-series biomarker analysis |
| **Graph Traversal** | Fraud ring detection (multi-hop) | Protein interaction pathways |
| **Vector Search** | Transaction similarity | Protein/compound similarity |
| **Partial Indexes** | `WHERE system_to IS NULL` (10x faster) | `WHERE label = 'protein'` |
### IRIS-Native Optimizations
- **Globals Storage**: Append-only (no UPDATE contention)
- **Partial Indexes**: Filter at index level (`WHERE system_to IS NULL`)
- **Temporal Views**: Pre-filter current versions
- **Foreign Key Constraints**: Referential integrity across graph
- **HNSW Vector Index**: 100x faster than flat search (ACORN-1)
---
## Performance
### Financial Services (Fraud Detection)
| Metric | Community IRIS | Licensed IRIS |
|--------|---------------|---------------|
| **Transactions** | 30M | 130M |
| **Database Size** | 5.3GB | 22.1GB |
| **Fraud Scoring** | <10ms | <10ms |
| **Bitemporal Queries** | <10ms (indexed) | <10ms (indexed) |
| **Time-Travel Queries** | <50ms | <50ms |
| **Late Arrival Detection** | Pattern search across 30M | Pattern search across 130M |
### Biomedical (Protein Networks)
| Metric | Standard IRIS | ACORN-1 (HNSW) |
|--------|--------------|----------------|
| **Vector Search** | 5800ms (flat) | 1.7ms (HNSW) |
| **Multi-hop Queries** | <50ms | <50ms |
| **Hybrid Search (RRF)** | <100ms | <20ms |
| **Graph Analytics** | NetworkX integration | Embedded Python |
**Tested At Scale**:
- ✅ 130M fraud transactions (licensed IRIS)
- ✅ 100K+ protein interactions (STRING DB)
- ✅ 768-dimensional embeddings (biomedical models)
---
## Documentation
### Getting Started
- [Fraud Detection Quick Start](examples/bitemporal/README.md)
- [Biomedical Graph Setup](biomedical/README.md)
- [Installation Guide](docs/setup/INSTALLATION.md)
### Architecture & Design
- [System Architecture](docs/architecture/ACTUAL_SCHEMA.md)
- [IRIS-Native Features](docs/architecture/IRIS_NATIVE.md)
- [Performance Benchmarks](docs/performance/)
### API Reference
- [REST API](docs/api/REST_API.md)
- [Python SDK](iris_vector_graph_core/README.md)
- [SQL Procedures](sql/operators.sql)
### Examples
- [Bitemporal Fraud Detection](examples/bitemporal/)
- [Protein Interaction Networks](biomedical/)
- [Migration to NodePK](scripts/migrations/)
---
## Repository Structure
```
sql/
schema.sql # Core graph schema
bitemporal/ # Fraud detection schema
fraud/ # Transaction tables
examples/
bitemporal/ # Financial services (fraud, audit)
biomedical/ # Life sciences (proteins, pathways)
iris_vector_graph_core/ # Python graph engine
src/iris_fraud_server/ # FastAPI fraud API
scripts/
fraud/ # 130M loader, benchmarks
migrations/ # NodePK migration
docker/
Dockerfile.fraud-embedded # Licensed IRIS + fraud API
start-fraud-server.sh # Embedded Python startup
```
---
## License
MIT License - See [LICENSE](LICENSE)
---
## Contributing
We welcome contributions! This repo demonstrates IRIS versatility across:
- **Financial Services**: Fraud detection, bitemporal data, regulatory compliance
- **Biomedical Research**: Protein networks, drug discovery, literature mining
Feel free to add examples from other domains or improve existing implementations.
---
**Production-Ready**: Proven with 130M+ financial transactions and 100K+ biomedical interactions on InterSystems IRIS.
Raw data
{
"_id": null,
"home_page": null,
"name": "iris-vector-graph",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "Thomas Dyar <thomas.dyar@intersystems.com>",
"keywords": "biomedical, bitemporal, fraud-detection, graph-database, hnsw, intersystems-iris, knowledge-graph, protein-interactions, rag, vector-search",
"author": null,
"author_email": "Thomas Dyar <thomas.dyar@intersystems.com>",
"download_url": "https://files.pythonhosted.org/packages/5b/a6/bbfd5608423de7024e2200be741721a6a4685ccc54d92d4e531c5c2f584c/iris_vector_graph-1.0.0.tar.gz",
"platform": null,
"description": "# IRIS Vector Graph\n\nA knowledge graph system built on InterSystems IRIS that combines graph traversal, vector similarity search, and full-text search in a single database.\n\n> **NEW**: [Interactive Demo Server](src/iris_demo_server/) showcasing fraud detection + biomedical capabilities\n\n**Proven at Scale Across Industries**:\n- **Financial Services**: Real-time fraud detection (130M+ transactions), bitemporal audit trails, <10ms queries\n- **Biomedical Research**: Protein interaction networks (100K+ proteins), drug discovery, <50ms multi-hop queries\n\nSame IRIS platform. Different domains. Powerful results.\n\n---\n\n## Table of Contents\n\n- [Quick Start](#quick-start)\n - [Option A: Fraud Detection (Financial Services)](#option-a-fraud-detection-financial-services)\n - [Option B: Biomedical Graph (Life Sciences)](#option-b-biomedical-graph-life-sciences)\n- [Use Cases by Industry](#use-cases-by-industry)\n- [Architecture](#architecture)\n- [Key Features](#key-features)\n- [Performance](#performance)\n- [Documentation](#documentation)\n\n---\n\n## Quick Start\n\n**Two Deployment Modes**:\n1. **External** (DEFAULT - simpler): Python app connects to IRIS via `iris.connect()`\n2. **Embedded** (ADVANCED - optional): Python app runs INSIDE IRIS container\n\n### Option A: Fraud Detection (Financial Services)\n\n#### External Mode (Default - Simpler)\n\n```bash\n# 1. Start IRIS database\ndocker-compose up -d\n\n# 2. Install Python dependencies\npip install iris-vector-graph[dev]\n\n# 3. Load fraud schema\ndocker exec -i iris /usr/irissys/bin/irissession IRIS -U USER < sql/fraud/schema.sql\n\n# 4. Start fraud API (external Python)\nPYTHONPATH=src python -m iris_fraud_server\n\n# Test fraud scoring API\ncurl -X POST http://localhost:8000/fraud/score \\\n -H 'Content-Type: application/json' \\\n -d '{\"mode\":\"MLP\",\"payer\":\"acct:test\",\"device\":\"dev:laptop\",\"amount\":1000.0}'\n```\n\n#### Embedded Mode (Advanced - Optional)\n\n```bash\n# Run FastAPI INSIDE IRIS container (licensed IRIS required)\ndocker-compose -f docker-compose.fraud-embedded.yml up -d\n\n# Test fraud scoring API (~2 min startup)\ncurl -X POST http://localhost:8100/fraud/score \\\n -H 'Content-Type: application/json' \\\n -d '{\"mode\":\"MLP\",\"payer\":\"acct:test\",\"device\":\"dev:laptop\",\"amount\":1000.0}'\n```\n\n**What you get**:\n- FastAPI fraud scoring (external `:8000` or embedded `:8100`)\n- Bitemporal data (track when transactions happened vs. when you learned about them)\n- Complete audit trails (regulatory compliance: SOX, MiFID II)\n- Direct IRIS queries (no middleware)\n\n**Learn more**: [`examples/bitemporal/README.md`](examples/bitemporal/README.md) - Fraud scenarios, chargeback defense, model tracking\n\n---\n\n### Option B: Biomedical Graph (Life Sciences)\n\n#### External Mode (Default - Simpler)\n\n```bash\n# 1. Start IRIS database\ndocker-compose up -d\n\n# 2. Install dependencies\ncurl -LsSf https://astral.sh/uv/install.sh | sh\nuv sync && source .venv/bin/activate\n\n# 3. Load STRING protein database (10K proteins, ~1 minute)\npython scripts/performance/string_db_scale_test.py --max-proteins 10000\n\n# 4. Start interactive demo server (external Python)\nPYTHONPATH=src python -m iris_demo_server.app\n\n# 5. Open browser\nopen http://localhost:8200/bio\n```\n\n#### Embedded Mode (Advanced - Optional)\n\n```bash\n# Run demo server INSIDE IRIS container (licensed IRIS required)\n# Coming soon - currently only external mode supported for biomedical demo\n```\n\n**What you get**:\n- **Interactive protein search** with vector similarity (EGFR, TP53, etc.)\n- **D3.js graph visualization** with click-to-expand nodes showing interaction networks\n- **Pathway analysis** between proteins using BFS graph traversal\n- **Real STRING DB data** (10K proteins, 37K interactions)\n- **<100ms queries** powered by direct IRIS integration (no API middleware)\n- **20/20 contract tests passing** - production-ready biomedical demo\n\n**Learn more**:\n- [`docs/biomedical-demo-setup.md`](docs/biomedical-demo-setup.md) - Complete setup guide with scaling options\n- [`biomedical/README.md`](biomedical/README.md) - Architecture and development patterns\n\n---\n\n## Use Cases by Industry\n\n### Financial Services (IDFS)\n\n| Use Case | Features | Performance |\n|----------|----------|-------------|\n| **Real-Time Fraud Detection** | Graph-based scoring, MLP models, device fingerprinting | <10ms scoring, 130M+ transactions |\n| **Bitemporal Audit Trails** | Valid time vs. system time, chargeback defense, compliance | <10ms time-travel queries |\n| **Late Arrival Detection** | Settlement delay analysis, backdated transaction flagging | Pattern detection across 130M events |\n| **Regulatory Compliance** | SOX, GDPR, MiFID II, Basel III reporting | Complete audit trail preservation |\n\n**Files**:\n- `examples/bitemporal/` - Fraud scenarios, audit queries, Python API\n- `sql/bitemporal/` - Schema (2 tables, 3 views, 8 indexes)\n- `src/iris_fraud_server/` - FastAPI fraud scoring server\n- `docker-compose.fraud-embedded.yml` - Licensed IRIS + embedded Python\n\n**Quick Links**:\n- [Bitemporal Fraud Detection README](examples/bitemporal/README.md)\n- [Fraud API Documentation](src/iris_fraud_server/README.md)\n\n---\n\n### Biomedical Research\n\n| Use Case | Features | Performance |\n|----------|----------|-------------|\n| **Protein Interaction Networks** | STRING DB integration, pathway analysis, vector similarity | <50ms multi-hop queries (100K+ proteins) |\n| **Drug Discovery** | Compound similarity, target identification, graph analytics | <10ms vector search (HNSW) |\n| **Literature Mining** | Hybrid search (embeddings + BM25), entity extraction | RRF fusion, sub-second queries |\n| **Pathway Analysis** | Multi-hop traversal, PageRank, connected components | NetworkX integration, embedded Python |\n\n**Files**:\n- `biomedical/` - Protein queries, pathway examples\n- `sql/schema.sql` - Graph schema (nodes, edges, properties, embeddings)\n- `iris_vector_graph_core/` - Core Python graph engine\n- `docker-compose.acorn.yml` - ACORN-1 with HNSW optimization\n\n**Quick Links**:\n- [Biomedical Examples](biomedical/README.md)\n- [STRING DB Integration](docs/setup/STRING_DB.md)\n\n---\n\n### Graph Algorithms (TSP Examples)\n\nTwo standalone implementations of the **Traveling Salesman Problem** demonstrating graph algorithms on IRIS:\n\n#### Option A: Python + NetworkX (Biomedical)\n\nFind optimal pathways through protein interaction networks:\n\n```bash\n# Test with 10 cancer-related proteins\npython scripts/algorithms/tsp_demo.py --proteins 10 --compare-methods\n```\n\n**Algorithms**: Greedy (1ms), Christofides (15ms), 2-opt (8ms)\n**Use case**: Optimize order to study protein interactions in cancer pathways\n\n#### Option B: ObjectScript (Healthcare Interoperability)\n\nOptimize caregiver routes for home healthcare:\n\n```bash\n# Load sample data (8 patients, 26 travel edges)\ndocker exec -i iris /usr/irissys/bin/irissession IRIS -U USER < sql/caregiver_routing_demo.sql\n\n# Run optimization demo (IRIS Terminal)\nDo ^TestCaregiverRouter\n```\n\n**Performance**: <2ms for 8-patient routes\n**Integration**: Direct Business Process method calls\n**Impact**: 53% travel time reduction (75min \u2192 35min)\n\n**What you get**:\n- **Python approach**: NetworkX integration, multiple algorithms, FastAPI endpoint example\n- **ObjectScript approach**: Zero dependencies, Interoperability production integration, bitemporal audit\n- **Comprehensive docs**: Neo4j comparison, performance benchmarks, real-world use cases\n\n**Files**:\n- `scripts/algorithms/tsp_demo.py` - Python demo (works with STRING protein data)\n- `iris/src/Graph/CaregiverRouter.cls` - ObjectScript TSP optimizer\n- `iris/src/Graph/ScheduleOptimizationProcess.cls` - Business Process integration\n- `sql/caregiver_routing_demo.sql` - Sample healthcare data\n\n**Learn more**:\n- [`docs/algorithms/TSP_ANALYSIS.md`](docs/algorithms/TSP_ANALYSIS.md) - Deep dive and Neo4j comparison\n- [`docs/algorithms/TSP_IMPLEMENTATION_SUMMARY.md`](docs/algorithms/TSP_IMPLEMENTATION_SUMMARY.md) - Overview and benchmarks\n- [`docs/examples/CAREGIVER_ROUTING_DEMO.md`](docs/examples/CAREGIVER_ROUTING_DEMO.md) - Step-by-step tutorial\n\n---\n\n## Architecture\n\n**Deployment Options**:\n- **External (Default)**: Python app connects to IRIS via `iris.connect()` - simpler setup, easier debugging\n- **Embedded (Advanced)**: Python app runs inside IRIS container - maximum performance, requires licensed IRIS\n\n```\nExternal Deployment (DEFAULT) Embedded Deployment (OPTIONAL)\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 FastAPI Server \u2502 \u2502 IRIS Container \u2502\n\u2502 (external Python) \u2502 \u2502 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u2502\n\u2502 \u2502 \u2502 \u2502 FastAPI Server \u2502 \u2502\n\u2502 iris.connect() \u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\u25ba\u2502 (/usr/irissys/bin/ \u2502 \u2502\n\u2502 to localhost:1972 \u2502 \u2502 \u2502 irispython) \u2502 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2502 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2502\n \u2502 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u2502\n \u2502 \u2502 IRIS Database Engine \u2502 \u2502\n \u2502 \u2502 (Bitemporal/Graph/Vector)\u2502 \u2502\n \u2502 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2502\n \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n\n Same Platform: InterSystems IRIS\n Same Features: Vector Search, Graph Traversal, Bitemporal Audit\n Different Domains: Finance vs. Life Sciences\n```\n\n**Core Components**:\n- **IRIS Globals**: Append-only storage (perfect for audit trails + graph data)\n- **Embedded Python**: Run ML models and graph algorithms in-database\n- **SQL Procedures**: `kg_KNN_VEC` (vector search), `kg_RRF_FUSE` (hybrid search)\n- **HNSW Indexing**: 100x faster vector similarity (requires IRIS 2025.3+ or ACORN-1)\n\n---\n\n## Key Features\n\n### Cross-Domain Capabilities\n\n| Feature | Financial Services Use | Biomedical Use |\n|---------|------------------------|----------------|\n| **Embedded Python** | Fraud ML models in-database | Graph analytics (PageRank, etc.) |\n| **Temporal Queries** | Bitemporal audit (\"what did we know when?\") | Time-series biomarker analysis |\n| **Graph Traversal** | Fraud ring detection (multi-hop) | Protein interaction pathways |\n| **Vector Search** | Transaction similarity | Protein/compound similarity |\n| **Partial Indexes** | `WHERE system_to IS NULL` (10x faster) | `WHERE label = 'protein'` |\n\n### IRIS-Native Optimizations\n\n- **Globals Storage**: Append-only (no UPDATE contention)\n- **Partial Indexes**: Filter at index level (`WHERE system_to IS NULL`)\n- **Temporal Views**: Pre-filter current versions\n- **Foreign Key Constraints**: Referential integrity across graph\n- **HNSW Vector Index**: 100x faster than flat search (ACORN-1)\n\n---\n\n## Performance\n\n### Financial Services (Fraud Detection)\n\n| Metric | Community IRIS | Licensed IRIS |\n|--------|---------------|---------------|\n| **Transactions** | 30M | 130M |\n| **Database Size** | 5.3GB | 22.1GB |\n| **Fraud Scoring** | <10ms | <10ms |\n| **Bitemporal Queries** | <10ms (indexed) | <10ms (indexed) |\n| **Time-Travel Queries** | <50ms | <50ms |\n| **Late Arrival Detection** | Pattern search across 30M | Pattern search across 130M |\n\n### Biomedical (Protein Networks)\n\n| Metric | Standard IRIS | ACORN-1 (HNSW) |\n|--------|--------------|----------------|\n| **Vector Search** | 5800ms (flat) | 1.7ms (HNSW) |\n| **Multi-hop Queries** | <50ms | <50ms |\n| **Hybrid Search (RRF)** | <100ms | <20ms |\n| **Graph Analytics** | NetworkX integration | Embedded Python |\n\n**Tested At Scale**:\n- \u2705 130M fraud transactions (licensed IRIS)\n- \u2705 100K+ protein interactions (STRING DB)\n- \u2705 768-dimensional embeddings (biomedical models)\n\n---\n\n## Documentation\n\n### Getting Started\n- [Fraud Detection Quick Start](examples/bitemporal/README.md)\n- [Biomedical Graph Setup](biomedical/README.md)\n- [Installation Guide](docs/setup/INSTALLATION.md)\n\n### Architecture & Design\n- [System Architecture](docs/architecture/ACTUAL_SCHEMA.md)\n- [IRIS-Native Features](docs/architecture/IRIS_NATIVE.md)\n- [Performance Benchmarks](docs/performance/)\n\n### API Reference\n- [REST API](docs/api/REST_API.md)\n- [Python SDK](iris_vector_graph_core/README.md)\n- [SQL Procedures](sql/operators.sql)\n\n### Examples\n- [Bitemporal Fraud Detection](examples/bitemporal/)\n- [Protein Interaction Networks](biomedical/)\n- [Migration to NodePK](scripts/migrations/)\n\n---\n\n## Repository Structure\n\n```\nsql/\n schema.sql # Core graph schema\n bitemporal/ # Fraud detection schema\n fraud/ # Transaction tables\n\nexamples/\n bitemporal/ # Financial services (fraud, audit)\n\nbiomedical/ # Life sciences (proteins, pathways)\n\niris_vector_graph_core/ # Python graph engine\n\nsrc/iris_fraud_server/ # FastAPI fraud API\n\nscripts/\n fraud/ # 130M loader, benchmarks\n migrations/ # NodePK migration\n\ndocker/\n Dockerfile.fraud-embedded # Licensed IRIS + fraud API\n start-fraud-server.sh # Embedded Python startup\n```\n\n---\n\n## License\n\nMIT License - See [LICENSE](LICENSE)\n\n---\n\n## Contributing\n\nWe welcome contributions! This repo demonstrates IRIS versatility across:\n- **Financial Services**: Fraud detection, bitemporal data, regulatory compliance\n- **Biomedical Research**: Protein networks, drug discovery, literature mining\n\nFeel free to add examples from other domains or improve existing implementations.\n\n---\n\n**Production-Ready**: Proven with 130M+ financial transactions and 100K+ biomedical interactions on InterSystems IRIS.\n",
"bugtrack_url": null,
"license": null,
"summary": "High-performance biomedical knowledge graph powered by InterSystems IRIS",
"version": "1.0.0",
"project_urls": {
"Documentation": "https://github.com/intersystems-community/iris-vector-graph/tree/main/docs",
"Homepage": "https://github.com/intersystems-community/iris-vector-graph",
"Issues": "https://github.com/intersystems-community/iris-vector-graph/issues",
"Repository": "https://github.com/intersystems-community/iris-vector-graph"
},
"split_keywords": [
"biomedical",
" bitemporal",
" fraud-detection",
" graph-database",
" hnsw",
" intersystems-iris",
" knowledge-graph",
" protein-interactions",
" rag",
" vector-search"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "9b50da49a06478df249e26fa1664b4a4725276716a32f7185873157363d71ae8",
"md5": "0b118df2d6937a6cd1578b6498cf8cae",
"sha256": "e78e7694ff115ecac40f8bd5d5550cc0a33d6aafb505f91b7fda303726974bdf"
},
"downloads": -1,
"filename": "iris_vector_graph-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0b118df2d6937a6cd1578b6498cf8cae",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 33517,
"upload_time": "2025-11-05T21:44:26",
"upload_time_iso_8601": "2025-11-05T21:44:26.264317Z",
"url": "https://files.pythonhosted.org/packages/9b/50/da49a06478df249e26fa1664b4a4725276716a32f7185873157363d71ae8/iris_vector_graph-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "5ba6bbfd5608423de7024e2200be741721a6a4685ccc54d92d4e531c5c2f584c",
"md5": "1dd011d689320979744d8fc8172ac8b9",
"sha256": "488f89ea9b02617c32e4cefb2e835a474fc19e399859177c2cf14c7788a50c7f"
},
"downloads": -1,
"filename": "iris_vector_graph-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "1dd011d689320979744d8fc8172ac8b9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 724475,
"upload_time": "2025-11-05T21:44:27",
"upload_time_iso_8601": "2025-11-05T21:44:27.608415Z",
"url": "https://files.pythonhosted.org/packages/5b/a6/bbfd5608423de7024e2200be741721a6a4685ccc54d92d4e531c5c2f584c/iris_vector_graph-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-05 21:44:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "intersystems-community",
"github_project": "iris-vector-graph",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "iris-vector-graph"
}