# TDCSophiread
High-performance Python and C++ library for processing TPX3 neutron imaging data with **96M+ hits/sec** throughput. TDCSophiread provides complete hit extraction and neutron clustering capabilities using TDC-only timing (detector-expert approved).
## 🚀 Key Features
- **🏃 High Performance**: **96M+ hits/sec** with Intel TBB parallel processing
- **🧠 Smart Clustering**: 4 algorithms (ABS, Graph, DBSCAN, Grid) for neutron event reconstruction
- **⚡ Zero-Copy Processing**: Memory-efficient temporal batching with structured numpy arrays
- **🔍 TDC-Only Timing**: Detector-expert-approved approach (no unreliable GDC)
- **🐍 Python Integration**: Complete Python API with Jupyter notebook examples
- **📊 Production Ready**: Real-world performance validated on 12GB datasets
## Quick Start
### Installation
```bash
# Clone repository
git clone https://github.com/ornlneutronimaging/mcpevent2hist.git
cd mcpevent2hist/sophiread
# Set up environment (pixi recommended)
pixi install && pixi shell
# Build and install
pixi run cmake -B build && pixi run cmake --build build && pip install -e .
```
### Get Sample Data (12GB)
```bash
# Download real TPX3 datasets for testing
git submodule update --init notebooks/data
```
### Python Usage
```python
import tdcsophiread
# 1. Extract hits from TPX3 file
hits = tdcsophiread.process_tpx3("data.tpx3", parallel=True)
print(f"Extracted {len(hits):,} hits")
# 2. Process hits to neutrons using clustering
neutrons = tdcsophiread.process_hits_to_neutrons(hits)
print(f"Found {len(neutrons):,} neutrons")
# 3. Try different clustering algorithms
config = tdcsophiread.NeutronProcessingConfig.venus_defaults()
config.clustering.algorithm = "dbscan" # or "abs", "graph", "grid"
neutrons = tdcsophiread.process_hits_to_neutrons(hits, config)
```
### Performance Monitoring
```python
# Get detailed performance statistics
config = tdcsophiread.NeutronProcessingConfig.venus_defaults()
processor = tdcsophiread.TemporalNeutronProcessor(config)
neutrons = processor.processHits(hits)
stats = processor.getStatistics()
print(f"Hit rate: {stats.hits_per_second/1e6:.1f} M hits/sec")
print(f"Neutron efficiency: {stats.neutron_efficiency:.3f}")
print(f"Parallel efficiency: {stats.parallel_efficiency:.2f}")
```
## 🧬 Architecture
TDCSophiread implements a modern, high-performance pipeline with parallel temporal processing:
```mermaid
flowchart TD
A[TPX3 Raw Data] --> B[TDCProcessor]
B --> |Memory-mapped I/O<br/>Section-aware processing| C[std::vector<TDCHit><br/>Temporally ordered hits]
C --> D[TemporalNeutronProcessor]
subgraph TemporalNeutronProcessor
direction TB
E[Phase 1: Statistical Analysis<br/>• Analyze hit distribution<br/>• Calculate optimal batch sizes<br/>• Determine overlaps]
E --> F[Phase 2: Parallel Worker Pool]
subgraph ParallelWorkerPool
direction LR
Worker0[Worker 0]
Worker1[Worker 1]
WorkerN[Worker N]
end
subgraph Worker0Details
direction TB
G1[Hit Clustering<br/>Algorithm Selection]
G1 --> G1a["ABS<br/>O(n) - Fastest"]
G1 --> G1b["Graph<br/>O(n log n) - Balanced"]
G1 --> G1c["DBSCAN<br/>O(n log n) - Noise handling"]
G1 --> G1d["Grid<br/>O(n) - Geometry optimized"]
G1a --> G2[Neutron Extraction<br/>TOT-weighted centroids]
G1b --> G2
G1c --> G2
G1d --> G2
end
subgraph Worker1Details
direction TB
H1[Hit Clustering] --> H2[Neutron Extraction]
end
subgraph WorkerNDetails
direction TB
I1[Hit Clustering] --> I2[Neutron Extraction]
end
Worker0 --> Worker0Details
Worker1 --> Worker1Details
WorkerN --> WorkerNDetails
F --> J[Phase 3: Result Aggregation<br/>• Combine worker results<br/>• Remove overlap duplicates<br/>• Generate statistics]
end
J --> K[std::vector<TDCNeutron><br/>Final neutron events<br/>96M+ hits/sec performance]
style A fill:#e1f5fe
style K fill:#e8f5e8
style TemporalNeutronProcessor fill:#f3e5f5
style G1a fill:#ffecb3
style G1b fill:#fff3e0
style G1c fill:#fce4ec
style G1d fill:#e0f2f1
```
### Phase 1: Hit Extraction
- **Memory-mapped I/O**: Efficient processing of large TPX3 files
- **Section-aware processing**: Respects TPX3 data structure constraints
- **TDC state propagation**: Sequential processing for reliable timing
- **Parallel chunk processing**: Intel TBB for maximum throughput
### Phase 2: Temporal Neutron Processing
- **Statistical analysis**: Optimal batching based on hit distribution
- **Parallel worker pool**: Each worker has dedicated algorithm instances
- **4 clustering algorithms**: ABS, Graph, DBSCAN, Grid with different performance characteristics
- **Zero-copy processing**: Iterator-based interfaces minimize memory overhead
### Phase 3: Result Aggregation
- **Parallel result combination**: Efficient merging from multiple workers
- **Overlap deduplication**: Remove duplicate neutrons from batch boundaries
- **Performance statistics**: Detailed metrics for optimization
## 🎯 Clustering Algorithms
| Algorithm | Performance | Use Case | Complexity |
|-----------|-------------|----------|------------|
| **ABS** | Fastest | General purpose, high throughput | O(n) |
| **Graph** | Fast | Balanced speed/accuracy | O(n log n) |
| **DBSCAN** | Medium | Noise handling, complex patterns | O(n log n) |
| **Grid** | Fast | Detector geometry optimization | O(n) |
### Algorithm Configuration
```python
config = tdcsophiread.NeutronProcessingConfig.venus_defaults()
# ABS (Adaptive Bucket Sort) - Fastest
config.clustering.algorithm = "abs"
config.clustering.abs.radius = 5.0
config.clustering.abs.neutron_correlation_window = 75.0 # nanoseconds
# DBSCAN - Best noise handling
config.clustering.algorithm = "dbscan"
config.clustering.dbscan.epsilon = 4.0
config.clustering.dbscan.min_points = 3
# Process with custom configuration
neutrons = tdcsophiread.process_hits_to_neutrons(hits, config)
```
## 📊 Performance
### Measured Performance (Real Hardware)
| System | Hit Rate | Clustering | Notes |
|--------|----------|------------|-------|
| M2 Max | 20M+ hits/sec | ABS | Development system |
| AMD EPYC 9174F | 96M+ hits/sec | ABS | Production target |
| Memory Usage | ~40-60 bytes/hit | All | Including clustering |
### Performance by File Size
- **< 100MB**: 20-40 M hits/sec (single-threaded sufficient)
- **100MB-1GB**: 50-80 M hits/sec (parallel recommended)
- **1GB-10GB**: 80-96 M hits/sec (optimal parallel)
- **> 10GB**: 90-96 M hits/sec (streaming mode)
## 🔧 Build System
### Development Workflow
```bash
# Core workflow
pixi run build # Configure and build C++
pixi run test # Run C++ tests
pixi run install # Install Python bindings (editable)
pixi run python-test # Test Python import
# Data setup (12GB sample data)
pixi run setup-data # Download sample TPX3 files
pixi run notebooks # Launch Jupyter with notebooks
```
### Build Options
```bash
# Debug build (if needed)
cmake -B build -DCMAKE_BUILD_TYPE=Debug
# Legacy components (not recommended)
cmake -B build -DBUILD_LEGACY=ON
```
**⚠️ Legacy Warning**: Legacy components use unreliable GDC timing and will be removed in the next major release.
## 📚 Documentation & Examples
### Jupyter Notebooks (Real Data)
```bash
# Start Jupyter with sample notebooks
pixi run notebooks
```
**Available Notebooks:**
- `notebooks/hits_extraction_from_tpx3_Ni.ipynb` - Hit extraction (96M+ hits/sec)
- `notebooks/neutrons_extraction_from_tpx3_Ni.ipynb` - Complete neutron processing
- `notebooks/clustering_abs_ni.ipynb` - ABS clustering demo
- `notebooks/clustering_graph_ni.ipynb` - Graph clustering demo
- `notebooks/clustering_dbscan_Ni.ipynb` - DBSCAN clustering demo
- `notebooks/clustering_grid_Ni.ipynb` - Grid clustering demo
### Documentation
- **📖 Quick Start**: [`docs/quickstart.md`](docs/quickstart.md)
- **📋 API Reference**: [`docs/api_reference.md`](docs/api_reference.md)
- **🏗️ Architecture**: [`TDCSOPHIREAD_ARCHITECTURE_2025.md`](TDCSOPHIREAD_ARCHITECTURE_2025.md)
- **🧬 TPX3 Format**: [`TPX3.md`](TPX3.md)
## 🗂️ Data Format
### Hit Data (Structured NumPy Array)
```python
hits = tdcsophiread.process_tpx3("data.tpx3")
print(f"Fields: {hits.dtype.names}")
# ('tof', 'x', 'y', 'timestamp', 'tot', 'chip_id', 'cluster_id')
# Access hit properties
x_coords = hits['x'] # Global X coordinates (uint16)
y_coords = hits['y'] # Global Y coordinates (uint16)
tof_values = hits['tof'] # Time-of-flight (uint32, 25ns units)
tot_values = hits['tot'] # Time-over-threshold (uint16)
chip_ids = hits['chip_id'] # Chip ID 0-3 (uint8)
```
### Neutron Data (Structured NumPy Array)
```python
neutrons = tdcsophiread.process_hits_to_neutrons(hits)
print(f"Fields: {neutrons.dtype.names}")
# ('x', 'y', 'tof', 'tot', 'n_hits', 'chip_id', 'reserved')
# Access neutron properties
x_subpixel = neutrons['x'] # Sub-pixel X coordinates (float64)
y_subpixel = neutrons['y'] # Sub-pixel Y coordinates (float64)
tof_neutron = neutrons['tof'] # Representative TOF (uint32, 25ns units)
cluster_size = neutrons['n_hits'] # Number of hits in cluster (uint16)
```
### Unit Conversions
```python
# Time conversions
tof_ms = hits['tof'] * 25 / 1e6 # 25ns units → milliseconds
timestamp_s = hits['timestamp'] * 25 / 1e9 # 25ns units → seconds
# Coordinate conversions
pixel_x = neutrons['x'] / 8.0 # Sub-pixel → pixel (factor=8)
pixel_y = neutrons['y'] / 8.0
```
## ⚙️ Configuration
### JSON Configuration
```json
{
"clustering": {
"algorithm": "abs",
"abs": {
"radius": 5.0,
"neutron_correlation_window": 75.0
}
},
"extraction": {
"algorithm": "simple_centroid",
"super_resolution_factor": 8.0,
"weighted_by_tot": true
},
"temporal": {
"num_workers": 0,
"max_batch_size": 100000
}
}
```
### Detector Configuration
```json
{
"detector": {
"timing": {
"tdc_frequency_hz": 60.0,
"enable_missing_tdc_correction": true
},
"chip_layout": {
"chip_size_x": 256,
"chip_size_y": 256
}
}
}
```
## 🔬 Scientific Context
### TPX3 Data Constraints
TDCSophiread respects the physical constraints of TPX3 data:
- **Variable section sizes**: No padding or fixed boundaries
- **Local time disorder**: Packets within sections not time-ordered
- **Missing TDC packets**: Hardware may drop TDC packets (corrected automatically)
- **Sequential dependencies**: TDC state must propagate in order
## 🛠️ Development
### Requirements
- **C++20** compiler (GCC 10+, Clang 11+, MSVC 2019+)
- **Intel TBB** for parallel processing
- **HDF5** for data I/O
- **Python 3.8+** with NumPy
- **CMake 3.20+**
### Environment Setup
```bash
# Install pixi (cross-platform package manager)
curl -sSL https://pixi.sh/install | bash
# Clone and setup
git clone https://github.com/ornlneutronimaging/mcpevent2hist.git
cd mcpevent2hist/sophiread
pixi install
```
### Code Style
- **C++20** with modern practices
- **Google C++ Style** (2-space indentation)
- **Test-Driven Development** with Google Test
- **Zero-copy** design patterns
- **Stateless algorithms** for parallelization
## 🔗 Legacy Components
Previous implementations (FastSophiread, CLI/GUI applications) have been moved to `legacy/` and are **deprecated**:
- ❌ **Unreliable GDC timing** (disapproved by detector experts)
- ❌ **Template complexity** (hard to maintain)
**Migration**: All legacy functionality is available in TDCSophiread with improved performance and reliability.
## 📈 Benchmarks
### Real-World Performance
Using sample data from `notebooks/data/`:
```python
# Ni powder diffraction data (>1M hits)
sample_file = "notebooks/data/Run_8217_April25_2025_Ni_Powder_MCP_TPX3_0_8C_1_9_AngsMin_serval_000000.tpx3"
import time
start = time.time()
hits = tdcsophiread.process_tpx3(sample_file, parallel=True)
neutrons = tdcsophiread.process_hits_to_neutrons(hits)
elapsed = time.time() - start
print(f"Performance: {len(hits) / elapsed / 1e6:.1f} M hits/sec")
print(f"Found {len(neutrons):,} neutrons from {len(hits):,} hits")
```
### Memory Efficiency
- **Before optimization**: 48GB peak memory
- **After optimization**: 20GB peak memory (**58% reduction**)
- **Current streaming**: 512MB chunks for any file size
## 🤝 Contributing
1. **Fork** the repository
2. **Create** a feature branch
3. **Add tests** for new functionality
4. **Submit** a pull request
### Issue Reporting
- **🐛 Bugs**: [GitHub Issues](https://github.com/ornlneutronimaging/mcpevent2hist/issues)
- **💬 Discussions**: [GitHub Discussions](https://github.com/ornlneutronimaging/mcpevent2hist/discussions)
- **📧 Contact**: neutronimaging@ornl.gov
## 📄 License
GPL-3.0+ License - see [LICENSE](LICENSE) file for details.
---
**Ready to process neutron data at 96M+ hits/sec?** 🚀
Get started: [`docs/quickstart.md`](docs/quickstart.md)
Raw data
{
"_id": null,
"home_page": null,
"name": "tdcsophiread",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "Chen Zhang <zhangc@ornl.gov>",
"keywords": "neutron imaging, timepix3, tpx3, tdc, time-of-flight, detector, high-performance, scientific computing",
"author": null,
"author_email": "Chen Zhang <zhangc@ornl.gov>, ORNL Neutron Imaging Team <neutronimaging@ornl.gov>",
"download_url": null,
"platform": null,
"description": "# TDCSophiread\n\nHigh-performance Python and C++ library for processing TPX3 neutron imaging data with **96M+ hits/sec** throughput. TDCSophiread provides complete hit extraction and neutron clustering capabilities using TDC-only timing (detector-expert approved).\n\n## \ud83d\ude80 Key Features\n\n- **\ud83c\udfc3 High Performance**: **96M+ hits/sec** with Intel TBB parallel processing\n- **\ud83e\udde0 Smart Clustering**: 4 algorithms (ABS, Graph, DBSCAN, Grid) for neutron event reconstruction\n- **\u26a1 Zero-Copy Processing**: Memory-efficient temporal batching with structured numpy arrays\n- **\ud83d\udd0d TDC-Only Timing**: Detector-expert-approved approach (no unreliable GDC)\n- **\ud83d\udc0d Python Integration**: Complete Python API with Jupyter notebook examples\n- **\ud83d\udcca Production Ready**: Real-world performance validated on 12GB datasets\n\n## Quick Start\n\n### Installation\n\n```bash\n# Clone repository\ngit clone https://github.com/ornlneutronimaging/mcpevent2hist.git\ncd mcpevent2hist/sophiread\n\n# Set up environment (pixi recommended)\npixi install && pixi shell\n\n# Build and install\npixi run cmake -B build && pixi run cmake --build build && pip install -e .\n```\n\n### Get Sample Data (12GB)\n\n```bash\n# Download real TPX3 datasets for testing\ngit submodule update --init notebooks/data\n```\n\n### Python Usage\n\n```python\nimport tdcsophiread\n\n# 1. Extract hits from TPX3 file\nhits = tdcsophiread.process_tpx3(\"data.tpx3\", parallel=True)\nprint(f\"Extracted {len(hits):,} hits\")\n\n# 2. Process hits to neutrons using clustering\nneutrons = tdcsophiread.process_hits_to_neutrons(hits)\nprint(f\"Found {len(neutrons):,} neutrons\")\n\n# 3. Try different clustering algorithms\nconfig = tdcsophiread.NeutronProcessingConfig.venus_defaults()\nconfig.clustering.algorithm = \"dbscan\" # or \"abs\", \"graph\", \"grid\"\nneutrons = tdcsophiread.process_hits_to_neutrons(hits, config)\n```\n\n### Performance Monitoring\n\n```python\n# Get detailed performance statistics\nconfig = tdcsophiread.NeutronProcessingConfig.venus_defaults()\nprocessor = tdcsophiread.TemporalNeutronProcessor(config)\nneutrons = processor.processHits(hits)\n\nstats = processor.getStatistics()\nprint(f\"Hit rate: {stats.hits_per_second/1e6:.1f} M hits/sec\")\nprint(f\"Neutron efficiency: {stats.neutron_efficiency:.3f}\")\nprint(f\"Parallel efficiency: {stats.parallel_efficiency:.2f}\")\n```\n\n## \ud83e\uddec Architecture\n\nTDCSophiread implements a modern, high-performance pipeline with parallel temporal processing:\n\n```mermaid\nflowchart TD\n A[TPX3 Raw Data] --> B[TDCProcessor]\n B --> |Memory-mapped I/O<br/>Section-aware processing| C[std::vector<TDCHit><br/>Temporally ordered hits]\n\n C --> D[TemporalNeutronProcessor]\n\n subgraph TemporalNeutronProcessor\n direction TB\n E[Phase 1: Statistical Analysis<br/>\u2022 Analyze hit distribution<br/>\u2022 Calculate optimal batch sizes<br/>\u2022 Determine overlaps]\n\n E --> F[Phase 2: Parallel Worker Pool]\n\n subgraph ParallelWorkerPool\n direction LR\n Worker0[Worker 0]\n Worker1[Worker 1]\n WorkerN[Worker N]\n end\n\n subgraph Worker0Details\n direction TB\n G1[Hit Clustering<br/>Algorithm Selection]\n G1 --> G1a[\"ABS<br/>O(n) - Fastest\"]\n G1 --> G1b[\"Graph<br/>O(n log n) - Balanced\"]\n G1 --> G1c[\"DBSCAN<br/>O(n log n) - Noise handling\"]\n G1 --> G1d[\"Grid<br/>O(n) - Geometry optimized\"]\n G1a --> G2[Neutron Extraction<br/>TOT-weighted centroids]\n G1b --> G2\n G1c --> G2\n G1d --> G2\n end\n\n subgraph Worker1Details\n direction TB\n H1[Hit Clustering] --> H2[Neutron Extraction]\n end\n\n subgraph WorkerNDetails\n direction TB\n I1[Hit Clustering] --> I2[Neutron Extraction]\n end\n\n Worker0 --> Worker0Details\n Worker1 --> Worker1Details\n WorkerN --> WorkerNDetails\n\n F --> J[Phase 3: Result Aggregation<br/>\u2022 Combine worker results<br/>\u2022 Remove overlap duplicates<br/>\u2022 Generate statistics]\n end\n\n J --> K[std::vector<TDCNeutron><br/>Final neutron events<br/>96M+ hits/sec performance]\n\n style A fill:#e1f5fe\n style K fill:#e8f5e8\n style TemporalNeutronProcessor fill:#f3e5f5\n style G1a fill:#ffecb3\n style G1b fill:#fff3e0\n style G1c fill:#fce4ec\n style G1d fill:#e0f2f1\n```\n\n### Phase 1: Hit Extraction\n- **Memory-mapped I/O**: Efficient processing of large TPX3 files\n- **Section-aware processing**: Respects TPX3 data structure constraints\n- **TDC state propagation**: Sequential processing for reliable timing\n- **Parallel chunk processing**: Intel TBB for maximum throughput\n\n### Phase 2: Temporal Neutron Processing\n- **Statistical analysis**: Optimal batching based on hit distribution\n- **Parallel worker pool**: Each worker has dedicated algorithm instances\n- **4 clustering algorithms**: ABS, Graph, DBSCAN, Grid with different performance characteristics\n- **Zero-copy processing**: Iterator-based interfaces minimize memory overhead\n\n### Phase 3: Result Aggregation\n- **Parallel result combination**: Efficient merging from multiple workers\n- **Overlap deduplication**: Remove duplicate neutrons from batch boundaries\n- **Performance statistics**: Detailed metrics for optimization\n\n## \ud83c\udfaf Clustering Algorithms\n\n| Algorithm | Performance | Use Case | Complexity |\n|-----------|-------------|----------|------------|\n| **ABS** | Fastest | General purpose, high throughput | O(n) |\n| **Graph** | Fast | Balanced speed/accuracy | O(n log n) |\n| **DBSCAN** | Medium | Noise handling, complex patterns | O(n log n) |\n| **Grid** | Fast | Detector geometry optimization | O(n) |\n\n### Algorithm Configuration\n\n```python\nconfig = tdcsophiread.NeutronProcessingConfig.venus_defaults()\n\n# ABS (Adaptive Bucket Sort) - Fastest\nconfig.clustering.algorithm = \"abs\"\nconfig.clustering.abs.radius = 5.0\nconfig.clustering.abs.neutron_correlation_window = 75.0 # nanoseconds\n\n# DBSCAN - Best noise handling\nconfig.clustering.algorithm = \"dbscan\"\nconfig.clustering.dbscan.epsilon = 4.0\nconfig.clustering.dbscan.min_points = 3\n\n# Process with custom configuration\nneutrons = tdcsophiread.process_hits_to_neutrons(hits, config)\n```\n\n## \ud83d\udcca Performance\n\n### Measured Performance (Real Hardware)\n\n| System | Hit Rate | Clustering | Notes |\n|--------|----------|------------|-------|\n| M2 Max | 20M+ hits/sec | ABS | Development system |\n| AMD EPYC 9174F | 96M+ hits/sec | ABS | Production target |\n| Memory Usage | ~40-60 bytes/hit | All | Including clustering |\n\n### Performance by File Size\n\n- **< 100MB**: 20-40 M hits/sec (single-threaded sufficient)\n- **100MB-1GB**: 50-80 M hits/sec (parallel recommended)\n- **1GB-10GB**: 80-96 M hits/sec (optimal parallel)\n- **> 10GB**: 90-96 M hits/sec (streaming mode)\n\n## \ud83d\udd27 Build System\n\n### Development Workflow\n\n```bash\n# Core workflow\npixi run build # Configure and build C++\npixi run test # Run C++ tests\npixi run install # Install Python bindings (editable)\npixi run python-test # Test Python import\n\n# Data setup (12GB sample data)\npixi run setup-data # Download sample TPX3 files\npixi run notebooks # Launch Jupyter with notebooks\n```\n\n### Build Options\n\n```bash\n# Debug build (if needed)\ncmake -B build -DCMAKE_BUILD_TYPE=Debug\n\n# Legacy components (not recommended)\ncmake -B build -DBUILD_LEGACY=ON\n```\n\n**\u26a0\ufe0f Legacy Warning**: Legacy components use unreliable GDC timing and will be removed in the next major release.\n\n## \ud83d\udcda Documentation & Examples\n\n### Jupyter Notebooks (Real Data)\n\n```bash\n# Start Jupyter with sample notebooks\npixi run notebooks\n```\n\n**Available Notebooks:**\n- `notebooks/hits_extraction_from_tpx3_Ni.ipynb` - Hit extraction (96M+ hits/sec)\n- `notebooks/neutrons_extraction_from_tpx3_Ni.ipynb` - Complete neutron processing\n- `notebooks/clustering_abs_ni.ipynb` - ABS clustering demo\n- `notebooks/clustering_graph_ni.ipynb` - Graph clustering demo\n- `notebooks/clustering_dbscan_Ni.ipynb` - DBSCAN clustering demo\n- `notebooks/clustering_grid_Ni.ipynb` - Grid clustering demo\n\n### Documentation\n\n- **\ud83d\udcd6 Quick Start**: [`docs/quickstart.md`](docs/quickstart.md)\n- **\ud83d\udccb API Reference**: [`docs/api_reference.md`](docs/api_reference.md)\n- **\ud83c\udfd7\ufe0f Architecture**: [`TDCSOPHIREAD_ARCHITECTURE_2025.md`](TDCSOPHIREAD_ARCHITECTURE_2025.md)\n- **\ud83e\uddec TPX3 Format**: [`TPX3.md`](TPX3.md)\n\n## \ud83d\uddc2\ufe0f Data Format\n\n### Hit Data (Structured NumPy Array)\n\n```python\nhits = tdcsophiread.process_tpx3(\"data.tpx3\")\nprint(f\"Fields: {hits.dtype.names}\")\n# ('tof', 'x', 'y', 'timestamp', 'tot', 'chip_id', 'cluster_id')\n\n# Access hit properties\nx_coords = hits['x'] # Global X coordinates (uint16)\ny_coords = hits['y'] # Global Y coordinates (uint16)\ntof_values = hits['tof'] # Time-of-flight (uint32, 25ns units)\ntot_values = hits['tot'] # Time-over-threshold (uint16)\nchip_ids = hits['chip_id'] # Chip ID 0-3 (uint8)\n```\n\n### Neutron Data (Structured NumPy Array)\n\n```python\nneutrons = tdcsophiread.process_hits_to_neutrons(hits)\nprint(f\"Fields: {neutrons.dtype.names}\")\n# ('x', 'y', 'tof', 'tot', 'n_hits', 'chip_id', 'reserved')\n\n# Access neutron properties\nx_subpixel = neutrons['x'] # Sub-pixel X coordinates (float64)\ny_subpixel = neutrons['y'] # Sub-pixel Y coordinates (float64)\ntof_neutron = neutrons['tof'] # Representative TOF (uint32, 25ns units)\ncluster_size = neutrons['n_hits'] # Number of hits in cluster (uint16)\n```\n\n### Unit Conversions\n\n```python\n# Time conversions\ntof_ms = hits['tof'] * 25 / 1e6 # 25ns units \u2192 milliseconds\ntimestamp_s = hits['timestamp'] * 25 / 1e9 # 25ns units \u2192 seconds\n\n# Coordinate conversions\npixel_x = neutrons['x'] / 8.0 # Sub-pixel \u2192 pixel (factor=8)\npixel_y = neutrons['y'] / 8.0\n```\n\n## \u2699\ufe0f Configuration\n\n### JSON Configuration\n\n```json\n{\n \"clustering\": {\n \"algorithm\": \"abs\",\n \"abs\": {\n \"radius\": 5.0,\n \"neutron_correlation_window\": 75.0\n }\n },\n \"extraction\": {\n \"algorithm\": \"simple_centroid\",\n \"super_resolution_factor\": 8.0,\n \"weighted_by_tot\": true\n },\n \"temporal\": {\n \"num_workers\": 0,\n \"max_batch_size\": 100000\n }\n}\n```\n\n### Detector Configuration\n\n```json\n{\n \"detector\": {\n \"timing\": {\n \"tdc_frequency_hz\": 60.0,\n \"enable_missing_tdc_correction\": true\n },\n \"chip_layout\": {\n \"chip_size_x\": 256,\n \"chip_size_y\": 256\n }\n }\n}\n```\n\n## \ud83d\udd2c Scientific Context\n\n### TPX3 Data Constraints\n\nTDCSophiread respects the physical constraints of TPX3 data:\n\n- **Variable section sizes**: No padding or fixed boundaries\n- **Local time disorder**: Packets within sections not time-ordered\n- **Missing TDC packets**: Hardware may drop TDC packets (corrected automatically)\n- **Sequential dependencies**: TDC state must propagate in order\n\n## \ud83d\udee0\ufe0f Development\n\n### Requirements\n\n- **C++20** compiler (GCC 10+, Clang 11+, MSVC 2019+)\n- **Intel TBB** for parallel processing\n- **HDF5** for data I/O\n- **Python 3.8+** with NumPy\n- **CMake 3.20+**\n\n### Environment Setup\n\n```bash\n# Install pixi (cross-platform package manager)\ncurl -sSL https://pixi.sh/install | bash\n\n# Clone and setup\ngit clone https://github.com/ornlneutronimaging/mcpevent2hist.git\ncd mcpevent2hist/sophiread\npixi install\n```\n\n### Code Style\n\n- **C++20** with modern practices\n- **Google C++ Style** (2-space indentation)\n- **Test-Driven Development** with Google Test\n- **Zero-copy** design patterns\n- **Stateless algorithms** for parallelization\n\n## \ud83d\udd17 Legacy Components\n\nPrevious implementations (FastSophiread, CLI/GUI applications) have been moved to `legacy/` and are **deprecated**:\n\n- \u274c **Unreliable GDC timing** (disapproved by detector experts)\n- \u274c **Template complexity** (hard to maintain)\n\n**Migration**: All legacy functionality is available in TDCSophiread with improved performance and reliability.\n\n## \ud83d\udcc8 Benchmarks\n\n### Real-World Performance\n\nUsing sample data from `notebooks/data/`:\n\n```python\n# Ni powder diffraction data (>1M hits)\nsample_file = \"notebooks/data/Run_8217_April25_2025_Ni_Powder_MCP_TPX3_0_8C_1_9_AngsMin_serval_000000.tpx3\"\n\nimport time\nstart = time.time()\nhits = tdcsophiread.process_tpx3(sample_file, parallel=True)\nneutrons = tdcsophiread.process_hits_to_neutrons(hits)\nelapsed = time.time() - start\n\nprint(f\"Performance: {len(hits) / elapsed / 1e6:.1f} M hits/sec\")\nprint(f\"Found {len(neutrons):,} neutrons from {len(hits):,} hits\")\n```\n\n### Memory Efficiency\n\n- **Before optimization**: 48GB peak memory\n- **After optimization**: 20GB peak memory (**58% reduction**)\n- **Current streaming**: 512MB chunks for any file size\n\n## \ud83e\udd1d Contributing\n\n1. **Fork** the repository\n2. **Create** a feature branch\n3. **Add tests** for new functionality\n4. **Submit** a pull request\n\n### Issue Reporting\n\n- **\ud83d\udc1b Bugs**: [GitHub Issues](https://github.com/ornlneutronimaging/mcpevent2hist/issues)\n- **\ud83d\udcac Discussions**: [GitHub Discussions](https://github.com/ornlneutronimaging/mcpevent2hist/discussions)\n- **\ud83d\udce7 Contact**: neutronimaging@ornl.gov\n\n## \ud83d\udcc4 License\n\nGPL-3.0+ License - see [LICENSE](LICENSE) file for details.\n\n---\n\n**Ready to process neutron data at 96M+ hits/sec?** \ud83d\ude80\n\nGet started: [`docs/quickstart.md`](docs/quickstart.md)\n",
"bugtrack_url": null,
"license": "GPL-3.0-or-later",
"summary": "High-performance TDC-only TPX3 neutron imaging data processor",
"version": "3.0.1",
"project_urls": {
"Changelog": "https://github.com/ornlneutronimaging/mcpevent2hist/blob/main/CHANGELOG.md",
"Documentation": "https://github.com/ornlneutronimaging/mcpevent2hist/blob/main/README.md",
"Homepage": "https://github.com/ornlneutronimaging/mcpevent2hist",
"Issues": "https://github.com/ornlneutronimaging/mcpevent2hist/issues",
"Repository": "https://github.com/ornlneutronimaging/mcpevent2hist.git"
},
"split_keywords": [
"neutron imaging",
" timepix3",
" tpx3",
" tdc",
" time-of-flight",
" detector",
" high-performance",
" scientific computing"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4e97f5ee967473b7322726ae6968e0648edc8ac7f4d9ff678750dcc1c125c3f8",
"md5": "223d0182a9482fc74605ee605f4f124a",
"sha256": "88d7768e8ce3a8d1e930edc194daa06c195728eca21bda14a7b0f5cf8c5f59ae"
},
"downloads": -1,
"filename": "tdcsophiread-3.0.1-cp312-abi3-macosx_15_0_arm64.whl",
"has_sig": false,
"md5_digest": "223d0182a9482fc74605ee605f4f124a",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.9",
"size": 6365023,
"upload_time": "2025-07-15T17:23:02",
"upload_time_iso_8601": "2025-07-15T17:23:02.079233Z",
"url": "https://files.pythonhosted.org/packages/4e/97/f5ee967473b7322726ae6968e0648edc8ac7f4d9ff678750dcc1c125c3f8/tdcsophiread-3.0.1-cp312-abi3-macosx_15_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "261b5cddb2a28daa5fd4595cca39f3a349c51362dd0c53e037fe1caa31a5eb3c",
"md5": "344e2f52311fc0a9ddc53fdc80a0ba63",
"sha256": "b15cd0fd984f2f2913b565919883cb72c79a9da0e918b4127c357169005d783b"
},
"downloads": -1,
"filename": "tdcsophiread-3.0.1-cp312-abi3-manylinux_2_34_x86_64.whl",
"has_sig": false,
"md5_digest": "344e2f52311fc0a9ddc53fdc80a0ba63",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.9",
"size": 525321,
"upload_time": "2025-07-15T17:33:49",
"upload_time_iso_8601": "2025-07-15T17:33:49.907372Z",
"url": "https://files.pythonhosted.org/packages/26/1b/5cddb2a28daa5fd4595cca39f3a349c51362dd0c53e037fe1caa31a5eb3c/tdcsophiread-3.0.1-cp312-abi3-manylinux_2_34_x86_64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-15 17:23:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ornlneutronimaging",
"github_project": "mcpevent2hist",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "tdcsophiread"
}