pyngb


Namepyngb JSON
Version 0.0.1 PyPI version JSON
download
home_pageNone
SummaryUnofficial parser for NETZSCH STA (Simultaneous Thermal Analysis) NGB instrument binary files. Not affiliated with, endorsed by, or approved by NETZSCH-GerΓ€tebau GmbH.
upload_time2025-08-15 01:20:59
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT License Copyright (c) 2025-present GraysonBellamy <gbellamy@umd.edu> Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords binary-parsing netzsch ngb scientific-data sta thermal-analysis unofficial
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pyNGB - NETZSCH STA File Parser

[![PyPI version](https://badge.fury.io/py/pyngb.svg)](https://badge.fury.io/py/pyngb)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Tests](https://github.com/GraysonBellamy/pyngb/workflows/Tests/badge.svg)](https://github.com/GraysonBellamy/pyngb/actions)

A comprehensive Python library for parsing and analyzing NETZSCH STA (Simultaneous Thermal Analysis) NGB files with high performance, extensive metadata extraction, and robust batch processing capabilities.

## 🚨 Disclaimer

**This package and its author are not affiliated with, endorsed by, or approved by NETZSCH-GerΓ€tebau GmbH.** This is an independent, open-source project created to provide Python support for parsing NGB (NETZSCH binary) file formats. NETZSCH is a trademark of NETZSCH-GerΓ€tebau GmbH.

## ✨ Features

### Core Capabilities
- πŸš€ **High-Performance Parsing**: Optimized binary parsing with NumPy and PyArrow
- πŸ“Š **Rich Metadata Extraction**: Complete instrument settings, sample information, and measurement parameters
- πŸ”§ **Flexible Data Access**: Multiple APIs for different use cases
- πŸ“¦ **Modern Data Formats**: PyArrow tables with embedded metadata
- πŸ” **Data Validation**: Built-in quality checking and validation tools
- ⚑ **Batch Processing**: Parallel processing of multiple files
- πŸ› οΈ **Command Line Interface**: Production-ready CLI for automation

### Advanced Features
- πŸ—οΈ **Modular Architecture**: Extensible and maintainable design
- πŸ”’ **Type Safety**: Full type hints and static analysis support
- πŸ§ͺ **Comprehensive Testing**: 300+ tests including integration and stress tests
- πŸ”„ **Format Conversion**: Export to Parquet, CSV, and JSON
- πŸ“ˆ **Dataset Management**: Tools for managing collections of NGB files
- πŸ”€ **Concurrent Processing**: Thread-safe operations and parallel execution
- πŸ“ **Rich Documentation**: Complete API documentation with examples

## πŸš€ Quick Start

### Installation

```bash
pip install pyngb
```

### Basic Usage

```python
from pyngb import read_ngb

# Quick data loading (recommended for most users)
data = read_ngb("sample.ngb-ss3")
print(f"Loaded {data.num_rows} rows with {data.num_columns} columns")
print(f"Columns: {data.column_names}")

# Access embedded metadata
import json
metadata = json.loads(data.schema.metadata[b'file_metadata'])
print(f"Sample: {metadata.get('sample_name', 'Unknown')}")
print(f"Instrument: {metadata.get('instrument', 'Unknown')}")

# Separate metadata and data (for advanced analysis)
metadata, data = read_ngb("sample.ngb-ss3", return_metadata=True)
```

### Data Analysis

```python
import polars as pl

# Convert to DataFrame for analysis
df = pl.from_arrow(table)

# Basic exploration
print(df.describe())
print(f"Temperature range: {df['sample_temperature'].min():.1f} to {df['sample_temperature'].max():.1f} Β°C")

# Simple plotting
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 8))
plt.subplot(2, 1, 1)
plt.plot(df['time'], df['sample_temperature'])
plt.ylabel('Temperature (Β°C)')
plt.title('Temperature Program')

plt.subplot(2, 1, 2)
plt.plot(df['time'], df['mass'])
plt.xlabel('Time (s)')
plt.ylabel('Mass (mg)')
plt.title('Mass Loss')
plt.show()
```

## πŸ“‹ Complete Usage Guide

### 1. Single File Processing

```python
from pyngb import read_ngb

# Method 1: Unified data and metadata (recommended)
table = read_ngb("experiment.ngb-ss3")
# Access data
df = pl.from_arrow(table)
# Access metadata
metadata = json.loads(table.schema.metadata[b'file_metadata'])

# Method 2: Separate metadata and data
metadata, data = read_ngb("experiment.ngb-ss3", return_metadata=True)
```

### 2. Batch Processing

```python
from pyngb import BatchProcessor

# Initialize batch processor
processor = BatchProcessor(max_workers=4, verbose=True)

# Process multiple files
results = processor.process_files(
    ["file1.ngb-ss3", "file2.ngb-ss3", "file3.ngb-ss3"],
    output_format="both",  # Parquet and CSV
    output_dir="./processed_data/"
)

# Check results
successful = [r for r in results if r["status"] == "success"]
print(f"Successfully processed {len(successful)} files")
```

### 3. Dataset Management

```python
from pyngb import NGBDataset

# Create dataset from directory
dataset = NGBDataset.from_directory("./sta_experiments/")

# Get overview
summary = dataset.summary()
print(f"Dataset contains {summary['file_count']} files")
print(f"Unique instruments: {summary['unique_instruments']}")

# Export metadata for analysis
dataset.export_metadata("dataset_metadata.csv", format="csv")

# Filter dataset
high_temp_files = dataset.filter_by_metadata(
    lambda meta: meta.get('sample_mass', 0) > 10.0
)
```

### 4. Data Validation

```python
from pyngb.validation import QualityChecker, validate_sta_data

# Quick validation
issues = validate_sta_data(df)
print(f"Found {len(issues)} data quality issues")

# Comprehensive validation
checker = QualityChecker(df)
result = checker.full_validation()

print(f"Validation passed: {result.is_valid}")
print(f"Errors: {result.summary()['error_count']}")
print(f"Warnings: {result.summary()['warning_count']}")
```

### 5. Advanced Parser Configuration

```python
from pyngb import NGBParser, PatternConfig

# Custom configuration
config = PatternConfig()
config.column_map["custom_id"] = "custom_column"
config.metadata_patterns["custom_field"] = (b"\x99\x99", b"\x88\x88")

# Use custom parser
parser = NGBParser(config)
metadata, data = parser.parse("sample.ngb-ss3")
```

## πŸ–₯️ Command Line Interface

### Basic Commands

```bash
# Convert single file to Parquet
python -m pyngb sample.ngb-ss3

# Convert to CSV with verbose output
python -m pyngb sample.ngb-ss3 -f csv -v

# Convert to all formats (Parquet, CSV, JSON metadata)
python -m pyngb sample.ngb-ss3 -f all -o ./output/
```

### Batch Processing

```bash
# Process all files in directory
python -m pyngb *.ngb-ss3 -f parquet -o ./processed/

# Process with specific output formats
python -m pyngb experiments/*.ngb-ss3 -f both -o ./results/

# Get help
python -m pyngb --help
```

### Advanced CLI Usage

```bash
# Process directory with pattern matching
find ./data -name "*.ngb-ss3" | xargs python -m pyngb -f parquet -o ./output/

# Automated processing pipeline
python -m pyngb $(find ./incoming -name "*.ngb-ss3" -mtime -1) -f all -o ./daily_processing/
```

## πŸ—οΈ Architecture

pyngb uses a modular, extensible architecture designed for performance and maintainability:

```
pyngb/
β”œβ”€β”€ api/                    # High-level user interface
β”‚   └── loaders.py         # Main loading functions
β”œβ”€β”€ binary/                # Low-level binary parsing
β”‚   β”œβ”€β”€ parser.py          # Binary structure parsing
β”‚   └── handlers.py        # Data type handlers
β”œβ”€β”€ core/                  # Core orchestration
β”‚   └── parser.py          # Main parser coordination
β”œβ”€β”€ extractors/            # Data extraction modules
β”‚   β”œβ”€β”€ metadata.py        # Metadata extraction
β”‚   └── streams.py         # Data stream processing
β”œβ”€β”€ batch.py               # Batch processing tools
β”œβ”€β”€ validation.py          # Data quality validation
β”œβ”€β”€ constants.py           # Configuration and constants
β”œβ”€β”€ exceptions.py          # Custom exception hierarchy
└── util.py               # Utility functions
```

### Design Principles

- **Performance First**: Optimized for speed and memory efficiency
- **Extensibility**: Easy to add new data types and extraction patterns
- **Reliability**: Comprehensive error handling and validation
- **Usability**: Multiple APIs for different user needs
- **Maintainability**: Clean separation of concerns and thorough testing

## πŸ“Š Data Output

### Supported Columns

Common data columns extracted from NGB files:

| Column | Description | Units |
|--------|-------------|--------|
| `time` | Measurement time | seconds |
| `sample_temperature` | Sample temperature | Β°C |
| `furnace_temperature` | Furnace temperature | Β°C |
| `mass` | Sample mass | mg |
| `dsc_signal` | DSC heat flow | Β΅V/mg |
| `purge_flow_1` | Primary purge gas flow | mL/min |
| `purge_flow_2` | Secondary purge gas flow | mL/min |
| `protective_flow` | Protective gas flow | mL/min |

### Metadata Fields

Comprehensive metadata extraction including:

- **Instrument Information**: Model, version, calibration data
- **Sample Details**: Name, mass, material, crucible type
- **Experimental Conditions**: Operator, date, lab, project
- **Temperature Program**: Complete heating/cooling profiles
- **Gas Flows**: MFC settings and gas types
- **System Parameters**: PID settings, acquisition rates

## πŸ”§ Advanced Features

### Performance Optimization

```python
# Memory-efficient processing of large files
table = read_ngb("large_file.ngb-ss3")
# Process in chunks to manage memory
chunk_size = 10000
for i in range(0, table.num_rows, chunk_size):
    chunk = table.slice(i, chunk_size)
    # Process chunk...
```

### Custom Data Types

```python
from pyngb.binary.handlers import DataTypeHandler, DataTypeRegistry

class CustomHandler(DataTypeHandler):
    def can_handle(self, data_type: bytes) -> bool:
        return data_type == b'\x99'

    def parse(self, data: bytes) -> list:
        # Custom parsing logic
        return [struct.unpack('<f', data[i:i+4])[0] for i in range(0, len(data), 4)]

# Register custom handler
registry = DataTypeRegistry()
registry.register(CustomHandler())
```

### Validation Customization

```python
from pyngb.validation import QualityChecker

class CustomQualityChecker(QualityChecker):
    def custom_check(self):
        """Add custom validation logic."""
        if "custom_column" in self.data.columns:
            values = self.data["custom_column"]
            if values.min() < 0:
                self.result.add_error("Custom column has negative values")
```

## πŸ§ͺ Testing and Quality

pyngb includes a comprehensive test suite ensuring reliability:

- **300+ Tests**: Unit, integration, and end-to-end tests
- **Real Data Testing**: Tests using actual NGB files
- **Stress Testing**: Memory management and concurrent processing
- **Edge Case Coverage**: Corrupted files, extreme data values
- **Performance Testing**: Large file processing benchmarks

Run tests locally:

```bash
# Install development dependencies
uv sync --extra dev

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

# Run only fast tests
pytest -m "not slow"
```

## 🀝 Contributing

We welcome contributions! Here's how to get started:

### Development Setup

```bash
# Clone repository
git clone https://github.com/GraysonBellamy/pyngb.git
cd pyngb

# Install with development dependencies
uv sync --extra dev

# Install pre-commit hooks
pre-commit install

# Run tests
pytest
```

### Contributing Guidelines

1. **Fork the repository** and create a feature branch
2. **Write tests** for new functionality
3. **Follow code style** (ruff + mypy)
4. **Update documentation** for new features
5. **Submit a pull request** with clear description

See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.

## πŸ“š Documentation

- **[Complete Documentation](https://graysonbellamy.github.io/pyngb/)**: Full user guide and API reference
- **[Quick Start Guide](docs/quickstart.md)**: Get up and running quickly
- **[API Reference](docs/api.md)**: Detailed function documentation
- **[Development Guide](docs/development.md)**: Contributing and development setup
- **[Troubleshooting](docs/troubleshooting.md)**: Common issues and solutions

## πŸš€ Performance

pyngb is optimized for performance:

- **Fast Parsing**: Typical files parse in 0.1-2 seconds
- **Memory Efficient**: Uses PyArrow for optimal memory usage
- **Parallel Processing**: Multi-core batch processing
- **Scalable**: Handles files from KB to GB sizes

### Benchmarks

| Operation | Performance |
|-----------|-------------|
| Parse 10MB file | ~0.5 seconds |
| Extract metadata | ~0.1 seconds |
| Batch process 100 files | ~30 seconds (4 cores) |
| Memory usage | ~2x file size |

## πŸ”— Integration

pyngb integrates well with the scientific Python ecosystem:

```python
# With Pandas
import pandas as pd
df_pandas = pl.from_arrow(table).to_pandas()

# With NumPy
import numpy as np
temperature_array = table['sample_temperature'].to_numpy()

# With Matplotlib/Seaborn
import matplotlib.pyplot as plt
import seaborn as sns

# With Jupyter notebooks
from IPython.display import display
display(df.head())
```

## πŸ“„ License

This project is licensed under the MIT License - see the [LICENSE.txt](LICENSE.txt) file for details.

## πŸ™ Acknowledgments

- NETZSCH-GerΓ€tebau GmbH for creating the STA instruments (no affiliation)
- The PyArrow and Polars teams for excellent data processing libraries
- The scientific Python community for the foundational tools

## πŸ“ž Support

- **Issues**: [GitHub Issues](https://github.com/GraysonBellamy/pyngb/issues)
- **Discussions**: [GitHub Discussions](https://github.com/GraysonBellamy/pyngb/discussions)
- **Documentation**: [Full Documentation](https://graysonbellamy.github.io/pyngb/)

---

Made with ❀️ for the scientific community

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pyngb",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "binary-parsing, netzsch, ngb, scientific-data, sta, thermal-analysis, unofficial",
    "author": null,
    "author_email": "Grayson Bellamy <gbellamy@umd.edu>",
    "download_url": "https://files.pythonhosted.org/packages/8f/a8/c6b296de47282c2aeac254926e68231bb25db26059e3ae2fbc3849e61903/pyngb-0.0.1.tar.gz",
    "platform": null,
    "description": "# pyNGB - NETZSCH STA File Parser\n\n[![PyPI version](https://badge.fury.io/py/pyngb.svg)](https://badge.fury.io/py/pyngb)\n[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Tests](https://github.com/GraysonBellamy/pyngb/workflows/Tests/badge.svg)](https://github.com/GraysonBellamy/pyngb/actions)\n\nA comprehensive Python library for parsing and analyzing NETZSCH STA (Simultaneous Thermal Analysis) NGB files with high performance, extensive metadata extraction, and robust batch processing capabilities.\n\n## \ud83d\udea8 Disclaimer\n\n**This package and its author are not affiliated with, endorsed by, or approved by NETZSCH-Ger\u00e4tebau GmbH.** This is an independent, open-source project created to provide Python support for parsing NGB (NETZSCH binary) file formats. NETZSCH is a trademark of NETZSCH-Ger\u00e4tebau GmbH.\n\n## \u2728 Features\n\n### Core Capabilities\n- \ud83d\ude80 **High-Performance Parsing**: Optimized binary parsing with NumPy and PyArrow\n- \ud83d\udcca **Rich Metadata Extraction**: Complete instrument settings, sample information, and measurement parameters\n- \ud83d\udd27 **Flexible Data Access**: Multiple APIs for different use cases\n- \ud83d\udce6 **Modern Data Formats**: PyArrow tables with embedded metadata\n- \ud83d\udd0d **Data Validation**: Built-in quality checking and validation tools\n- \u26a1 **Batch Processing**: Parallel processing of multiple files\n- \ud83d\udee0\ufe0f **Command Line Interface**: Production-ready CLI for automation\n\n### Advanced Features\n- \ud83c\udfd7\ufe0f **Modular Architecture**: Extensible and maintainable design\n- \ud83d\udd12 **Type Safety**: Full type hints and static analysis support\n- \ud83e\uddea **Comprehensive Testing**: 300+ tests including integration and stress tests\n- \ud83d\udd04 **Format Conversion**: Export to Parquet, CSV, and JSON\n- \ud83d\udcc8 **Dataset Management**: Tools for managing collections of NGB files\n- \ud83d\udd00 **Concurrent Processing**: Thread-safe operations and parallel execution\n- \ud83d\udcdd **Rich Documentation**: Complete API documentation with examples\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\npip install pyngb\n```\n\n### Basic Usage\n\n```python\nfrom pyngb import read_ngb\n\n# Quick data loading (recommended for most users)\ndata = read_ngb(\"sample.ngb-ss3\")\nprint(f\"Loaded {data.num_rows} rows with {data.num_columns} columns\")\nprint(f\"Columns: {data.column_names}\")\n\n# Access embedded metadata\nimport json\nmetadata = json.loads(data.schema.metadata[b'file_metadata'])\nprint(f\"Sample: {metadata.get('sample_name', 'Unknown')}\")\nprint(f\"Instrument: {metadata.get('instrument', 'Unknown')}\")\n\n# Separate metadata and data (for advanced analysis)\nmetadata, data = read_ngb(\"sample.ngb-ss3\", return_metadata=True)\n```\n\n### Data Analysis\n\n```python\nimport polars as pl\n\n# Convert to DataFrame for analysis\ndf = pl.from_arrow(table)\n\n# Basic exploration\nprint(df.describe())\nprint(f\"Temperature range: {df['sample_temperature'].min():.1f} to {df['sample_temperature'].max():.1f} \u00b0C\")\n\n# Simple plotting\nimport matplotlib.pyplot as plt\n\nplt.figure(figsize=(12, 8))\nplt.subplot(2, 1, 1)\nplt.plot(df['time'], df['sample_temperature'])\nplt.ylabel('Temperature (\u00b0C)')\nplt.title('Temperature Program')\n\nplt.subplot(2, 1, 2)\nplt.plot(df['time'], df['mass'])\nplt.xlabel('Time (s)')\nplt.ylabel('Mass (mg)')\nplt.title('Mass Loss')\nplt.show()\n```\n\n## \ud83d\udccb Complete Usage Guide\n\n### 1. Single File Processing\n\n```python\nfrom pyngb import read_ngb\n\n# Method 1: Unified data and metadata (recommended)\ntable = read_ngb(\"experiment.ngb-ss3\")\n# Access data\ndf = pl.from_arrow(table)\n# Access metadata\nmetadata = json.loads(table.schema.metadata[b'file_metadata'])\n\n# Method 2: Separate metadata and data\nmetadata, data = read_ngb(\"experiment.ngb-ss3\", return_metadata=True)\n```\n\n### 2. Batch Processing\n\n```python\nfrom pyngb import BatchProcessor\n\n# Initialize batch processor\nprocessor = BatchProcessor(max_workers=4, verbose=True)\n\n# Process multiple files\nresults = processor.process_files(\n    [\"file1.ngb-ss3\", \"file2.ngb-ss3\", \"file3.ngb-ss3\"],\n    output_format=\"both\",  # Parquet and CSV\n    output_dir=\"./processed_data/\"\n)\n\n# Check results\nsuccessful = [r for r in results if r[\"status\"] == \"success\"]\nprint(f\"Successfully processed {len(successful)} files\")\n```\n\n### 3. Dataset Management\n\n```python\nfrom pyngb import NGBDataset\n\n# Create dataset from directory\ndataset = NGBDataset.from_directory(\"./sta_experiments/\")\n\n# Get overview\nsummary = dataset.summary()\nprint(f\"Dataset contains {summary['file_count']} files\")\nprint(f\"Unique instruments: {summary['unique_instruments']}\")\n\n# Export metadata for analysis\ndataset.export_metadata(\"dataset_metadata.csv\", format=\"csv\")\n\n# Filter dataset\nhigh_temp_files = dataset.filter_by_metadata(\n    lambda meta: meta.get('sample_mass', 0) > 10.0\n)\n```\n\n### 4. Data Validation\n\n```python\nfrom pyngb.validation import QualityChecker, validate_sta_data\n\n# Quick validation\nissues = validate_sta_data(df)\nprint(f\"Found {len(issues)} data quality issues\")\n\n# Comprehensive validation\nchecker = QualityChecker(df)\nresult = checker.full_validation()\n\nprint(f\"Validation passed: {result.is_valid}\")\nprint(f\"Errors: {result.summary()['error_count']}\")\nprint(f\"Warnings: {result.summary()['warning_count']}\")\n```\n\n### 5. Advanced Parser Configuration\n\n```python\nfrom pyngb import NGBParser, PatternConfig\n\n# Custom configuration\nconfig = PatternConfig()\nconfig.column_map[\"custom_id\"] = \"custom_column\"\nconfig.metadata_patterns[\"custom_field\"] = (b\"\\x99\\x99\", b\"\\x88\\x88\")\n\n# Use custom parser\nparser = NGBParser(config)\nmetadata, data = parser.parse(\"sample.ngb-ss3\")\n```\n\n## \ud83d\udda5\ufe0f Command Line Interface\n\n### Basic Commands\n\n```bash\n# Convert single file to Parquet\npython -m pyngb sample.ngb-ss3\n\n# Convert to CSV with verbose output\npython -m pyngb sample.ngb-ss3 -f csv -v\n\n# Convert to all formats (Parquet, CSV, JSON metadata)\npython -m pyngb sample.ngb-ss3 -f all -o ./output/\n```\n\n### Batch Processing\n\n```bash\n# Process all files in directory\npython -m pyngb *.ngb-ss3 -f parquet -o ./processed/\n\n# Process with specific output formats\npython -m pyngb experiments/*.ngb-ss3 -f both -o ./results/\n\n# Get help\npython -m pyngb --help\n```\n\n### Advanced CLI Usage\n\n```bash\n# Process directory with pattern matching\nfind ./data -name \"*.ngb-ss3\" | xargs python -m pyngb -f parquet -o ./output/\n\n# Automated processing pipeline\npython -m pyngb $(find ./incoming -name \"*.ngb-ss3\" -mtime -1) -f all -o ./daily_processing/\n```\n\n## \ud83c\udfd7\ufe0f Architecture\n\npyngb uses a modular, extensible architecture designed for performance and maintainability:\n\n```\npyngb/\n\u251c\u2500\u2500 api/                    # High-level user interface\n\u2502   \u2514\u2500\u2500 loaders.py         # Main loading functions\n\u251c\u2500\u2500 binary/                # Low-level binary parsing\n\u2502   \u251c\u2500\u2500 parser.py          # Binary structure parsing\n\u2502   \u2514\u2500\u2500 handlers.py        # Data type handlers\n\u251c\u2500\u2500 core/                  # Core orchestration\n\u2502   \u2514\u2500\u2500 parser.py          # Main parser coordination\n\u251c\u2500\u2500 extractors/            # Data extraction modules\n\u2502   \u251c\u2500\u2500 metadata.py        # Metadata extraction\n\u2502   \u2514\u2500\u2500 streams.py         # Data stream processing\n\u251c\u2500\u2500 batch.py               # Batch processing tools\n\u251c\u2500\u2500 validation.py          # Data quality validation\n\u251c\u2500\u2500 constants.py           # Configuration and constants\n\u251c\u2500\u2500 exceptions.py          # Custom exception hierarchy\n\u2514\u2500\u2500 util.py               # Utility functions\n```\n\n### Design Principles\n\n- **Performance First**: Optimized for speed and memory efficiency\n- **Extensibility**: Easy to add new data types and extraction patterns\n- **Reliability**: Comprehensive error handling and validation\n- **Usability**: Multiple APIs for different user needs\n- **Maintainability**: Clean separation of concerns and thorough testing\n\n## \ud83d\udcca Data Output\n\n### Supported Columns\n\nCommon data columns extracted from NGB files:\n\n| Column | Description | Units |\n|--------|-------------|--------|\n| `time` | Measurement time | seconds |\n| `sample_temperature` | Sample temperature | \u00b0C |\n| `furnace_temperature` | Furnace temperature | \u00b0C |\n| `mass` | Sample mass | mg |\n| `dsc_signal` | DSC heat flow | \u00b5V/mg |\n| `purge_flow_1` | Primary purge gas flow | mL/min |\n| `purge_flow_2` | Secondary purge gas flow | mL/min |\n| `protective_flow` | Protective gas flow | mL/min |\n\n### Metadata Fields\n\nComprehensive metadata extraction including:\n\n- **Instrument Information**: Model, version, calibration data\n- **Sample Details**: Name, mass, material, crucible type\n- **Experimental Conditions**: Operator, date, lab, project\n- **Temperature Program**: Complete heating/cooling profiles\n- **Gas Flows**: MFC settings and gas types\n- **System Parameters**: PID settings, acquisition rates\n\n## \ud83d\udd27 Advanced Features\n\n### Performance Optimization\n\n```python\n# Memory-efficient processing of large files\ntable = read_ngb(\"large_file.ngb-ss3\")\n# Process in chunks to manage memory\nchunk_size = 10000\nfor i in range(0, table.num_rows, chunk_size):\n    chunk = table.slice(i, chunk_size)\n    # Process chunk...\n```\n\n### Custom Data Types\n\n```python\nfrom pyngb.binary.handlers import DataTypeHandler, DataTypeRegistry\n\nclass CustomHandler(DataTypeHandler):\n    def can_handle(self, data_type: bytes) -> bool:\n        return data_type == b'\\x99'\n\n    def parse(self, data: bytes) -> list:\n        # Custom parsing logic\n        return [struct.unpack('<f', data[i:i+4])[0] for i in range(0, len(data), 4)]\n\n# Register custom handler\nregistry = DataTypeRegistry()\nregistry.register(CustomHandler())\n```\n\n### Validation Customization\n\n```python\nfrom pyngb.validation import QualityChecker\n\nclass CustomQualityChecker(QualityChecker):\n    def custom_check(self):\n        \"\"\"Add custom validation logic.\"\"\"\n        if \"custom_column\" in self.data.columns:\n            values = self.data[\"custom_column\"]\n            if values.min() < 0:\n                self.result.add_error(\"Custom column has negative values\")\n```\n\n## \ud83e\uddea Testing and Quality\n\npyngb includes a comprehensive test suite ensuring reliability:\n\n- **300+ Tests**: Unit, integration, and end-to-end tests\n- **Real Data Testing**: Tests using actual NGB files\n- **Stress Testing**: Memory management and concurrent processing\n- **Edge Case Coverage**: Corrupted files, extreme data values\n- **Performance Testing**: Large file processing benchmarks\n\nRun tests locally:\n\n```bash\n# Install development dependencies\nuv sync --extra dev\n\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=src --cov-report=html\n\n# Run only fast tests\npytest -m \"not slow\"\n```\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Here's how to get started:\n\n### Development Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/GraysonBellamy/pyngb.git\ncd pyngb\n\n# Install with development dependencies\nuv sync --extra dev\n\n# Install pre-commit hooks\npre-commit install\n\n# Run tests\npytest\n```\n\n### Contributing Guidelines\n\n1. **Fork the repository** and create a feature branch\n2. **Write tests** for new functionality\n3. **Follow code style** (ruff + mypy)\n4. **Update documentation** for new features\n5. **Submit a pull request** with clear description\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.\n\n## \ud83d\udcda Documentation\n\n- **[Complete Documentation](https://graysonbellamy.github.io/pyngb/)**: Full user guide and API reference\n- **[Quick Start Guide](docs/quickstart.md)**: Get up and running quickly\n- **[API Reference](docs/api.md)**: Detailed function documentation\n- **[Development Guide](docs/development.md)**: Contributing and development setup\n- **[Troubleshooting](docs/troubleshooting.md)**: Common issues and solutions\n\n## \ud83d\ude80 Performance\n\npyngb is optimized for performance:\n\n- **Fast Parsing**: Typical files parse in 0.1-2 seconds\n- **Memory Efficient**: Uses PyArrow for optimal memory usage\n- **Parallel Processing**: Multi-core batch processing\n- **Scalable**: Handles files from KB to GB sizes\n\n### Benchmarks\n\n| Operation | Performance |\n|-----------|-------------|\n| Parse 10MB file | ~0.5 seconds |\n| Extract metadata | ~0.1 seconds |\n| Batch process 100 files | ~30 seconds (4 cores) |\n| Memory usage | ~2x file size |\n\n## \ud83d\udd17 Integration\n\npyngb integrates well with the scientific Python ecosystem:\n\n```python\n# With Pandas\nimport pandas as pd\ndf_pandas = pl.from_arrow(table).to_pandas()\n\n# With NumPy\nimport numpy as np\ntemperature_array = table['sample_temperature'].to_numpy()\n\n# With Matplotlib/Seaborn\nimport matplotlib.pyplot as plt\nimport seaborn as sns\n\n# With Jupyter notebooks\nfrom IPython.display import display\ndisplay(df.head())\n```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE.txt](LICENSE.txt) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- NETZSCH-Ger\u00e4tebau GmbH for creating the STA instruments (no affiliation)\n- The PyArrow and Polars teams for excellent data processing libraries\n- The scientific Python community for the foundational tools\n\n## \ud83d\udcde Support\n\n- **Issues**: [GitHub Issues](https://github.com/GraysonBellamy/pyngb/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/GraysonBellamy/pyngb/discussions)\n- **Documentation**: [Full Documentation](https://graysonbellamy.github.io/pyngb/)\n\n---\n\nMade with \u2764\ufe0f for the scientific community\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2025-present GraysonBellamy <gbellamy@umd.edu>\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "Unofficial parser for NETZSCH STA (Simultaneous Thermal Analysis) NGB instrument binary files. Not affiliated with, endorsed by, or approved by NETZSCH-Ger\u00e4tebau GmbH.",
    "version": "0.0.1",
    "project_urls": {
        "Changelog": "https://github.com/GraysonBellamy/pyngb/blob/main/CHANGELOG.md",
        "Documentation": "https://graysonbellamy.github.io/pyngb/",
        "Homepage": "https://github.com/GraysonBellamy/pyngb",
        "Issues": "https://github.com/GraysonBellamy/pyngb/issues",
        "Repository": "https://github.com/GraysonBellamy/pyngb.git"
    },
    "split_keywords": [
        "binary-parsing",
        " netzsch",
        " ngb",
        " scientific-data",
        " sta",
        " thermal-analysis",
        " unofficial"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "db3dfa7d741289fd5b6ca8d8807b4e75e0905b59572eab60576e233a9e11a04c",
                "md5": "0932b8b3c75aff47bc38da76b828123f",
                "sha256": "24bd6bb30742a1ff2fb68eb7a8211711496e36ffc7fef9077edc43124c61467f"
            },
            "downloads": -1,
            "filename": "pyngb-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0932b8b3c75aff47bc38da76b828123f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 48066,
            "upload_time": "2025-08-15T01:20:58",
            "upload_time_iso_8601": "2025-08-15T01:20:58.293863Z",
            "url": "https://files.pythonhosted.org/packages/db/3d/fa7d741289fd5b6ca8d8807b4e75e0905b59572eab60576e233a9e11a04c/pyngb-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8fa8c6b296de47282c2aeac254926e68231bb25db26059e3ae2fbc3849e61903",
                "md5": "b583f92eff108bd1f855e39ea4ba554c",
                "sha256": "9d64f77263c4b790fc488f803b4be06adcc040b68b5ad5a30cbf7b6de887cd4f"
            },
            "downloads": -1,
            "filename": "pyngb-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "b583f92eff108bd1f855e39ea4ba554c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 1249071,
            "upload_time": "2025-08-15T01:20:59",
            "upload_time_iso_8601": "2025-08-15T01:20:59.958252Z",
            "url": "https://files.pythonhosted.org/packages/8f/a8/c6b296de47282c2aeac254926e68231bb25db26059e3ae2fbc3849e61903/pyngb-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-15 01:20:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "GraysonBellamy",
    "github_project": "pyngb",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pyngb"
}
        
Elapsed time: 0.62596s