timestrader-preprocessing


Nametimestrader-preprocessing JSON
Version 1.0.8 PyPI version JSON
download
home_pageNone
SummaryData preprocessing pipeline for TimeStrader AI trading system - Google Colab optimized
upload_time2025-09-09 04:24:35
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords trading ai timeseries preprocessing colab finance
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TimeStrader Preprocessing

[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![PyPI version](https://badge.fury.io/py/timestrader-preprocessing.svg)](https://badge.fury.io/py/timestrader-preprocessing)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A pip-installable package providing TimeStrader data processing capabilities optimized for Google Colab training and retraining workflows.

## πŸš€ Quick Start

### Installation

#### For Google Colab (Recommended)
```bash
pip install timestrader-preprocessing[colab]
```

#### Basic Installation
```bash
pip install timestrader-preprocessing
```

#### Production Environment
```bash
pip install timestrader-preprocessing[production]
```

### Basic Usage

> **⚠️ Important**: As of v1.0.3, the API has been simplified for better Google Colab compatibility. Use `HistoricalProcessor` as the main entry point.

```python
import timestrader_preprocessing as tsp

# Check environment
print(f"Running in Colab: {tsp.is_colab_environment()}")
print(f"Environment info: {tsp.ENVIRONMENT_INFO}")

# Load and process historical data
processor = tsp.HistoricalProcessor()
data = processor.load_from_csv("mnq_historical.csv")
indicators = processor.calculate_indicators(data)
normalized, params = processor.normalize_data(indicators)

print(f"Processed {len(data)} candles")
print(f"Data quality: {processor.get_quality_metrics()}")
```

## πŸ”„ Version 1.0.3 Updates

### API Simplification
The package API has been streamlined for better Google Colab compatibility:

```python
# βœ… Correct Usage (v1.0.3+)
from timestrader_preprocessing import HistoricalProcessor

processor = HistoricalProcessor()

# Step-by-step processing
validation_results = processor.validate_data(raw_data)
indicators_data = processor.calculate_indicators(raw_data, indicators=['vwap', 'rsi', 'atr', 'ema9', 'ema21', 'stoch'])
normalized_data, params = processor.normalize_data(indicators_data, window_size=288, method='zscore')
sequences = processor.generate_training_sequences(normalized_data, sequence_length=144)
```

### Deprecated Usage
```python
# ❌ No longer available (caused import errors in Colab)
from timestrader_preprocessing import UnifiedDataProcessor, TechnicalIndicators
from timestrader_preprocessing.core.config import ProcessingMode
from timestrader_preprocessing.core.data_structures import MarketData
```

### Method Changes
| Old Method (v1.0.0-1.0.2) | New Method (v1.0.3+) | Status |
|---------------------------|----------------------|---------|
| `UnifiedDataProcessor()` | `HistoricalProcessor()` | βœ… Simplified |
| `process_historical_data()` | `calculate_indicators()` + `normalize_data()` | βœ… Split for clarity |
| `MarketData` dataclass | pandas DataFrame | βœ… Standard format |
| `ProcessingMode.TRAINING` | Direct method calls | βœ… Simplified |

## πŸ“‹ Features

### Historical Data Processing
- **OHLCV Data Loading**: CSV and pandas DataFrame support
- **Technical Indicators**: VWAP, RSI, ATR, EMA9, EMA21, Stochastic
- **Data Validation**: Comprehensive outlier detection and quality scoring
- **Normalization**: Z-score normalization with rolling windows
- **Parameter Export**: Export normalization parameters for production consistency

### Google Colab Optimization
- **Fast Installation**: < 2 minutes in Colab environment
- **Quick Import**: < 10 seconds package initialization
- **CPU-Only Dependencies**: No CUDA/GPU requirements for basic functionality
- **Memory Efficient**: < 100MB package overhead after import
- **Environment Detection**: Automatic Colab/Jupyter detection

### Real-time Components (Production)
- **Streaming Normalization**: Real-time data processing with exported parameters
- **Production Integration**: Compatible with TimeStrader VPS deployment

## πŸ“– Detailed Documentation

### Historical Processor API

```python
from timestrader_preprocessing import HistoricalProcessor

# Initialize processor
processor = HistoricalProcessor(config_path="config.yaml")

# Load data (supports file paths, StringIO for Colab)
data = processor.load_from_csv(
    file_path="data.csv",
    progress_bar=True  # Show progress for large files
)

# Calculate technical indicators
indicators = processor.calculate_indicators(
    data=data,
    indicators=['vwap', 'rsi', 'atr', 'ema9', 'ema21', 'stoch']
)

# Normalize data with rolling window
normalized, params = processor.normalize_data(
    data=indicators,
    window_size=288,  # 24 hours for 5-min candles
    method='zscore'
)

# Export parameters for production
processor.export_normalization_parameters(
    params=params,
    output_path="normalization_params.json"
)

# Get data quality metrics
quality = processor.get_quality_metrics()
print(f"Quality score: {quality.score:.2%}")
```

### Environment Detection

```python
import timestrader_preprocessing as tsp

# Check environment
if tsp.is_colab_environment():
    print("Running in Google Colab")
    # Colab-specific optimizations
elif tsp.is_jupyter_environment():
    print("Running in Jupyter notebook")
else:
    print("Running in standard Python environment")

# Access environment information
info = tsp.ENVIRONMENT_INFO
print(f"Python version: {info['python_version']}")
print(f"Package version: {info['package_version']}")
```

### Configuration Management

```python
from timestrader_preprocessing.config import get_default_config

# Get default configuration for current environment
config = get_default_config()

# Colab-specific configuration
colab_config = get_default_config(environment='colab')

# Production configuration  
prod_config = get_default_config(environment='production')
```

## πŸ§ͺ Testing

```bash
# Run all tests
pytest

# Run specific test categories
pytest -m unit          # Fast unit tests
pytest -m integration   # Integration tests  
pytest -m colab        # Colab-specific tests
pytest -m package      # Package installation tests

# Run with coverage
pytest --cov=timestrader_preprocessing --cov-report=html
```

## πŸ“Š Performance Benchmarks

| Metric | Target | Typical |
|--------|--------|---------|
| Installation Time (Colab) | < 2 minutes | ~1.5 minutes |
| Import Time | < 10 seconds | ~3 seconds |
| Package Size | < 50MB | ~35MB |
| Memory Overhead | < 100MB | ~65MB |
| Processing Speed | 441K candles < 5 min | ~3.5 minutes |

## πŸ”§ Development

### Local Development Setup

```bash
# Clone repository
git clone https://github.com/timestrader/timestrader-v05
cd timestrader-v05/timestrader-preprocessing

# Install development dependencies
pip install -e .[dev]

# Format code
black src/ tests/
isort src/ tests/

# Type checking
mypy src/

# Run tests
pytest
```

### Building and Publishing

```bash
# Build package
python -m build

# Check package
twine check dist/*

# Upload to PyPI (requires authentication)
twine upload dist/*

# Test installation
pip install timestrader-preprocessing
```

## πŸ“ Changelog

See [CHANGELOG.md](CHANGELOG.md) for version history and updates.

## 🀝 Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## πŸ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## πŸ†˜ Support

- **Documentation**: https://timestrader.readthedocs.io
- **Issues**: https://github.com/timestrader/timestrader-v05/issues
- **Discussions**: https://github.com/timestrader/timestrader-v05/discussions

## πŸ—οΈ Architecture

This package is part of the TimeStrader AI trading system:

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Google Colab   β”‚    β”‚  PyPI Package    β”‚    β”‚   VPS Production β”‚
β”‚                 β”‚    β”‚                  β”‚    β”‚                 β”‚
β”‚ Model Training  │◄──── timestrader-     │───►│  Real-time      β”‚
β”‚ Data Processing β”‚    β”‚ preprocessing    β”‚    β”‚  Trading        β”‚
β”‚                 β”‚    β”‚                  β”‚    β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

- **Training Phase**: Use this package in Google Colab for historical data processing and model training
- **Production Phase**: Export parameters and models to VPS for real-time trading
- **Retraining**: Weekly updates using the same preprocessing pipeline for consistency

---

**TimeStrader Team** - Building the future of AI-powered trading

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "timestrader-preprocessing",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "trading, ai, timeseries, preprocessing, colab, finance",
    "author": null,
    "author_email": "Carlos Guimar\u00e3es <guimaraes.dpf@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/4a/20/3ac26802331c6c1ff1b2b680cdee9029893625d6dd37b1bd5f9efc106c9c/timestrader_preprocessing-1.0.8.tar.gz",
    "platform": null,
    "description": "# TimeStrader Preprocessing\n\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![PyPI version](https://badge.fury.io/py/timestrader-preprocessing.svg)](https://badge.fury.io/py/timestrader-preprocessing)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nA pip-installable package providing TimeStrader data processing capabilities optimized for Google Colab training and retraining workflows.\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n#### For Google Colab (Recommended)\n```bash\npip install timestrader-preprocessing[colab]\n```\n\n#### Basic Installation\n```bash\npip install timestrader-preprocessing\n```\n\n#### Production Environment\n```bash\npip install timestrader-preprocessing[production]\n```\n\n### Basic Usage\n\n> **\u26a0\ufe0f Important**: As of v1.0.3, the API has been simplified for better Google Colab compatibility. Use `HistoricalProcessor` as the main entry point.\n\n```python\nimport timestrader_preprocessing as tsp\n\n# Check environment\nprint(f\"Running in Colab: {tsp.is_colab_environment()}\")\nprint(f\"Environment info: {tsp.ENVIRONMENT_INFO}\")\n\n# Load and process historical data\nprocessor = tsp.HistoricalProcessor()\ndata = processor.load_from_csv(\"mnq_historical.csv\")\nindicators = processor.calculate_indicators(data)\nnormalized, params = processor.normalize_data(indicators)\n\nprint(f\"Processed {len(data)} candles\")\nprint(f\"Data quality: {processor.get_quality_metrics()}\")\n```\n\n## \ud83d\udd04 Version 1.0.3 Updates\n\n### API Simplification\nThe package API has been streamlined for better Google Colab compatibility:\n\n```python\n# \u2705 Correct Usage (v1.0.3+)\nfrom timestrader_preprocessing import HistoricalProcessor\n\nprocessor = HistoricalProcessor()\n\n# Step-by-step processing\nvalidation_results = processor.validate_data(raw_data)\nindicators_data = processor.calculate_indicators(raw_data, indicators=['vwap', 'rsi', 'atr', 'ema9', 'ema21', 'stoch'])\nnormalized_data, params = processor.normalize_data(indicators_data, window_size=288, method='zscore')\nsequences = processor.generate_training_sequences(normalized_data, sequence_length=144)\n```\n\n### Deprecated Usage\n```python\n# \u274c No longer available (caused import errors in Colab)\nfrom timestrader_preprocessing import UnifiedDataProcessor, TechnicalIndicators\nfrom timestrader_preprocessing.core.config import ProcessingMode\nfrom timestrader_preprocessing.core.data_structures import MarketData\n```\n\n### Method Changes\n| Old Method (v1.0.0-1.0.2) | New Method (v1.0.3+) | Status |\n|---------------------------|----------------------|---------|\n| `UnifiedDataProcessor()` | `HistoricalProcessor()` | \u2705 Simplified |\n| `process_historical_data()` | `calculate_indicators()` + `normalize_data()` | \u2705 Split for clarity |\n| `MarketData` dataclass | pandas DataFrame | \u2705 Standard format |\n| `ProcessingMode.TRAINING` | Direct method calls | \u2705 Simplified |\n\n## \ud83d\udccb Features\n\n### Historical Data Processing\n- **OHLCV Data Loading**: CSV and pandas DataFrame support\n- **Technical Indicators**: VWAP, RSI, ATR, EMA9, EMA21, Stochastic\n- **Data Validation**: Comprehensive outlier detection and quality scoring\n- **Normalization**: Z-score normalization with rolling windows\n- **Parameter Export**: Export normalization parameters for production consistency\n\n### Google Colab Optimization\n- **Fast Installation**: < 2 minutes in Colab environment\n- **Quick Import**: < 10 seconds package initialization\n- **CPU-Only Dependencies**: No CUDA/GPU requirements for basic functionality\n- **Memory Efficient**: < 100MB package overhead after import\n- **Environment Detection**: Automatic Colab/Jupyter detection\n\n### Real-time Components (Production)\n- **Streaming Normalization**: Real-time data processing with exported parameters\n- **Production Integration**: Compatible with TimeStrader VPS deployment\n\n## \ud83d\udcd6 Detailed Documentation\n\n### Historical Processor API\n\n```python\nfrom timestrader_preprocessing import HistoricalProcessor\n\n# Initialize processor\nprocessor = HistoricalProcessor(config_path=\"config.yaml\")\n\n# Load data (supports file paths, StringIO for Colab)\ndata = processor.load_from_csv(\n    file_path=\"data.csv\",\n    progress_bar=True  # Show progress for large files\n)\n\n# Calculate technical indicators\nindicators = processor.calculate_indicators(\n    data=data,\n    indicators=['vwap', 'rsi', 'atr', 'ema9', 'ema21', 'stoch']\n)\n\n# Normalize data with rolling window\nnormalized, params = processor.normalize_data(\n    data=indicators,\n    window_size=288,  # 24 hours for 5-min candles\n    method='zscore'\n)\n\n# Export parameters for production\nprocessor.export_normalization_parameters(\n    params=params,\n    output_path=\"normalization_params.json\"\n)\n\n# Get data quality metrics\nquality = processor.get_quality_metrics()\nprint(f\"Quality score: {quality.score:.2%}\")\n```\n\n### Environment Detection\n\n```python\nimport timestrader_preprocessing as tsp\n\n# Check environment\nif tsp.is_colab_environment():\n    print(\"Running in Google Colab\")\n    # Colab-specific optimizations\nelif tsp.is_jupyter_environment():\n    print(\"Running in Jupyter notebook\")\nelse:\n    print(\"Running in standard Python environment\")\n\n# Access environment information\ninfo = tsp.ENVIRONMENT_INFO\nprint(f\"Python version: {info['python_version']}\")\nprint(f\"Package version: {info['package_version']}\")\n```\n\n### Configuration Management\n\n```python\nfrom timestrader_preprocessing.config import get_default_config\n\n# Get default configuration for current environment\nconfig = get_default_config()\n\n# Colab-specific configuration\ncolab_config = get_default_config(environment='colab')\n\n# Production configuration  \nprod_config = get_default_config(environment='production')\n```\n\n## \ud83e\uddea Testing\n\n```bash\n# Run all tests\npytest\n\n# Run specific test categories\npytest -m unit          # Fast unit tests\npytest -m integration   # Integration tests  \npytest -m colab        # Colab-specific tests\npytest -m package      # Package installation tests\n\n# Run with coverage\npytest --cov=timestrader_preprocessing --cov-report=html\n```\n\n## \ud83d\udcca Performance Benchmarks\n\n| Metric | Target | Typical |\n|--------|--------|---------|\n| Installation Time (Colab) | < 2 minutes | ~1.5 minutes |\n| Import Time | < 10 seconds | ~3 seconds |\n| Package Size | < 50MB | ~35MB |\n| Memory Overhead | < 100MB | ~65MB |\n| Processing Speed | 441K candles < 5 min | ~3.5 minutes |\n\n## \ud83d\udd27 Development\n\n### Local Development Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/timestrader/timestrader-v05\ncd timestrader-v05/timestrader-preprocessing\n\n# Install development dependencies\npip install -e .[dev]\n\n# Format code\nblack src/ tests/\nisort src/ tests/\n\n# Type checking\nmypy src/\n\n# Run tests\npytest\n```\n\n### Building and Publishing\n\n```bash\n# Build package\npython -m build\n\n# Check package\ntwine check dist/*\n\n# Upload to PyPI (requires authentication)\ntwine upload dist/*\n\n# Test installation\npip install timestrader-preprocessing\n```\n\n## \ud83d\udcdd Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for version history and updates.\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83c\udd98 Support\n\n- **Documentation**: https://timestrader.readthedocs.io\n- **Issues**: https://github.com/timestrader/timestrader-v05/issues\n- **Discussions**: https://github.com/timestrader/timestrader-v05/discussions\n\n## \ud83c\udfd7\ufe0f Architecture\n\nThis package is part of the TimeStrader AI trading system:\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502  Google Colab   \u2502    \u2502  PyPI Package    \u2502    \u2502   VPS Production \u2502\n\u2502                 \u2502    \u2502                  \u2502    \u2502                 \u2502\n\u2502 Model Training  \u2502\u25c4\u2500\u2500\u2500\u2524 timestrader-     \u2502\u2500\u2500\u2500\u25ba\u2502  Real-time      \u2502\n\u2502 Data Processing \u2502    \u2502 preprocessing    \u2502    \u2502  Trading        \u2502\n\u2502                 \u2502    \u2502                  \u2502    \u2502                 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n- **Training Phase**: Use this package in Google Colab for historical data processing and model training\n- **Production Phase**: Export parameters and models to VPS for real-time trading\n- **Retraining**: Weekly updates using the same preprocessing pipeline for consistency\n\n---\n\n**TimeStrader Team** - Building the future of AI-powered trading\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Data preprocessing pipeline for TimeStrader AI trading system - Google Colab optimized",
    "version": "1.0.8",
    "project_urls": {
        "Changelog": "https://github.com/timestrader/timestrader-v05/blob/main/timestrader-preprocessing/CHANGELOG.md",
        "Documentation": "https://timestrader.readthedocs.io",
        "Homepage": "https://github.com/timestrader/timestrader-v05",
        "Issues": "https://github.com/timestrader/timestrader-v05/issues",
        "Repository": "https://github.com/timestrader/timestrader-v05"
    },
    "split_keywords": [
        "trading",
        " ai",
        " timeseries",
        " preprocessing",
        " colab",
        " finance"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "949991dd837a8a8f0db470931b94e1ed2997c3a432fb60338c45a9c56ec05997",
                "md5": "9cd4449ff6b7a489860ef50b0255720f",
                "sha256": "51b8896f88904c51fd9b1716ccf872b66cfba236015b9f5cee6cd7b28754629c"
            },
            "downloads": -1,
            "filename": "timestrader_preprocessing-1.0.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9cd4449ff6b7a489860ef50b0255720f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 41669,
            "upload_time": "2025-09-09T04:24:32",
            "upload_time_iso_8601": "2025-09-09T04:24:32.919129Z",
            "url": "https://files.pythonhosted.org/packages/94/99/91dd837a8a8f0db470931b94e1ed2997c3a432fb60338c45a9c56ec05997/timestrader_preprocessing-1.0.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4a203ac26802331c6c1ff1b2b680cdee9029893625d6dd37b1bd5f9efc106c9c",
                "md5": "53e45d6a4555fe32f10661c3d5f2dd6e",
                "sha256": "380b4ee4b4623f9e8a394b7ef094e015ab2bc7a17761be7a991d5503497049b3"
            },
            "downloads": -1,
            "filename": "timestrader_preprocessing-1.0.8.tar.gz",
            "has_sig": false,
            "md5_digest": "53e45d6a4555fe32f10661c3d5f2dd6e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 53695,
            "upload_time": "2025-09-09T04:24:35",
            "upload_time_iso_8601": "2025-09-09T04:24:35.083737Z",
            "url": "https://files.pythonhosted.org/packages/4a/20/3ac26802331c6c1ff1b2b680cdee9029893625d6dd37b1bd5f9efc106c9c/timestrader_preprocessing-1.0.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-09 04:24:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "timestrader",
    "github_project": "timestrader-v05",
    "github_not_found": true,
    "lcname": "timestrader-preprocessing"
}
        
Elapsed time: 2.33198s