# TimeStrader Preprocessing
[](https://www.python.org/downloads/)
[](https://badge.fury.io/py/timestrader-preprocessing)
[](https://opensource.org/licenses/MIT)
A pip-installable package providing TimeStrader data processing capabilities optimized for Google Colab training and retraining workflows.
## π Quick Start
### Installation
#### For Google Colab (Recommended)
```bash
pip install timestrader-preprocessing[colab]
```
#### Basic Installation
```bash
pip install timestrader-preprocessing
```
#### Production Environment
```bash
pip install timestrader-preprocessing[production]
```
### Basic Usage
> **β οΈ Important**: As of v1.0.3, the API has been simplified for better Google Colab compatibility. Use `HistoricalProcessor` as the main entry point.
```python
import timestrader_preprocessing as tsp
# Check environment
print(f"Running in Colab: {tsp.is_colab_environment()}")
print(f"Environment info: {tsp.ENVIRONMENT_INFO}")
# Load and process historical data
processor = tsp.HistoricalProcessor()
data = processor.load_from_csv("mnq_historical.csv")
indicators = processor.calculate_indicators(data)
normalized, params = processor.normalize_data(indicators)
print(f"Processed {len(data)} candles")
print(f"Data quality: {processor.get_quality_metrics()}")
```
## π Version 1.0.3 Updates
### API Simplification
The package API has been streamlined for better Google Colab compatibility:
```python
# β
Correct Usage (v1.0.3+)
from timestrader_preprocessing import HistoricalProcessor
processor = HistoricalProcessor()
# Step-by-step processing
validation_results = processor.validate_data(raw_data)
indicators_data = processor.calculate_indicators(raw_data, indicators=['vwap', 'rsi', 'atr', 'ema9', 'ema21', 'stoch'])
normalized_data, params = processor.normalize_data(indicators_data, window_size=288, method='zscore')
sequences = processor.generate_training_sequences(normalized_data, sequence_length=144)
```
### Deprecated Usage
```python
# β No longer available (caused import errors in Colab)
from timestrader_preprocessing import UnifiedDataProcessor, TechnicalIndicators
from timestrader_preprocessing.core.config import ProcessingMode
from timestrader_preprocessing.core.data_structures import MarketData
```
### Method Changes
| Old Method (v1.0.0-1.0.2) | New Method (v1.0.3+) | Status |
|---------------------------|----------------------|---------|
| `UnifiedDataProcessor()` | `HistoricalProcessor()` | β
Simplified |
| `process_historical_data()` | `calculate_indicators()` + `normalize_data()` | β
Split for clarity |
| `MarketData` dataclass | pandas DataFrame | β
Standard format |
| `ProcessingMode.TRAINING` | Direct method calls | β
Simplified |
## π Features
### Historical Data Processing
- **OHLCV Data Loading**: CSV and pandas DataFrame support
- **Technical Indicators**: VWAP, RSI, ATR, EMA9, EMA21, Stochastic
- **Data Validation**: Comprehensive outlier detection and quality scoring
- **Normalization**: Z-score normalization with rolling windows
- **Parameter Export**: Export normalization parameters for production consistency
### Google Colab Optimization
- **Fast Installation**: < 2 minutes in Colab environment
- **Quick Import**: < 10 seconds package initialization
- **CPU-Only Dependencies**: No CUDA/GPU requirements for basic functionality
- **Memory Efficient**: < 100MB package overhead after import
- **Environment Detection**: Automatic Colab/Jupyter detection
### Real-time Components (Production)
- **Streaming Normalization**: Real-time data processing with exported parameters
- **Production Integration**: Compatible with TimeStrader VPS deployment
## π Detailed Documentation
### Historical Processor API
```python
from timestrader_preprocessing import HistoricalProcessor
# Initialize processor
processor = HistoricalProcessor(config_path="config.yaml")
# Load data (supports file paths, StringIO for Colab)
data = processor.load_from_csv(
file_path="data.csv",
progress_bar=True # Show progress for large files
)
# Calculate technical indicators
indicators = processor.calculate_indicators(
data=data,
indicators=['vwap', 'rsi', 'atr', 'ema9', 'ema21', 'stoch']
)
# Normalize data with rolling window
normalized, params = processor.normalize_data(
data=indicators,
window_size=288, # 24 hours for 5-min candles
method='zscore'
)
# Export parameters for production
processor.export_normalization_parameters(
params=params,
output_path="normalization_params.json"
)
# Get data quality metrics
quality = processor.get_quality_metrics()
print(f"Quality score: {quality.score:.2%}")
```
### Environment Detection
```python
import timestrader_preprocessing as tsp
# Check environment
if tsp.is_colab_environment():
print("Running in Google Colab")
# Colab-specific optimizations
elif tsp.is_jupyter_environment():
print("Running in Jupyter notebook")
else:
print("Running in standard Python environment")
# Access environment information
info = tsp.ENVIRONMENT_INFO
print(f"Python version: {info['python_version']}")
print(f"Package version: {info['package_version']}")
```
### Configuration Management
```python
from timestrader_preprocessing.config import get_default_config
# Get default configuration for current environment
config = get_default_config()
# Colab-specific configuration
colab_config = get_default_config(environment='colab')
# Production configuration
prod_config = get_default_config(environment='production')
```
## π§ͺ Testing
```bash
# Run all tests
pytest
# Run specific test categories
pytest -m unit # Fast unit tests
pytest -m integration # Integration tests
pytest -m colab # Colab-specific tests
pytest -m package # Package installation tests
# Run with coverage
pytest --cov=timestrader_preprocessing --cov-report=html
```
## π Performance Benchmarks
| Metric | Target | Typical |
|--------|--------|---------|
| Installation Time (Colab) | < 2 minutes | ~1.5 minutes |
| Import Time | < 10 seconds | ~3 seconds |
| Package Size | < 50MB | ~35MB |
| Memory Overhead | < 100MB | ~65MB |
| Processing Speed | 441K candles < 5 min | ~3.5 minutes |
## π§ Development
### Local Development Setup
```bash
# Clone repository
git clone https://github.com/timestrader/timestrader-v05
cd timestrader-v05/timestrader-preprocessing
# Install development dependencies
pip install -e .[dev]
# Format code
black src/ tests/
isort src/ tests/
# Type checking
mypy src/
# Run tests
pytest
```
### Building and Publishing
```bash
# Build package
python -m build
# Check package
twine check dist/*
# Upload to PyPI (requires authentication)
twine upload dist/*
# Test installation
pip install timestrader-preprocessing
```
## π Changelog
See [CHANGELOG.md](CHANGELOG.md) for version history and updates.
## π€ Contributing
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## π Support
- **Documentation**: https://timestrader.readthedocs.io
- **Issues**: https://github.com/timestrader/timestrader-v05/issues
- **Discussions**: https://github.com/timestrader/timestrader-v05/discussions
## ποΈ Architecture
This package is part of the TimeStrader AI trading system:
```
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Google Colab β β PyPI Package β β VPS Production β
β β β β β β
β Model Training ββββββ€ timestrader- βββββΊβ Real-time β
β Data Processing β β preprocessing β β Trading β
β β β β β β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
```
- **Training Phase**: Use this package in Google Colab for historical data processing and model training
- **Production Phase**: Export parameters and models to VPS for real-time trading
- **Retraining**: Weekly updates using the same preprocessing pipeline for consistency
---
**TimeStrader Team** - Building the future of AI-powered trading
Raw data
{
"_id": null,
"home_page": null,
"name": "timestrader-preprocessing",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "trading, ai, timeseries, preprocessing, colab, finance",
"author": null,
"author_email": "Carlos Guimar\u00e3es <guimaraes.dpf@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/4a/20/3ac26802331c6c1ff1b2b680cdee9029893625d6dd37b1bd5f9efc106c9c/timestrader_preprocessing-1.0.8.tar.gz",
"platform": null,
"description": "# TimeStrader Preprocessing\n\n[](https://www.python.org/downloads/)\n[](https://badge.fury.io/py/timestrader-preprocessing)\n[](https://opensource.org/licenses/MIT)\n\nA pip-installable package providing TimeStrader data processing capabilities optimized for Google Colab training and retraining workflows.\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n#### For Google Colab (Recommended)\n```bash\npip install timestrader-preprocessing[colab]\n```\n\n#### Basic Installation\n```bash\npip install timestrader-preprocessing\n```\n\n#### Production Environment\n```bash\npip install timestrader-preprocessing[production]\n```\n\n### Basic Usage\n\n> **\u26a0\ufe0f Important**: As of v1.0.3, the API has been simplified for better Google Colab compatibility. Use `HistoricalProcessor` as the main entry point.\n\n```python\nimport timestrader_preprocessing as tsp\n\n# Check environment\nprint(f\"Running in Colab: {tsp.is_colab_environment()}\")\nprint(f\"Environment info: {tsp.ENVIRONMENT_INFO}\")\n\n# Load and process historical data\nprocessor = tsp.HistoricalProcessor()\ndata = processor.load_from_csv(\"mnq_historical.csv\")\nindicators = processor.calculate_indicators(data)\nnormalized, params = processor.normalize_data(indicators)\n\nprint(f\"Processed {len(data)} candles\")\nprint(f\"Data quality: {processor.get_quality_metrics()}\")\n```\n\n## \ud83d\udd04 Version 1.0.3 Updates\n\n### API Simplification\nThe package API has been streamlined for better Google Colab compatibility:\n\n```python\n# \u2705 Correct Usage (v1.0.3+)\nfrom timestrader_preprocessing import HistoricalProcessor\n\nprocessor = HistoricalProcessor()\n\n# Step-by-step processing\nvalidation_results = processor.validate_data(raw_data)\nindicators_data = processor.calculate_indicators(raw_data, indicators=['vwap', 'rsi', 'atr', 'ema9', 'ema21', 'stoch'])\nnormalized_data, params = processor.normalize_data(indicators_data, window_size=288, method='zscore')\nsequences = processor.generate_training_sequences(normalized_data, sequence_length=144)\n```\n\n### Deprecated Usage\n```python\n# \u274c No longer available (caused import errors in Colab)\nfrom timestrader_preprocessing import UnifiedDataProcessor, TechnicalIndicators\nfrom timestrader_preprocessing.core.config import ProcessingMode\nfrom timestrader_preprocessing.core.data_structures import MarketData\n```\n\n### Method Changes\n| Old Method (v1.0.0-1.0.2) | New Method (v1.0.3+) | Status |\n|---------------------------|----------------------|---------|\n| `UnifiedDataProcessor()` | `HistoricalProcessor()` | \u2705 Simplified |\n| `process_historical_data()` | `calculate_indicators()` + `normalize_data()` | \u2705 Split for clarity |\n| `MarketData` dataclass | pandas DataFrame | \u2705 Standard format |\n| `ProcessingMode.TRAINING` | Direct method calls | \u2705 Simplified |\n\n## \ud83d\udccb Features\n\n### Historical Data Processing\n- **OHLCV Data Loading**: CSV and pandas DataFrame support\n- **Technical Indicators**: VWAP, RSI, ATR, EMA9, EMA21, Stochastic\n- **Data Validation**: Comprehensive outlier detection and quality scoring\n- **Normalization**: Z-score normalization with rolling windows\n- **Parameter Export**: Export normalization parameters for production consistency\n\n### Google Colab Optimization\n- **Fast Installation**: < 2 minutes in Colab environment\n- **Quick Import**: < 10 seconds package initialization\n- **CPU-Only Dependencies**: No CUDA/GPU requirements for basic functionality\n- **Memory Efficient**: < 100MB package overhead after import\n- **Environment Detection**: Automatic Colab/Jupyter detection\n\n### Real-time Components (Production)\n- **Streaming Normalization**: Real-time data processing with exported parameters\n- **Production Integration**: Compatible with TimeStrader VPS deployment\n\n## \ud83d\udcd6 Detailed Documentation\n\n### Historical Processor API\n\n```python\nfrom timestrader_preprocessing import HistoricalProcessor\n\n# Initialize processor\nprocessor = HistoricalProcessor(config_path=\"config.yaml\")\n\n# Load data (supports file paths, StringIO for Colab)\ndata = processor.load_from_csv(\n file_path=\"data.csv\",\n progress_bar=True # Show progress for large files\n)\n\n# Calculate technical indicators\nindicators = processor.calculate_indicators(\n data=data,\n indicators=['vwap', 'rsi', 'atr', 'ema9', 'ema21', 'stoch']\n)\n\n# Normalize data with rolling window\nnormalized, params = processor.normalize_data(\n data=indicators,\n window_size=288, # 24 hours for 5-min candles\n method='zscore'\n)\n\n# Export parameters for production\nprocessor.export_normalization_parameters(\n params=params,\n output_path=\"normalization_params.json\"\n)\n\n# Get data quality metrics\nquality = processor.get_quality_metrics()\nprint(f\"Quality score: {quality.score:.2%}\")\n```\n\n### Environment Detection\n\n```python\nimport timestrader_preprocessing as tsp\n\n# Check environment\nif tsp.is_colab_environment():\n print(\"Running in Google Colab\")\n # Colab-specific optimizations\nelif tsp.is_jupyter_environment():\n print(\"Running in Jupyter notebook\")\nelse:\n print(\"Running in standard Python environment\")\n\n# Access environment information\ninfo = tsp.ENVIRONMENT_INFO\nprint(f\"Python version: {info['python_version']}\")\nprint(f\"Package version: {info['package_version']}\")\n```\n\n### Configuration Management\n\n```python\nfrom timestrader_preprocessing.config import get_default_config\n\n# Get default configuration for current environment\nconfig = get_default_config()\n\n# Colab-specific configuration\ncolab_config = get_default_config(environment='colab')\n\n# Production configuration \nprod_config = get_default_config(environment='production')\n```\n\n## \ud83e\uddea Testing\n\n```bash\n# Run all tests\npytest\n\n# Run specific test categories\npytest -m unit # Fast unit tests\npytest -m integration # Integration tests \npytest -m colab # Colab-specific tests\npytest -m package # Package installation tests\n\n# Run with coverage\npytest --cov=timestrader_preprocessing --cov-report=html\n```\n\n## \ud83d\udcca Performance Benchmarks\n\n| Metric | Target | Typical |\n|--------|--------|---------|\n| Installation Time (Colab) | < 2 minutes | ~1.5 minutes |\n| Import Time | < 10 seconds | ~3 seconds |\n| Package Size | < 50MB | ~35MB |\n| Memory Overhead | < 100MB | ~65MB |\n| Processing Speed | 441K candles < 5 min | ~3.5 minutes |\n\n## \ud83d\udd27 Development\n\n### Local Development Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/timestrader/timestrader-v05\ncd timestrader-v05/timestrader-preprocessing\n\n# Install development dependencies\npip install -e .[dev]\n\n# Format code\nblack src/ tests/\nisort src/ tests/\n\n# Type checking\nmypy src/\n\n# Run tests\npytest\n```\n\n### Building and Publishing\n\n```bash\n# Build package\npython -m build\n\n# Check package\ntwine check dist/*\n\n# Upload to PyPI (requires authentication)\ntwine upload dist/*\n\n# Test installation\npip install timestrader-preprocessing\n```\n\n## \ud83d\udcdd Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for version history and updates.\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83c\udd98 Support\n\n- **Documentation**: https://timestrader.readthedocs.io\n- **Issues**: https://github.com/timestrader/timestrader-v05/issues\n- **Discussions**: https://github.com/timestrader/timestrader-v05/discussions\n\n## \ud83c\udfd7\ufe0f Architecture\n\nThis package is part of the TimeStrader AI trading system:\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 Google Colab \u2502 \u2502 PyPI Package \u2502 \u2502 VPS Production \u2502\n\u2502 \u2502 \u2502 \u2502 \u2502 \u2502\n\u2502 Model Training \u2502\u25c4\u2500\u2500\u2500\u2524 timestrader- \u2502\u2500\u2500\u2500\u25ba\u2502 Real-time \u2502\n\u2502 Data Processing \u2502 \u2502 preprocessing \u2502 \u2502 Trading \u2502\n\u2502 \u2502 \u2502 \u2502 \u2502 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n- **Training Phase**: Use this package in Google Colab for historical data processing and model training\n- **Production Phase**: Export parameters and models to VPS for real-time trading\n- **Retraining**: Weekly updates using the same preprocessing pipeline for consistency\n\n---\n\n**TimeStrader Team** - Building the future of AI-powered trading\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Data preprocessing pipeline for TimeStrader AI trading system - Google Colab optimized",
"version": "1.0.8",
"project_urls": {
"Changelog": "https://github.com/timestrader/timestrader-v05/blob/main/timestrader-preprocessing/CHANGELOG.md",
"Documentation": "https://timestrader.readthedocs.io",
"Homepage": "https://github.com/timestrader/timestrader-v05",
"Issues": "https://github.com/timestrader/timestrader-v05/issues",
"Repository": "https://github.com/timestrader/timestrader-v05"
},
"split_keywords": [
"trading",
" ai",
" timeseries",
" preprocessing",
" colab",
" finance"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "949991dd837a8a8f0db470931b94e1ed2997c3a432fb60338c45a9c56ec05997",
"md5": "9cd4449ff6b7a489860ef50b0255720f",
"sha256": "51b8896f88904c51fd9b1716ccf872b66cfba236015b9f5cee6cd7b28754629c"
},
"downloads": -1,
"filename": "timestrader_preprocessing-1.0.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9cd4449ff6b7a489860ef50b0255720f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 41669,
"upload_time": "2025-09-09T04:24:32",
"upload_time_iso_8601": "2025-09-09T04:24:32.919129Z",
"url": "https://files.pythonhosted.org/packages/94/99/91dd837a8a8f0db470931b94e1ed2997c3a432fb60338c45a9c56ec05997/timestrader_preprocessing-1.0.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4a203ac26802331c6c1ff1b2b680cdee9029893625d6dd37b1bd5f9efc106c9c",
"md5": "53e45d6a4555fe32f10661c3d5f2dd6e",
"sha256": "380b4ee4b4623f9e8a394b7ef094e015ab2bc7a17761be7a991d5503497049b3"
},
"downloads": -1,
"filename": "timestrader_preprocessing-1.0.8.tar.gz",
"has_sig": false,
"md5_digest": "53e45d6a4555fe32f10661c3d5f2dd6e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 53695,
"upload_time": "2025-09-09T04:24:35",
"upload_time_iso_8601": "2025-09-09T04:24:35.083737Z",
"url": "https://files.pythonhosted.org/packages/4a/20/3ac26802331c6c1ff1b2b680cdee9029893625d6dd37b1bd5f9efc106c9c/timestrader_preprocessing-1.0.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-09 04:24:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "timestrader",
"github_project": "timestrader-v05",
"github_not_found": true,
"lcname": "timestrader-preprocessing"
}