Name | stacking-sats-pipeline JSON |
Version |
0.4.0
JSON |
| download |
home_page | None |
Summary | Hypertrial's Stacking Sats Library - Optimized Bitcoin DCA |
upload_time | 2025-07-09 03:43:48 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.11 |
license | MIT License
Copyright (c) 2025 Hypertrial
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE. |
keywords |
bitcoin
dca
backtesting
cryptocurrency
trading
strategy
|
VCS |
 |
bugtrack_url |
|
requirements |
pandas
numpy
requests
pyarrow
python-dotenv
pytest
ruff
autopep8
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Stacking Sats Pipeline
A data engineering pipeline for extracting, loading, and merging cryptocurrency and financial data from multiple sources.
## Requirements
- Python 3.11 or 3.12
- pip
## Installation
```bash
pip install stacking-sats-pipeline
```
## Quick Start
### Data Extraction
Extract all data sources to local files for offline analysis:
#### CLI Usage
```bash
# Extract all data to CSV format
stacking-sats --extract-data csv
# Extract all data to Parquet format (smaller files, better compression)
stacking-sats --extract-data parquet
# Extract to specific directory
stacking-sats --extract-data csv --output-dir data/
stacking-sats --extract-data parquet -o exports/
```
#### Python API
```python
from stacking_sats_pipeline import extract_all_data
# Extract all data to CSV in current directory
extract_all_data("csv")
# Extract all data to Parquet in specific directory
extract_all_data("parquet", "data/exports/")
```
### Data Loading
```python
from stacking_sats_pipeline import load_data
# Load Bitcoin price data
df = load_data()
# Load specific data source
from stacking_sats_pipeline.data import CoinMetricsLoader
loader = CoinMetricsLoader()
btc_data = loader.load_from_web()
```
**What gets extracted:**
- 📈 **Bitcoin Price Data** (CoinMetrics) → `btc_coinmetrics.csv/parquet`
- 😨 **Fear & Greed Index** (Alternative.me) → `fear_greed.csv/parquet`
- 💵 **U.S. Dollar Index** (FRED) → `dxy_fred.csv/parquet`\*
_\*Requires `FRED_API_KEY` environment variable. Get a free key at [FRED API](https://fred.stlouisfed.org/docs/api/api_key.html)_
**File Format Benefits:**
- **CSV**: Human-readable, universally compatible
- **Parquet**: ~50% smaller files, faster loading, preserves data types
### Multi-Source Data Loading
```python
from stacking_sats_pipeline.data import MultiSourceDataLoader
# Load and merge data from all available sources
loader = MultiSourceDataLoader()
available_sources = loader.get_available_sources()
merged_df = loader.load_and_merge(available_sources)
# Available sources: coinmetrics, feargreed, fred (if API key available)
print(f"Available data sources: {available_sources}")
print(f"Merged data shape: {merged_df.shape}")
```
## Data Sources
### CoinMetrics (Bitcoin Price Data)
```python
from stacking_sats_pipeline.data import CoinMetricsLoader
loader = CoinMetricsLoader(data_dir="data/")
df = loader.load_from_web() # Fetch latest data
df = loader.load_from_file() # Load cached data (fetches if missing)
# Extract to files
csv_path = loader.extract_to_csv()
parquet_path = loader.extract_to_parquet()
```
### Fear & Greed Index
```python
from stacking_sats_pipeline.data import FearGreedLoader
loader = FearGreedLoader(data_dir="data/")
df = loader.load_from_web()
```
### FRED (Federal Reserve Economic Data)
```python
import os
os.environ['FRED_API_KEY'] = 'your_api_key_here'
from stacking_sats_pipeline.data import FREDLoader
loader = FREDLoader(data_dir="data/")
df = loader.load_from_web() # DXY (Dollar Index) data
```
## Development
For development and testing:
**Requirements**: Python 3.11 or 3.12
```bash
# Clone the repository
git clone https://github.com/hypertrial/stacking_sats_pipeline.git
cd stacking_sats_pipeline
# Set up development environment (installs dependencies + pre-commit hooks)
make setup-dev
# OR manually:
pip install -e ".[dev]"
pre-commit install
# Run tests
make test
# OR: pytest
# Code quality (MANDATORY - CI will fail if not clean)
make lint # Fix linting issues
make format # Format code
make check # Check without fixing (CI-style)
# Run specific test categories
pytest -m "not integration" # Skip integration tests
pytest -m integration # Run only integration tests
```
### Code Quality Standards
**⚠️ MANDATORY**: All code must pass ruff linting and formatting checks.
- **Linting/Formatting**: We use [ruff](https://docs.astral.sh/ruff/) for both linting and code formatting
- **Pre-commit hooks**: Automatically run on every commit to catch issues early
- **CI enforcement**: Pull requests will fail if code doesn't meet standards
**Quick commands:**
```bash
make help # Show all available commands
make lint # Fix ALL issues (autopep8 + ruff + format)
make autopep8 # Fix line length issues specifically
make format # Format code with ruff only
make format-all # Comprehensive formatting (autopep8 + ruff)
make check # Check code quality (what CI runs)
```
For detailed testing documentation, see [TESTS.md](tests/TESTS.md).
### Contributing Data Sources
The data loading system is designed to be modular and extensible. To add new data sources (exchanges, APIs, etc.), see the [Data Loader Contribution Guide](stacking_sats_pipeline/data/CONTRIBUTE.md) which provides step-by-step instructions for implementing new data loaders.
## Command Line Options
```bash
# Extract data
stacking-sats --extract-data csv --output-dir data/
stacking-sats --extract-data parquet -o exports/
# Show help
stacking-sats --help
```
## Project Structure
```
├── stacking_sats_pipeline/
│ ├── main.py # Pipeline orchestrator and CLI
│ ├── config.py # Configuration constants
│ ├── data/ # Modular data loading system
│ │ ├── coinmetrics_loader.py # CoinMetrics data source
│ │ ├── fear_greed_loader.py # Fear & Greed Index data source
│ │ ├── fred_loader.py # FRED economic data source
│ │ ├── data_loader.py # Multi-source data loader
│ │ └── CONTRIBUTE.md # Guide for adding data sources
│ └── __init__.py # Package exports
├── tutorials/examples.py # Interactive examples
└── tests/ # Comprehensive test suite
```
## API Reference
### Core Functions
```python
from stacking_sats_pipeline import (
extract_all_data, # Extract all data sources to files
load_data, # Load Bitcoin price data
validate_price_data, # Validate price data quality
extract_btc_data_to_csv, # Extract Bitcoin data to CSV
extract_btc_data_to_parquet # Extract Bitcoin data to Parquet
)
```
### Configuration Constants
```python
from stacking_sats_pipeline import (
BACKTEST_START, # Default start date for data range
BACKTEST_END, # Default end date for data range
CYCLE_YEARS, # Default cycle period
MIN_WEIGHT, # Minimum weight threshold
PURCHASE_FREQ # Default purchase frequency
)
```
## Data Validation
All data sources include built-in validation:
```python
from stacking_sats_pipeline import validate_price_data
# Validate Bitcoin price data
df = load_data()
is_valid = validate_price_data(df)
# Custom validation with specific requirements
requirements = {
'required_columns': ['PriceUSD', 'Volume'],
'min_price': 100,
'max_price': 1000000
}
is_valid = validate_price_data(df, **requirements)
```
## File Format Support
The pipeline supports both CSV and Parquet formats:
- **CSV**: Universal compatibility, human-readable
- **Parquet**: Better compression (~50% smaller), faster loading, preserves data types
```python
# CSV format
extract_all_data("csv", "output_dir/")
# Parquet format
extract_all_data("parquet", "output_dir/")
```
## Timestamp Handling
All data sources normalize timestamps to midnight UTC for consistent merging:
```python
loader = MultiSourceDataLoader()
merged_df = loader.load_and_merge(['coinmetrics', 'fred'])
# All timestamps are normalized to 00:00:00 UTC
print(merged_df.index.tz) # UTC
print(merged_df.index.time[0]) # 00:00:00
```
## Error Handling
The pipeline includes comprehensive error handling:
```python
try:
df = extract_all_data("csv")
except Exception as e:
print(f"Data extraction failed: {e}")
# Partial extraction may have succeeded
```
Individual data sources fail gracefully - if one source is unavailable, others will still be extracted.
Raw data
{
"_id": null,
"home_page": null,
"name": "stacking-sats-pipeline",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "Matt Faltyn <matt@trilemmacapital.com>",
"keywords": "bitcoin, dca, backtesting, cryptocurrency, trading, strategy",
"author": null,
"author_email": "Matt Faltyn <matt@trilemmacapital.com>",
"download_url": "https://files.pythonhosted.org/packages/64/2b/6ff4fa087fa0b355867363be3013cb41ef52969d3d86f5923238b17504cf/stacking_sats_pipeline-0.4.0.tar.gz",
"platform": null,
"description": "# Stacking Sats Pipeline\n\nA data engineering pipeline for extracting, loading, and merging cryptocurrency and financial data from multiple sources.\n\n## Requirements\n\n- Python 3.11 or 3.12\n- pip\n\n## Installation\n\n```bash\npip install stacking-sats-pipeline\n```\n\n## Quick Start\n\n### Data Extraction\n\nExtract all data sources to local files for offline analysis:\n\n#### CLI Usage\n\n```bash\n# Extract all data to CSV format\nstacking-sats --extract-data csv\n\n# Extract all data to Parquet format (smaller files, better compression)\nstacking-sats --extract-data parquet\n\n# Extract to specific directory\nstacking-sats --extract-data csv --output-dir data/\nstacking-sats --extract-data parquet -o exports/\n```\n\n#### Python API\n\n```python\nfrom stacking_sats_pipeline import extract_all_data\n\n# Extract all data to CSV in current directory\nextract_all_data(\"csv\")\n\n# Extract all data to Parquet in specific directory\nextract_all_data(\"parquet\", \"data/exports/\")\n```\n\n### Data Loading\n\n```python\nfrom stacking_sats_pipeline import load_data\n\n# Load Bitcoin price data\ndf = load_data()\n\n# Load specific data source\nfrom stacking_sats_pipeline.data import CoinMetricsLoader\nloader = CoinMetricsLoader()\nbtc_data = loader.load_from_web()\n```\n\n**What gets extracted:**\n\n- \ud83d\udcc8 **Bitcoin Price Data** (CoinMetrics) \u2192 `btc_coinmetrics.csv/parquet`\n- \ud83d\ude28 **Fear & Greed Index** (Alternative.me) \u2192 `fear_greed.csv/parquet`\n- \ud83d\udcb5 **U.S. Dollar Index** (FRED) \u2192 `dxy_fred.csv/parquet`\\*\n\n_\\*Requires `FRED_API_KEY` environment variable. Get a free key at [FRED API](https://fred.stlouisfed.org/docs/api/api_key.html)_\n\n**File Format Benefits:**\n\n- **CSV**: Human-readable, universally compatible\n- **Parquet**: ~50% smaller files, faster loading, preserves data types\n\n### Multi-Source Data Loading\n\n```python\nfrom stacking_sats_pipeline.data import MultiSourceDataLoader\n\n# Load and merge data from all available sources\nloader = MultiSourceDataLoader()\navailable_sources = loader.get_available_sources()\nmerged_df = loader.load_and_merge(available_sources)\n\n# Available sources: coinmetrics, feargreed, fred (if API key available)\nprint(f\"Available data sources: {available_sources}\")\nprint(f\"Merged data shape: {merged_df.shape}\")\n```\n\n## Data Sources\n\n### CoinMetrics (Bitcoin Price Data)\n\n```python\nfrom stacking_sats_pipeline.data import CoinMetricsLoader\n\nloader = CoinMetricsLoader(data_dir=\"data/\")\ndf = loader.load_from_web() # Fetch latest data\ndf = loader.load_from_file() # Load cached data (fetches if missing)\n\n# Extract to files\ncsv_path = loader.extract_to_csv()\nparquet_path = loader.extract_to_parquet()\n```\n\n### Fear & Greed Index\n\n```python\nfrom stacking_sats_pipeline.data import FearGreedLoader\n\nloader = FearGreedLoader(data_dir=\"data/\")\ndf = loader.load_from_web()\n```\n\n### FRED (Federal Reserve Economic Data)\n\n```python\nimport os\nos.environ['FRED_API_KEY'] = 'your_api_key_here'\n\nfrom stacking_sats_pipeline.data import FREDLoader\n\nloader = FREDLoader(data_dir=\"data/\")\ndf = loader.load_from_web() # DXY (Dollar Index) data\n```\n\n## Development\n\nFor development and testing:\n\n**Requirements**: Python 3.11 or 3.12\n\n```bash\n# Clone the repository\ngit clone https://github.com/hypertrial/stacking_sats_pipeline.git\ncd stacking_sats_pipeline\n\n# Set up development environment (installs dependencies + pre-commit hooks)\nmake setup-dev\n\n# OR manually:\npip install -e \".[dev]\"\npre-commit install\n\n# Run tests\nmake test\n# OR: pytest\n\n# Code quality (MANDATORY - CI will fail if not clean)\nmake lint # Fix linting issues\nmake format # Format code\nmake check # Check without fixing (CI-style)\n\n# Run specific test categories\npytest -m \"not integration\" # Skip integration tests\npytest -m integration # Run only integration tests\n```\n\n### Code Quality Standards\n\n**\u26a0\ufe0f MANDATORY**: All code must pass ruff linting and formatting checks.\n\n- **Linting/Formatting**: We use [ruff](https://docs.astral.sh/ruff/) for both linting and code formatting\n- **Pre-commit hooks**: Automatically run on every commit to catch issues early\n- **CI enforcement**: Pull requests will fail if code doesn't meet standards\n\n**Quick commands:**\n\n```bash\nmake help # Show all available commands\nmake lint # Fix ALL issues (autopep8 + ruff + format)\nmake autopep8 # Fix line length issues specifically\nmake format # Format code with ruff only\nmake format-all # Comprehensive formatting (autopep8 + ruff)\nmake check # Check code quality (what CI runs)\n```\n\nFor detailed testing documentation, see [TESTS.md](tests/TESTS.md).\n\n### Contributing Data Sources\n\nThe data loading system is designed to be modular and extensible. To add new data sources (exchanges, APIs, etc.), see the [Data Loader Contribution Guide](stacking_sats_pipeline/data/CONTRIBUTE.md) which provides step-by-step instructions for implementing new data loaders.\n\n## Command Line Options\n\n```bash\n# Extract data\nstacking-sats --extract-data csv --output-dir data/\nstacking-sats --extract-data parquet -o exports/\n\n# Show help\nstacking-sats --help\n```\n\n## Project Structure\n\n```\n\u251c\u2500\u2500 stacking_sats_pipeline/\n\u2502 \u251c\u2500\u2500 main.py # Pipeline orchestrator and CLI\n\u2502 \u251c\u2500\u2500 config.py # Configuration constants\n\u2502 \u251c\u2500\u2500 data/ # Modular data loading system\n\u2502 \u2502 \u251c\u2500\u2500 coinmetrics_loader.py # CoinMetrics data source\n\u2502 \u2502 \u251c\u2500\u2500 fear_greed_loader.py # Fear & Greed Index data source\n\u2502 \u2502 \u251c\u2500\u2500 fred_loader.py # FRED economic data source\n\u2502 \u2502 \u251c\u2500\u2500 data_loader.py # Multi-source data loader\n\u2502 \u2502 \u2514\u2500\u2500 CONTRIBUTE.md # Guide for adding data sources\n\u2502 \u2514\u2500\u2500 __init__.py # Package exports\n\u251c\u2500\u2500 tutorials/examples.py # Interactive examples\n\u2514\u2500\u2500 tests/ # Comprehensive test suite\n```\n\n## API Reference\n\n### Core Functions\n\n```python\nfrom stacking_sats_pipeline import (\n extract_all_data, # Extract all data sources to files\n load_data, # Load Bitcoin price data\n validate_price_data, # Validate price data quality\n extract_btc_data_to_csv, # Extract Bitcoin data to CSV\n extract_btc_data_to_parquet # Extract Bitcoin data to Parquet\n)\n```\n\n### Configuration Constants\n\n```python\nfrom stacking_sats_pipeline import (\n BACKTEST_START, # Default start date for data range\n BACKTEST_END, # Default end date for data range\n CYCLE_YEARS, # Default cycle period\n MIN_WEIGHT, # Minimum weight threshold\n PURCHASE_FREQ # Default purchase frequency\n)\n```\n\n## Data Validation\n\nAll data sources include built-in validation:\n\n```python\nfrom stacking_sats_pipeline import validate_price_data\n\n# Validate Bitcoin price data\ndf = load_data()\nis_valid = validate_price_data(df)\n\n# Custom validation with specific requirements\nrequirements = {\n 'required_columns': ['PriceUSD', 'Volume'],\n 'min_price': 100,\n 'max_price': 1000000\n}\nis_valid = validate_price_data(df, **requirements)\n```\n\n## File Format Support\n\nThe pipeline supports both CSV and Parquet formats:\n\n- **CSV**: Universal compatibility, human-readable\n- **Parquet**: Better compression (~50% smaller), faster loading, preserves data types\n\n```python\n# CSV format\nextract_all_data(\"csv\", \"output_dir/\")\n\n# Parquet format\nextract_all_data(\"parquet\", \"output_dir/\")\n```\n\n## Timestamp Handling\n\nAll data sources normalize timestamps to midnight UTC for consistent merging:\n\n```python\nloader = MultiSourceDataLoader()\nmerged_df = loader.load_and_merge(['coinmetrics', 'fred'])\n\n# All timestamps are normalized to 00:00:00 UTC\nprint(merged_df.index.tz) # UTC\nprint(merged_df.index.time[0]) # 00:00:00\n```\n\n## Error Handling\n\nThe pipeline includes comprehensive error handling:\n\n```python\ntry:\n df = extract_all_data(\"csv\")\nexcept Exception as e:\n print(f\"Data extraction failed: {e}\")\n # Partial extraction may have succeeded\n```\n\nIndividual data sources fail gracefully - if one source is unavailable, others will still be extracted.\n",
"bugtrack_url": null,
"license": "MIT License\n \n Copyright (c) 2025 Hypertrial\n \n Permission is hereby granted, free of charge, to any person obtaining a copy\n of this software and associated documentation files (the \"Software\"), to deal\n in the Software without restriction, including without limitation the rights\n to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n copies of the Software, and to permit persons to whom the Software is\n furnished to do so, subject to the following conditions:\n \n The above copyright notice and this permission notice shall be included in all\n copies or substantial portions of the Software.\n \n THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n SOFTWARE.",
"summary": "Hypertrial's Stacking Sats Library - Optimized Bitcoin DCA",
"version": "0.4.0",
"project_urls": {
"Bug Tracker": "https://github.com/hypertrial/stacking_sats_pipeline/issues",
"Documentation": "https://github.com/hypertrial/stacking_sats_pipeline#readme",
"Homepage": "https://github.com/hypertrial/stacking_sats_pipeline",
"Repository": "https://github.com/hypertrial/stacking_sats_pipeline"
},
"split_keywords": [
"bitcoin",
" dca",
" backtesting",
" cryptocurrency",
" trading",
" strategy"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4ce79e4c6fe47856cf57a0c5440db164046fb5196a6d78967cd42178a1897739",
"md5": "25e9e02f47322c5907647d1587fed65e",
"sha256": "7699c4bdb4704347cdaf956ca8ce86a1759424024ad334dff1d380fe4a58a0b2"
},
"downloads": -1,
"filename": "stacking_sats_pipeline-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "25e9e02f47322c5907647d1587fed65e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 21868,
"upload_time": "2025-07-09T03:43:47",
"upload_time_iso_8601": "2025-07-09T03:43:47.415414Z",
"url": "https://files.pythonhosted.org/packages/4c/e7/9e4c6fe47856cf57a0c5440db164046fb5196a6d78967cd42178a1897739/stacking_sats_pipeline-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "642b6ff4fa087fa0b355867363be3013cb41ef52969d3d86f5923238b17504cf",
"md5": "c33060e44bd21f36c151058c6d93aee6",
"sha256": "cd8b793ca1cf3502317570d8ad7e5c203b63b59c2a967589989906d9f35cfb5b"
},
"downloads": -1,
"filename": "stacking_sats_pipeline-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "c33060e44bd21f36c151058c6d93aee6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 30870,
"upload_time": "2025-07-09T03:43:48",
"upload_time_iso_8601": "2025-07-09T03:43:48.649669Z",
"url": "https://files.pythonhosted.org/packages/64/2b/6ff4fa087fa0b355867363be3013cb41ef52969d3d86f5923238b17504cf/stacking_sats_pipeline-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-09 03:43:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hypertrial",
"github_project": "stacking_sats_pipeline",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "pandas",
"specs": [
[
">=",
"1.5.0"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.21.0"
]
]
},
{
"name": "requests",
"specs": [
[
">=",
"2.28.0"
]
]
},
{
"name": "pyarrow",
"specs": [
[
">=",
"10.0.0"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
">=",
"0.19.0"
]
]
},
{
"name": "pytest",
"specs": [
[
">=",
"7.0.0"
]
]
},
{
"name": "ruff",
"specs": [
[
">=",
"0.1.0"
]
]
},
{
"name": "autopep8",
"specs": [
[
">=",
"2.0.0"
]
]
}
],
"lcname": "stacking-sats-pipeline"
}