atio


Nameatio JSON
Version 3.1.0 PyPI version JSON
download
home_pageNone
SummarySafe atomic file writer for Pandas, Polars, NumPy, and other data objects
upload_time2025-10-24 01:43:23
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords atomic file writer pandas polars numpy data
VCS
bugtrack_url
requirements pandas pyarrow polars numpy sqlalchemy openpyxl connectorx fsspec fastcdc typer rich pytz pytest build twine alabaster myst-parser
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">

<img width="250" alt="atio-logo" src="https://github.com/user-attachments/assets/e34f2740-0182-4e34-b56c-ff6eb3e9fce4">

# Atio

**๐Ÿ›ก๏ธ Safe Atomic File Writing Library for Python**

[![Python](https://img.shields.io/badge/Python-3.7+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](LICENSE)
[![PyPI](https://img.shields.io/badge/PyPI-2.1.0-orange.svg)](https://pypi.org/project/atio/)
[![Documentation](https://img.shields.io/badge/Documentation-Read%20the%20Docs-blue.svg)](https://seojaeohcode.github.io/atio/)
[![Discord](https://img.shields.io/badge/Discord-Community-5865F2?logo=discord&logoColor=white)](https://discord.gg/EVxgByVh)

![Pandas](https://img.shields.io/badge/pandas-2.0+-green.svg?style=for-the-badge&logo=pandas&logoColor=white) ![Polars](https://img.shields.io/badge/polars-1.0+-orange.svg?style=for-the-badge&logo=polars&logoColor=white) ![NumPy](https://img.shields.io/badge/numpy-1.20+-red.svg?style=for-the-badge&logo=numpy&logoColor=white) ![PyArrow](https://img.shields.io/badge/pyarrow-17.0+-purple.svg?style=for-the-badge&logo=apache-arrow&logoColor=white) ![SQLAlchemy](https://img.shields.io/badge/sqlalchemy-2.0+-blue.svg?style=for-the-badge&logo=sqlalchemy&logoColor=white) ![OpenPyXL](https://img.shields.io/badge/openpyxl-3.1+-green.svg?style=for-the-badge&logo=openpyxl&logoColor=white)

</div>

---

## ๐Ÿ“‹ Table of Contents

- [๐ŸŽฏ Overview](#-overview)
- [๐Ÿš€ 30-Second Quick Start](#-30-second-quick-start)
- [๐Ÿ“Š Supported Formats & Libraries](#-supported-formats--libraries)
- [๐Ÿ—๏ธ Architecture](#๏ธ-architecture)
- [โšก Performance Comparison](#-performance-comparison)
- [๐Ÿ’ก Real-World Use Cases](#-real-world-use-cases)
- [๐ŸŽฏ Core Features](#-core-features)
- [๐Ÿ”ง Advanced Usage](#-advanced-usage)
- [๐Ÿ› ๏ธ Installation](#๏ธ-installation)
- [๐Ÿ“š Documentation & Examples](#-documentation--examples)
- [๐Ÿ† Why Choose Atio?](#-why-choose-atio)
- [๐Ÿ“„ License](#-license)

---

## ๐ŸŽฏ Overview

**Atio** is a Python library that prevents data loss and ensures safe file writing. Through atomic writing, it protects existing data even when errors occur during file writing, and supports various data formats and database connections.

### โœจ Why Atio?

- ๐Ÿ”’ **Zero Data Loss**: Atomic operations guarantee file integrity
- โšก **High Performance**: Minimal overhead with maximum safety
- ๐Ÿ”„ **Auto Rollback**: Automatic recovery when errors occur
- ๐Ÿ“Š **Universal Support**: Works with Pandas, Polars, NumPy, and more
- ๐ŸŽฏ **Simple API**: Drop-in replacement for existing code

## ๐Ÿš€ 30-Second Quick Start

```bash
pip install atio
```

```python
import atio
import pandas as pd

# Create sample data
df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35],
    "city": ["Seoul", "Busan", "Incheon"]
})

# Safe atomic writing
atio.write(df, "users.parquet", format="parquet")
# โœ… File saved safely with atomic operation!
```

## ๐Ÿ“Š Supported Formats & Libraries

| Format | Pandas | Polars | NumPy | Description |
|--------|--------|--------|-------|-------------|
| **CSV** | โœ… | โœ… | โœ… | Comma-separated values |
| **Parquet** | โœ… | โœ… | โŒ | Columnar storage format |
| **Excel** | โœ… | โœ… | โŒ | Microsoft Excel files |
| **JSON** | โœ… | โœ… | โŒ | JavaScript Object Notation |
| **SQL** | โœ… | โŒ | โŒ | SQL database storage |
| **Database** | โŒ | โœ… | โŒ | Direct database connection |
| **NPY/NPZ** | โŒ | โŒ | โœ… | NumPy binary formats |
| **Pickle** | โœ… | โŒ | โŒ | Python serialization |
| **HTML** | โœ… | โŒ | โŒ | HTML table format |

## ๐Ÿ—๏ธ Architecture

### Atomic Writing Process

```mermaid
graph LR
    A[Data Object] --> B[Temp File]
    B --> C[Validation]
    C --> D[Atomic Replace]
    D --> E[Success Flag]
    
    C -->|Error| F[Rollback]
    F --> G[Original File Preserved]
    
    style A fill:#e1f5fe
    style E fill:#c8e6c9
    style F fill:#ffcdd2
    style G fill:#c8e6c9
```

### Key Components

- **๐Ÿ›ก๏ธ Atomic Operations**: Temporary file โ†’ Validation โ†’ Atomic replacement
- **๐Ÿ”„ Rollback Mechanism**: Automatic recovery on failure
- **๐Ÿ“ˆ Progress Monitoring**: Real-time progress for large files
- **๐Ÿ“‹ Version Management**: Snapshot-based data versioning
- **๐Ÿงน Auto Cleanup**: Automatic cleanup of temporary files

## ๐Ÿ’ก Real-World Use Cases

### ๐Ÿ”ฅ Data Pipeline Protection
```python
# ETL pipeline with automatic rollback
try:
    atio.write(processed_data, "final_results.parquet", format="parquet")
    print("โœ… Pipeline completed successfully")
except Exception as e:
    print("โŒ Pipeline failed, but original data is safe")
    # Original file remains untouched
```

### ๐Ÿงช Machine Learning Experiments
```python
# Version-controlled experiment results
atio.write_snapshot(model_results, "experiment_v1", mode="overwrite")
atio.write_snapshot(improved_results, "experiment_v1", mode="append")

# Rollback to previous version if needed
atio.rollback("experiment_v1", version_id=1)
```

### ๐Ÿ“Š Large Data Processing
```python
# Progress monitoring for large datasets
atio.write(large_df, "big_data.parquet", 
          format="parquet", 
          show_progress=True)
# Shows: โ ‹ Writing big_data.parquet... [ 45.2 MB | 12.3 MB/s | 00:15 ]
```

## ๐ŸŽฏ Core Features

### 1. **Atomic File Writing**
```python
# Safe writing with automatic rollback
atio.write(df, "data.parquet", format="parquet")
# Creates: data.parquet + .data.parquet._SUCCESS
```

### 2. **Database Integration**
```python
# Direct database storage
from sqlalchemy import create_engine
engine = create_engine('postgresql://user:pass@localhost/db')
atio.write(df, format="sql", name="users", con=engine, if_exists="replace")
```

### 3. **Version Management**
```python
# Snapshot-based versioning
atio.write_snapshot(df, "my_table", mode="overwrite")  # v1
atio.write_snapshot(new_df, "my_table", mode="append") # v2

# Read specific version
df_v1 = atio.read_table("my_table", version=1)
```

### 4. **Progress Monitoring**
```python
# Real-time progress for large files
atio.write(large_df, "data.parquet", 
          format="parquet", 
          show_progress=True,
          verbose=True)
```

## ๐Ÿ”ง Advanced Usage

### Multi-Format Support
```python
import polars as pl
import numpy as np

# Polars DataFrame
pl_df = pl.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
atio.write(pl_df, "data.parquet", format="parquet")

# NumPy Arrays
arr = np.random.randn(1000, 100)
atio.write(arr, "array.npy", format="npy")

# Multiple arrays
atio.write({'arr1': arr, 'arr2': arr*2}, "arrays.npz", format="npz")
```

### Error Handling & Recovery
```python
# Automatic rollback on failure
try:
    atio.write(df, "data.parquet", format="parquet")
except Exception as e:
    print(f"Write failed: {e}")
    # Original file is automatically preserved
```

### Performance Monitoring
```python
# Detailed performance analysis
atio.write(df, "data.parquet", format="parquet", verbose=True)
# Output:
# [INFO] Temporary directory created: /tmp/tmp12345
# [INFO] Writer to use: to_parquet (format: parquet)
# [INFO] โœ… File writing completed (total time: 0.1234s)
```

## ๐Ÿ› ๏ธ Installation

### Basic Installation
```bash
pip install atio
```

### With Optional Dependencies
```bash
# For Excel support
pip install atio[excel]

# For database support
pip install atio[database]

# For all features
pip install atio[all]
```

### Development Installation
```bash
git clone https://github.com/seojaeohcode/atio.git
cd atio
pip install -e .
```

## ๐Ÿ“š Documentation & Examples

### ๐Ÿ“– Documentation
- **[Complete Documentation](https://seojaeohcode.github.io/atio/)** - Full API reference
- **[Quick Start Guide](https://seojaeohcode.github.io/atio/quickstart.html)** - Get started in minutes
- **[Advanced Usage](https://seojaeohcode.github.io/atio/advanced_usage.html)** - Power user features

### ๐ŸŽฏ Examples

#### ๐Ÿ“ **Basic Usage** - Simple file operations
```python
import atio
import pandas as pd

# Create sample data
df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35],
    "city": ["Seoul", "Busan", "Incheon"]
})

# Safe atomic writing
atio.write(df, "users.parquet", format="parquet")
print("โœ… File saved safely!")

# Read back to verify
df_read = pd.read_parquet("users.parquet")
print(df_read)
```

#### ๐Ÿ“Š **Progress Monitoring** - Large file handling
```python
import atio
import pandas as pd
import numpy as np

# Create large dataset
large_df = pd.DataFrame(np.random.randn(200000, 5), columns=list("ABCDE"))

# Save with progress monitoring
atio.write(large_df, "large_data.parquet", 
          format="parquet", 
          show_progress=True)
# Shows: โ ‹ Writing large_data.parquet... [ 45.2 MB | 12.3 MB/s | 00:15 ]
```

#### ๐Ÿ“‹ **Snapshot Management** - Version control
```python
import atio
import pandas as pd

# Version 1: Initial data
df_v1 = pd.DataFrame({"id": [1, 2, 3], "value": ["A", "B", "C"]})
atio.write_snapshot(df_v1, "my_table", mode="overwrite")

# Version 2: Append new data
df_v2 = pd.DataFrame({"score": [95, 87, 92]})
atio.write_snapshot(df_v2, "my_table", mode="append")

# Read specific version
df_latest = atio.read_table("my_table")  # Latest version
df_v1 = atio.read_table("my_table", version=1)  # Version 1
```

#### โšก **Performance Testing** - Benchmarking
```python
import atio
import pandas as pd
import time

# Performance comparison
df = pd.DataFrame(np.random.randn(100000, 10))

# Standard pandas
start = time.time()
df.to_parquet("standard.parquet")
pandas_time = time.time() - start

# Atio with safety
start = time.time()
atio.write(df, "safe.parquet", format="parquet", verbose=True)
atio_time = time.time() - start

print(f"Pandas: {pandas_time:.3f}s")
print(f"Atio: {atio_time:.3f}s")
print(f"Safety overhead: {((atio_time/pandas_time - 1) * 100):.1f}%")
```

### ๐Ÿงช Test Scenarios

#### โŒจ๏ธ **Keyboard Interrupt** - Ctrl+C safety
```python
# test_interrupt.py
import atio
import pandas as pd
import numpy as np

print("Creating large dataset...")
df = pd.DataFrame(np.random.randn(1000000, 10))

print("Starting write operation...")
print("Press Ctrl+C to test interrupt safety!")

try:
    atio.write(df, "test_interrupt.parquet", 
              format="parquet", 
              show_progress=True)
    print("โœ… Write completed successfully!")
except KeyboardInterrupt:
    print("โŒ Interrupted by user!")
    print("๐Ÿ” Checking file safety...")
    import os
    if os.path.exists("test_interrupt.parquet"):
        print("โš ๏ธ  File exists but may be corrupted")
    else:
        print("โœ… No corrupted file left behind!")
```

#### ๐Ÿ’พ **Out of Memory** - Memory failure handling
```python
# test_oom.py
import atio
import pandas as pd
import numpy as np

def simulate_oom():
    print("Creating extremely large dataset...")
    # This will likely cause OOM
    huge_df = pd.DataFrame(np.random.randn(10000000, 100))
    
    print("Attempting to save...")
    try:
        atio.write(huge_df, "huge_data.parquet", format="parquet")
        print("โœ… Successfully saved!")
    except MemoryError:
        print("โŒ Out of Memory error!")
        print("โœ… But original file is safe!")
    except Exception as e:
        print(f"โŒ Error: {e}")
        print("โœ… Atio protected your data!")

# Run the test
simulate_oom()
```

#### ๐Ÿš€ **CI/CD Pipeline** - Automated deployment safety
```python
# ci_pipeline.py
import atio
import pandas as pd
import os

def deploy_artifacts():
    """Simulate CI/CD pipeline deployment"""
    
    # Generate deployment artifacts
    config = pd.DataFrame({
        "service": ["api", "web", "db"],
        "version": ["v1.2.3", "v1.2.3", "v1.2.3"],
        "status": ["ready", "ready", "ready"]
    })
    
    metrics = pd.DataFrame({
        "metric": ["cpu", "memory", "disk"],
        "value": [75.5, 68.2, 45.1],
        "unit": ["%", "%", "%"]
    })
    
    print("๐Ÿš€ Starting deployment...")
    
    try:
        # Atomic deployment - either all succeed or all fail
        atio.write(config, "deployment_config.json", format="json")
        atio.write(metrics, "deployment_metrics.parquet", format="parquet")
        
        # Create success marker
        atio.write(pd.DataFrame({"status": ["deployed"]}), 
                  "deployment_success.parquet", format="parquet")
        
        print("โœ… Deployment completed successfully!")
        return True
        
    except Exception as e:
        print(f"โŒ Deployment failed: {e}")
        print("๐Ÿ”„ Rolling back...")
        
        # Clean up any partial files
        for file in ["deployment_config.json", "deployment_metrics.parquet"]:
            if os.path.exists(file):
                os.remove(file)
        
        print("โœ… Rollback completed - system is clean!")
        return False

# Test the pipeline
deploy_artifacts()
```

## ๐Ÿ† Why Choose Atio?

### โœ… **Data Safety First**
- **Zero data loss** even during system failures
- **Automatic rollback** on any error
- **File integrity** guaranteed by atomic operations

### โšก **Performance Optimized**
- **Minimal overhead** (1.1-1.2x vs native libraries)
- **Progress monitoring** for large files
- **Memory efficient** processing

### ๐Ÿ”ง **Developer Friendly**
- **Drop-in replacement** for existing code
- **Simple API** with powerful features
- **Comprehensive documentation** and examples

### ๐ŸŒ **Universal Compatibility**
- **Multiple data formats** (CSV, Parquet, Excel, JSON, etc.)
- **Multiple libraries** (Pandas, Polars, NumPy)
- **Database integration** (SQL, NoSQL)

## ๐Ÿ“„ License

This project is distributed under the **Apache 2.0 License**. See the [LICENSE](LICENSE) file for details.

---

<div align="center">

**๐Ÿ›ก๏ธ Atio** - Because your data deserves to be safe

[![GitHub stars](https://img.shields.io/github/stars/seojaeohcode/atio?style=social)](https://github.com/seojaeohcode/atio)
[![GitHub forks](https://img.shields.io/github/forks/seojaeohcode/atio?style=social)](https://github.com/seojaeohcode/atio)
[![GitHub watchers](https://img.shields.io/github/watchers/seojaeohcode/atio?style=social)](https://github.com/seojaeohcode/atio)

</div>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "atio",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "atomic, file, writer, pandas, polars, numpy, data",
    "author": null,
    "author_email": "Seo Jae Oh <seojaeohcoder@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/b5/02/fc51ea72d046d11bc56f41a6ef0a9d6d01c7d9966ffec82fd53d99847154/atio-3.1.0.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\r\n\r\n<img width=\"250\" alt=\"atio-logo\" src=\"https://github.com/user-attachments/assets/e34f2740-0182-4e34-b56c-ff6eb3e9fce4\">\r\n\r\n# Atio\r\n\r\n**\ud83d\udee1\ufe0f Safe Atomic File Writing Library for Python**\r\n\r\n[![Python](https://img.shields.io/badge/Python-3.7+-blue.svg)](https://www.python.org/downloads/)\r\n[![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](LICENSE)\r\n[![PyPI](https://img.shields.io/badge/PyPI-2.1.0-orange.svg)](https://pypi.org/project/atio/)\r\n[![Documentation](https://img.shields.io/badge/Documentation-Read%20the%20Docs-blue.svg)](https://seojaeohcode.github.io/atio/)\r\n[![Discord](https://img.shields.io/badge/Discord-Community-5865F2?logo=discord&logoColor=white)](https://discord.gg/EVxgByVh)\r\n\r\n![Pandas](https://img.shields.io/badge/pandas-2.0+-green.svg?style=for-the-badge&logo=pandas&logoColor=white) ![Polars](https://img.shields.io/badge/polars-1.0+-orange.svg?style=for-the-badge&logo=polars&logoColor=white) ![NumPy](https://img.shields.io/badge/numpy-1.20+-red.svg?style=for-the-badge&logo=numpy&logoColor=white) ![PyArrow](https://img.shields.io/badge/pyarrow-17.0+-purple.svg?style=for-the-badge&logo=apache-arrow&logoColor=white) ![SQLAlchemy](https://img.shields.io/badge/sqlalchemy-2.0+-blue.svg?style=for-the-badge&logo=sqlalchemy&logoColor=white) ![OpenPyXL](https://img.shields.io/badge/openpyxl-3.1+-green.svg?style=for-the-badge&logo=openpyxl&logoColor=white)\r\n\r\n</div>\r\n\r\n---\r\n\r\n## \ud83d\udccb Table of Contents\r\n\r\n- [\ud83c\udfaf Overview](#-overview)\r\n- [\ud83d\ude80 30-Second Quick Start](#-30-second-quick-start)\r\n- [\ud83d\udcca Supported Formats & Libraries](#-supported-formats--libraries)\r\n- [\ud83c\udfd7\ufe0f Architecture](#\ufe0f-architecture)\r\n- [\u26a1 Performance Comparison](#-performance-comparison)\r\n- [\ud83d\udca1 Real-World Use Cases](#-real-world-use-cases)\r\n- [\ud83c\udfaf Core Features](#-core-features)\r\n- [\ud83d\udd27 Advanced Usage](#-advanced-usage)\r\n- [\ud83d\udee0\ufe0f Installation](#\ufe0f-installation)\r\n- [\ud83d\udcda Documentation & Examples](#-documentation--examples)\r\n- [\ud83c\udfc6 Why Choose Atio?](#-why-choose-atio)\r\n- [\ud83d\udcc4 License](#-license)\r\n\r\n---\r\n\r\n## \ud83c\udfaf Overview\r\n\r\n**Atio** is a Python library that prevents data loss and ensures safe file writing. Through atomic writing, it protects existing data even when errors occur during file writing, and supports various data formats and database connections.\r\n\r\n### \u2728 Why Atio?\r\n\r\n- \ud83d\udd12 **Zero Data Loss**: Atomic operations guarantee file integrity\r\n- \u26a1 **High Performance**: Minimal overhead with maximum safety\r\n- \ud83d\udd04 **Auto Rollback**: Automatic recovery when errors occur\r\n- \ud83d\udcca **Universal Support**: Works with Pandas, Polars, NumPy, and more\r\n- \ud83c\udfaf **Simple API**: Drop-in replacement for existing code\r\n\r\n## \ud83d\ude80 30-Second Quick Start\r\n\r\n```bash\r\npip install atio\r\n```\r\n\r\n```python\r\nimport atio\r\nimport pandas as pd\r\n\r\n# Create sample data\r\ndf = pd.DataFrame({\r\n    \"name\": [\"Alice\", \"Bob\", \"Charlie\"],\r\n    \"age\": [25, 30, 35],\r\n    \"city\": [\"Seoul\", \"Busan\", \"Incheon\"]\r\n})\r\n\r\n# Safe atomic writing\r\natio.write(df, \"users.parquet\", format=\"parquet\")\r\n# \u2705 File saved safely with atomic operation!\r\n```\r\n\r\n## \ud83d\udcca Supported Formats & Libraries\r\n\r\n| Format | Pandas | Polars | NumPy | Description |\r\n|--------|--------|--------|-------|-------------|\r\n| **CSV** | \u2705 | \u2705 | \u2705 | Comma-separated values |\r\n| **Parquet** | \u2705 | \u2705 | \u274c | Columnar storage format |\r\n| **Excel** | \u2705 | \u2705 | \u274c | Microsoft Excel files |\r\n| **JSON** | \u2705 | \u2705 | \u274c | JavaScript Object Notation |\r\n| **SQL** | \u2705 | \u274c | \u274c | SQL database storage |\r\n| **Database** | \u274c | \u2705 | \u274c | Direct database connection |\r\n| **NPY/NPZ** | \u274c | \u274c | \u2705 | NumPy binary formats |\r\n| **Pickle** | \u2705 | \u274c | \u274c | Python serialization |\r\n| **HTML** | \u2705 | \u274c | \u274c | HTML table format |\r\n\r\n## \ud83c\udfd7\ufe0f Architecture\r\n\r\n### Atomic Writing Process\r\n\r\n```mermaid\r\ngraph LR\r\n    A[Data Object] --> B[Temp File]\r\n    B --> C[Validation]\r\n    C --> D[Atomic Replace]\r\n    D --> E[Success Flag]\r\n    \r\n    C -->|Error| F[Rollback]\r\n    F --> G[Original File Preserved]\r\n    \r\n    style A fill:#e1f5fe\r\n    style E fill:#c8e6c9\r\n    style F fill:#ffcdd2\r\n    style G fill:#c8e6c9\r\n```\r\n\r\n### Key Components\r\n\r\n- **\ud83d\udee1\ufe0f Atomic Operations**: Temporary file \u2192 Validation \u2192 Atomic replacement\r\n- **\ud83d\udd04 Rollback Mechanism**: Automatic recovery on failure\r\n- **\ud83d\udcc8 Progress Monitoring**: Real-time progress for large files\r\n- **\ud83d\udccb Version Management**: Snapshot-based data versioning\r\n- **\ud83e\uddf9 Auto Cleanup**: Automatic cleanup of temporary files\r\n\r\n## \ud83d\udca1 Real-World Use Cases\r\n\r\n### \ud83d\udd25 Data Pipeline Protection\r\n```python\r\n# ETL pipeline with automatic rollback\r\ntry:\r\n    atio.write(processed_data, \"final_results.parquet\", format=\"parquet\")\r\n    print(\"\u2705 Pipeline completed successfully\")\r\nexcept Exception as e:\r\n    print(\"\u274c Pipeline failed, but original data is safe\")\r\n    # Original file remains untouched\r\n```\r\n\r\n### \ud83e\uddea Machine Learning Experiments\r\n```python\r\n# Version-controlled experiment results\r\natio.write_snapshot(model_results, \"experiment_v1\", mode=\"overwrite\")\r\natio.write_snapshot(improved_results, \"experiment_v1\", mode=\"append\")\r\n\r\n# Rollback to previous version if needed\r\natio.rollback(\"experiment_v1\", version_id=1)\r\n```\r\n\r\n### \ud83d\udcca Large Data Processing\r\n```python\r\n# Progress monitoring for large datasets\r\natio.write(large_df, \"big_data.parquet\", \r\n          format=\"parquet\", \r\n          show_progress=True)\r\n# Shows: \u280b Writing big_data.parquet... [ 45.2 MB | 12.3 MB/s | 00:15 ]\r\n```\r\n\r\n## \ud83c\udfaf Core Features\r\n\r\n### 1. **Atomic File Writing**\r\n```python\r\n# Safe writing with automatic rollback\r\natio.write(df, \"data.parquet\", format=\"parquet\")\r\n# Creates: data.parquet + .data.parquet._SUCCESS\r\n```\r\n\r\n### 2. **Database Integration**\r\n```python\r\n# Direct database storage\r\nfrom sqlalchemy import create_engine\r\nengine = create_engine('postgresql://user:pass@localhost/db')\r\natio.write(df, format=\"sql\", name=\"users\", con=engine, if_exists=\"replace\")\r\n```\r\n\r\n### 3. **Version Management**\r\n```python\r\n# Snapshot-based versioning\r\natio.write_snapshot(df, \"my_table\", mode=\"overwrite\")  # v1\r\natio.write_snapshot(new_df, \"my_table\", mode=\"append\") # v2\r\n\r\n# Read specific version\r\ndf_v1 = atio.read_table(\"my_table\", version=1)\r\n```\r\n\r\n### 4. **Progress Monitoring**\r\n```python\r\n# Real-time progress for large files\r\natio.write(large_df, \"data.parquet\", \r\n          format=\"parquet\", \r\n          show_progress=True,\r\n          verbose=True)\r\n```\r\n\r\n## \ud83d\udd27 Advanced Usage\r\n\r\n### Multi-Format Support\r\n```python\r\nimport polars as pl\r\nimport numpy as np\r\n\r\n# Polars DataFrame\r\npl_df = pl.DataFrame({\"a\": [1, 2, 3], \"b\": [4, 5, 6]})\r\natio.write(pl_df, \"data.parquet\", format=\"parquet\")\r\n\r\n# NumPy Arrays\r\narr = np.random.randn(1000, 100)\r\natio.write(arr, \"array.npy\", format=\"npy\")\r\n\r\n# Multiple arrays\r\natio.write({'arr1': arr, 'arr2': arr*2}, \"arrays.npz\", format=\"npz\")\r\n```\r\n\r\n### Error Handling & Recovery\r\n```python\r\n# Automatic rollback on failure\r\ntry:\r\n    atio.write(df, \"data.parquet\", format=\"parquet\")\r\nexcept Exception as e:\r\n    print(f\"Write failed: {e}\")\r\n    # Original file is automatically preserved\r\n```\r\n\r\n### Performance Monitoring\r\n```python\r\n# Detailed performance analysis\r\natio.write(df, \"data.parquet\", format=\"parquet\", verbose=True)\r\n# Output:\r\n# [INFO] Temporary directory created: /tmp/tmp12345\r\n# [INFO] Writer to use: to_parquet (format: parquet)\r\n# [INFO] \u2705 File writing completed (total time: 0.1234s)\r\n```\r\n\r\n## \ud83d\udee0\ufe0f Installation\r\n\r\n### Basic Installation\r\n```bash\r\npip install atio\r\n```\r\n\r\n### With Optional Dependencies\r\n```bash\r\n# For Excel support\r\npip install atio[excel]\r\n\r\n# For database support\r\npip install atio[database]\r\n\r\n# For all features\r\npip install atio[all]\r\n```\r\n\r\n### Development Installation\r\n```bash\r\ngit clone https://github.com/seojaeohcode/atio.git\r\ncd atio\r\npip install -e .\r\n```\r\n\r\n## \ud83d\udcda Documentation & Examples\r\n\r\n### \ud83d\udcd6 Documentation\r\n- **[Complete Documentation](https://seojaeohcode.github.io/atio/)** - Full API reference\r\n- **[Quick Start Guide](https://seojaeohcode.github.io/atio/quickstart.html)** - Get started in minutes\r\n- **[Advanced Usage](https://seojaeohcode.github.io/atio/advanced_usage.html)** - Power user features\r\n\r\n### \ud83c\udfaf Examples\r\n\r\n#### \ud83d\udcdd **Basic Usage** - Simple file operations\r\n```python\r\nimport atio\r\nimport pandas as pd\r\n\r\n# Create sample data\r\ndf = pd.DataFrame({\r\n    \"name\": [\"Alice\", \"Bob\", \"Charlie\"],\r\n    \"age\": [25, 30, 35],\r\n    \"city\": [\"Seoul\", \"Busan\", \"Incheon\"]\r\n})\r\n\r\n# Safe atomic writing\r\natio.write(df, \"users.parquet\", format=\"parquet\")\r\nprint(\"\u2705 File saved safely!\")\r\n\r\n# Read back to verify\r\ndf_read = pd.read_parquet(\"users.parquet\")\r\nprint(df_read)\r\n```\r\n\r\n#### \ud83d\udcca **Progress Monitoring** - Large file handling\r\n```python\r\nimport atio\r\nimport pandas as pd\r\nimport numpy as np\r\n\r\n# Create large dataset\r\nlarge_df = pd.DataFrame(np.random.randn(200000, 5), columns=list(\"ABCDE\"))\r\n\r\n# Save with progress monitoring\r\natio.write(large_df, \"large_data.parquet\", \r\n          format=\"parquet\", \r\n          show_progress=True)\r\n# Shows: \u280b Writing large_data.parquet... [ 45.2 MB | 12.3 MB/s | 00:15 ]\r\n```\r\n\r\n#### \ud83d\udccb **Snapshot Management** - Version control\r\n```python\r\nimport atio\r\nimport pandas as pd\r\n\r\n# Version 1: Initial data\r\ndf_v1 = pd.DataFrame({\"id\": [1, 2, 3], \"value\": [\"A\", \"B\", \"C\"]})\r\natio.write_snapshot(df_v1, \"my_table\", mode=\"overwrite\")\r\n\r\n# Version 2: Append new data\r\ndf_v2 = pd.DataFrame({\"score\": [95, 87, 92]})\r\natio.write_snapshot(df_v2, \"my_table\", mode=\"append\")\r\n\r\n# Read specific version\r\ndf_latest = atio.read_table(\"my_table\")  # Latest version\r\ndf_v1 = atio.read_table(\"my_table\", version=1)  # Version 1\r\n```\r\n\r\n#### \u26a1 **Performance Testing** - Benchmarking\r\n```python\r\nimport atio\r\nimport pandas as pd\r\nimport time\r\n\r\n# Performance comparison\r\ndf = pd.DataFrame(np.random.randn(100000, 10))\r\n\r\n# Standard pandas\r\nstart = time.time()\r\ndf.to_parquet(\"standard.parquet\")\r\npandas_time = time.time() - start\r\n\r\n# Atio with safety\r\nstart = time.time()\r\natio.write(df, \"safe.parquet\", format=\"parquet\", verbose=True)\r\natio_time = time.time() - start\r\n\r\nprint(f\"Pandas: {pandas_time:.3f}s\")\r\nprint(f\"Atio: {atio_time:.3f}s\")\r\nprint(f\"Safety overhead: {((atio_time/pandas_time - 1) * 100):.1f}%\")\r\n```\r\n\r\n### \ud83e\uddea Test Scenarios\r\n\r\n#### \u2328\ufe0f **Keyboard Interrupt** - Ctrl+C safety\r\n```python\r\n# test_interrupt.py\r\nimport atio\r\nimport pandas as pd\r\nimport numpy as np\r\n\r\nprint(\"Creating large dataset...\")\r\ndf = pd.DataFrame(np.random.randn(1000000, 10))\r\n\r\nprint(\"Starting write operation...\")\r\nprint(\"Press Ctrl+C to test interrupt safety!\")\r\n\r\ntry:\r\n    atio.write(df, \"test_interrupt.parquet\", \r\n              format=\"parquet\", \r\n              show_progress=True)\r\n    print(\"\u2705 Write completed successfully!\")\r\nexcept KeyboardInterrupt:\r\n    print(\"\u274c Interrupted by user!\")\r\n    print(\"\ud83d\udd0d Checking file safety...\")\r\n    import os\r\n    if os.path.exists(\"test_interrupt.parquet\"):\r\n        print(\"\u26a0\ufe0f  File exists but may be corrupted\")\r\n    else:\r\n        print(\"\u2705 No corrupted file left behind!\")\r\n```\r\n\r\n#### \ud83d\udcbe **Out of Memory** - Memory failure handling\r\n```python\r\n# test_oom.py\r\nimport atio\r\nimport pandas as pd\r\nimport numpy as np\r\n\r\ndef simulate_oom():\r\n    print(\"Creating extremely large dataset...\")\r\n    # This will likely cause OOM\r\n    huge_df = pd.DataFrame(np.random.randn(10000000, 100))\r\n    \r\n    print(\"Attempting to save...\")\r\n    try:\r\n        atio.write(huge_df, \"huge_data.parquet\", format=\"parquet\")\r\n        print(\"\u2705 Successfully saved!\")\r\n    except MemoryError:\r\n        print(\"\u274c Out of Memory error!\")\r\n        print(\"\u2705 But original file is safe!\")\r\n    except Exception as e:\r\n        print(f\"\u274c Error: {e}\")\r\n        print(\"\u2705 Atio protected your data!\")\r\n\r\n# Run the test\r\nsimulate_oom()\r\n```\r\n\r\n#### \ud83d\ude80 **CI/CD Pipeline** - Automated deployment safety\r\n```python\r\n# ci_pipeline.py\r\nimport atio\r\nimport pandas as pd\r\nimport os\r\n\r\ndef deploy_artifacts():\r\n    \"\"\"Simulate CI/CD pipeline deployment\"\"\"\r\n    \r\n    # Generate deployment artifacts\r\n    config = pd.DataFrame({\r\n        \"service\": [\"api\", \"web\", \"db\"],\r\n        \"version\": [\"v1.2.3\", \"v1.2.3\", \"v1.2.3\"],\r\n        \"status\": [\"ready\", \"ready\", \"ready\"]\r\n    })\r\n    \r\n    metrics = pd.DataFrame({\r\n        \"metric\": [\"cpu\", \"memory\", \"disk\"],\r\n        \"value\": [75.5, 68.2, 45.1],\r\n        \"unit\": [\"%\", \"%\", \"%\"]\r\n    })\r\n    \r\n    print(\"\ud83d\ude80 Starting deployment...\")\r\n    \r\n    try:\r\n        # Atomic deployment - either all succeed or all fail\r\n        atio.write(config, \"deployment_config.json\", format=\"json\")\r\n        atio.write(metrics, \"deployment_metrics.parquet\", format=\"parquet\")\r\n        \r\n        # Create success marker\r\n        atio.write(pd.DataFrame({\"status\": [\"deployed\"]}), \r\n                  \"deployment_success.parquet\", format=\"parquet\")\r\n        \r\n        print(\"\u2705 Deployment completed successfully!\")\r\n        return True\r\n        \r\n    except Exception as e:\r\n        print(f\"\u274c Deployment failed: {e}\")\r\n        print(\"\ud83d\udd04 Rolling back...\")\r\n        \r\n        # Clean up any partial files\r\n        for file in [\"deployment_config.json\", \"deployment_metrics.parquet\"]:\r\n            if os.path.exists(file):\r\n                os.remove(file)\r\n        \r\n        print(\"\u2705 Rollback completed - system is clean!\")\r\n        return False\r\n\r\n# Test the pipeline\r\ndeploy_artifacts()\r\n```\r\n\r\n## \ud83c\udfc6 Why Choose Atio?\r\n\r\n### \u2705 **Data Safety First**\r\n- **Zero data loss** even during system failures\r\n- **Automatic rollback** on any error\r\n- **File integrity** guaranteed by atomic operations\r\n\r\n### \u26a1 **Performance Optimized**\r\n- **Minimal overhead** (1.1-1.2x vs native libraries)\r\n- **Progress monitoring** for large files\r\n- **Memory efficient** processing\r\n\r\n### \ud83d\udd27 **Developer Friendly**\r\n- **Drop-in replacement** for existing code\r\n- **Simple API** with powerful features\r\n- **Comprehensive documentation** and examples\r\n\r\n### \ud83c\udf10 **Universal Compatibility**\r\n- **Multiple data formats** (CSV, Parquet, Excel, JSON, etc.)\r\n- **Multiple libraries** (Pandas, Polars, NumPy)\r\n- **Database integration** (SQL, NoSQL)\r\n\r\n## \ud83d\udcc4 License\r\n\r\nThis project is distributed under the **Apache 2.0 License**. See the [LICENSE](LICENSE) file for details.\r\n\r\n---\r\n\r\n<div align=\"center\">\r\n\r\n**\ud83d\udee1\ufe0f Atio** - Because your data deserves to be safe\r\n\r\n[![GitHub stars](https://img.shields.io/github/stars/seojaeohcode/atio?style=social)](https://github.com/seojaeohcode/atio)\r\n[![GitHub forks](https://img.shields.io/github/forks/seojaeohcode/atio?style=social)](https://github.com/seojaeohcode/atio)\r\n[![GitHub watchers](https://img.shields.io/github/watchers/seojaeohcode/atio?style=social)](https://github.com/seojaeohcode/atio)\r\n\r\n</div>\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Safe atomic file writer for Pandas, Polars, NumPy, and other data objects",
    "version": "3.1.0",
    "project_urls": {
        "Homepage": "https://github.com/seojaeohcode/atomic-writer"
    },
    "split_keywords": [
        "atomic",
        " file",
        " writer",
        " pandas",
        " polars",
        " numpy",
        " data"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bc96b6085a3c14464ed6b82b53d7ed9717f8f5b6e1691639de1dcedadce75407",
                "md5": "67de26219f23655b204364858caa713e",
                "sha256": "2cdc2ee846587954dc8022965423fbd571e085e1c1fd8425641312965c586700"
            },
            "downloads": -1,
            "filename": "atio-3.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "67de26219f23655b204364858caa713e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 34629,
            "upload_time": "2025-10-24T01:43:22",
            "upload_time_iso_8601": "2025-10-24T01:43:22.094003Z",
            "url": "https://files.pythonhosted.org/packages/bc/96/b6085a3c14464ed6b82b53d7ed9717f8f5b6e1691639de1dcedadce75407/atio-3.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b502fc51ea72d046d11bc56f41a6ef0a9d6d01c7d9966ffec82fd53d99847154",
                "md5": "8b6ee16bc3a6718a3a5c7353625fd18d",
                "sha256": "705a83faf459d571e03013883ade50619139d1ecb888fc14193b15cf5921494f"
            },
            "downloads": -1,
            "filename": "atio-3.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "8b6ee16bc3a6718a3a5c7353625fd18d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 49202,
            "upload_time": "2025-10-24T01:43:23",
            "upload_time_iso_8601": "2025-10-24T01:43:23.669551Z",
            "url": "https://files.pythonhosted.org/packages/b5/02/fc51ea72d046d11bc56f41a6ef0a9d6d01c7d9966ffec82fd53d99847154/atio-3.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-24 01:43:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "seojaeohcode",
    "github_project": "atomic-writer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "2.2.0"
                ]
            ]
        },
        {
            "name": "pyarrow",
            "specs": [
                [
                    ">=",
                    "15.0"
                ]
            ]
        },
        {
            "name": "polars",
            "specs": [
                [
                    ">=",
                    "1.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.26"
                ],
                [
                    "<",
                    "2.0"
                ]
            ]
        },
        {
            "name": "sqlalchemy",
            "specs": [
                [
                    ">=",
                    "2.0"
                ]
            ]
        },
        {
            "name": "openpyxl",
            "specs": [
                [
                    ">=",
                    "3.1"
                ]
            ]
        },
        {
            "name": "connectorx",
            "specs": [
                [
                    ">=",
                    "0.3.0"
                ]
            ]
        },
        {
            "name": "fsspec",
            "specs": [
                [
                    ">=",
                    "2024.5.0"
                ]
            ]
        },
        {
            "name": "fastcdc",
            "specs": [
                [
                    ">=",
                    "1.7.0"
                ]
            ]
        },
        {
            "name": "typer",
            "specs": [
                [
                    ">=",
                    "0.12.0"
                ]
            ]
        },
        {
            "name": "rich",
            "specs": [
                [
                    ">=",
                    "13.0"
                ]
            ]
        },
        {
            "name": "pytz",
            "specs": [
                [
                    ">=",
                    "2024.1"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    "~=",
                    "8.4"
                ]
            ]
        },
        {
            "name": "build",
            "specs": [
                [
                    "~=",
                    "1.2"
                ]
            ]
        },
        {
            "name": "twine",
            "specs": [
                [
                    "~=",
                    "5.1"
                ]
            ]
        },
        {
            "name": "alabaster",
            "specs": [
                [
                    "~=",
                    "0.7"
                ]
            ]
        },
        {
            "name": "myst-parser",
            "specs": [
                [
                    "==",
                    "0.18.0"
                ]
            ]
        }
    ],
    "lcname": "atio"
}
        
Elapsed time: 4.41675s