sheetwise


Namesheetwise JSON
Version 2.2.0 PyPI version JSON
download
home_pagehttps://github.com/Khushiyant/sheetwise
SummaryA Python package for encoding spreadsheets for Large Language Models, implementing the SpreadsheetLLM research framework
upload_time2025-08-19 15:46:22
maintainerNone
docs_urlNone
authorKhushiyant Chauhan
requires_python<4.0.0,>=3.8.1
licenseMIT
keywords spreadsheet llm encoding compression data-processing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # SheetWise

A Python package for encoding spreadsheets for Large Language Models, implementing the SpreadsheetLLM research framework.

[![PyPI version](https://img.shields.io/pypi/v/sheetwise.svg)](https://pypi.org/project/sheetwise/)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/release/python-380/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Overview

SheetWise is a Python package that implements the key components from Microsoft Research's SpreadsheetLLM paper for efficiently encoding spreadsheets for use with Large Language Models. The package provides:

- **SheetCompressor**: Efficient encoding framework with three compression modules
- **Chain of Spreadsheet**: Multi-step reasoning approach for spreadsheet analysis
- **Vanilla Encoding**: Traditional cell-by-cell encoding methods
- **Token Optimization**: Significant reduction in token usage
- **Formula Analysis**: Extract and simplify Excel formulas
- **Multi-Sheet Support**: Process entire workbooks with cross-sheet references
- **Visualization Tools**: Generate visual reports of compression results

## Key Features

- **Intelligent Compression**: Up to 96% reduction in token usage while preserving semantic information
- **Auto-Configuration**: Automatically optimizes compression settings based on spreadsheet characteristics  
- **Multi-Table Support**: Handles complex spreadsheets with multiple tables and regions
- **Structural Analysis**: Identifies and preserves important structural elements
- **Format-Aware**: Preserves data type and formatting information
- **Enhanced Algorithms**: Improved range detection and contiguous cell grouping
- **Easy Integration**: Simple API for immediate use

## Installation

### Using pip

```bash
pip install sheetwise
```

### Using Poetry

```bash
poetry add sheetwise
```

### Development Installation

```bash
git clone https://github.com/yourusername/sheetwise.git
cd sheetwise
poetry install
```

## Quick Start

### Basic Usage

```python
import pandas as pd
from sheetwise import SpreadsheetLLM

# Initialize the framework
sllm = SpreadsheetLLM()

# Load your spreadsheet
df = pd.read_excel("your_spreadsheet.xlsx")

# Compress and encode for LLM use
llm_ready_text = sllm.compress_and_encode_for_llm(df)

# Copy and paste this text directly into ChatGPT/Claude
print(llm_ready_text)
```

### Advanced Usage

```python
from sheetwise import SpreadsheetLLM, SheetCompressor

# Auto-configuration
sllm = SpreadsheetLLM(enable_logging=True)
auto_compressed = sllm.compress_with_auto_config(df)  # Automatically optimizes settings

# Manual configuration
compressor = SheetCompressor(
    k=2,  # Structural anchor neighborhood size
    use_extraction=True,
    use_translation=True, 
    use_aggregation=True
)

# Compress the spreadsheet
compressed_result = compressor.compress(df)
print(f"Compression ratio: {compressed_result['compression_ratio']:.1f}x")
print(f"Compressed shape: {compressed_result['compressed_df'].shape}")

# Or use with SpreadsheetLLM for full pipeline
sllm = SpreadsheetLLM(compression_params={
    'k': 2,
    'use_extraction': True,
    'use_translation': True, 
    'use_aggregation': True
})

# Get detailed statistics
stats = sllm.get_encoding_stats(df)
print(f"Token reduction: {stats['token_reduction_ratio']:.1f}x")

# Process QA queries
result = sllm.process_qa_query(df, "What was the total revenue in 2023?")
```

### Enhanced Features Usage (v2.0+)

```python
from sheetwise import (
    SpreadsheetLLM, 
    FormulaParser, 
    WorkbookManager, 
    CompressionVisualizer, 
    SmartTableDetector
)

# Formula extraction and analysis
formula_parser = FormulaParser()
formulas = formula_parser.extract_formulas("your_spreadsheet.xlsx")
formula_parser.build_dependency_graph()
impact = formula_parser.get_formula_impact("Sheet1!A1")
formula_text = formula_parser.encode_formulas_for_llm()

# Multi-sheet support
workbook = WorkbookManager()
sheets = workbook.load_workbook("your_workbook.xlsx")
cross_refs = workbook.detect_cross_sheet_references()
sllm = SpreadsheetLLM()
compressed = workbook.compress_workbook(sllm.compressor)
encoded = workbook.encode_workbook_for_llm(compressed)

# Visualization
visualizer = CompressionVisualizer()
df = sllm.load_from_file("your_spreadsheet.xlsx")
compressed_result = sllm.compress_spreadsheet(df)
fig = visualizer.create_data_density_heatmap(df)
fig.savefig("heatmap.png")
html_report = visualizer.generate_html_report(df, compressed_result)

# Advanced table detection
detector = SmartTableDetector()
tables = detector.detect_tables(df)
extracted_tables = detector.extract_tables_to_dataframes(df)
```

### Command Line Interface

```bash
# Basic usage
sheetwise input.xlsx -o output.txt --stats

# Auto-configure compression
sheetwise input.xlsx --auto-config --verbose

# Run demo with sample data
sheetwise --demo --auto-config

# Use vanilla encoding instead of compression
sheetwise input.xlsx --vanilla

# Output in JSON format
sheetwise input.xlsx --format json
```

### Enhanced CLI Features (v2.0+)

```bash
# Extract and analyze formulas
sheetwise your_spreadsheet.xlsx --extract-formulas

# Process all sheets in a workbook
sheetwise your_workbook.xlsx --multi-sheet

# Generate visualizations
sheetwise your_spreadsheet.xlsx --visualize

# Detect and extract tables
sheetwise your_spreadsheet.xlsx --detect-tables

# Generate an HTML report
sheetwise your_spreadsheet.xlsx --format html
```

## Benchmarks & Visualization

SheetWise includes a benchmarking script to evaluate compression, speed, and memory usage across spreadsheets. This helps you understand performance and compare results visually.

### Running Benchmarks

1. Place your sample spreadsheets in `benchmarks/samples/` (supports .xlsx and .csv).
2. Run the benchmark script:

```bash
python scripts/generate_benchmarks.py
```

3. Results and charts will be saved in `benchmarks/results/` and `benchmarks/charts/`.

### Example Output

- **Compression Ratio vs. File Size**
  ![Compression Ratio](benchmarks/charts/compression_vs_size.png)
- **Processing Time vs. File Size**
  ![Processing Time](benchmarks/charts/time_vs_size.png)
- **Max Memory Usage per File**
  ![Memory Usage](benchmarks/charts/memory_usage.png)

These charts can be included in your documentation or website to showcase SheetWise's efficiency and scalability.

---

## Core Components

### 1. SheetCompressor

The main compression framework with three modules:

- **Structural Anchor Extraction**: Identifies and preserves structurally important rows/columns
- **Inverted Index Translation**: Creates efficient value-to-location mappings
- **Data Format Aggregation**: Groups cells by data type and format

### 2. Chain of Spreadsheet

Multi-step reasoning approach:

1. **Table Identification**: Automatically detects table regions
2. **Compression**: Applies SheetCompressor to reduce size
3. **Query Processing**: Identifies relevant regions for specific queries

### 3. Enhanced Modules (v2.0+)

- **FormulaParser**: Extracts and analyzes Excel formulas
- **WorkbookManager**: Handles multi-sheet workbooks and cross-references
- **CompressionVisualizer**: Generates visualizations and reports
- **SmartTableDetector**: Advanced table detection and classification

## Examples

### Working with Financial Data

```python
from sheetwise import SpreadsheetLLM
from sheetwise.utils import create_realistic_spreadsheet

# Create sample financial spreadsheet
df = create_realistic_spreadsheet()

sllm = SpreadsheetLLM()

# Analyze the data
stats = sllm.get_encoding_stats(df)
print(f"Original size: {stats['original_shape']}")
print(f"Sparsity: {stats['sparsity_percentage']:.1f}% empty cells")
print(f"Compression: {stats['compression_ratio']:.1f}x smaller")

# Generate LLM-ready output
encoded = sllm.compress_and_encode_for_llm(df)
print("\nReady for LLM:")
print(encoded[:300] + "...")
```

### Visualizing Compression

```python
from sheetwise import SpreadsheetLLM, CompressionVisualizer
import pandas as pd

# Load your data
df = pd.read_excel("complex_spreadsheet.xlsx")

# Compress the data
sllm = SpreadsheetLLM()
compressed_result = sllm.compress_spreadsheet(df)

# Create visualizations
visualizer = CompressionVisualizer()

# Generate heatmap of data density
fig1 = visualizer.create_data_density_heatmap(df)
fig1.savefig("density_heatmap.png")

# Compare original vs compressed
fig2 = visualizer.compare_original_vs_compressed(df, compressed_result)
fig2.savefig("compression_comparison.png")

# Generate HTML report with all visualizations
html_report = visualizer.generate_html_report(df, compressed_result)
with open("compression_report.html", "w") as f:
    f.write(html_report)

# Compare different compression strategies
configs = [
    {"name": "Extraction Only", "use_translation": False, "use_aggregation": False},
    {"name": "Translation Only", "use_extraction": False, "use_aggregation": False}, 
    {"name": "All Modules", "use_extraction": True, "use_translation": True, "use_aggregation": True}
]

for config in configs:
    compressor = SheetCompressor(**{k: v for k, v in config.items() if k != "name"})
    result = compressor.compress(df)
    print(f"{config['name']}: {result['compression_ratio']:.1f}x compression")
```



## Performance

SpreadsheetLLM achieves significant improvements over vanilla encoding:

| Metric | Vanilla | SpreadsheetLLM | Improvement |
|--------|---------|----------------|-------------|
| Token Count | ~25,000 | ~1,200 | **96% reduction** |
| Sparsity Handling | Poor | Excellent | **Removes empty regions** |
| Multi-Table Support | Limited | Native | **Preserves structure** |
| Format Preservation | Basic | Advanced | **Type-aware grouping** |

## API Reference

### SpreadsheetLLM Class

The main interface for the framework.

#### Methods

- `compress_and_encode_for_llm(df)`: One-step compression and encoding
- `compress_spreadsheet(df)`: Apply compression pipeline  
- `encode_vanilla(df)`: Traditional encoding
- `get_encoding_stats(df)`: Detailed compression statistics
- `process_qa_query(df, query)`: Chain of Spreadsheet reasoning
- `load_from_file(filepath)`: Load spreadsheet from file

### SheetCompressor Class

Core compression framework.

#### Parameters

- `k`: Structural anchor neighborhood size (default: 4)
- `use_extraction`: Enable structural extraction (default: True)
- `use_translation`: Enable inverted index translation (default: True)
- `use_aggregation`: Enable format aggregation (default: True)

## Contributing

We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

### Development Setup

```bash
# Clone the repository

git clone https://github.com/yourusername/sheetwise.git
cd sheetwise

# Install development dependencies
poetry install

# Run tests
poetry run pytest

# Run linting
poetry run black src tests
poetry run isort src tests
poetry run flake8 src tests
```

### Running Tests

```bash
# Run all tests
poetry run pytest

# Run with coverage
poetry run pytest --cov=src

# Run specific test file
poetry run pytest tests/test_core.py
```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Citation

If you use SpreadsheetLLM in your research, please cite:

```bibtex
@article{spreadsheetllm2024,
  title={SpreadsheetLLM: Encoding Spreadsheets for Large Language Models},
  author={Microsoft Research Team},
  journal={arXiv preprint},
  year={2024}
}
```


## Support

- [Documentation](https://sheetwise.readthedocs.io)
- [Issue Tracker](https://github.com/yourusername/sheetwise/issues)
- [Discussions](https://github.com/yourusername/sheetwise/discussions)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Khushiyant/sheetwise",
    "name": "sheetwise",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0.0,>=3.8.1",
    "maintainer_email": null,
    "keywords": "spreadsheet, llm, encoding, compression, data-processing",
    "author": "Khushiyant Chauhan",
    "author_email": "khushiyant@example.com",
    "download_url": "https://files.pythonhosted.org/packages/e0/b4/636e9cd7750638526d4407b02e07d29b484d69adb96cc7cc72c67620f46a/sheetwise-2.2.0.tar.gz",
    "platform": null,
    "description": "# SheetWise\n\nA Python package for encoding spreadsheets for Large Language Models, implementing the SpreadsheetLLM research framework.\n\n[![PyPI version](https://img.shields.io/pypi/v/sheetwise.svg)](https://pypi.org/project/sheetwise/)\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/release/python-380/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n## Overview\n\nSheetWise is a Python package that implements the key components from Microsoft Research's SpreadsheetLLM paper for efficiently encoding spreadsheets for use with Large Language Models. The package provides:\n\n- **SheetCompressor**: Efficient encoding framework with three compression modules\n- **Chain of Spreadsheet**: Multi-step reasoning approach for spreadsheet analysis\n- **Vanilla Encoding**: Traditional cell-by-cell encoding methods\n- **Token Optimization**: Significant reduction in token usage\n- **Formula Analysis**: Extract and simplify Excel formulas\n- **Multi-Sheet Support**: Process entire workbooks with cross-sheet references\n- **Visualization Tools**: Generate visual reports of compression results\n\n## Key Features\n\n- **Intelligent Compression**: Up to 96% reduction in token usage while preserving semantic information\n- **Auto-Configuration**: Automatically optimizes compression settings based on spreadsheet characteristics  \n- **Multi-Table Support**: Handles complex spreadsheets with multiple tables and regions\n- **Structural Analysis**: Identifies and preserves important structural elements\n- **Format-Aware**: Preserves data type and formatting information\n- **Enhanced Algorithms**: Improved range detection and contiguous cell grouping\n- **Easy Integration**: Simple API for immediate use\n\n## Installation\n\n### Using pip\n\n```bash\npip install sheetwise\n```\n\n### Using Poetry\n\n```bash\npoetry add sheetwise\n```\n\n### Development Installation\n\n```bash\ngit clone https://github.com/yourusername/sheetwise.git\ncd sheetwise\npoetry install\n```\n\n## Quick Start\n\n### Basic Usage\n\n```python\nimport pandas as pd\nfrom sheetwise import SpreadsheetLLM\n\n# Initialize the framework\nsllm = SpreadsheetLLM()\n\n# Load your spreadsheet\ndf = pd.read_excel(\"your_spreadsheet.xlsx\")\n\n# Compress and encode for LLM use\nllm_ready_text = sllm.compress_and_encode_for_llm(df)\n\n# Copy and paste this text directly into ChatGPT/Claude\nprint(llm_ready_text)\n```\n\n### Advanced Usage\n\n```python\nfrom sheetwise import SpreadsheetLLM, SheetCompressor\n\n# Auto-configuration\nsllm = SpreadsheetLLM(enable_logging=True)\nauto_compressed = sllm.compress_with_auto_config(df)  # Automatically optimizes settings\n\n# Manual configuration\ncompressor = SheetCompressor(\n    k=2,  # Structural anchor neighborhood size\n    use_extraction=True,\n    use_translation=True, \n    use_aggregation=True\n)\n\n# Compress the spreadsheet\ncompressed_result = compressor.compress(df)\nprint(f\"Compression ratio: {compressed_result['compression_ratio']:.1f}x\")\nprint(f\"Compressed shape: {compressed_result['compressed_df'].shape}\")\n\n# Or use with SpreadsheetLLM for full pipeline\nsllm = SpreadsheetLLM(compression_params={\n    'k': 2,\n    'use_extraction': True,\n    'use_translation': True, \n    'use_aggregation': True\n})\n\n# Get detailed statistics\nstats = sllm.get_encoding_stats(df)\nprint(f\"Token reduction: {stats['token_reduction_ratio']:.1f}x\")\n\n# Process QA queries\nresult = sllm.process_qa_query(df, \"What was the total revenue in 2023?\")\n```\n\n### Enhanced Features Usage (v2.0+)\n\n```python\nfrom sheetwise import (\n    SpreadsheetLLM, \n    FormulaParser, \n    WorkbookManager, \n    CompressionVisualizer, \n    SmartTableDetector\n)\n\n# Formula extraction and analysis\nformula_parser = FormulaParser()\nformulas = formula_parser.extract_formulas(\"your_spreadsheet.xlsx\")\nformula_parser.build_dependency_graph()\nimpact = formula_parser.get_formula_impact(\"Sheet1!A1\")\nformula_text = formula_parser.encode_formulas_for_llm()\n\n# Multi-sheet support\nworkbook = WorkbookManager()\nsheets = workbook.load_workbook(\"your_workbook.xlsx\")\ncross_refs = workbook.detect_cross_sheet_references()\nsllm = SpreadsheetLLM()\ncompressed = workbook.compress_workbook(sllm.compressor)\nencoded = workbook.encode_workbook_for_llm(compressed)\n\n# Visualization\nvisualizer = CompressionVisualizer()\ndf = sllm.load_from_file(\"your_spreadsheet.xlsx\")\ncompressed_result = sllm.compress_spreadsheet(df)\nfig = visualizer.create_data_density_heatmap(df)\nfig.savefig(\"heatmap.png\")\nhtml_report = visualizer.generate_html_report(df, compressed_result)\n\n# Advanced table detection\ndetector = SmartTableDetector()\ntables = detector.detect_tables(df)\nextracted_tables = detector.extract_tables_to_dataframes(df)\n```\n\n### Command Line Interface\n\n```bash\n# Basic usage\nsheetwise input.xlsx -o output.txt --stats\n\n# Auto-configure compression\nsheetwise input.xlsx --auto-config --verbose\n\n# Run demo with sample data\nsheetwise --demo --auto-config\n\n# Use vanilla encoding instead of compression\nsheetwise input.xlsx --vanilla\n\n# Output in JSON format\nsheetwise input.xlsx --format json\n```\n\n### Enhanced CLI Features (v2.0+)\n\n```bash\n# Extract and analyze formulas\nsheetwise your_spreadsheet.xlsx --extract-formulas\n\n# Process all sheets in a workbook\nsheetwise your_workbook.xlsx --multi-sheet\n\n# Generate visualizations\nsheetwise your_spreadsheet.xlsx --visualize\n\n# Detect and extract tables\nsheetwise your_spreadsheet.xlsx --detect-tables\n\n# Generate an HTML report\nsheetwise your_spreadsheet.xlsx --format html\n```\n\n## Benchmarks & Visualization\n\nSheetWise includes a benchmarking script to evaluate compression, speed, and memory usage across spreadsheets. This helps you understand performance and compare results visually.\n\n### Running Benchmarks\n\n1. Place your sample spreadsheets in `benchmarks/samples/` (supports .xlsx and .csv).\n2. Run the benchmark script:\n\n```bash\npython scripts/generate_benchmarks.py\n```\n\n3. Results and charts will be saved in `benchmarks/results/` and `benchmarks/charts/`.\n\n### Example Output\n\n- **Compression Ratio vs. File Size**\n  ![Compression Ratio](benchmarks/charts/compression_vs_size.png)\n- **Processing Time vs. File Size**\n  ![Processing Time](benchmarks/charts/time_vs_size.png)\n- **Max Memory Usage per File**\n  ![Memory Usage](benchmarks/charts/memory_usage.png)\n\nThese charts can be included in your documentation or website to showcase SheetWise's efficiency and scalability.\n\n---\n\n## Core Components\n\n### 1. SheetCompressor\n\nThe main compression framework with three modules:\n\n- **Structural Anchor Extraction**: Identifies and preserves structurally important rows/columns\n- **Inverted Index Translation**: Creates efficient value-to-location mappings\n- **Data Format Aggregation**: Groups cells by data type and format\n\n### 2. Chain of Spreadsheet\n\nMulti-step reasoning approach:\n\n1. **Table Identification**: Automatically detects table regions\n2. **Compression**: Applies SheetCompressor to reduce size\n3. **Query Processing**: Identifies relevant regions for specific queries\n\n### 3. Enhanced Modules (v2.0+)\n\n- **FormulaParser**: Extracts and analyzes Excel formulas\n- **WorkbookManager**: Handles multi-sheet workbooks and cross-references\n- **CompressionVisualizer**: Generates visualizations and reports\n- **SmartTableDetector**: Advanced table detection and classification\n\n## Examples\n\n### Working with Financial Data\n\n```python\nfrom sheetwise import SpreadsheetLLM\nfrom sheetwise.utils import create_realistic_spreadsheet\n\n# Create sample financial spreadsheet\ndf = create_realistic_spreadsheet()\n\nsllm = SpreadsheetLLM()\n\n# Analyze the data\nstats = sllm.get_encoding_stats(df)\nprint(f\"Original size: {stats['original_shape']}\")\nprint(f\"Sparsity: {stats['sparsity_percentage']:.1f}% empty cells\")\nprint(f\"Compression: {stats['compression_ratio']:.1f}x smaller\")\n\n# Generate LLM-ready output\nencoded = sllm.compress_and_encode_for_llm(df)\nprint(\"\\nReady for LLM:\")\nprint(encoded[:300] + \"...\")\n```\n\n### Visualizing Compression\n\n```python\nfrom sheetwise import SpreadsheetLLM, CompressionVisualizer\nimport pandas as pd\n\n# Load your data\ndf = pd.read_excel(\"complex_spreadsheet.xlsx\")\n\n# Compress the data\nsllm = SpreadsheetLLM()\ncompressed_result = sllm.compress_spreadsheet(df)\n\n# Create visualizations\nvisualizer = CompressionVisualizer()\n\n# Generate heatmap of data density\nfig1 = visualizer.create_data_density_heatmap(df)\nfig1.savefig(\"density_heatmap.png\")\n\n# Compare original vs compressed\nfig2 = visualizer.compare_original_vs_compressed(df, compressed_result)\nfig2.savefig(\"compression_comparison.png\")\n\n# Generate HTML report with all visualizations\nhtml_report = visualizer.generate_html_report(df, compressed_result)\nwith open(\"compression_report.html\", \"w\") as f:\n    f.write(html_report)\n\n# Compare different compression strategies\nconfigs = [\n    {\"name\": \"Extraction Only\", \"use_translation\": False, \"use_aggregation\": False},\n    {\"name\": \"Translation Only\", \"use_extraction\": False, \"use_aggregation\": False}, \n    {\"name\": \"All Modules\", \"use_extraction\": True, \"use_translation\": True, \"use_aggregation\": True}\n]\n\nfor config in configs:\n    compressor = SheetCompressor(**{k: v for k, v in config.items() if k != \"name\"})\n    result = compressor.compress(df)\n    print(f\"{config['name']}: {result['compression_ratio']:.1f}x compression\")\n```\n\n\n\n## Performance\n\nSpreadsheetLLM achieves significant improvements over vanilla encoding:\n\n| Metric | Vanilla | SpreadsheetLLM | Improvement |\n|--------|---------|----------------|-------------|\n| Token Count | ~25,000 | ~1,200 | **96% reduction** |\n| Sparsity Handling | Poor | Excellent | **Removes empty regions** |\n| Multi-Table Support | Limited | Native | **Preserves structure** |\n| Format Preservation | Basic | Advanced | **Type-aware grouping** |\n\n## API Reference\n\n### SpreadsheetLLM Class\n\nThe main interface for the framework.\n\n#### Methods\n\n- `compress_and_encode_for_llm(df)`: One-step compression and encoding\n- `compress_spreadsheet(df)`: Apply compression pipeline  \n- `encode_vanilla(df)`: Traditional encoding\n- `get_encoding_stats(df)`: Detailed compression statistics\n- `process_qa_query(df, query)`: Chain of Spreadsheet reasoning\n- `load_from_file(filepath)`: Load spreadsheet from file\n\n### SheetCompressor Class\n\nCore compression framework.\n\n#### Parameters\n\n- `k`: Structural anchor neighborhood size (default: 4)\n- `use_extraction`: Enable structural extraction (default: True)\n- `use_translation`: Enable inverted index translation (default: True)\n- `use_aggregation`: Enable format aggregation (default: True)\n\n## Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n### Development Setup\n\n```bash\n# Clone the repository\n\ngit clone https://github.com/yourusername/sheetwise.git\ncd sheetwise\n\n# Install development dependencies\npoetry install\n\n# Run tests\npoetry run pytest\n\n# Run linting\npoetry run black src tests\npoetry run isort src tests\npoetry run flake8 src tests\n```\n\n### Running Tests\n\n```bash\n# Run all tests\npoetry run pytest\n\n# Run with coverage\npoetry run pytest --cov=src\n\n# Run specific test file\npoetry run pytest tests/test_core.py\n```\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Citation\n\nIf you use SpreadsheetLLM in your research, please cite:\n\n```bibtex\n@article{spreadsheetllm2024,\n  title={SpreadsheetLLM: Encoding Spreadsheets for Large Language Models},\n  author={Microsoft Research Team},\n  journal={arXiv preprint},\n  year={2024}\n}\n```\n\n\n## Support\n\n- [Documentation](https://sheetwise.readthedocs.io)\n- [Issue Tracker](https://github.com/yourusername/sheetwise/issues)\n- [Discussions](https://github.com/yourusername/sheetwise/discussions)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python package for encoding spreadsheets for Large Language Models, implementing the SpreadsheetLLM research framework",
    "version": "2.2.0",
    "project_urls": {
        "Documentation": "https://khushiyant.github.io/sheetwise",
        "Homepage": "https://github.com/Khushiyant/sheetwise",
        "Repository": "https://github.com/Khushiyant/sheetwise"
    },
    "split_keywords": [
        "spreadsheet",
        " llm",
        " encoding",
        " compression",
        " data-processing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6a10e9c1e127845d963b15815007cb619b111b3904df9b3a441da65d6b9d415f",
                "md5": "6b414be94eb0bf315c96169da6ad4bb5",
                "sha256": "439340efea490a47b4ddd6e55a3824764fd4aa9b77ce30f8062a8c0467e7f9c8"
            },
            "downloads": -1,
            "filename": "sheetwise-2.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6b414be94eb0bf315c96169da6ad4bb5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0.0,>=3.8.1",
            "size": 37856,
            "upload_time": "2025-08-19T15:46:20",
            "upload_time_iso_8601": "2025-08-19T15:46:20.493640Z",
            "url": "https://files.pythonhosted.org/packages/6a/10/e9c1e127845d963b15815007cb619b111b3904df9b3a441da65d6b9d415f/sheetwise-2.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e0b4636e9cd7750638526d4407b02e07d29b484d69adb96cc7cc72c67620f46a",
                "md5": "8a77a0137ce73d4fef08e62ccbba7083",
                "sha256": "9b170e396822d02303b519e70a98034a12b97cefc5895658aa8976f026d3d4ae"
            },
            "downloads": -1,
            "filename": "sheetwise-2.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "8a77a0137ce73d4fef08e62ccbba7083",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0.0,>=3.8.1",
            "size": 34679,
            "upload_time": "2025-08-19T15:46:22",
            "upload_time_iso_8601": "2025-08-19T15:46:22.272529Z",
            "url": "https://files.pythonhosted.org/packages/e0/b4/636e9cd7750638526d4407b02e07d29b484d69adb96cc7cc72c67620f46a/sheetwise-2.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-19 15:46:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Khushiyant",
    "github_project": "sheetwise",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "sheetwise"
}
        
Elapsed time: 1.21580s