# FunPuter - Intelligent Imputation Analysis
[](https://www.python.org/downloads/)
[](https://pypi.org/project/funputer/)
[]
[](#documentation)
**Intelligent imputation analysis with automatic data validation and metadata inference**
FunPuter analyzes your data and recommends the best imputation methods based on data patterns, missing mechanisms, and metadata constraints. Get intelligent suggestions with confidence scores to handle missing data professionally.
## 🚀 Quick Start
### Installation
```bash
pip install funputer
```
### 30-Second Example
**Auto-Inference Mode** (Zero Configuration)
```python
import funputer
# Point to your CSV - FunPuter figures out everything automatically
suggestions = funputer.analyze_imputation_requirements("your_data.csv")
# Get intelligent suggestions with confidence scores
for suggestion in suggestions:
if suggestion.missing_count > 0:
print(f"📊 {suggestion.column_name}: {suggestion.proposed_method}")
print(f" Confidence: {suggestion.confidence_score:.3f}")
print(f" Reason: {suggestion.rationale}")
print(f" Missing: {suggestion.missing_count} ({suggestion.missing_percentage:.1f}%)")
```
**Production Mode** (Full Control)
```python
import funputer
from funputer.models import ColumnMetadata
# Define your data structure with constraints
metadata = [
ColumnMetadata('customer_id', 'integer', unique_flag=True, nullable=False),
ColumnMetadata('age', 'integer', min_value=18, max_value=100),
ColumnMetadata('income', 'float', min_value=0),
ColumnMetadata('category', 'categorical', allowed_values='A,B,C'),
]
# Get production-grade suggestions
suggestions = funputer.analyze_dataframe(your_dataframe, metadata)
```
## 🎯 Key Features
- **🤖 Automatic Metadata Inference** - Intelligent data type and constraint detection
- **📊 Missing Data Analysis** - MCAR, MAR, MNAR mechanism detection
- **⚡ Data Validation** - Real-time constraint checking and validation
- **🎯 Smart Recommendations** - Context-aware imputation method suggestions
- **📈 Confidence Scoring** - Transparent reliability estimates for each recommendation
- **🛡️ Pre-flight Checks** - Comprehensive data validation before analysis
- **💻 CLI & Python API** - Flexible usage via command line or programmatic access
## 📊 Data Validation System
Comprehensive validation runs automatically to prevent crashes and guide your workflow:
- **File validation**: Format detection, encoding, accessibility
- **Structure validation**: Column analysis, data type inference
- **Memory estimation**: Resource usage prediction
- **Advisory recommendations**: Guided workflow suggestions
**Independent Usage:**
```bash
# Basic validation check
funputer preflight -d your_data.csv
# With custom options
funputer preflight -d data.csv --sample-rows 5000 --encoding utf-8
# JSON report output
funputer preflight -d data.csv --json-out report.json
```
**Exit Codes:**
- `0`: Ready for analysis
- `2`: OK with warnings (can proceed)
- `10`: Hard error (cannot proceed)
## 💻 Command Line Interface
```bash
# Generate metadata template from your data
funputer init -d data.csv -o metadata.csv
# Analyze with auto-inference
funputer analyze -d data.csv
# Analyze with custom metadata
funputer analyze -d data.csv -m metadata.csv --verbose
# Data quality check first
funputer preflight -d data.csv
```
## 📚 Usage Examples
### Basic Analysis
```python
import funputer
# Simple analysis with auto-inference
suggestions = funputer.analyze_imputation_requirements("sales_data.csv")
# Display recommendations
for suggestion in suggestions:
print(f"Column: {suggestion.column_name}")
print(f"Method: {suggestion.proposed_method}")
print(f"Confidence: {suggestion.confidence_score:.3f}")
print(f"Missing: {suggestion.missing_count} values")
print()
```
### Advanced Configuration
```python
from funputer.models import ColumnMetadata, AnalysisConfig
from funputer.analyzer import ImputationAnalyzer
# Custom metadata with business rules
metadata = [
ColumnMetadata('product_id', 'string', unique_flag=True, max_length=10),
ColumnMetadata('price', 'float', min_value=0, max_value=10000),
ColumnMetadata('category', 'categorical', allowed_values='Electronics,Books,Clothing'),
ColumnMetadata('rating', 'float', min_value=1.0, max_value=5.0),
]
# Custom analysis configuration
config = AnalysisConfig(
missing_percentage_threshold=0.3, # 30% threshold
skip_columns=['internal_id'],
outlier_threshold=0.1
)
# Run analysis
analyzer = ImputationAnalyzer(config)
suggestions = analyzer.analyze_dataframe(df, metadata)
```
### Industry-Specific Examples
**E-commerce Analytics**
```python
metadata = [
ColumnMetadata('customer_id', 'integer', unique_flag=True, nullable=False),
ColumnMetadata('age', 'integer', min_value=13, max_value=120),
ColumnMetadata('purchase_amount', 'float', min_value=0),
ColumnMetadata('customer_segment', 'categorical', allowed_values='Premium,Standard,Basic'),
]
suggestions = funputer.analyze_dataframe(customer_df, metadata)
```
**Healthcare Data**
```python
metadata = [
ColumnMetadata('patient_id', 'integer', unique_flag=True, nullable=False),
ColumnMetadata('age', 'integer', min_value=0, max_value=150),
ColumnMetadata('blood_pressure', 'integer', min_value=50, max_value=300),
ColumnMetadata('diagnosis', 'categorical', nullable=False),
]
config = AnalysisConfig(missing_threshold=0.05) # Low tolerance for healthcare
suggestions = funputer.analyze_dataframe(patient_df, metadata, config)
```
**Financial Risk Assessment**
```python
metadata = [
ColumnMetadata('application_id', 'integer', unique_flag=True, nullable=False),
ColumnMetadata('credit_score', 'integer', min_value=300, max_value=850),
ColumnMetadata('debt_to_income', 'float', min_value=0.0, max_value=10.0),
ColumnMetadata('loan_purpose', 'categorical', allowed_values='home,auto,personal,business'),
]
# Skip sensitive columns
config = AnalysisConfig(skip_columns=['ssn', 'account_number'])
suggestions = funputer.analyze_dataframe(loan_df, metadata, config)
```
## ⚙️ Requirements
- **Python**: 3.9 or higher
- **Dependencies**: pandas, numpy, scipy, pydantic, click, pyyaml
## 🔧 Installation from Source
```bash
git clone https://github.com/RajeshRamachander/funputer.git
cd funputer
pip install -e .
```
## 📚 Documentation
- **API Reference**: Complete docstrings and type hints throughout the codebase
- **Examples**: See usage examples above and in the codebase
- **Test Coverage**: 84% coverage with comprehensive test suite
## 📄 License
Proprietary License - Source code is available for inspection but not for derivative works.
---
**Focus**: Get intelligent imputation recommendations, not complex infrastructure.
Raw data
{
"_id": null,
"home_page": null,
"name": "funputer",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "imputation, missing-data, data-science, machine-learning, pandas, auto-inference, metadata, preflight, validation",
"author": null,
"author_email": "Rajesh Ramachander <rajeshr.technocraft@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/e3/87/c80b17641a9bcdf0aa3a6c8617052b32d106cf762f657038bf795ddc7191/funputer-1.5.2.tar.gz",
"platform": null,
"description": "# FunPuter - Intelligent Imputation Analysis\n\n[](https://www.python.org/downloads/)\n[](https://pypi.org/project/funputer/)\n[]\n[](#documentation)\n\n**Intelligent imputation analysis with automatic data validation and metadata inference**\n\nFunPuter analyzes your data and recommends the best imputation methods based on data patterns, missing mechanisms, and metadata constraints. Get intelligent suggestions with confidence scores to handle missing data professionally.\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\npip install funputer\n```\n\n### 30-Second Example\n\n**Auto-Inference Mode** (Zero Configuration)\n```python\nimport funputer\n\n# Point to your CSV - FunPuter figures out everything automatically\nsuggestions = funputer.analyze_imputation_requirements(\"your_data.csv\")\n\n# Get intelligent suggestions with confidence scores\nfor suggestion in suggestions:\n if suggestion.missing_count > 0:\n print(f\"\ud83d\udcca {suggestion.column_name}: {suggestion.proposed_method}\")\n print(f\" Confidence: {suggestion.confidence_score:.3f}\")\n print(f\" Reason: {suggestion.rationale}\")\n print(f\" Missing: {suggestion.missing_count} ({suggestion.missing_percentage:.1f}%)\")\n```\n\n**Production Mode** (Full Control)\n```python\nimport funputer\nfrom funputer.models import ColumnMetadata\n\n# Define your data structure with constraints\nmetadata = [\n ColumnMetadata('customer_id', 'integer', unique_flag=True, nullable=False),\n ColumnMetadata('age', 'integer', min_value=18, max_value=100),\n ColumnMetadata('income', 'float', min_value=0),\n ColumnMetadata('category', 'categorical', allowed_values='A,B,C'),\n]\n\n# Get production-grade suggestions\nsuggestions = funputer.analyze_dataframe(your_dataframe, metadata)\n```\n\n## \ud83c\udfaf Key Features\n\n- **\ud83e\udd16 Automatic Metadata Inference** - Intelligent data type and constraint detection\n- **\ud83d\udcca Missing Data Analysis** - MCAR, MAR, MNAR mechanism detection \n- **\u26a1 Data Validation** - Real-time constraint checking and validation\n- **\ud83c\udfaf Smart Recommendations** - Context-aware imputation method suggestions\n- **\ud83d\udcc8 Confidence Scoring** - Transparent reliability estimates for each recommendation\n- **\ud83d\udee1\ufe0f Pre-flight Checks** - Comprehensive data validation before analysis\n- **\ud83d\udcbb CLI & Python API** - Flexible usage via command line or programmatic access\n\n## \ud83d\udcca Data Validation System\n\nComprehensive validation runs automatically to prevent crashes and guide your workflow:\n\n- **File validation**: Format detection, encoding, accessibility\n- **Structure validation**: Column analysis, data type inference \n- **Memory estimation**: Resource usage prediction\n- **Advisory recommendations**: Guided workflow suggestions\n\n**Independent Usage:**\n```bash\n# Basic validation check\nfunputer preflight -d your_data.csv\n\n# With custom options \nfunputer preflight -d data.csv --sample-rows 5000 --encoding utf-8\n\n# JSON report output\nfunputer preflight -d data.csv --json-out report.json\n```\n\n**Exit Codes:**\n- `0`: Ready for analysis\n- `2`: OK with warnings (can proceed)\n- `10`: Hard error (cannot proceed)\n\n## \ud83d\udcbb Command Line Interface\n\n```bash\n# Generate metadata template from your data\nfunputer init -d data.csv -o metadata.csv\n\n# Analyze with auto-inference \nfunputer analyze -d data.csv\n\n# Analyze with custom metadata\nfunputer analyze -d data.csv -m metadata.csv --verbose\n\n# Data quality check first\nfunputer preflight -d data.csv\n```\n\n## \ud83d\udcda Usage Examples\n\n### Basic Analysis\n\n```python\nimport funputer\n\n# Simple analysis with auto-inference\nsuggestions = funputer.analyze_imputation_requirements(\"sales_data.csv\")\n\n# Display recommendations\nfor suggestion in suggestions:\n print(f\"Column: {suggestion.column_name}\")\n print(f\"Method: {suggestion.proposed_method}\") \n print(f\"Confidence: {suggestion.confidence_score:.3f}\")\n print(f\"Missing: {suggestion.missing_count} values\")\n print()\n```\n\n### Advanced Configuration\n\n```python\nfrom funputer.models import ColumnMetadata, AnalysisConfig\nfrom funputer.analyzer import ImputationAnalyzer\n\n# Custom metadata with business rules\nmetadata = [\n ColumnMetadata('product_id', 'string', unique_flag=True, max_length=10),\n ColumnMetadata('price', 'float', min_value=0, max_value=10000),\n ColumnMetadata('category', 'categorical', allowed_values='Electronics,Books,Clothing'),\n ColumnMetadata('rating', 'float', min_value=1.0, max_value=5.0),\n]\n\n# Custom analysis configuration\nconfig = AnalysisConfig(\n missing_percentage_threshold=0.3, # 30% threshold\n skip_columns=['internal_id'],\n outlier_threshold=0.1\n)\n\n# Run analysis\nanalyzer = ImputationAnalyzer(config)\nsuggestions = analyzer.analyze_dataframe(df, metadata)\n```\n\n### Industry-Specific Examples\n\n**E-commerce Analytics**\n```python\nmetadata = [\n ColumnMetadata('customer_id', 'integer', unique_flag=True, nullable=False),\n ColumnMetadata('age', 'integer', min_value=13, max_value=120),\n ColumnMetadata('purchase_amount', 'float', min_value=0),\n ColumnMetadata('customer_segment', 'categorical', allowed_values='Premium,Standard,Basic'),\n]\nsuggestions = funputer.analyze_dataframe(customer_df, metadata)\n```\n\n**Healthcare Data** \n```python\nmetadata = [\n ColumnMetadata('patient_id', 'integer', unique_flag=True, nullable=False),\n ColumnMetadata('age', 'integer', min_value=0, max_value=150),\n ColumnMetadata('blood_pressure', 'integer', min_value=50, max_value=300),\n ColumnMetadata('diagnosis', 'categorical', nullable=False),\n]\nconfig = AnalysisConfig(missing_threshold=0.05) # Low tolerance for healthcare\nsuggestions = funputer.analyze_dataframe(patient_df, metadata, config)\n```\n\n**Financial Risk Assessment**\n```python \nmetadata = [\n ColumnMetadata('application_id', 'integer', unique_flag=True, nullable=False),\n ColumnMetadata('credit_score', 'integer', min_value=300, max_value=850),\n ColumnMetadata('debt_to_income', 'float', min_value=0.0, max_value=10.0),\n ColumnMetadata('loan_purpose', 'categorical', allowed_values='home,auto,personal,business'),\n]\n# Skip sensitive columns\nconfig = AnalysisConfig(skip_columns=['ssn', 'account_number'])\nsuggestions = funputer.analyze_dataframe(loan_df, metadata, config)\n```\n\n## \u2699\ufe0f Requirements\n\n- **Python**: 3.9 or higher\n- **Dependencies**: pandas, numpy, scipy, pydantic, click, pyyaml\n\n## \ud83d\udd27 Installation from Source\n\n```bash\ngit clone https://github.com/RajeshRamachander/funputer.git\ncd funputer\npip install -e .\n```\n\n## \ud83d\udcda Documentation\n\n- **API Reference**: Complete docstrings and type hints throughout the codebase\n- **Examples**: See usage examples above and in the codebase\n- **Test Coverage**: 84% coverage with comprehensive test suite\n\n## \ud83d\udcc4 License \n\nProprietary License - Source code is available for inspection but not for derivative works.\n\n---\n\n**Focus**: Get intelligent imputation recommendations, not complex infrastructure.\n",
"bugtrack_url": null,
"license": null,
"summary": "Intelligent imputation analysis with automatic data validation and metadata inference",
"version": "1.5.2",
"project_urls": {
"Documentation": "https://pypi.org/project/funputer/",
"Homepage": "https://pypi.org/project/funputer/"
},
"split_keywords": [
"imputation",
" missing-data",
" data-science",
" machine-learning",
" pandas",
" auto-inference",
" metadata",
" preflight",
" validation"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "34635dede034e64458361783c88c1b5a17449b19c7dad599c6a9bdb6f229139e",
"md5": "81af9506e41f612724e05c8430e2fbaa",
"sha256": "3c857a56e2a21c4886417c616f8f0d5584b73900a5278d54410dff6d8220c97d"
},
"downloads": -1,
"filename": "funputer-1.5.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "81af9506e41f612724e05c8430e2fbaa",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 36251,
"upload_time": "2025-08-19T17:21:22",
"upload_time_iso_8601": "2025-08-19T17:21:22.310760Z",
"url": "https://files.pythonhosted.org/packages/34/63/5dede034e64458361783c88c1b5a17449b19c7dad599c6a9bdb6f229139e/funputer-1.5.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "e387c80b17641a9bcdf0aa3a6c8617052b32d106cf762f657038bf795ddc7191",
"md5": "2ed0611a20a8447f91233b9f1ae6204b",
"sha256": "ba99a039e9e8e6984501918dabe8ee574856bcccdd594e5b0a0ce70bf0b4b6d8"
},
"downloads": -1,
"filename": "funputer-1.5.2.tar.gz",
"has_sig": false,
"md5_digest": "2ed0611a20a8447f91233b9f1ae6204b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 107101,
"upload_time": "2025-08-19T17:21:23",
"upload_time_iso_8601": "2025-08-19T17:21:23.603747Z",
"url": "https://files.pythonhosted.org/packages/e3/87/c80b17641a9bcdf0aa3a6c8617052b32d106cf762f657038bf795ddc7191/funputer-1.5.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-19 17:21:23",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "funputer"
}