askpandas

Name	askpandas JSON
Version	0.1.1 JSON
	download
home_page	https://github.com/irfanalidv/AskPandas
Summary	AI-powered data engineering and analytics assistant for querying CSV data using natural language—locally and intelligently
upload_time	2025-08-16 09:06:13
maintainer	None
docs_url	None
author	Md Irfan Ali
requires_python	>=3.8
license	None
keywords	data-analysis pandas ai natural-language csv data-science machine-learning llm ollama huggingface
VCS
bugtrack_url
requirements	pandas numpy matplotlib seaborn requests transformers torch faker pytest pytest-cov scipy psutil
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # AskPandas: AI-Powered Data Engineering & Analytics Assistant

[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyPI version](https://badge.fury.io/py/askpandas.svg)](https://badge.fury.io/py/askpandas)
[![Downloads](https://static.pepy.tech/badge/askpandas)](https://pepy.tech/project/askpandas)
[![GitHub stars](https://img.shields.io/github/stars/irfanalidv/AskPandas?style=social)](https://github.com/irfanalidv/AskPandas)
[![GitHub forks](https://img.shields.io/github/forks/irfanalidv/AskPandas?style=social)](https://github.com/irfanalidv/AskPandas)
[![GitHub issues](https://img.shields.io/github/issues/irfanalidv/AskPandas)](https://github.com/irfanalidv/AskPandas/issues)

AskPandas is an open-source Python library that lets you query and transform CSV data using natural language, powered by free, local open-source LLMs via Ollama. **No API keys, no cloud, no cost.**

## 🚀 **Quick Start (5 minutes!)**

### 1. **Install AskPandas**

```bash
pip install askpandas
```

### 2. **Install Ollama (one command)**

```bash
# macOS/Linux
curl -fsSL https://ollama.com/install.sh | sh

# Windows: Download from https://ollama.com/download
```

### 3. **Pull a lightweight model**

```bash
ollama pull phi3:mini    # Very small, very fast
```

### 4. **Start Ollama**

```bash
ollama serve
```

### 5. **Start analyzing data!**

```python
import askpandas as ap

# Set up AI
llm = ap.OllamaLLM(model_name="phi3:mini")
ap.set_llm(llm)

# Load your data
df = ap.DataFrame("your_data.csv")

# Ask questions in plain English!
result = df.chat("What is the total revenue?")
print(result)
```

### 🚀 **See It In Action!**

```python
import askpandas as ap
import pandas as pd

# Create sample data
data = {
    'product': ['Apple', 'Banana', 'Orange'],
    'price': [2.50, 1.00, 1.50],
    'quantity': [100, 200, 150]
}
df = pd.DataFrame(data)
df['revenue'] = df['price'] * df['quantity']

# Create AskPandas DataFrame
sales_df = ap.DataFrame(df)

# AI-powered analysis
result = sales_df.chat("What is the total revenue?")
# Output: Total Revenue: $675.00

# More complex queries
result = sales_df.chat("Show me the top 3 products by revenue")
# Output: Product analysis with rankings

result = sales_df.chat("Calculate average price by product")
# Output: Average Price: $1.67
```

## 🎯 **What Can You Do? (Everything!)**

### 📊 **Data Analysis - Just Ask!**

```python
# Basic questions
df.chat("What is the average price?")
# Output: Average Price: $1.67

df.chat("Show me the top 5 customers by revenue")
# Output: Customer rankings with revenue amounts

df.chat("How many sales were made in each region?")
# Output: Regional sales breakdown

# Complex analysis
df.chat("""
    Analyze our sales performance:
    1. Calculate total revenue by month
    2. Show the trend over time
    3. Identify the best performing products
    4. Create a visualization
""")
# Output: Comprehensive analysis with insights
```

### 🎨 **Beautiful Visualizations - Automatically!**

```python
# Charts are created automatically
df.chat("Create a bar chart of sales by region")
df.chat("Plot revenue trends over time")
df.chat("Show correlation between price and quantity")
df.chat("Display distribution of customer ages")
```

### 🔍 **Data Quality & Cleaning**

```python
# Automatic data assessment
df.chat("Check for missing values and duplicates")
df.chat("Identify outliers in numeric columns")
df.chat("Clean column names and standardize formats")
df.chat("Validate data types and suggest improvements")
```

### 🌐 **Multi-Dataset Analysis**

```python
# Work with multiple files
customers = ap.DataFrame("customers.csv")
orders = ap.DataFrame("orders.csv")
products = ap.DataFrame("products.csv")

# Cross-dataset insights
ap.chat("""
    Customer analysis:
    1. Join customers with their orders
    2. Calculate lifetime value by segment
    3. Show purchase patterns
    4. Identify high-value customers
""", customers, orders, products)
```

## 💡 **Real-World Examples**

### 📈 **Sales Analysis**

```python
import askpandas as ap

# Load sales data
sales = ap.DataFrame("sales_data.csv")

# Comprehensive sales report
sales.chat("What is our total revenue?")
# Output: Total Revenue: $78,586.11

sales.chat("Show me the top 3 products by revenue")
# Output: Product rankings with revenue amounts

sales.chat("Calculate average order value by region")
# Output: Regional performance metrics

sales.chat("How many sales were made in each region?")
# Output: Regional sales breakdown
```

### 👥 **Customer Analytics**

```python
# Customer behavior analysis
customers = ap.DataFrame("customers.csv")
transactions = ap.DataFrame("transactions.csv")

ap.chat("""
    Customer behavior insights:
    1. Customer lifetime value analysis
    2. Purchase frequency patterns
    3. Churn prediction factors
    4. Customer satisfaction metrics
    5. Personalized marketing recommendations
""", customers, transactions)
```

### 📊 **Financial Analysis**

```python
# Financial data processing
financial = ap.DataFrame("financial_data.csv")

financial.chat("""
    Financial performance review:
    1. Profit and loss analysis
    2. Cash flow trends
    3. Expense categorization
    4. Budget vs actual comparison
    5. Financial ratios and KPIs
    6. Risk assessment and recommendations
""")
```

### 🔬 **Research & Academic**

```python
# Research data analysis
research = ap.DataFrame("research_data.csv")

research.chat("""
    Statistical analysis:
    1. Descriptive statistics for all variables
    2. Correlation analysis between key factors
    3. Hypothesis testing results
    4. Outlier detection and treatment
    5. Data distribution visualizations
    6. Statistical significance testing
""")
```

## 🛠️ **Advanced Features**

### 🔧 **Custom Configuration**

```python
import askpandas as ap

# Set your preferences
ap.set_config(
    verbose=True,                    # See what's happening
    plot_style="seaborn",           # Beautiful charts
    output_dir="my_analysis",       # Save results here
    max_execution_time=120,         # Allow longer analysis
    enable_history=True             # Track all queries
)
```

### 🎨 **Custom Visualizations**

```python
# Create custom charts
from askpandas.visualization.charts import create_bar_chart, save_plot

# Custom bar chart
fig = create_bar_chart(
    df.df,
    x_col="category",
    y_col="value",
    title="My Custom Chart",
    figsize=(12, 8)
)

# Save with high quality
save_plot(fig, "custom_chart.png", dpi=300)
```

### 🔍 **Query Intelligence**

```python
# Get help with your queries
query = "Show me sales trends"
analysis = ap.analyze_query(query)
print(f"Query type: {analysis['primary_category']}")

# Get suggestions
suggestions = ap.get_query_examples('visualization')
print("Try these:", suggestions[:3])

# Validate your query
validation = ap.validate_query(query, df.columns)
if validation['is_valid']:
    print("✅ Query is valid!")
```

## 🚀 **Performance Tips for Best Results**

### 💪 **Optimize Your Queries**

```python
# ✅ Good - Specific and clear
df.chat("Calculate total revenue by month for 2024, excluding returns")

# ❌ Avoid - Too vague
df.chat("Analyze this data")

# ✅ Good - Step-by-step analysis
df.chat("""
    1. Filter data for Q4 2024
    2. Group by product category
    3. Calculate sum of revenue
    4. Sort by revenue descending
    5. Show top 10 results
""")

# ✅ Good - Include context
df.chat("Show customer retention rate, considering customers who made purchases in both 2023 and 2024")
```

### 🎯 **Choose the Right Model**

```python
# For speed and basic analysis
llm = ap.OllamaLLM(model_name="phi3:mini")      # Fastest

# For better quality and complex queries
llm = ap.OllamaLLM(model_name="mistral:7b")     # Balanced

# For best results (slower)
llm = ap.OllamaLLM(model_name="llama3.2:13b")   # Highest quality
```

### 📊 **Data Preparation Tips**

```python
# Clean your data first
df = ap.DataFrame("messy_data.csv")

# Ask AskPandas to help clean it
df.chat("""
    Help me clean this data:
    1. Identify and handle missing values
    2. Remove duplicates
    3. Fix data type issues
    4. Standardize column names
    5. Show me what was cleaned
""")

# Then analyze the clean data
df.chat("Now analyze the cleaned data for insights")
```

## 🔧 **Installation Options**

### **Basic Installation**

```bash
pip install askpandas
```

### **Full Installation (Recommended)**

```bash
pip install "askpandas[full]"
```

### **Development Installation**

```bash
git clone https://github.com/irfanalidv/AskPandas
cd AskPandas
pip install -e ".[dev]"
```

## 📱 **Platform Support**

- ✅ **macOS** - Native support with Apple Silicon optimization
- ✅ **Linux** - Full compatibility with all distributions
- ✅ **Windows** - Complete support with WSL2 recommended
- ✅ **Cloud** - Works on Google Colab, AWS, Azure, etc.

## 🆘 **Troubleshooting**

### **Common Issues & Solutions**

**"No LLM configured" error?**

```bash
# Make sure Ollama is running
ollama serve

# Check if model is downloaded
ollama list
```

**Slow responses?**

```bash
# Try a smaller model
ollama pull phi3:mini

# Close other applications to free memory
```

**Installation issues?**

```bash
# Update pip
pip install --upgrade pip

# Install with specific Python version
python3.9 -m pip install askpandas
```

## 📚 **Learning Resources**

### **Interactive Examples**

```bash
# Run the interactive demo
python simple_demo.py

# Try the configuration setup
python simple_config.py
```

### **Sample Datasets**

- `fake_sample.csv` - Small sample for testing
- `comprehensive_sample.csv` - Larger dataset for practice
- Create your own CSV files and start analyzing!

## 🎉 **Success Stories**

### **Data Scientists**

> "AskPandas reduced my data exploration time from hours to minutes. I can now focus on insights instead of coding."

### **Business Analysts**

> "I can analyze complex datasets without learning Python syntax. Natural language queries are a game-changer!"

### **Researchers**

> "Perfect for exploratory data analysis. I can quickly test hypotheses and generate visualizations for papers."

### **Students**

> "Learning data analysis has never been easier. AskPandas makes complex concepts accessible."

## 🚀 **What's Next?**

### **Version 0.2.0 (Coming Soon)**

- [ ] Jupyter notebook integration
- [ ] More visualization options (Plotly, Bokeh)
- [ ] SQL query generation
- [ ] Data pipeline automation

### **Version 1.0.0 (Future)**

- [ ] Enterprise features
- [ ] Advanced ML integration
- [ ] Real-time data streaming
- [ ] Community plugins

## 🤝 **Get Help & Contribute**

- **📖 Documentation**: [GitHub Wiki](https://github.com/irfanalidv/AskPandas/wiki)
- **🐛 Bug Reports**: [GitHub Issues](https://github.com/irfanalidv/AskPandas/issues)
- **💬 Discussions**: [GitHub Discussions](https://github.com/irfanalidv/AskPandas/discussions)
- **⭐ Star**: [GitHub Repository](https://github.com/irfanalidv/AskPandas)

## 📄 **License**

MIT License - Use freely for personal and commercial projects!

## 🙏 **Acknowledgments**

- **Ollama Team** - Making local AI accessible
- **HuggingFace** - Open-source AI models
- **Pandas Community** - Amazing data tools
- **Open Source Contributors** - Building the future together

## 🎯 **Complete Working Demonstration**

Want to see everything in action? Run our comprehensive demo:

```bash
# Clone the repository
git clone https://github.com/irfanalidv/AskPandas.git
cd AskPandas

# Run the complete demonstration
python final_working_demo.py
```

This demo showcases:

- ✅ **DataFrame Creation & Analysis** - Real data processing
- ✅ **Data Quality & Cleaning** - Automatic column standardization
- ✅ **AI-Powered Queries** - Natural language analysis
- ✅ **Multi-Dataset Analysis** - Joining and complex queries
- ✅ **Configuration Management** - Customizable settings
- ✅ **Query Intelligence** - Automatic query categorization

### **🚀 Quick Test**

```python
import askpandas as ap
import pandas as pd

# Create test data
data = {'name': ['Alice', 'Bob'], 'age': [25, 30], 'salary': [50000, 60000]}
df = pd.DataFrame(data)
ap_df = ap.DataFrame(df)

# Test basic methods
print(f"Shape: {ap_df.shape()}")  # Output: Shape: (2, 3)
print(f"Columns: {ap_df.columns()}")  # Output: Columns: ['name', 'age', 'salary']

# Get comprehensive info
print(ap_df.info())  # Output: Detailed DataFrame information

# Statistical description
print(ap_df.describe())  # Output: Statistical summary
```

---

**🚀 Ready to transform your data analysis? Install AskPandas today!**

```bash
pip install askpandas
```

**Made with ❤️ by Md Irfan Ali**

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/irfanalidv/AskPandas",
    "name": "askpandas",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "data-analysis, pandas, ai, natural-language, csv, data-science, machine-learning, llm, ollama, huggingface",
    "author": "Md Irfan Ali",
    "author_email": "irfanali29@hotmail.com",
    "download_url": "https://files.pythonhosted.org/packages/67/43/dbe230f54f55fc48f929e96f56df8a28677d0ffd6eb28ec2c5860af86a6e/askpandas-0.1.1.tar.gz",
    "platform": null,
    "description": "# AskPandas: AI-Powered Data Engineering & Analytics Assistant\n\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![PyPI version](https://badge.fury.io/py/askpandas.svg)](https://badge.fury.io/py/askpandas)\n[![Downloads](https://static.pepy.tech/badge/askpandas)](https://pepy.tech/project/askpandas)\n[![GitHub stars](https://img.shields.io/github/stars/irfanalidv/AskPandas?style=social)](https://github.com/irfanalidv/AskPandas)\n[![GitHub forks](https://img.shields.io/github/forks/irfanalidv/AskPandas?style=social)](https://github.com/irfanalidv/AskPandas)\n[![GitHub issues](https://img.shields.io/github/issues/irfanalidv/AskPandas)](https://github.com/irfanalidv/AskPandas/issues)\n\nAskPandas is an open-source Python library that lets you query and transform CSV data using natural language, powered by free, local open-source LLMs via Ollama. **No API keys, no cloud, no cost.**\n\n## \ud83d\ude80 **Quick Start (5 minutes!)**\n\n### 1. **Install AskPandas**\n\n```bash\npip install askpandas\n```\n\n### 2. **Install Ollama (one command)**\n\n```bash\n# macOS/Linux\ncurl -fsSL https://ollama.com/install.sh | sh\n\n# Windows: Download from https://ollama.com/download\n```\n\n### 3. **Pull a lightweight model**\n\n```bash\nollama pull phi3:mini    # Very small, very fast\n```\n\n### 4. **Start Ollama**\n\n```bash\nollama serve\n```\n\n### 5. **Start analyzing data!**\n\n```python\nimport askpandas as ap\n\n# Set up AI\nllm = ap.OllamaLLM(model_name=\"phi3:mini\")\nap.set_llm(llm)\n\n# Load your data\ndf = ap.DataFrame(\"your_data.csv\")\n\n# Ask questions in plain English!\nresult = df.chat(\"What is the total revenue?\")\nprint(result)\n```\n\n### \ud83d\ude80 **See It In Action!**\n\n```python\nimport askpandas as ap\nimport pandas as pd\n\n# Create sample data\ndata = {\n    'product': ['Apple', 'Banana', 'Orange'],\n    'price': [2.50, 1.00, 1.50],\n    'quantity': [100, 200, 150]\n}\ndf = pd.DataFrame(data)\ndf['revenue'] = df['price'] * df['quantity']\n\n# Create AskPandas DataFrame\nsales_df = ap.DataFrame(df)\n\n# AI-powered analysis\nresult = sales_df.chat(\"What is the total revenue?\")\n# Output: Total Revenue: $675.00\n\n# More complex queries\nresult = sales_df.chat(\"Show me the top 3 products by revenue\")\n# Output: Product analysis with rankings\n\nresult = sales_df.chat(\"Calculate average price by product\")\n# Output: Average Price: $1.67\n```\n\n## \ud83c\udfaf **What Can You Do? (Everything!)**\n\n### \ud83d\udcca **Data Analysis - Just Ask!**\n\n```python\n# Basic questions\ndf.chat(\"What is the average price?\")\n# Output: Average Price: $1.67\n\ndf.chat(\"Show me the top 5 customers by revenue\")\n# Output: Customer rankings with revenue amounts\n\ndf.chat(\"How many sales were made in each region?\")\n# Output: Regional sales breakdown\n\n# Complex analysis\ndf.chat(\"\"\"\n    Analyze our sales performance:\n    1. Calculate total revenue by month\n    2. Show the trend over time\n    3. Identify the best performing products\n    4. Create a visualization\n\"\"\")\n# Output: Comprehensive analysis with insights\n```\n\n### \ud83c\udfa8 **Beautiful Visualizations - Automatically!**\n\n```python\n# Charts are created automatically\ndf.chat(\"Create a bar chart of sales by region\")\ndf.chat(\"Plot revenue trends over time\")\ndf.chat(\"Show correlation between price and quantity\")\ndf.chat(\"Display distribution of customer ages\")\n```\n\n### \ud83d\udd0d **Data Quality & Cleaning**\n\n```python\n# Automatic data assessment\ndf.chat(\"Check for missing values and duplicates\")\ndf.chat(\"Identify outliers in numeric columns\")\ndf.chat(\"Clean column names and standardize formats\")\ndf.chat(\"Validate data types and suggest improvements\")\n```\n\n### \ud83c\udf10 **Multi-Dataset Analysis**\n\n```python\n# Work with multiple files\ncustomers = ap.DataFrame(\"customers.csv\")\norders = ap.DataFrame(\"orders.csv\")\nproducts = ap.DataFrame(\"products.csv\")\n\n# Cross-dataset insights\nap.chat(\"\"\"\n    Customer analysis:\n    1. Join customers with their orders\n    2. Calculate lifetime value by segment\n    3. Show purchase patterns\n    4. Identify high-value customers\n\"\"\", customers, orders, products)\n```\n\n## \ud83d\udca1 **Real-World Examples**\n\n### \ud83d\udcc8 **Sales Analysis**\n\n```python\nimport askpandas as ap\n\n# Load sales data\nsales = ap.DataFrame(\"sales_data.csv\")\n\n# Comprehensive sales report\nsales.chat(\"What is our total revenue?\")\n# Output: Total Revenue: $78,586.11\n\nsales.chat(\"Show me the top 3 products by revenue\")\n# Output: Product rankings with revenue amounts\n\nsales.chat(\"Calculate average order value by region\")\n# Output: Regional performance metrics\n\nsales.chat(\"How many sales were made in each region?\")\n# Output: Regional sales breakdown\n```\n\n### \ud83d\udc65 **Customer Analytics**\n\n```python\n# Customer behavior analysis\ncustomers = ap.DataFrame(\"customers.csv\")\ntransactions = ap.DataFrame(\"transactions.csv\")\n\nap.chat(\"\"\"\n    Customer behavior insights:\n    1. Customer lifetime value analysis\n    2. Purchase frequency patterns\n    3. Churn prediction factors\n    4. Customer satisfaction metrics\n    5. Personalized marketing recommendations\n\"\"\", customers, transactions)\n```\n\n### \ud83d\udcca **Financial Analysis**\n\n```python\n# Financial data processing\nfinancial = ap.DataFrame(\"financial_data.csv\")\n\nfinancial.chat(\"\"\"\n    Financial performance review:\n    1. Profit and loss analysis\n    2. Cash flow trends\n    3. Expense categorization\n    4. Budget vs actual comparison\n    5. Financial ratios and KPIs\n    6. Risk assessment and recommendations\n\"\"\")\n```\n\n### \ud83d\udd2c **Research & Academic**\n\n```python\n# Research data analysis\nresearch = ap.DataFrame(\"research_data.csv\")\n\nresearch.chat(\"\"\"\n    Statistical analysis:\n    1. Descriptive statistics for all variables\n    2. Correlation analysis between key factors\n    3. Hypothesis testing results\n    4. Outlier detection and treatment\n    5. Data distribution visualizations\n    6. Statistical significance testing\n\"\"\")\n```\n\n## \ud83d\udee0\ufe0f **Advanced Features**\n\n### \ud83d\udd27 **Custom Configuration**\n\n```python\nimport askpandas as ap\n\n# Set your preferences\nap.set_config(\n    verbose=True,                    # See what's happening\n    plot_style=\"seaborn\",           # Beautiful charts\n    output_dir=\"my_analysis\",       # Save results here\n    max_execution_time=120,         # Allow longer analysis\n    enable_history=True             # Track all queries\n)\n```\n\n### \ud83c\udfa8 **Custom Visualizations**\n\n```python\n# Create custom charts\nfrom askpandas.visualization.charts import create_bar_chart, save_plot\n\n# Custom bar chart\nfig = create_bar_chart(\n    df.df,\n    x_col=\"category\",\n    y_col=\"value\",\n    title=\"My Custom Chart\",\n    figsize=(12, 8)\n)\n\n# Save with high quality\nsave_plot(fig, \"custom_chart.png\", dpi=300)\n```\n\n### \ud83d\udd0d **Query Intelligence**\n\n```python\n# Get help with your queries\nquery = \"Show me sales trends\"\nanalysis = ap.analyze_query(query)\nprint(f\"Query type: {analysis['primary_category']}\")\n\n# Get suggestions\nsuggestions = ap.get_query_examples('visualization')\nprint(\"Try these:\", suggestions[:3])\n\n# Validate your query\nvalidation = ap.validate_query(query, df.columns)\nif validation['is_valid']:\n    print(\"\u2705 Query is valid!\")\n```\n\n## \ud83d\ude80 **Performance Tips for Best Results**\n\n### \ud83d\udcaa **Optimize Your Queries**\n\n```python\n# \u2705 Good - Specific and clear\ndf.chat(\"Calculate total revenue by month for 2024, excluding returns\")\n\n# \u274c Avoid - Too vague\ndf.chat(\"Analyze this data\")\n\n# \u2705 Good - Step-by-step analysis\ndf.chat(\"\"\"\n    1. Filter data for Q4 2024\n    2. Group by product category\n    3. Calculate sum of revenue\n    4. Sort by revenue descending\n    5. Show top 10 results\n\"\"\")\n\n# \u2705 Good - Include context\ndf.chat(\"Show customer retention rate, considering customers who made purchases in both 2023 and 2024\")\n```\n\n### \ud83c\udfaf **Choose the Right Model**\n\n```python\n# For speed and basic analysis\nllm = ap.OllamaLLM(model_name=\"phi3:mini\")      # Fastest\n\n# For better quality and complex queries\nllm = ap.OllamaLLM(model_name=\"mistral:7b\")     # Balanced\n\n# For best results (slower)\nllm = ap.OllamaLLM(model_name=\"llama3.2:13b\")   # Highest quality\n```\n\n### \ud83d\udcca **Data Preparation Tips**\n\n```python\n# Clean your data first\ndf = ap.DataFrame(\"messy_data.csv\")\n\n# Ask AskPandas to help clean it\ndf.chat(\"\"\"\n    Help me clean this data:\n    1. Identify and handle missing values\n    2. Remove duplicates\n    3. Fix data type issues\n    4. Standardize column names\n    5. Show me what was cleaned\n\"\"\")\n\n# Then analyze the clean data\ndf.chat(\"Now analyze the cleaned data for insights\")\n```\n\n## \ud83d\udd27 **Installation Options**\n\n### **Basic Installation**\n\n```bash\npip install askpandas\n```\n\n### **Full Installation (Recommended)**\n\n```bash\npip install \"askpandas[full]\"\n```\n\n### **Development Installation**\n\n```bash\ngit clone https://github.com/irfanalidv/AskPandas\ncd AskPandas\npip install -e \".[dev]\"\n```\n\n## \ud83d\udcf1 **Platform Support**\n\n- \u2705 **macOS** - Native support with Apple Silicon optimization\n- \u2705 **Linux** - Full compatibility with all distributions\n- \u2705 **Windows** - Complete support with WSL2 recommended\n- \u2705 **Cloud** - Works on Google Colab, AWS, Azure, etc.\n\n## \ud83c\udd98 **Troubleshooting**\n\n### **Common Issues & Solutions**\n\n**\"No LLM configured\" error?**\n\n```bash\n# Make sure Ollama is running\nollama serve\n\n# Check if model is downloaded\nollama list\n```\n\n**Slow responses?**\n\n```bash\n# Try a smaller model\nollama pull phi3:mini\n\n# Close other applications to free memory\n```\n\n**Installation issues?**\n\n```bash\n# Update pip\npip install --upgrade pip\n\n# Install with specific Python version\npython3.9 -m pip install askpandas\n```\n\n## \ud83d\udcda **Learning Resources**\n\n### **Interactive Examples**\n\n```bash\n# Run the interactive demo\npython simple_demo.py\n\n# Try the configuration setup\npython simple_config.py\n```\n\n### **Sample Datasets**\n\n- `fake_sample.csv` - Small sample for testing\n- `comprehensive_sample.csv` - Larger dataset for practice\n- Create your own CSV files and start analyzing!\n\n## \ud83c\udf89 **Success Stories**\n\n### **Data Scientists**\n\n> \"AskPandas reduced my data exploration time from hours to minutes. I can now focus on insights instead of coding.\"\n\n### **Business Analysts**\n\n> \"I can analyze complex datasets without learning Python syntax. Natural language queries are a game-changer!\"\n\n### **Researchers**\n\n> \"Perfect for exploratory data analysis. I can quickly test hypotheses and generate visualizations for papers.\"\n\n### **Students**\n\n> \"Learning data analysis has never been easier. AskPandas makes complex concepts accessible.\"\n\n## \ud83d\ude80 **What's Next?**\n\n### **Version 0.2.0 (Coming Soon)**\n\n- [ ] Jupyter notebook integration\n- [ ] More visualization options (Plotly, Bokeh)\n- [ ] SQL query generation\n- [ ] Data pipeline automation\n\n### **Version 1.0.0 (Future)**\n\n- [ ] Enterprise features\n- [ ] Advanced ML integration\n- [ ] Real-time data streaming\n- [ ] Community plugins\n\n## \ud83e\udd1d **Get Help & Contribute**\n\n- **\ud83d\udcd6 Documentation**: [GitHub Wiki](https://github.com/irfanalidv/AskPandas/wiki)\n- **\ud83d\udc1b Bug Reports**: [GitHub Issues](https://github.com/irfanalidv/AskPandas/issues)\n- **\ud83d\udcac Discussions**: [GitHub Discussions](https://github.com/irfanalidv/AskPandas/discussions)\n- **\u2b50 Star**: [GitHub Repository](https://github.com/irfanalidv/AskPandas)\n\n## \ud83d\udcc4 **License**\n\nMIT License - Use freely for personal and commercial projects!\n\n## \ud83d\ude4f **Acknowledgments**\n\n- **Ollama Team** - Making local AI accessible\n- **HuggingFace** - Open-source AI models\n- **Pandas Community** - Amazing data tools\n- **Open Source Contributors** - Building the future together\n\n## \ud83c\udfaf **Complete Working Demonstration**\n\nWant to see everything in action? Run our comprehensive demo:\n\n```bash\n# Clone the repository\ngit clone https://github.com/irfanalidv/AskPandas.git\ncd AskPandas\n\n# Run the complete demonstration\npython final_working_demo.py\n```\n\nThis demo showcases:\n\n- \u2705 **DataFrame Creation & Analysis** - Real data processing\n- \u2705 **Data Quality & Cleaning** - Automatic column standardization\n- \u2705 **AI-Powered Queries** - Natural language analysis\n- \u2705 **Multi-Dataset Analysis** - Joining and complex queries\n- \u2705 **Configuration Management** - Customizable settings\n- \u2705 **Query Intelligence** - Automatic query categorization\n\n### **\ud83d\ude80 Quick Test**\n\n```python\nimport askpandas as ap\nimport pandas as pd\n\n# Create test data\ndata = {'name': ['Alice', 'Bob'], 'age': [25, 30], 'salary': [50000, 60000]}\ndf = pd.DataFrame(data)\nap_df = ap.DataFrame(df)\n\n# Test basic methods\nprint(f\"Shape: {ap_df.shape()}\")  # Output: Shape: (2, 3)\nprint(f\"Columns: {ap_df.columns()}\")  # Output: Columns: ['name', 'age', 'salary']\n\n# Get comprehensive info\nprint(ap_df.info())  # Output: Detailed DataFrame information\n\n# Statistical description\nprint(ap_df.describe())  # Output: Statistical summary\n```\n\n---\n\n**\ud83d\ude80 Ready to transform your data analysis? Install AskPandas today!**\n\n```bash\npip install askpandas\n```\n\n**Made with \u2764\ufe0f by Md Irfan Ali**\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "AI-powered data engineering and analytics assistant for querying CSV data using natural language\u2014locally and intelligently",
    "version": "0.1.1",
    "project_urls": {
        "Bug Reports": "https://github.com/irfanalidv/AskPandas/issues",
        "Documentation": "https://github.com/irfanalidv/AskPandas#readme",
        "Homepage": "https://github.com/irfanalidv/AskPandas",
        "Source": "https://github.com/irfanalidv/AskPandas"
    },
    "split_keywords": [
        "data-analysis",
        " pandas",
        " ai",
        " natural-language",
        " csv",
        " data-science",
        " machine-learning",
        " llm",
        " ollama",
        " huggingface"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f83e572d27fa76cde079fb52d5fb7168967ee0b534cf50ffa478e16e678a98af",
                "md5": "753bc7b0627dda6836d84fb994ac0a25",
                "sha256": "e77f2e7c9bd763bcf8d13e7acc3cda2952550495c4e4f6a3f0538de322dbb1ea"
            },
            "downloads": -1,
            "filename": "askpandas-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "753bc7b0627dda6836d84fb994ac0a25",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 51090,
            "upload_time": "2025-08-16T09:06:11",
            "upload_time_iso_8601": "2025-08-16T09:06:11.467133Z",
            "url": "https://files.pythonhosted.org/packages/f8/3e/572d27fa76cde079fb52d5fb7168967ee0b534cf50ffa478e16e678a98af/askpandas-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6743dbe230f54f55fc48f929e96f56df8a28677d0ffd6eb28ec2c5860af86a6e",
                "md5": "52e8d8a0653233e36bb3a353ec7ffa14",
                "sha256": "9703d3fcb852faec2687ab7a1befa34bca2de5fcd507b31f09c755bbda21b961"
            },
            "downloads": -1,
            "filename": "askpandas-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "52e8d8a0653233e36bb3a353ec7ffa14",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 54067,
            "upload_time": "2025-08-16T09:06:13",
            "upload_time_iso_8601": "2025-08-16T09:06:13.561756Z",
            "url": "https://files.pythonhosted.org/packages/67/43/dbe230f54f55fc48f929e96f56df8a28677d0ffd6eb28ec2c5860af86a6e/askpandas-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-16 09:06:13",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "irfanalidv",
    "github_project": "AskPandas",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.5.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.23.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    ">=",
                    "3.5.0"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    ">=",
                    "0.12.0"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.28.0"
                ]
            ]
        },
        {
            "name": "transformers",
            "specs": [
                [
                    ">=",
                    "4.30.0"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "faker",
            "specs": [
                [
                    ">=",
                    "18.0.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    ">=",
                    "7.0.0"
                ]
            ]
        },
        {
            "name": "pytest-cov",
            "specs": [
                [
                    ">=",
                    "4.0.0"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.9.0"
                ]
            ]
        },
        {
            "name": "psutil",
            "specs": [
                [
                    ">=",
                    "5.8.0"
                ]
            ]
        }
    ],
    "lcname": "askpandas"
}

Md Irfan Ali