gpt4pandas


Namegpt4pandas JSON
Version 0.2 PyPI version JSON
download
home_page
SummaryA tool that uses the GPT4ALL language model and the Pandas library to answer questions about dataframes
upload_time2023-05-03 22:05:16
maintainer
docs_urlNone
authorParisNeo (Saifeddine ALOUI)
requires_python
licenseApache License 2.0
keywords pandas gpt4all qa
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # GPT4Pandas

GPT4Pandas is a tool that uses the GPT4ALL language model and the Pandas library to answer questions about dataframes. With this tool, you can easily get answers to questions about your dataframes without needing to write any code.

## Installation

To install GPT4ALL Pandas Q&A, you can use pip:
```bash
pip install gpt4all-pandasqa
```

## Usage

To use GPT4ALL Pandas Q&A, you can import the `GPT4Pandas` class and create an instance of it with your dataframe:
```python
import pandas as pd
from gpt4pandas import GPT4Pandas
# Load a sample dataframe
data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 35],
    "City": ["New York", "Paris", "London"],
    "Salary": [50000, 60000, 70000],
}
df = pd.DataFrame(data)

# Initialize the GPT4Pandas model
model_path = <the path to the model file>
gpt = GPT4Pandas(model_path, df, verbose=False)
```

Then ask a question about your dataframe:

```python
# Ask a question about the dataframe
question = "What is the average salary?"
print(question)
answer = gpt.ask(question)
print(answer)  # Output: "mean(Salary)"
```

Here is a complete example that you can also find in examples folder :

```python
import pandas as pd
from gpt4pandas import GPT4Pandas
from pathlib import Path
from tqdm import tqdm
import urllib
import sys

# If there is no model, then download one 
# These models can be automatically downloaded, uncomment the model you want to use
# url = "https://huggingface.co/ParisNeo/GPT4All/resolve/main/gpt4all-lora-quantized-ggml.bin"
# url = "https://huggingface.co/ParisNeo/GPT4All/resolve/main/gpt4all-lora-unfiltered-quantized.new.bin"
# url = "https://huggingface.co/eachadea/legacy-ggml-vicuna-7b-4bit/resolve/main/ggml-vicuna-7b-4bit-rev1.bin"
url = "https://huggingface.co/eachadea/ggml-vicuna-13b-4bit/resolve/main/ggml-vicuna-13b-4bit-rev1.bin"
model_name  = url.split("/")[-1]
folder_path = Path("models/")

model_full_path = (folder_path / model_name)

# ++++++++++++++++++++ Model downloading +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# Check if file already exists in folder
if model_full_path.exists():
    print("File already exists in folder")
else:
    # Create folder if it doesn't exist
    folder_path.mkdir(parents=True, exist_ok=True)
    progress_bar = tqdm(total=None, unit="B", unit_scale=True, desc=f"Downloading {url.split('/')[-1]}")
    # Define callback function for urlretrieve
    def report_progress(block_num, block_size, total_size):
        progress_bar.total=total_size
        progress_bar.update(block_size)
    # Download file from URL to folder
    try:
        urllib.request.urlretrieve(url, folder_path / url.split("/")[-1], reporthook=report_progress)
        print("File downloaded successfully!")
    except Exception as e:
        print("Error downloading file:", e)
        sys.exit(1)
# ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

# Load a sample dataframe
data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 35],
    "City": ["New York", "Paris", "London"],
    "Salary": [50000, 60000, 70000],
}
df = pd.DataFrame(data)

# Initialize the GPT4Pandas model
model_path = "models/"+model_name
gpt = GPT4Pandas(model_path, df, verbose=False)

print("Dataframe")
print(df)
# Ask a question about the dataframe
question = "What is the average salary?"
print(question)
answer = gpt.ask(question)
print(answer)  # Output: "mean(Salary)"

# Ask another question
question = "Which person is youngest?"
print(question)
answer = gpt.ask(question)
print(answer)  # Output: "max(Age)"

# Set a new dataframe and ask a question
new_data = {
    "Name": ["David", "Emily"],
    "Age": [40, 45],
    "City": ["Berlin", "Tokyo"],
    "Salary": [80000, 90000],
}
new_df = pd.DataFrame(new_data)
print("Dataframe")
print(new_df)

gpt.set_dataframe(new_df)
question = "What is salary in Tokyo?"
print(question)
answer = gpt.ask(question)
print(answer)  # Output: "min(Salary) where City is Tokyo"
```

This will output the answer to your question.
Here is one of the answers :

```
Dataframe
      Name  Age      City  Salary
0    Alice   25  New York   50000
1      Bob   30     Paris   60000
2  Charlie   35    London   70000
What is the average salary?
The average salary is $60,000.
Which person is youngest?
Alice is the youngest.
Dataframe
    Name  Age    City  Salary
0  David   40  Berlin   80000
1  Emily   45   Tokyo   90000
What is salary in Tokyo?
The salary in Tokyo is $90,000.
```

# License
GPT4ALL Pandas Q&A is licensed under the Apache License, Version 2.0. See the LICENSE file for more information.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "gpt4pandas",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "pandas GPT4ALL QA",
    "author": "ParisNeo (Saifeddine ALOUI)",
    "author_email": "aloui.seifeddine@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ac/17/2332b0c408b42cab9311808a4348624ff4ccbe54ee948481bf07faa2911e/gpt4pandas-0.2.tar.gz",
    "platform": null,
    "description": "# GPT4Pandas\r\n\r\nGPT4Pandas is a tool that uses the GPT4ALL language model and the Pandas library to answer questions about dataframes. With this tool, you can easily get answers to questions about your dataframes without needing to write any code.\r\n\r\n## Installation\r\n\r\nTo install GPT4ALL Pandas Q&A, you can use pip:\r\n```bash\r\npip install gpt4all-pandasqa\r\n```\r\n\r\n## Usage\r\n\r\nTo use GPT4ALL Pandas Q&A, you can import the `GPT4Pandas` class and create an instance of it with your dataframe:\r\n```python\r\nimport pandas as pd\r\nfrom gpt4pandas import GPT4Pandas\r\n# Load a sample dataframe\r\ndata = {\r\n    \"Name\": [\"Alice\", \"Bob\", \"Charlie\"],\r\n    \"Age\": [25, 30, 35],\r\n    \"City\": [\"New York\", \"Paris\", \"London\"],\r\n    \"Salary\": [50000, 60000, 70000],\r\n}\r\ndf = pd.DataFrame(data)\r\n\r\n# Initialize the GPT4Pandas model\r\nmodel_path = <the path to the model file>\r\ngpt = GPT4Pandas(model_path, df, verbose=False)\r\n```\r\n\r\nThen ask a question about your dataframe:\r\n\r\n```python\r\n# Ask a question about the dataframe\r\nquestion = \"What is the average salary?\"\r\nprint(question)\r\nanswer = gpt.ask(question)\r\nprint(answer)  # Output: \"mean(Salary)\"\r\n```\r\n\r\nHere is a complete example that you can also find in examples folder :\r\n\r\n```python\r\nimport pandas as pd\r\nfrom gpt4pandas import GPT4Pandas\r\nfrom pathlib import Path\r\nfrom tqdm import tqdm\r\nimport urllib\r\nimport sys\r\n\r\n# If there is no model, then download one \r\n# These models can be automatically downloaded, uncomment the model you want to use\r\n# url = \"https://huggingface.co/ParisNeo/GPT4All/resolve/main/gpt4all-lora-quantized-ggml.bin\"\r\n# url = \"https://huggingface.co/ParisNeo/GPT4All/resolve/main/gpt4all-lora-unfiltered-quantized.new.bin\"\r\n# url = \"https://huggingface.co/eachadea/legacy-ggml-vicuna-7b-4bit/resolve/main/ggml-vicuna-7b-4bit-rev1.bin\"\r\nurl = \"https://huggingface.co/eachadea/ggml-vicuna-13b-4bit/resolve/main/ggml-vicuna-13b-4bit-rev1.bin\"\r\nmodel_name  = url.split(\"/\")[-1]\r\nfolder_path = Path(\"models/\")\r\n\r\nmodel_full_path = (folder_path / model_name)\r\n\r\n# ++++++++++++++++++++ Model downloading +++++++++++++++++++++++++++++++++++++++++++++++++++++++++\r\n# Check if file already exists in folder\r\nif model_full_path.exists():\r\n    print(\"File already exists in folder\")\r\nelse:\r\n    # Create folder if it doesn't exist\r\n    folder_path.mkdir(parents=True, exist_ok=True)\r\n    progress_bar = tqdm(total=None, unit=\"B\", unit_scale=True, desc=f\"Downloading {url.split('/')[-1]}\")\r\n    # Define callback function for urlretrieve\r\n    def report_progress(block_num, block_size, total_size):\r\n        progress_bar.total=total_size\r\n        progress_bar.update(block_size)\r\n    # Download file from URL to folder\r\n    try:\r\n        urllib.request.urlretrieve(url, folder_path / url.split(\"/\")[-1], reporthook=report_progress)\r\n        print(\"File downloaded successfully!\")\r\n    except Exception as e:\r\n        print(\"Error downloading file:\", e)\r\n        sys.exit(1)\r\n# ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++\r\n\r\n# Load a sample dataframe\r\ndata = {\r\n    \"Name\": [\"Alice\", \"Bob\", \"Charlie\"],\r\n    \"Age\": [25, 30, 35],\r\n    \"City\": [\"New York\", \"Paris\", \"London\"],\r\n    \"Salary\": [50000, 60000, 70000],\r\n}\r\ndf = pd.DataFrame(data)\r\n\r\n# Initialize the GPT4Pandas model\r\nmodel_path = \"models/\"+model_name\r\ngpt = GPT4Pandas(model_path, df, verbose=False)\r\n\r\nprint(\"Dataframe\")\r\nprint(df)\r\n# Ask a question about the dataframe\r\nquestion = \"What is the average salary?\"\r\nprint(question)\r\nanswer = gpt.ask(question)\r\nprint(answer)  # Output: \"mean(Salary)\"\r\n\r\n# Ask another question\r\nquestion = \"Which person is youngest?\"\r\nprint(question)\r\nanswer = gpt.ask(question)\r\nprint(answer)  # Output: \"max(Age)\"\r\n\r\n# Set a new dataframe and ask a question\r\nnew_data = {\r\n    \"Name\": [\"David\", \"Emily\"],\r\n    \"Age\": [40, 45],\r\n    \"City\": [\"Berlin\", \"Tokyo\"],\r\n    \"Salary\": [80000, 90000],\r\n}\r\nnew_df = pd.DataFrame(new_data)\r\nprint(\"Dataframe\")\r\nprint(new_df)\r\n\r\ngpt.set_dataframe(new_df)\r\nquestion = \"What is salary in Tokyo?\"\r\nprint(question)\r\nanswer = gpt.ask(question)\r\nprint(answer)  # Output: \"min(Salary) where City is Tokyo\"\r\n```\r\n\r\nThis will output the answer to your question.\r\nHere is one of the answers :\r\n\r\n```\r\nDataframe\r\n      Name  Age      City  Salary\r\n0    Alice   25  New York   50000\r\n1      Bob   30     Paris   60000\r\n2  Charlie   35    London   70000\r\nWhat is the average salary?\r\nThe average salary is $60,000.\r\nWhich person is youngest?\r\nAlice is the youngest.\r\nDataframe\r\n    Name  Age    City  Salary\r\n0  David   40  Berlin   80000\r\n1  Emily   45   Tokyo   90000\r\nWhat is salary in Tokyo?\r\nThe salary in Tokyo is $90,000.\r\n```\r\n\r\n# License\r\nGPT4ALL Pandas Q&A is licensed under the Apache License, Version 2.0. See the LICENSE file for more information.\r\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "A tool that uses the GPT4ALL language model and the Pandas library to answer questions about dataframes",
    "version": "0.2",
    "project_urls": null,
    "split_keywords": [
        "pandas",
        "gpt4all",
        "qa"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b095d21f0926fca6d1495e97fec03948299952472618d6c8351b84ff44e22bd6",
                "md5": "b1fcf0565630dc491b25de204181e115",
                "sha256": "c930488f87a7ea4206fadf75985be07a50e4343d6f688245f8b12c9a1e3d4cf2"
            },
            "downloads": -1,
            "filename": "gpt4pandas-0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b1fcf0565630dc491b25de204181e115",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 7614,
            "upload_time": "2023-05-03T22:05:12",
            "upload_time_iso_8601": "2023-05-03T22:05:12.813231Z",
            "url": "https://files.pythonhosted.org/packages/b0/95/d21f0926fca6d1495e97fec03948299952472618d6c8351b84ff44e22bd6/gpt4pandas-0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ac172332b0c408b42cab9311808a4348624ff4ccbe54ee948481bf07faa2911e",
                "md5": "29cc40e49935b3c8857aea77753a99ba",
                "sha256": "e0c5758b39539f9e668ccbb8411836aedc2a259fa0031fd3b204417ad23e1b0f"
            },
            "downloads": -1,
            "filename": "gpt4pandas-0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "29cc40e49935b3c8857aea77753a99ba",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 7364,
            "upload_time": "2023-05-03T22:05:16",
            "upload_time_iso_8601": "2023-05-03T22:05:16.501715Z",
            "url": "https://files.pythonhosted.org/packages/ac/17/2332b0c408b42cab9311808a4348624ff4ccbe54ee948481bf07faa2911e/gpt4pandas-0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-03 22:05:16",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "gpt4pandas"
}
        
Elapsed time: 0.85096s