hfttai


Namehfttai JSON
Version 0.1.5 PyPI version JSON
download
home_page
SummaryPandas AI is a Python library that integrates generative artificial intelligence capabilities into Pandas, making dataframes conversational.
upload_time2023-07-07 13:41:55
maintainer
docs_urlNone
authorGabriele Venturi
requires_python>=3.9, !=2.7.*, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.*, !=3.8.*
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PandasAI 🐼

[![Release](https://img.shields.io/pypi/v/pandasai?label=Release&style=flat-square)](https://pypi.org/project/pandasai/)
[![CI](https://github.com/gventuri/pandas-ai/actions/workflows/ci.yml/badge.svg)](https://github.com/gventuri/pandas-ai/actions/workflows/ci.yml/badge.svg)
[![CD](https://github.com/gventuri/pandas-ai/actions/workflows/cd.yml/badge.svg)](https://github.com/gventuri/pandas-ai/actions/workflows/cd.yml/badge.svg)
[![Documentation Status](https://readthedocs.org/projects/pandas-ai/badge/?version=latest)](https://pandas-ai.readthedocs.io/en/latest/?badge=latest)
[![](https://dcbadge.vercel.app/api/server/kF7FqH2FwS?style=flat&compact=true)](https://discord.gg/kF7FqH2FwS)
[![Downloads](https://static.pepy.tech/badge/pandasai)](https://pepy.tech/project/pandasai) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Open in Colab](https://camo.githubusercontent.com/84f0493939e0c4de4e6dbe113251b4bfb5353e57134ffd9fcab6b8714514d4d1/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/drive/1rKz7TudOeCeKGHekw7JFNL4sagN9hon-?usp=sharing)

PandasAI is a Python library that adds Generative AI capabilities to [pandas](https://github.com/pandas-dev/pandas), the popular data analysis and manipulation tool. It is designed to be used in conjunction with pandas, and is not a replacement for it.

<!-- Add images/pandas-ai.png -->

![PandasAI](images/pandas-ai.png?raw=true)

## 🔧 Quick install

```bash
pip install pandasai
```

## 🔍 Demo

Try out PandasAI in your browser:

[![Open in Colab](https://camo.githubusercontent.com/84f0493939e0c4de4e6dbe113251b4bfb5353e57134ffd9fcab6b8714514d4d1/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/drive/1rKz7TudOeCeKGHekw7JFNL4sagN9hon-?usp=sharing)

## 📖 Documentation

The documentation for PandasAI can be found [here](https://pandas-ai.readthedocs.io/en/latest/).

## 💻 Usage

> Disclaimer: GDP data was collected from [this source](https://ourworldindata.org/grapher/gross-domestic-product?tab=table), published by World Development Indicators - World Bank (2022.05.26) and collected at National accounts data - World Bank / OECD. It relates to the year of 2020. Happiness indexes were extracted from [the World Happiness Report](https://ftnnews.com/images/stories/documents/2020/WHR20.pdf). Another useful [link](https://data.world/makeovermonday/2020w19-world-happiness-report-2020).

PandasAI is designed to be used in conjunction with pandas. It makes pandas conversational, allowing you to ask questions to your data in natural language.

### Queries

For example, you can ask PandasAI to find all the rows in a DataFrame where the value of a column is greater than 5, and it will return a DataFrame containing only those rows:

```python
import pandas as pd
from pandasai import PandasAI

# Sample DataFrame
df = pd.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
    "happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
})

# Instantiate a LLM
from pandasai.llm.openai import OpenAI
llm = OpenAI(api_token="YOUR_API_TOKEN")

pandas_ai = PandasAI(llm)
pandas_ai(df, prompt='Which are the 5 happiest countries?')
```

The above code will return the following:

```
6            Canada
7         Australia
1    United Kingdom
3           Germany
0     United States
Name: country, dtype: object
```

Of course, you can also ask PandasAI to perform more complex queries. For example, you can ask PandasAI to find the sum of the GDPs of the 2 unhappiest countries:

```python
pandas_ai(df, prompt='What is the sum of the GDPs of the 2 unhappiest countries?')
```

The above code will return the following:

```
19012600725504
```

### Charts

You can also ask PandasAI to draw a graph:

```python
pandas_ai(
    df,
    "Plot the histogram of countries showing for each the gdp, using different colors for each bar",
)
```

![Chart](images/histogram-chart.png?raw=true)

You can save any charts generated by PandasAI by setting the `save_charts` parameter to `True` in the `PandasAI` constructor. For example, `PandasAI(llm, save_charts=True)`. Charts are saved in `./pandasai/exports/charts` .

### Multiple DataFrames

Additionally, you can also pass in multiple dataframes to PandasAI and ask questions relating them.

```python
import pandas as pd
from pandasai import PandasAI

employees_data = {
    'EmployeeID': [1, 2, 3, 4, 5],
    'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
    'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
}

salaries_data = {
    'EmployeeID': [1, 2, 3, 4, 5],
    'Salary': [5000, 6000, 4500, 7000, 5500]
}

employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)


llm = OpenAI()
pandas_ai = PandasAI(llm)
pandas_ai([employees_df, salaries_df], "Who gets paid the most?")
```

The above code will return the following:

```
Oh, Olivia gets paid the most.
```

You can find more examples in the [examples](examples) directory.

### ⚡️ Shortcuts

PandasAI also provides a number of shortcuts (beta) to make it easier to ask questions to your data. For example, you can ask PandasAI to `clean_data`, `impute_missing_values`, `generate_features`, `plot_histogram`, and many many more.

```python
# Clean data
pandas_ai.clean_data(df)

# Impute missing values
pandas_ai.impute_missing_values(df)

# Generate features
pandas_ai.generate_features(df)

# Plot histogram
pandas_ai.plot_histogram(df, column="gdp")
```

Learn more about the shortcuts [here](https://pandas-ai.readthedocs.io/en/latest/shortcuts/).

## 🔒 Privacy & Security

In order to generate the Python code to run, we take the dataframe head, we randomize it (using random generation for sensitive data and shuffling for non-sensitive data) and send just the head.

Also, if you want to enforce further your privacy you can instantiate PandasAI with `enforce_privacy = True` which will not send the head (but just column names) to the LLM.

## ⚙️ Command-Line Tool

Pai is the command line tool designed to provide a convenient way to interact with PandasAI through a command line interface (CLI). In order to access the CLI tool, make sure to create a virtualenv for testing purpose and to install project dependencies in your local virtual environment using `pip` by running the following command:

Read more about how to use the CLI [here](https://pandas-ai.readthedocs.io/en/latest/pai_cli/).

## 🤝 Contributing

Contributions are welcome! Please check out the todos below, and feel free to open a pull request.
For more information, please see the [contributing guidelines](CONTRIBUTING.md).

After installing the virtual environment, please remember to install `pre-commit` to be compliant with our standards:

```bash
pre-commit install
```

## 📜 License

PandasAI is licensed under the MIT License. See the LICENSE file for more details.

## Acknowledgements

- This project is based on the [pandas](https://github.com/pandas-dev/pandas) library by independent contributors, but it's in no way affiliated with the pandas project.
- This project is meant to be used as a tool for data exploration and analysis, and it's not meant to be used for production purposes. Please use it responsibly.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "hfttai",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9, !=2.7.*, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.*, !=3.8.*",
    "maintainer_email": "",
    "keywords": "",
    "author": "Gabriele Venturi",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/45/f7/03537c8a4e83bf1048ad6f688b6354815610cc3160c7e4705e1608092866/hfttai-0.1.5.tar.gz",
    "platform": null,
    "description": "# PandasAI \ud83d\udc3c\n\n[![Release](https://img.shields.io/pypi/v/pandasai?label=Release&style=flat-square)](https://pypi.org/project/pandasai/)\n[![CI](https://github.com/gventuri/pandas-ai/actions/workflows/ci.yml/badge.svg)](https://github.com/gventuri/pandas-ai/actions/workflows/ci.yml/badge.svg)\n[![CD](https://github.com/gventuri/pandas-ai/actions/workflows/cd.yml/badge.svg)](https://github.com/gventuri/pandas-ai/actions/workflows/cd.yml/badge.svg)\n[![Documentation Status](https://readthedocs.org/projects/pandas-ai/badge/?version=latest)](https://pandas-ai.readthedocs.io/en/latest/?badge=latest)\n[![](https://dcbadge.vercel.app/api/server/kF7FqH2FwS?style=flat&compact=true)](https://discord.gg/kF7FqH2FwS)\n[![Downloads](https://static.pepy.tech/badge/pandasai)](https://pepy.tech/project/pandasai) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Open in Colab](https://camo.githubusercontent.com/84f0493939e0c4de4e6dbe113251b4bfb5353e57134ffd9fcab6b8714514d4d1/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/drive/1rKz7TudOeCeKGHekw7JFNL4sagN9hon-?usp=sharing)\n\nPandasAI is a Python library that adds Generative AI capabilities to [pandas](https://github.com/pandas-dev/pandas), the popular data analysis and manipulation tool. It is designed to be used in conjunction with pandas, and is not a replacement for it.\n\n<!-- Add images/pandas-ai.png -->\n\n![PandasAI](images/pandas-ai.png?raw=true)\n\n## \ud83d\udd27 Quick install\n\n```bash\npip install pandasai\n```\n\n## \ud83d\udd0d Demo\n\nTry out PandasAI in your browser:\n\n[![Open in Colab](https://camo.githubusercontent.com/84f0493939e0c4de4e6dbe113251b4bfb5353e57134ffd9fcab6b8714514d4d1/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/drive/1rKz7TudOeCeKGHekw7JFNL4sagN9hon-?usp=sharing)\n\n## \ud83d\udcd6 Documentation\n\nThe documentation for PandasAI can be found [here](https://pandas-ai.readthedocs.io/en/latest/).\n\n## \ud83d\udcbb Usage\n\n> Disclaimer: GDP data was collected from [this source](https://ourworldindata.org/grapher/gross-domestic-product?tab=table), published by World Development Indicators - World Bank (2022.05.26) and collected at National accounts data - World Bank / OECD. It relates to the year of 2020. Happiness indexes were extracted from [the World Happiness Report](https://ftnnews.com/images/stories/documents/2020/WHR20.pdf). Another useful [link](https://data.world/makeovermonday/2020w19-world-happiness-report-2020).\n\nPandasAI is designed to be used in conjunction with pandas. It makes pandas conversational, allowing you to ask questions to your data in natural language.\n\n### Queries\n\nFor example, you can ask PandasAI to find all the rows in a DataFrame where the value of a column is greater than 5, and it will return a DataFrame containing only those rows:\n\n```python\nimport pandas as pd\nfrom pandasai import PandasAI\n\n# Sample DataFrame\ndf = pd.DataFrame({\n    \"country\": [\"United States\", \"United Kingdom\", \"France\", \"Germany\", \"Italy\", \"Spain\", \"Canada\", \"Australia\", \"Japan\", \"China\"],\n    \"gdp\": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],\n    \"happiness_index\": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]\n})\n\n# Instantiate a LLM\nfrom pandasai.llm.openai import OpenAI\nllm = OpenAI(api_token=\"YOUR_API_TOKEN\")\n\npandas_ai = PandasAI(llm)\npandas_ai(df, prompt='Which are the 5 happiest countries?')\n```\n\nThe above code will return the following:\n\n```\n6            Canada\n7         Australia\n1    United Kingdom\n3           Germany\n0     United States\nName: country, dtype: object\n```\n\nOf course, you can also ask PandasAI to perform more complex queries. For example, you can ask PandasAI to find the sum of the GDPs of the 2 unhappiest countries:\n\n```python\npandas_ai(df, prompt='What is the sum of the GDPs of the 2 unhappiest countries?')\n```\n\nThe above code will return the following:\n\n```\n19012600725504\n```\n\n### Charts\n\nYou can also ask PandasAI to draw a graph:\n\n```python\npandas_ai(\n    df,\n    \"Plot the histogram of countries showing for each the gdp, using different colors for each bar\",\n)\n```\n\n![Chart](images/histogram-chart.png?raw=true)\n\nYou can save any charts generated by PandasAI by setting the `save_charts` parameter to `True` in the `PandasAI` constructor. For example, `PandasAI(llm, save_charts=True)`. Charts are saved in `./pandasai/exports/charts` .\n\n### Multiple DataFrames\n\nAdditionally, you can also pass in multiple dataframes to PandasAI and ask questions relating them.\n\n```python\nimport pandas as pd\nfrom pandasai import PandasAI\n\nemployees_data = {\n    'EmployeeID': [1, 2, 3, 4, 5],\n    'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],\n    'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']\n}\n\nsalaries_data = {\n    'EmployeeID': [1, 2, 3, 4, 5],\n    'Salary': [5000, 6000, 4500, 7000, 5500]\n}\n\nemployees_df = pd.DataFrame(employees_data)\nsalaries_df = pd.DataFrame(salaries_data)\n\n\nllm = OpenAI()\npandas_ai = PandasAI(llm)\npandas_ai([employees_df, salaries_df], \"Who gets paid the most?\")\n```\n\nThe above code will return the following:\n\n```\nOh, Olivia gets paid the most.\n```\n\nYou can find more examples in the [examples](examples) directory.\n\n### \u26a1\ufe0f Shortcuts\n\nPandasAI also provides a number of shortcuts (beta) to make it easier to ask questions to your data. For example, you can ask PandasAI to `clean_data`, `impute_missing_values`, `generate_features`, `plot_histogram`, and many many more.\n\n```python\n# Clean data\npandas_ai.clean_data(df)\n\n# Impute missing values\npandas_ai.impute_missing_values(df)\n\n# Generate features\npandas_ai.generate_features(df)\n\n# Plot histogram\npandas_ai.plot_histogram(df, column=\"gdp\")\n```\n\nLearn more about the shortcuts [here](https://pandas-ai.readthedocs.io/en/latest/shortcuts/).\n\n## \ud83d\udd12 Privacy & Security\n\nIn order to generate the Python code to run, we take the dataframe head, we randomize it (using random generation for sensitive data and shuffling for non-sensitive data) and send just the head.\n\nAlso, if you want to enforce further your privacy you can instantiate PandasAI with `enforce_privacy = True` which will not send the head (but just column names) to the LLM.\n\n## \u2699\ufe0f Command-Line Tool\n\nPai is the command line tool designed to provide a convenient way to interact with PandasAI through a command line interface (CLI). In order to access the CLI tool, make sure to create a virtualenv for testing purpose and to install project dependencies in your local virtual environment using `pip` by running the following command:\n\nRead more about how to use the CLI [here](https://pandas-ai.readthedocs.io/en/latest/pai_cli/).\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please check out the todos below, and feel free to open a pull request.\nFor more information, please see the [contributing guidelines](CONTRIBUTING.md).\n\nAfter installing the virtual environment, please remember to install `pre-commit` to be compliant with our standards:\n\n```bash\npre-commit install\n```\n\n## \ud83d\udcdc License\n\nPandasAI is licensed under the MIT License. See the LICENSE file for more details.\n\n## Acknowledgements\n\n- This project is based on the [pandas](https://github.com/pandas-dev/pandas) library by independent contributors, but it's in no way affiliated with the pandas project.\n- This project is meant to be used as a tool for data exploration and analysis, and it's not meant to be used for production purposes. Please use it responsibly.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Pandas AI is a Python library that integrates generative artificial intelligence capabilities into Pandas, making dataframes conversational.",
    "version": "0.1.5",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cf66e79c1f811a09f9a117d7ef60076e09078f72f06e0802c54f64be86785924",
                "md5": "9138656538c0995d147a60f2fc59d970",
                "sha256": "fa65cecb5803766c1933af4eb8e45c88ad77ff9a9a449a7f41e7e8f03033a726"
            },
            "downloads": -1,
            "filename": "hfttai-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9138656538c0995d147a60f2fc59d970",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9, !=2.7.*, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.*, !=3.8.*",
            "size": 38325,
            "upload_time": "2023-07-07T13:41:53",
            "upload_time_iso_8601": "2023-07-07T13:41:53.537782Z",
            "url": "https://files.pythonhosted.org/packages/cf/66/e79c1f811a09f9a117d7ef60076e09078f72f06e0802c54f64be86785924/hfttai-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "45f703537c8a4e83bf1048ad6f688b6354815610cc3160c7e4705e1608092866",
                "md5": "16215019b6f4f5d0418aa452eea4288c",
                "sha256": "88f3abb20adf594c034b5a4fd1dac63cd710fb1b351a541ae814eabe87a959fa"
            },
            "downloads": -1,
            "filename": "hfttai-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "16215019b6f4f5d0418aa452eea4288c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9, !=2.7.*, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.*, !=3.8.*",
            "size": 60351,
            "upload_time": "2023-07-07T13:41:55",
            "upload_time_iso_8601": "2023-07-07T13:41:55.794537Z",
            "url": "https://files.pythonhosted.org/packages/45/f7/03537c8a4e83bf1048ad6f688b6354815610cc3160c7e4705e1608092866/hfttai-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-07 13:41:55",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "hfttai"
}
        
Elapsed time: 0.10552s