llama-text2sql-eval

Name	llama-text2sql-eval JSON
Version	0.0.4 JSON
	download
home_page	None
Summary	A Quick Llama Text2SQL Evaluation Library
upload_time	2025-07-14 22:17:08
maintainer	None
docs_url	None
author	Jeff Tang
requires_python	>=3.10
license	MIT
keywords	llama text2sql eval
VCS
bugtrack_url
requirements	torch accelerate appdirs loralib bitsandbytes black black datasets fire peft transformers sentencepiece py7zr scipy optimum matplotlib chardet openai typing-extensions tabulate evaluate rouge_score pyyaml faiss-gpu unstructured sentence_transformers codeshield gradio markupsafe
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # A Quick Library for Llama Text2SQL Accuracy Evaluation

This library provides a simple interface for evaluating the accuracy of Llama models on the Text2SQL task. It uses the BIRD DEV dataset and provides a simple API for running the evaluation pipeline using the Llama API.

## Quick Start

1. Run `pip install llama-text2sql-eval` to install the library.

2. Download the [BIRD](https://bird-bench.github.io/) DEV dataset by running the following commands:

```bash

mkdir -p llama-text2sql-eval/data
cd llama-text2sql-eval/data
wget https://bird-bench.oss-cn-beijing.aliyuncs.com/dev.zip
unzip dev.zip
rm dev.zip
rm -rf __MACOSX
cd dev_20240627
unzip dev_databases.zip
rm dev_databases.zip
rm -rf __MACOSX
cd ../..
```

3. Get your Llama API key [here](https://llama.developer.meta.com/) and set up an environment variable:

```bash
export LLAMA_API_KEY="your_key_here"
```

4. Run the eval with one of the two options:

Option A:

```bash
llama-text2sql-eval --model Llama-3.3-8B-Instruct
```

Option B:

Save the following code to a file named `run.py`, then `python run.py`:

```python
import os
from llama_text2sql_eval import LlamaText2SQLEval

evaluator = LlamaText2SQLEval()

results = evaluator.run(
    model="Llama-3.3-70B-Instruct", # or any other Llama models supported by the Llama API
    api_key=os.getenv("LLAMA_API_KEY")
)

if results:
    print(f"Overall Accuracy: {results['overall_accuracy']:.2f}%")
    print(f"Simple: {results['simple_accuracy']:.2f}%")
    print(f"Moderate: {results['moderate_accuracy']:.2f}%")
    print(f"Challenging: {results['challenging_accuracy']:.2f}%")
```

Running the eval will take about 40 minutes to complete. You should see something like at the end of the run:

```
Overall Accuracy: 57.95%
Simple: 65.30%
Moderate: 47.63%
Challenging: 44.14%
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-text2sql-eval",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "llama, text2sql, eval",
    "author": "Jeff Tang",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/ad/09/ec5febb85562b2f67d411eb7d4222617834cb1d0acd5e837dfd93cb347aa/llama_text2sql_eval-0.0.4.tar.gz",
    "platform": null,
    "description": "# A Quick Library for Llama Text2SQL Accuracy Evaluation\n\nThis library provides a simple interface for evaluating the accuracy of Llama models on the Text2SQL task. It uses the BIRD DEV dataset and provides a simple API for running the evaluation pipeline using the Llama API.\n\n## Quick Start\n\n1. Run `pip install llama-text2sql-eval` to install the library.\n\n2. Download the [BIRD](https://bird-bench.github.io/) DEV dataset by running the following commands:\n\n```bash\n\nmkdir -p llama-text2sql-eval/data\ncd llama-text2sql-eval/data\nwget https://bird-bench.oss-cn-beijing.aliyuncs.com/dev.zip\nunzip dev.zip\nrm dev.zip\nrm -rf __MACOSX\ncd dev_20240627\nunzip dev_databases.zip\nrm dev_databases.zip\nrm -rf __MACOSX\ncd ../..\n```\n\n3. Get your Llama API key [here](https://llama.developer.meta.com/) and set up an environment variable:\n\n```bash\nexport LLAMA_API_KEY=\"your_key_here\"\n```\n\n4. Run the eval with one of the two options:\n\nOption A:\n\n```bash\nllama-text2sql-eval --model Llama-3.3-8B-Instruct\n```\n\nOption B:\n\nSave the following code to a file named `run.py`, then `python run.py`:\n\n```python\nimport os\nfrom llama_text2sql_eval import LlamaText2SQLEval\n\nevaluator = LlamaText2SQLEval()\n\nresults = evaluator.run(\n    model=\"Llama-3.3-70B-Instruct\", # or any other Llama models supported by the Llama API\n    api_key=os.getenv(\"LLAMA_API_KEY\")\n)\n\nif results:\n    print(f\"Overall Accuracy: {results['overall_accuracy']:.2f}%\")\n    print(f\"Simple: {results['simple_accuracy']:.2f}%\")\n    print(f\"Moderate: {results['moderate_accuracy']:.2f}%\")\n    print(f\"Challenging: {results['challenging_accuracy']:.2f}%\")\n```\n\nRunning the eval will take about 40 minutes to complete. You should see something like at the end of the run:\n\n```\nOverall Accuracy: 57.95%\nSimple: 65.30%\nModerate: 47.63%\nChallenging: 44.14%\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Quick Llama Text2SQL Evaluation Library",
    "version": "0.0.4",
    "project_urls": {
        "Homepage": "https://github.com/meta-llama/llama-cookbook/tree/text2sql/end-to-end-use-cases/coding/text2sql/eval"
    },
    "split_keywords": [
        "llama",
        " text2sql",
        " eval"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "66a743d172be1203c9007ad33468dca322b54b8feb6787c8ac8a222ba6070f5f",
                "md5": "e260d28d801620e9ac8c1b0c8d99a2a9",
                "sha256": "808a730888c2b5e59cc9a9c39a7feb9a3828155cbda376a8dce69d3ef402132a"
            },
            "downloads": -1,
            "filename": "llama_text2sql_eval-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e260d28d801620e9ac8c1b0c8d99a2a9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 12250,
            "upload_time": "2025-07-14T22:17:07",
            "upload_time_iso_8601": "2025-07-14T22:17:07.451241Z",
            "url": "https://files.pythonhosted.org/packages/66/a7/43d172be1203c9007ad33468dca322b54b8feb6787c8ac8a222ba6070f5f/llama_text2sql_eval-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ad09ec5febb85562b2f67d411eb7d4222617834cb1d0acd5e837dfd93cb347aa",
                "md5": "eb07dc3a6a3067bd7ac41f3b3135b4a6",
                "sha256": "dbc4b1e0b82fbd673a0612d0c05d346b9c5b346f8090be9043585f8bd5281db6"
            },
            "downloads": -1,
            "filename": "llama_text2sql_eval-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "eb07dc3a6a3067bd7ac41f3b3135b4a6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 10974,
            "upload_time": "2025-07-14T22:17:08",
            "upload_time_iso_8601": "2025-07-14T22:17:08.857481Z",
            "url": "https://files.pythonhosted.org/packages/ad/09/ec5febb85562b2f67d411eb7d4222617834cb1d0acd5e837dfd93cb347aa/llama_text2sql_eval-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-14 22:17:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "meta-llama",
    "github_project": "llama-cookbook",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "2.2"
                ]
            ]
        },
        {
            "name": "accelerate",
            "specs": []
        },
        {
            "name": "appdirs",
            "specs": []
        },
        {
            "name": "loralib",
            "specs": []
        },
        {
            "name": "bitsandbytes",
            "specs": []
        },
        {
            "name": "black",
            "specs": []
        },
        {
            "name": "black",
            "specs": []
        },
        {
            "name": "datasets",
            "specs": []
        },
        {
            "name": "fire",
            "specs": []
        },
        {
            "name": "peft",
            "specs": []
        },
        {
            "name": "transformers",
            "specs": [
                [
                    ">=",
                    "4.45.1"
                ]
            ]
        },
        {
            "name": "sentencepiece",
            "specs": []
        },
        {
            "name": "py7zr",
            "specs": []
        },
        {
            "name": "scipy",
            "specs": []
        },
        {
            "name": "optimum",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "chardet",
            "specs": []
        },
        {
            "name": "openai",
            "specs": []
        },
        {
            "name": "typing-extensions",
            "specs": [
                [
                    ">=",
                    "4.8.0"
                ]
            ]
        },
        {
            "name": "tabulate",
            "specs": []
        },
        {
            "name": "evaluate",
            "specs": []
        },
        {
            "name": "rouge_score",
            "specs": []
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    "==",
                    "6.0.1"
                ]
            ]
        },
        {
            "name": "faiss-gpu",
            "specs": []
        },
        {
            "name": "unstructured",
            "specs": []
        },
        {
            "name": "sentence_transformers",
            "specs": []
        },
        {
            "name": "codeshield",
            "specs": []
        },
        {
            "name": "gradio",
            "specs": []
        },
        {
            "name": "markupsafe",
            "specs": [
                [
                    "==",
                    "2.0.1"
                ]
            ]
        }
    ],
    "lcname": "llama-text2sql-eval"
}

Jeff Tang