stark-qa


Namestark-qa JSON
Version 0.1.2 PyPI version JSON
download
home_pageNone
SummaryBenchmarking LLM Retrieval on Textual and Relational Knowledge Bases
upload_time2024-06-18 06:33:58
maintainerNone
docs_urlNone
authorNone
requires_python<3.12,>=3.8
licenseMIT License
keywords nlp information-retrieval graph knowledge-base semi-structured-data multimodal llm
VCS
bugtrack_url
requirements anthropic bs4 gdown huggingface_hub langchain langdetect multiprocess nltk numpy ogb openai pandas PyTDC torch torch_geometric torchmetrics tqdm transformers
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
<h1 align="left">
    STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
</h1>

<div align="left">


**Dataset website:** [STaRK Website](https://stark.stanford.edu/)


## What is STaRK?
STaRK is a large-scale semi-structure retrieval benchmark on Textual and Relational Knowledge Bases. Given a user query, the task is to extract nodes from the knowledge base that are relevant to the query. 


## Why STaRK?
- **Novel Task**: Recently, large language models have demonstrated significant potential on information retrieval tasks. Nevertheless, it remains an open
question how effectively LLMs can handle the complex interplay between textual and relational
requirements in queries.

- **Large-scale and Diverse KBs**: We provide three large-scale knowledge bases across three areas, which are constructed from public sources.

- **Natural-sounding and Practical Queries**: The queries in our benchmark are crafted to incorporate rich relational information and complex textual properties, and closely mirror questions in real-life scenarios, e.g., with flexible query formats and possibly with extra contexts.


# Access benchmark data

## 1) Package installation
```bash
pip install stark_qa
```

## 2) Data loading 

```python
from stark_qa import load_qa, load_skb

dataset_name = 'amazon'

# Load the retrieval dataset
qa_dataset = load_qa(dataset_name)
idx_split = qa_dataset.get_idx_split()

# Load the semi-structured knowledge base
skb = load_skb(dataset_name, download_processed=True, root=None)
```
The root argument for load_skb specifies the location to store SKB data. With default value `None`, the data will be stored in [huggingface cache](https://huggingface.co/docs/datasets/en/cache).

### Data of the Retrieval Task

Question answer pairs for the retrieval task will be automatically downloaded in `data/{dataset}/stark_qa` by default. We provided official split in `data/{dataset}/split`.

### Data of the Knowledge Bases

There are two ways to load the knowledge base data:
- (Recommended) Instant downloading: The knowledge base data of all three benchmark will be **automatically** downloaded and loaded when setting `download_processed=True`. 
- Process data from raw: We also provided all of our preprocessing code for transparency. Therefore, you can process the raw data from scratch via setting `download_processed=False`. In this case, STaRK-PrimeKG takes around 5 minutes to download and load the processed data. STaRK-Amazon and STaRK-MAG may takes around an hour to process from the raw data.

## 3) LLM API usage

### Specify under config/ directory
Please specify API keys at `config/openai_api_key.txt` for openai models or `config/claude_api_key.txt` for Claude models.
### Specify in command line
```
ANTHROPIC_API_KEY=YOUR_API_KEY
```
or
```
OPENAI_API_KEY=YOUR_API_KEY
OPENAI_ORG=YOUR_ORGANIZATION
```

## 4) More usage
Please refer to the [documentation](https://stark.stanford.edu/doc.html) for more details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "stark-qa",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.8",
    "maintainer_email": "STaRK team <stark-qa@cs.stanford.edu>",
    "keywords": "nlp, information-retrieval, graph, knowledge-base, semi-structured-data, multimodal, llm",
    "author": null,
    "author_email": "STaRK team <stark-qa@cs.stanford.edu>",
    "download_url": "https://files.pythonhosted.org/packages/d4/9e/4805510f870aadc4ad23c0a0380b1bf067268663854f1454fa90f2357368/stark_qa-0.1.2.tar.gz",
    "platform": null,
    "description": "\n<h1 align=\"left\">\n    STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases\n</h1>\n\n<div align=\"left\">\n\n\n**Dataset website:** [STaRK Website](https://stark.stanford.edu/)\n\n\n## What is STaRK?\nSTaRK is a large-scale semi-structure retrieval benchmark on Textual and Relational Knowledge Bases. Given a user query, the task is to extract nodes from the knowledge base that are relevant to the query. \n\n\n## Why STaRK?\n- **Novel Task**: Recently, large language models have demonstrated significant potential on information retrieval tasks. Nevertheless, it remains an open\nquestion how effectively LLMs can handle the complex interplay between textual and relational\nrequirements in queries.\n\n- **Large-scale and Diverse KBs**: We provide three large-scale knowledge bases across three areas, which are constructed from public sources.\n\n- **Natural-sounding and Practical Queries**: The queries in our benchmark are crafted to incorporate rich relational information and complex textual properties, and closely mirror questions in real-life scenarios, e.g., with flexible query formats and possibly with extra contexts.\n\n\n# Access benchmark data\n\n## 1) Package installation\n```bash\npip install stark_qa\n```\n\n## 2) Data loading \n\n```python\nfrom stark_qa import load_qa, load_skb\n\ndataset_name = 'amazon'\n\n# Load the retrieval dataset\nqa_dataset = load_qa(dataset_name)\nidx_split = qa_dataset.get_idx_split()\n\n# Load the semi-structured knowledge base\nskb = load_skb(dataset_name, download_processed=True, root=None)\n```\nThe root argument for load_skb specifies the location to store SKB data. With default value `None`, the data will be stored in [huggingface cache](https://huggingface.co/docs/datasets/en/cache).\n\n### Data of the Retrieval Task\n\nQuestion answer pairs for the retrieval task will be automatically downloaded in `data/{dataset}/stark_qa` by default. We provided official split in `data/{dataset}/split`.\n\n### Data of the Knowledge Bases\n\nThere are two ways to load the knowledge base data:\n- (Recommended) Instant downloading: The knowledge base data of all three benchmark will be **automatically** downloaded and loaded when setting `download_processed=True`. \n- Process data from raw: We also provided all of our preprocessing code for transparency. Therefore, you can process the raw data from scratch via setting `download_processed=False`. In this case, STaRK-PrimeKG takes around 5 minutes to download and load the processed data. STaRK-Amazon and STaRK-MAG may takes around an hour to process from the raw data.\n\n## 3) LLM API usage\n\n### Specify under config/ directory\nPlease specify API keys at `config/openai_api_key.txt` for openai models or `config/claude_api_key.txt` for Claude models.\n### Specify in command line\n```\nANTHROPIC_API_KEY=YOUR_API_KEY\n```\nor\n```\nOPENAI_API_KEY=YOUR_API_KEY\nOPENAI_ORG=YOUR_ORGANIZATION\n```\n\n## 4) More usage\nPlease refer to the [documentation](https://stark.stanford.edu/doc.html) for more details.\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases",
    "version": "0.1.2",
    "project_urls": {
        "Homepage": "https://stark.stanford.edu/",
        "Repository": "https://github.com/snap-stanford/stark.git"
    },
    "split_keywords": [
        "nlp",
        " information-retrieval",
        " graph",
        " knowledge-base",
        " semi-structured-data",
        " multimodal",
        " llm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "85f96b26783e148d36796405570bf1784a826be21081a31414c4b1fe8721a71c",
                "md5": "cfe766328598a8debbd0e92846b3b415",
                "sha256": "197b2a829050d290c77490860d5333e2a2924f40bdb0c2004fb38567d98cb859"
            },
            "downloads": -1,
            "filename": "stark_qa-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cfe766328598a8debbd0e92846b3b415",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.8",
            "size": 44168,
            "upload_time": "2024-06-18T06:33:53",
            "upload_time_iso_8601": "2024-06-18T06:33:53.825724Z",
            "url": "https://files.pythonhosted.org/packages/85/f9/6b26783e148d36796405570bf1784a826be21081a31414c4b1fe8721a71c/stark_qa-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d49e4805510f870aadc4ad23c0a0380b1bf067268663854f1454fa90f2357368",
                "md5": "39dec49e87d2cf78f4bf1174daaa6f46",
                "sha256": "0b36674b365e488b0116310fdda620b0b5b4392973515024b47a4ae98c5a1c9f"
            },
            "downloads": -1,
            "filename": "stark_qa-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "39dec49e87d2cf78f4bf1174daaa6f46",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.8",
            "size": 37202,
            "upload_time": "2024-06-18T06:33:58",
            "upload_time_iso_8601": "2024-06-18T06:33:58.684764Z",
            "url": "https://files.pythonhosted.org/packages/d4/9e/4805510f870aadc4ad23c0a0380b1bf067268663854f1454fa90f2357368/stark_qa-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-18 06:33:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "snap-stanford",
    "github_project": "stark",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "anthropic",
            "specs": []
        },
        {
            "name": "bs4",
            "specs": []
        },
        {
            "name": "gdown",
            "specs": []
        },
        {
            "name": "huggingface_hub",
            "specs": []
        },
        {
            "name": "langchain",
            "specs": []
        },
        {
            "name": "langdetect",
            "specs": []
        },
        {
            "name": "multiprocess",
            "specs": []
        },
        {
            "name": "nltk",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "ogb",
            "specs": []
        },
        {
            "name": "openai",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "PyTDC",
            "specs": []
        },
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "torch_geometric",
            "specs": []
        },
        {
            "name": "torchmetrics",
            "specs": []
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "transformers",
            "specs": []
        }
    ],
    "lcname": "stark-qa"
}
        
Elapsed time: 0.45637s