grasp-rdf

Name	grasp-rdf JSON
Version	0.1.1 JSON
	download
home_page	None
Summary	GRASP: Generic Reasoning and SPARQL generation across knowledge graphs
upload_time	2025-09-09 17:36:31
maintainer	None
docs_url	None
author	None
requires_python	>=3.12
license	None
keywords	rdf llm sparql question answering knowledge graph
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # GRASP - Generic Reasoning and SPARQL generation across Knowledge Graphs

## News

- August 28th 2025:
  - Demo paper of GRASP has also been accepted to [ISWC 2025](https://iswc2025.semanticweb.org/)
  - Preview of camera-ready version coming soon

- July 31st 2025:
  - GRASP has been accepted to [ISWC 2025](https://iswc2025.semanticweb.org/)
  - Preview of camera-ready version available [here](https://ad-publications.cs.uni-freiburg.de/ISWC_grasp_WB_2025.pdf)

- July 14th 2025:
  - arXiv preprint available at [arxiv.org/abs/2507.08107](https://arxiv.org/abs/2507.08107)

- July 10th 2025:
  - Code release
  - Data release

## Overview and directory structure

Links:

- Public demo available at [grasp.cs.uni-freiburg.de](https://grasp.cs.uni-freiburg.de)
- Data available at [ad-publications.cs.uni-freiburg.de/grasp](https://ad-publications.cs.uni-freiburg.de/grasp)

```
apps/
  evaluation/                     # Streamlit app for evaluation
  grasp/                          # Web app compatible with GRASP server
bash/                             # Bash scripts to run and evaluate GRASP
configs/
  run.yaml                        # Config to run GRASP with a single KG
  serve.yaml                      # Config to run GRASP with all available KGs
queries/                          # Custom index data and info SPARQL queries
                                    for various knowledge graphs
scripts/                          # Various helper scripts
data/                          
  benchmark/                      # Benchmarks grouped by knowledge graph
    [knowledge-graph]/
      [benchmark]/                   
        test.jsonl                # Test set with input and ground truth
        train.example_index/      # Index based on train set for few-shot learning
                                    (needs to be downloaded)
        outputs/
          [model].jsonl           # Model output
          [model].config.json     # Model config
          [model].evaluation.json # Evaluation against ground truth
  kg-index/                       # KG indices (need to be downloaded)
    wikidata/
    freebase/
    ...
src/                              # Source code for GRASP
Makefile                          # Makefile for building benchmarks
```

## Quickstart

Follow these steps to run GRASP.

### Run GRASP

> Note: We recommend to use conda for ease of installation of Faiss and to avoid
> dependency issues.

1. Create and activate conda environment:
`conda create -n grasp python=3.12 && conda activate grasp`

2. Install Faiss (not supported to be installed with pip):
`conda install -c pytorch -c nvidia faiss-gpu=1.11.0`

> You might have to install the CPU version of Faiss, since
> the GPU version leads to issues on some systems.

3. Install GRASP

```bash
# From PyPI (recommended)
pip install grasp-rdf

# Or via git
pip install git+https://github.com/ad-freiburg/grasp.git@main
```

4. Set the `GRASP_INDEX_DIR` env variable. Defaults to `$HOME/.grasp/index` if not
set. We set it to `$PWD/data/kg-index`, but you can choose any directory you like.

> We recommend to set it with conda, such that it is set automatically when you activate
> the conda environment: `conda env config vars set GRASP_INDEX_DIR=/path/to/dir`

5. Get indices for the knowledge graphs you want to use. All indices are available
[publicly](https://ad-publications.cs.uni-freiburg.de/grasp/kg-index).
For example, to get the indices for Wikidata:

```bash
# Change to index directory
cd $GRASP_INDEX_DIR
# Download Wikidata index
wget https://ad-publications.cs.uni-freiburg.de/grasp/kg-index/wikidata.tar.gz
# Extract index
tar -xzf wikidata.tar.gz
```

Optionally, you can also download example indices for few-shot learning.
Example indices are always built from the train set of a benchmark
and called `train.example-index`.
For example, to get the example index for QALD-10 on Wikidata:

```bash
# Change to benchmark directory
cd data/benchmark/wikidata/qald10
# Download example index
wget https://ad-publications.cs.uni-freiburg.de/grasp/benchmark/wikidata/qald10/train.example-index.tar.gz
# Extract example index
tar -xzf train.example-index.tar.gz
```

6. Run GRASP:

```bash
# Note, that if you e.g. run OpenAI models, you also need to set the
# OPENAI_API_KEY env variable (see section about supported models below).

# Tip: Set --log-level DEBUG to show the individual steps of GRASP
# (reasoning and function calls) in a nicely formatted way.

# Run GRASP on an input and output the result to stdout as JSON with metadata.
# Actual output for the task is in the "output" field of that JSON object.

# Input from stdin:
echo "Where was Angela Merkel born?" | grasp run configs/run.yaml

# Input via CLI argument:
grasp run configs/run.yaml --input "Where was Angela Merkel born?"

# You can run different tasks with GRASP (default is sparql-qa). 
# Depending on the task, the expected input format and output format
# will differ. For general-qa, the input is also a natural language
# question, same as for sparql-qa, but the output will be just a natural
# language answer instead of a SPARQL query.
echo "Where was Angela Merkel born?" | grasp run configs/run.yaml --task general-qa

# Show all available options:
grasp run -h

# You can also run GRASP on multiple inputs (in JSONL format).
# In the following, we show an example to run GRASP on the QALD-10 
# test set over Wikidata.

# Input from stdin:
cat data/benchmark/wikidata/qald10/test.jsonl | grasp file configs/run.yaml

# Input via CLI argument:
grasp file configs/run.yaml --input-file data/benchmark/wikidata/qald10/test.jsonl

# Save output to a file instead of stdout and show progress bar:
grasp file configs/run.yaml \
  --input-file data/benchmark/wikidata/qald10/test.jsonl \
  --output-file data/benchmark/wikidata/qald10/outputs/test.jsonl \
  --progress

# Show all available options:
grasp file -h

# You can also run GRASP in a client-server setup. This is also the server
# that powers the corresponding web app.
# To start a GRASP server, by default on port 8000, just run:
grasp serve configs/run.yaml

# For convenience, we also provide a config to run the server with all
# available knowledge graphs (make sure to download all indices first):
grasp serve configs/serve.yaml

# Show all available options:
grasp serve -h
```

### Configure GRASP

GRASP can be configured via a single YAML config file, which is passed
to `grasp run`, `grasp file`, or `grasp serve` as first argument (see above).
You can use env variable placeholders in the config file of the form
`env(VAR_NAME:default_value)`, which will be replaced at runtime by the value of
the env variable `VAR_NAME` if it is set, or by `default_value` otherwise.
If no default value is given and the env variable is not set, an error
is raised. If you omit an entire config option, we also use a default value
as specified in the config code.

The configuration options and the use of env variable placeholders are
mostly self-explanatory, so we refer you to the [example config files](configs)
and the [config code](src/grasp/configs.py) for details.

### Build your own knowledge graph indices

Using GRASP with your own knowledge graph requires two steps:

- Getting the index data from a SPARQL endpoint for the knowledge graph
- Building the indices

#### Get index data

We get the index data by issuing two SPARQL queries to a SPARQL endpoint,
one for entities and one for properties. Both queries are expected to
return three columns in their results:

1. The IRI of the entity/property (required, must be unique)
2. The main label of the entity/property (optional)
3. All other labels/aliases of the entity/property, separated by `;;;` (optional)

A typical SPARQL query for that looks like this:

```sparql
SELECT
  # unique identifier of the entity/property
  ?id
  # main label of the entity/property, typically in English via rdfs:label
  (SAMPLE(?label) AS ?main_label)
  # all other labels/aliases, separated by ;;;
  (GROUP_CONCAT(DISTINCT ?alias; SEPARATOR=";;;") AS ?aliases)
WHERE {
  ...
}
# group by the identifier to ensure uniqueness
GROUP BY ?id
```

The query body will determine which entities/properties are included included
in the index, and how their labels and aliases are retrieved.

> Notes:
>
> - If you do not provide custom index data SPARQL queries, we use the generic
> default queries from [here](src/grasp/sparql/queries)
> - Our custom index data queries for various knowledge graphs
> are [here](queries)
> - If there is neither a label nor an alias for an entity/property, we use
> its IRI as fallback label
> - For properties, we always add the IRI as alias, to make them searchable by
> their IRI as well

With the CLI, you can use the `grasp data` command as follows:

```bash
# By default, if you just specify the knowledge graph name,
# we use https://qlever.cs.uni-freiburg.de/api/<kg_name> as SPARQL endpoint.
# The data will be saved to $GRASP_INDEX_DIR/<kg_name>/entities/data.tsv
# and $GRASP_INDEX_DIR/<kg_name>/properties/data.tsv.
# For example, to get the index data for IMDB:
grasp data imdb

# You can also set a custom SPARQL endpoint:
grasp data my-imdb --endpoint https://my-imdb-sparql-endpoint.com/sparql

# To download the index data, we use generic queries for both
# entities and properties by default. You can also provide your own queries,
# which is recommended, especially for larger knowledge graphs or
# knowledge graph with unusual schema.
grasp data imdb \
  --entity-sparql <path/to/entity.sparql> \
  --property-sparql <path/to/property.sparql>

# Show all available options:
grasp data -h
```

#### Build indices

After getting the index data, you can build the indices for the knowledge graph.
You probably do not need to change any parameters here.

With the CLI, you can use the `grasp index` command as follows:

```bash
# The indices will be saved to $GRASP_INDEX_DIR/<kg_name>/entities/<index_type>
# and $GRASP_INDEX_DIR/<kg_name>/properties/<index_type>.
# For example, to build the indices for IMDB:
grasp index imdb

# You can also change the types of indices that are built. By default, we build a
# prefix index for entities and a similarity index for properties.
grasp index imdb \
  --entities-type <prefix|similarity> \
  --properties-type <prefix|similarity>

# Show all available options:
grasp index -h
```

After this step is done, you can use the knowledge graph with GRASP by
including it in your config file (see above).

#### Customizing prefixes and info SPARQL queries

There are two more optional steps you can perform to customize the behavior
of GRASP related to your knowledge graph.

**Prefixes**

First, you can customize the prefixes that GRASP uses for a
knowledge graph at build time and runtime.
For that, create a file `$GRASP_INDEX_DIR/<kg_name>/prefixes.json`
in the following format (example for Wikidata):

```jsonc
{
  "wd": "<http://www.wikidata.org/entity/",
  "wdt": "<http://www.wikidata.org/prop/direct/",
  // other prefixes ...
}
```

During build time, the prefixes are used for the fallback label generation
if an entity/property has neither a label nor an alias. During runtime, the
prefixes are used to shorten IRIs in function call results, and allows GRASP
to use prefixed instead of full IRIs in function call arguments.

> Note: For QLever endpoints, we automatically retrieve prefixes via the API at
> `https://qlever.cs.uni-freiburg.de/api/prefixes/<kg_name>`, so you do not
> need to create a `prefixes.json` file in that case

**Info SPARQL queries**

Second, you can customize the SPARQL queries that GRASP uses to fetch additional
information about entities and properties for enriching search results.
For that, create a file `$GRASP_INDEX_DIR/<kg_name>/entities/info.sparql`
for entities or `$GRASP_INDEX_DIR/<kg_name>/properties/info.sparql` for properties.
The file should contain a SPARQL query, that returns two columns in its results:

1. The IRI of the entity/property (required, must be unique)
2. All additional information about the entity/property, separated by `;;;` (optional)

A typical SPARQL query for that looks like this:

```sparql
SELECT
  # unique identifier of the entity/property
  ?id
  # all additional information, separated by ;;;
  (GROUP_CONCAT(DISTINCT ?info; SEPARATOR=";;;") AS ?infos)
} WHERE {
  {
    VALUES ?id { {IDS} }
    ...
  } UNION {
    VALUES ?id { {IDS} }
    ...
  }
  ...
}
# group by the identifier to ensure uniqueness
GROUP BY ?id
```

At runtime, all places where `{IDS}` appears in the query will be
replaced by the list of entity/property IRIs to get information for.
Typically, this will be within a `VALUES ?id { ... }` clause as
shown above.

See our [info SPARQL query for Wikidata entities](queries/wikidata.entity.info.sparql) as an example.

> Note: If no custom info SPARQL query is found, we use the
> default ones from [here](src/grasp/sparql/queries)

## Run GRASP webapp

Make sure to start a GRASP server first (see above).
Then follow [these instructions](apps/grasp/README.md) to run the GRASP web app.

## Run evaluation app

Follow [these instructions](apps/evaluation/README.md) to run the
evaluation app for the SPARQL QA task.

## Supported models

GRASP supports both commercial and open-source models.

### OpenAI

1. Set `OPENAI_API_KEY` env variable
2. Set model to `openai/<model_name>` in the config file or with
`MODEL` env variable, we tested:

- `openai/gpt-4.1`
- `openai/gpt-4.1-mini`
- `openai/o4-mini`
- `openai/gpt-5-mini`
- `openai/gpt-5`

### Google Gemini

1. Set `GEMINI_API_KEY`
2. Set model to `gemini/<model_name>` in the config file or with
`MODEL` env variable, we tested:

- `gemini/gemini-2.0-flash`
- `gemini/gemini-2.5-flash-preview-04-17`

### Local server with vLLM

1. Install vLLM with `pip install vllm`
2. Run vLLM server with a model of your choice, see below
3. Set model to `hosted_vllm/<model_name>` in the config file or with
`MODEL` env variable, we tested:

- `hosted_vllm/Qwen/Qwen2.5-72B-Instruct` (and other sizes)
- `hosted_vllm/Qwen/Qwen3-32B` (and other sizes)

4. Set model_endpoint in the config file or with `MODEL_ENDPOINT` env variable
to your vLLM server endpoint, by default this will be `http://localhost:8000/v1`

#### Run Qwen2.5

Change 72B to 7B, 14B, or 32B to run other sizes. Adapt the tensor parallel size
to your GPU setup, we used two H100 GPUs for Qwen2.7 72B.

```bash
vllm serve Qwen/Qwen2.5-72B-Instruct --tool-call-parser hermes \
--enable-auto-tool-choice --tensor-parallel-size 2
```

#### Run Qwen3

Change 32B to 4B, 8B, or 14B to run other sizes.

```bash
vllm serve Qwen/Qwen3-32B --reasoning-parser qwen3 \
--tool-call-parser hermes --enable-auto-tool-choice
```

## Misc

To prepare some benchmark datasets with the [Makefile](Makefile),
e.g. using `make wikidata-benchmarks`, you first need to clone
[github.com/KGQA/KGQA-datasets](https://github.com/KGQA/KGQA-datasets) into `third_party`:

```bash
mkdir -p third_party
git clone https://github.com/KGQA/KGQA-datasets.git third_party/KGQA-datasets
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "grasp-rdf",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "rdf, llm, sparql, question answering, knowledge graph",
    "author": null,
    "author_email": "Sebastian Walter <swalter@cs.uni-freiburg.de>",
    "download_url": null,
    "platform": null,
    "description": "# GRASP - Generic Reasoning and SPARQL generation across Knowledge Graphs\n\n## News\n\n- August 28th 2025:\n  - Demo paper of GRASP has also been accepted to [ISWC 2025](https://iswc2025.semanticweb.org/)\n  - Preview of camera-ready version coming soon\n\n- July 31st 2025:\n  - GRASP has been accepted to [ISWC 2025](https://iswc2025.semanticweb.org/)\n  - Preview of camera-ready version available [here](https://ad-publications.cs.uni-freiburg.de/ISWC_grasp_WB_2025.pdf)\n\n- July 14th 2025:\n  - arXiv preprint available at [arxiv.org/abs/2507.08107](https://arxiv.org/abs/2507.08107)\n\n- July 10th 2025:\n  - Code release\n  - Data release\n\n## Overview and directory structure\n\nLinks:\n\n- Public demo available at [grasp.cs.uni-freiburg.de](https://grasp.cs.uni-freiburg.de)\n- Data available at [ad-publications.cs.uni-freiburg.de/grasp](https://ad-publications.cs.uni-freiburg.de/grasp)\n\n```\napps/\n  evaluation/                     # Streamlit app for evaluation\n  grasp/                          # Web app compatible with GRASP server\nbash/                             # Bash scripts to run and evaluate GRASP\nconfigs/\n  run.yaml                        # Config to run GRASP with a single KG\n  serve.yaml                      # Config to run GRASP with all available KGs\nqueries/                          # Custom index data and info SPARQL queries\n                                    for various knowledge graphs\nscripts/                          # Various helper scripts\ndata/                          \n  benchmark/                      # Benchmarks grouped by knowledge graph\n    [knowledge-graph]/\n      [benchmark]/                   \n        test.jsonl                # Test set with input and ground truth\n        train.example_index/      # Index based on train set for few-shot learning\n                                    (needs to be downloaded)\n        outputs/\n          [model].jsonl           # Model output\n          [model].config.json     # Model config\n          [model].evaluation.json # Evaluation against ground truth\n  kg-index/                       # KG indices (need to be downloaded)\n    wikidata/\n    freebase/\n    ...\nsrc/                              # Source code for GRASP\nMakefile                          # Makefile for building benchmarks\n```\n\n## Quickstart\n\nFollow these steps to run GRASP.\n\n### Run GRASP\n\n> Note: We recommend to use conda for ease of installation of Faiss and to avoid\n> dependency issues.\n\n1. Create and activate conda environment:\n`conda create -n grasp python=3.12 && conda activate grasp`\n\n2. Install Faiss (not supported to be installed with pip):\n`conda install -c pytorch -c nvidia faiss-gpu=1.11.0`\n\n> You might have to install the CPU version of Faiss, since\n> the GPU version leads to issues on some systems.\n\n3. Install GRASP\n\n```bash\n# From PyPI (recommended)\npip install grasp-rdf\n\n# Or via git\npip install git+https://github.com/ad-freiburg/grasp.git@main\n```\n\n4. Set the `GRASP_INDEX_DIR` env variable. Defaults to `$HOME/.grasp/index` if not\nset. We set it to `$PWD/data/kg-index`, but you can choose any directory you like.\n\n> We recommend to set it with conda, such that it is set automatically when you activate\n> the conda environment: `conda env config vars set GRASP_INDEX_DIR=/path/to/dir`\n\n5. Get indices for the knowledge graphs you want to use. All indices are available\n[publicly](https://ad-publications.cs.uni-freiburg.de/grasp/kg-index).\nFor example, to get the indices for Wikidata:\n\n```bash\n# Change to index directory\ncd $GRASP_INDEX_DIR\n# Download Wikidata index\nwget https://ad-publications.cs.uni-freiburg.de/grasp/kg-index/wikidata.tar.gz\n# Extract index\ntar -xzf wikidata.tar.gz\n```\n\nOptionally, you can also download example indices for few-shot learning.\nExample indices are always built from the train set of a benchmark\nand called `train.example-index`.\nFor example, to get the example index for QALD-10 on Wikidata:\n\n```bash\n# Change to benchmark directory\ncd data/benchmark/wikidata/qald10\n# Download example index\nwget https://ad-publications.cs.uni-freiburg.de/grasp/benchmark/wikidata/qald10/train.example-index.tar.gz\n# Extract example index\ntar -xzf train.example-index.tar.gz\n```\n\n6. Run GRASP:\n\n```bash\n# Note, that if you e.g. run OpenAI models, you also need to set the\n# OPENAI_API_KEY env variable (see section about supported models below).\n\n# Tip: Set --log-level DEBUG to show the individual steps of GRASP\n# (reasoning and function calls) in a nicely formatted way.\n\n# Run GRASP on an input and output the result to stdout as JSON with metadata.\n# Actual output for the task is in the \"output\" field of that JSON object.\n\n# Input from stdin:\necho \"Where was Angela Merkel born?\" | grasp run configs/run.yaml\n\n# Input via CLI argument:\ngrasp run configs/run.yaml --input \"Where was Angela Merkel born?\"\n\n# You can run different tasks with GRASP (default is sparql-qa). \n# Depending on the task, the expected input format and output format\n# will differ. For general-qa, the input is also a natural language\n# question, same as for sparql-qa, but the output will be just a natural\n# language answer instead of a SPARQL query.\necho \"Where was Angela Merkel born?\" | grasp run configs/run.yaml --task general-qa\n\n# Show all available options:\ngrasp run -h\n\n# You can also run GRASP on multiple inputs (in JSONL format).\n# In the following, we show an example to run GRASP on the QALD-10 \n# test set over Wikidata.\n\n# Input from stdin:\ncat data/benchmark/wikidata/qald10/test.jsonl | grasp file configs/run.yaml\n\n# Input via CLI argument:\ngrasp file configs/run.yaml --input-file data/benchmark/wikidata/qald10/test.jsonl\n\n# Save output to a file instead of stdout and show progress bar:\ngrasp file configs/run.yaml \\\n  --input-file data/benchmark/wikidata/qald10/test.jsonl \\\n  --output-file data/benchmark/wikidata/qald10/outputs/test.jsonl \\\n  --progress\n\n# Show all available options:\ngrasp file -h\n\n# You can also run GRASP in a client-server setup. This is also the server\n# that powers the corresponding web app.\n# To start a GRASP server, by default on port 8000, just run:\ngrasp serve configs/run.yaml\n\n# For convenience, we also provide a config to run the server with all\n# available knowledge graphs (make sure to download all indices first):\ngrasp serve configs/serve.yaml\n\n# Show all available options:\ngrasp serve -h\n```\n\n### Configure GRASP\n\nGRASP can be configured via a single YAML config file, which is passed\nto `grasp run`, `grasp file`, or `grasp serve` as first argument (see above).\nYou can use env variable placeholders in the config file of the form\n`env(VAR_NAME:default_value)`, which will be replaced at runtime by the value of\nthe env variable `VAR_NAME` if it is set, or by `default_value` otherwise.\nIf no default value is given and the env variable is not set, an error\nis raised. If you omit an entire config option, we also use a default value\nas specified in the config code.\n\nThe configuration options and the use of env variable placeholders are\nmostly self-explanatory, so we refer you to the [example config files](configs)\nand the [config code](src/grasp/configs.py) for details.\n\n### Build your own knowledge graph indices\n\nUsing GRASP with your own knowledge graph requires two steps:\n\n- Getting the index data from a SPARQL endpoint for the knowledge graph\n- Building the indices\n\n#### Get index data\n\nWe get the index data by issuing two SPARQL queries to a SPARQL endpoint,\none for entities and one for properties. Both queries are expected to\nreturn three columns in their results:\n\n1. The IRI of the entity/property (required, must be unique)\n2. The main label of the entity/property (optional)\n3. All other labels/aliases of the entity/property, separated by `;;;` (optional)\n\nA typical SPARQL query for that looks like this:\n\n```sparql\nSELECT\n  # unique identifier of the entity/property\n  ?id\n  # main label of the entity/property, typically in English via rdfs:label\n  (SAMPLE(?label) AS ?main_label)\n  # all other labels/aliases, separated by ;;;\n  (GROUP_CONCAT(DISTINCT ?alias; SEPARATOR=\";;;\") AS ?aliases)\nWHERE {\n  ...\n}\n# group by the identifier to ensure uniqueness\nGROUP BY ?id\n```\n\nThe query body will determine which entities/properties are included included\nin the index, and how their labels and aliases are retrieved.\n\n> Notes:\n>\n> - If you do not provide custom index data SPARQL queries, we use the generic\n> default queries from [here](src/grasp/sparql/queries)\n> - Our custom index data queries for various knowledge graphs\n> are [here](queries)\n> - If there is neither a label nor an alias for an entity/property, we use\n> its IRI as fallback label\n> - For properties, we always add the IRI as alias, to make them searchable by\n> their IRI as well\n\nWith the CLI, you can use the `grasp data` command as follows:\n\n```bash\n# By default, if you just specify the knowledge graph name,\n# we use https://qlever.cs.uni-freiburg.de/api/<kg_name> as SPARQL endpoint.\n# The data will be saved to $GRASP_INDEX_DIR/<kg_name>/entities/data.tsv\n# and $GRASP_INDEX_DIR/<kg_name>/properties/data.tsv.\n# For example, to get the index data for IMDB:\ngrasp data imdb\n\n# You can also set a custom SPARQL endpoint:\ngrasp data my-imdb --endpoint https://my-imdb-sparql-endpoint.com/sparql\n\n# To download the index data, we use generic queries for both\n# entities and properties by default. You can also provide your own queries,\n# which is recommended, especially for larger knowledge graphs or\n# knowledge graph with unusual schema.\ngrasp data imdb \\\n  --entity-sparql <path/to/entity.sparql> \\\n  --property-sparql <path/to/property.sparql>\n\n# Show all available options:\ngrasp data -h\n```\n\n#### Build indices\n\nAfter getting the index data, you can build the indices for the knowledge graph.\nYou probably do not need to change any parameters here.\n\nWith the CLI, you can use the `grasp index` command as follows:\n\n```bash\n# The indices will be saved to $GRASP_INDEX_DIR/<kg_name>/entities/<index_type>\n# and $GRASP_INDEX_DIR/<kg_name>/properties/<index_type>.\n# For example, to build the indices for IMDB:\ngrasp index imdb\n\n# You can also change the types of indices that are built. By default, we build a\n# prefix index for entities and a similarity index for properties.\ngrasp index imdb \\\n  --entities-type <prefix|similarity> \\\n  --properties-type <prefix|similarity>\n\n# Show all available options:\ngrasp index -h\n```\n\nAfter this step is done, you can use the knowledge graph with GRASP by\nincluding it in your config file (see above).\n\n#### Customizing prefixes and info SPARQL queries\n\nThere are two more optional steps you can perform to customize the behavior\nof GRASP related to your knowledge graph.\n\n**Prefixes**\n\nFirst, you can customize the prefixes that GRASP uses for a\nknowledge graph at build time and runtime.\nFor that, create a file `$GRASP_INDEX_DIR/<kg_name>/prefixes.json`\nin the following format (example for Wikidata):\n\n```jsonc\n{\n  \"wd\": \"<http://www.wikidata.org/entity/\",\n  \"wdt\": \"<http://www.wikidata.org/prop/direct/\",\n  // other prefixes ...\n}\n```\n\nDuring build time, the prefixes are used for the fallback label generation\nif an entity/property has neither a label nor an alias. During runtime, the\nprefixes are used to shorten IRIs in function call results, and allows GRASP\nto use prefixed instead of full IRIs in function call arguments.\n\n> Note: For QLever endpoints, we automatically retrieve prefixes via the API at\n> `https://qlever.cs.uni-freiburg.de/api/prefixes/<kg_name>`, so you do not\n> need to create a `prefixes.json` file in that case\n\n**Info SPARQL queries**\n\nSecond, you can customize the SPARQL queries that GRASP uses to fetch additional\ninformation about entities and properties for enriching search results.\nFor that, create a file `$GRASP_INDEX_DIR/<kg_name>/entities/info.sparql`\nfor entities or `$GRASP_INDEX_DIR/<kg_name>/properties/info.sparql` for properties.\nThe file should contain a SPARQL query, that returns two columns in its results:\n\n1. The IRI of the entity/property (required, must be unique)\n2. All additional information about the entity/property, separated by `;;;` (optional)\n\nA typical SPARQL query for that looks like this:\n\n```sparql\nSELECT\n  # unique identifier of the entity/property\n  ?id\n  # all additional information, separated by ;;;\n  (GROUP_CONCAT(DISTINCT ?info; SEPARATOR=\";;;\") AS ?infos)\n} WHERE {\n  {\n    VALUES ?id { {IDS} }\n    ...\n  } UNION {\n    VALUES ?id { {IDS} }\n    ...\n  }\n  ...\n}\n# group by the identifier to ensure uniqueness\nGROUP BY ?id\n```\n\nAt runtime, all places where `{IDS}` appears in the query will be\nreplaced by the list of entity/property IRIs to get information for.\nTypically, this will be within a `VALUES ?id { ... }` clause as\nshown above.\n\nSee our [info SPARQL query for Wikidata entities](queries/wikidata.entity.info.sparql) as an example.\n\n> Note: If no custom info SPARQL query is found, we use the\n> default ones from [here](src/grasp/sparql/queries)\n\n## Run GRASP webapp\n\nMake sure to start a GRASP server first (see above).\nThen follow [these instructions](apps/grasp/README.md) to run the GRASP web app.\n\n## Run evaluation app\n\nFollow [these instructions](apps/evaluation/README.md) to run the\nevaluation app for the SPARQL QA task.\n\n## Supported models\n\nGRASP supports both commercial and open-source models.\n\n### OpenAI\n\n1. Set `OPENAI_API_KEY` env variable\n2. Set model to `openai/<model_name>` in the config file or with\n`MODEL` env variable, we tested:\n\n- `openai/gpt-4.1`\n- `openai/gpt-4.1-mini`\n- `openai/o4-mini`\n- `openai/gpt-5-mini`\n- `openai/gpt-5`\n\n### Google Gemini\n\n1. Set `GEMINI_API_KEY`\n2. Set model to `gemini/<model_name>` in the config file or with\n`MODEL` env variable, we tested:\n\n- `gemini/gemini-2.0-flash`\n- `gemini/gemini-2.5-flash-preview-04-17`\n\n### Local server with vLLM\n\n1. Install vLLM with `pip install vllm`\n2. Run vLLM server with a model of your choice, see below\n3. Set model to `hosted_vllm/<model_name>` in the config file or with\n`MODEL` env variable, we tested:\n\n- `hosted_vllm/Qwen/Qwen2.5-72B-Instruct` (and other sizes)\n- `hosted_vllm/Qwen/Qwen3-32B` (and other sizes)\n\n4. Set model_endpoint in the config file or with `MODEL_ENDPOINT` env variable\nto your vLLM server endpoint, by default this will be `http://localhost:8000/v1`\n\n#### Run Qwen2.5\n\nChange 72B to 7B, 14B, or 32B to run other sizes. Adapt the tensor parallel size\nto your GPU setup, we used two H100 GPUs for Qwen2.7 72B.\n\n```bash\nvllm serve Qwen/Qwen2.5-72B-Instruct --tool-call-parser hermes \\\n--enable-auto-tool-choice --tensor-parallel-size 2\n```\n\n#### Run Qwen3\n\nChange 32B to 4B, 8B, or 14B to run other sizes.\n\n```bash\nvllm serve Qwen/Qwen3-32B --reasoning-parser qwen3 \\\n--tool-call-parser hermes --enable-auto-tool-choice\n```\n\n## Misc\n\nTo prepare some benchmark datasets with the [Makefile](Makefile),\ne.g. using `make wikidata-benchmarks`, you first need to clone\n[github.com/KGQA/KGQA-datasets](https://github.com/KGQA/KGQA-datasets) into `third_party`:\n\n```bash\nmkdir -p third_party\ngit clone https://github.com/KGQA/KGQA-datasets.git third_party/KGQA-datasets\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "GRASP: Generic Reasoning and SPARQL generation across knowledge graphs",
    "version": "0.1.1",
    "project_urls": {
        "Github": "https://github.com/bastiscode/grasp"
    },
    "split_keywords": [
        "rdf",
        " llm",
        " sparql",
        " question answering",
        " knowledge graph"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6b4417fdf0e927e1d0943b38192f7b958516a3995043dad104e0e0bb7d911f8f",
                "md5": "749f877514f0dea79cbbd114dae939f4",
                "sha256": "b77de250b44e919d675629aaed4c4bbd190f8f49115703d8d4e514d72e6d30e3"
            },
            "downloads": -1,
            "filename": "grasp_rdf-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "749f877514f0dea79cbbd114dae939f4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 84616,
            "upload_time": "2025-09-09T17:36:31",
            "upload_time_iso_8601": "2025-09-09T17:36:31.170820Z",
            "url": "https://files.pythonhosted.org/packages/6b/44/17fdf0e927e1d0943b38192f7b958516a3995043dad104e0e0bb7d911f8f/grasp_rdf-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-09 17:36:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "bastiscode",
    "github_project": "grasp",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "grasp-rdf"
}

None