| Name | minkarr JSON |
| Version |
0.1.1
JSON |
| download |
| home_page | None |
| Summary | A minimal implementation of KaRR knowledge assessment method for Large Language Models (LLMs) |
| upload_time | 2024-10-25 16:10:26 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | >=3.8 |
| license | None |
| keywords |
|
| VCS |
|
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# Statistical Knowledge Assessment for Large Language Models
A minimal implementation of KaRR knowledge assessment method from the following paper:
> [**Statistical Knowledge Assessment for Large Language Models**](https://arxiv.org/abs/2305.10519),
> Qingxiu Dong, Jingjing Xu, Lingpeng Kong, Zhifang Sui, Lei Li
> *arXiv preprint ([arxiv_version](https://arxiv.org/abs/2305.10519))*
This is a fork of the [official implementation](https://github.com/dqxiu/KAssess) released by the authors.
## How to use?
First setup the conda environment using the following command
```bash
pip install minkarr
```
Here is a simple example of **how to quantify the knowledge of a fact by an LLM using KaRR**
```python
from karr import KaRR
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = 'gpt2'
device = 'cuda'
model = AutoModelForCausalLM.from_pretrained(model_name, device_map = device)
tokenizer = AutoTokenizer.from_pretrained(model_name)
karr = KaRR(model, tokenizer, device)
# Testing the fact: (France, capital, Paris)
# You can find other facts by looking into Wikidata
fact = ('Q142', 'P36', 'Q90')
karr, does_know = karr.compute(fact)
print('Fact %s' % str(fact))
print('KaRR = %s' % karr)
ans = 'Yes' if does_know else 'No'
print('According to KaRR, does the model knows this fact? Answer: %s' % ans)
# Output:
# KaRR = 3.338972442145268
# According to KaRR, does the model knows this fact? Answer: No
```
## Difference with original repo
- Easy-to-use
- Clean code
- Minimalistic implementation: I kept only the portion of the code needed to compute KaRR and removed the rest
- This implementation can compute KaRR on a single fact (the original implementation went through all facts)
## Citation
Cite the original authors using:
```
@misc{dong2023statistical,
title={Statistical Knowledge Assessment for Large Language Models},
author={Qingxiu Dong and Jingjing Xu and Lingpeng Kong and Zhifang Sui and Lei Li},
year={2023},
journal = {Proceedings of NeurIPS},
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "minkarr",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Hichem Ammar Khodja <hichem.ammarkhodja@orange.com>",
"download_url": "https://files.pythonhosted.org/packages/46/82/ba40e5e1a3fc777bf30fb81b54a23469040b7d58912194352b6880b5b2b9/minkarr-0.1.1.tar.gz",
"platform": null,
"description": "# Statistical Knowledge Assessment for Large Language Models\nA minimal implementation of KaRR knowledge assessment method from the following paper:\n\n> [**Statistical Knowledge Assessment for Large Language Models**](https://arxiv.org/abs/2305.10519), \n> Qingxiu Dong, Jingjing Xu, Lingpeng Kong, Zhifang Sui, Lei Li \n> *arXiv preprint ([arxiv_version](https://arxiv.org/abs/2305.10519))* \n\nThis is a fork of the [official implementation](https://github.com/dqxiu/KAssess) released by the authors.\n\n## How to use?\n\nFirst setup the conda environment using the following command\n\n```bash\npip install minkarr\n```\n\nHere is a simple example of **how to quantify the knowledge of a fact by an LLM using KaRR**\n```python\nfrom karr import KaRR\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_name = 'gpt2'\ndevice = 'cuda'\nmodel = AutoModelForCausalLM.from_pretrained(model_name, device_map = device)\ntokenizer = AutoTokenizer.from_pretrained(model_name)\n\nkarr = KaRR(model, tokenizer, device)\n\n# Testing the fact: (France, capital, Paris)\n# You can find other facts by looking into Wikidata\nfact = ('Q142', 'P36', 'Q90')\n\nkarr, does_know = karr.compute(fact)\nprint('Fact %s' % str(fact))\nprint('KaRR = %s' % karr)\nans = 'Yes' if does_know else 'No'\nprint('According to KaRR, does the model knows this fact? Answer: %s' % ans)\n# Output:\n# KaRR = 3.338972442145268\n# According to KaRR, does the model knows this fact? Answer: No\n```\n\n## Difference with original repo\n\n- Easy-to-use\n- Clean code\n- Minimalistic implementation: I kept only the portion of the code needed to compute KaRR and removed the rest\n- This implementation can compute KaRR on a single fact (the original implementation went through all facts)\n\n## Citation\nCite the original authors using:\n```\n@misc{dong2023statistical,\n title={Statistical Knowledge Assessment for Large Language Models}, \n author={Qingxiu Dong and Jingjing Xu and Lingpeng Kong and Zhifang Sui and Lei Li},\n year={2023},\n journal = {Proceedings of NeurIPS},\n}\n```\n\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A minimal implementation of KaRR knowledge assessment method for Large Language Models (LLMs)",
"version": "0.1.1",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "055c9b6ef439e158be4aa7d0966a8c5cb1a055e0282e75ec0dab9c40862488ed",
"md5": "a39cd073918308dacfefdc1f3346c16d",
"sha256": "007bb9afc0016e3e2027085d5ad8edb573d8aa7654a48377f8131f87bd17ee6a"
},
"downloads": -1,
"filename": "minkarr-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a39cd073918308dacfefdc1f3346c16d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 15598,
"upload_time": "2024-10-25T16:10:24",
"upload_time_iso_8601": "2024-10-25T16:10:24.479119Z",
"url": "https://files.pythonhosted.org/packages/05/5c/9b6ef439e158be4aa7d0966a8c5cb1a055e0282e75ec0dab9c40862488ed/minkarr-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4682ba40e5e1a3fc777bf30fb81b54a23469040b7d58912194352b6880b5b2b9",
"md5": "07587389e981600c1b3ce8b27616f30e",
"sha256": "74f80e075c9d9ceb7fbc7c152adc3b7c3cda1f3b0b43afdd02cb9ca8831744ed"
},
"downloads": -1,
"filename": "minkarr-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "07587389e981600c1b3ce8b27616f30e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 11420,
"upload_time": "2024-10-25T16:10:26",
"upload_time_iso_8601": "2024-10-25T16:10:26.124153Z",
"url": "https://files.pythonhosted.org/packages/46/82/ba40e5e1a3fc777bf30fb81b54a23469040b7d58912194352b6880b5b2b9/minkarr-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-25 16:10:26",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "minkarr"
}