| Name | VulnTrain JSON |
| Version |
2.0.0
JSON |
| download |
| home_page | None |
| Summary | Generate datasets amd models based on vulnerabilities data from Vulnerability-Lookup. |
| upload_time | 2025-09-05 12:36:07 |
| maintainer | None |
| docs_url | None |
| author | Cédric Bonhomme |
| requires_python | <4.0,>=3.11 |
| license | GPL-3.0-or-later |
| keywords |
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# VulnTrain
[](https://github.com/vulnerability-lookup/VulnTrain/releases/latest)
[](https://www.gnu.org/licenses/gpl-3.0.html)
[](https://pypi.org/project/VulnTrain)
VulnTrain offers a suite of commands to generate diverse AI datasets and train models using
comprehensive vulnerability data from [Vulnerability-Lookup](https://github.com/vulnerability-lookup/vulnerability-lookup).
It harnesses over one million JSON records from all supported advisory sources to build high-quality, domain-specific models.
Additionally, data from the ``vulnerability-lookup:meta`` container, including enrichment sources such as vulnrichment and Fraunhofer FKIE,
is incorporated to enhance model quality.
Check out the datasets and models on Hugging Face:
[](https://huggingface.co/CIRCL)
For more information about the use of AI in Vulnerability-Lookup, please refer to the
[user manual](https://www.vulnerability-lookup.org/user-manual/ai/).
## Usage
Install VulnTrain:
```bash
$ pipx install VulnTrain
```
Three types of commands are available:
- **Dataset generation**: Create and prepare datasets.
- **Model training**: Train models using the prepared datasets.
- Train a model to **classify** vulnerabilities by severity. [](https://huggingface.co/CIRCL/vulnerability-severity-classification-roberta-base)
- Train a model for **text generation** to assist in writing vulnerability descriptions [](https://huggingface.co/CIRCL/vulnerability-description-generation-gpt2#how-to-get-started-with-the-model)
- **Model validation**: Assess the performance of trained models (validations, benchmarks, etc.).
Check out the [documentation](docs/) for more information.
## How to cite
Bonhomme, C., & Dulaunoy, A. (2025). VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification (Version 1.4.0) [Computer software]. https://doi.org/10.48550/arXiv.2507.03607
```bibtex
@misc{bonhomme2025vlai,
title={VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification},
author={Cédric Bonhomme and Alexandre Dulaunoy},
year={2025},
eprint={2507.03607},
archivePrefix={arXiv},
primaryClass={cs.CR}
}
```
## License
[VulnTrain](https://github.com/vulnerability-lookup/VulnTrain) is licensed under
[GNU General Public License version 3](https://www.gnu.org/licenses/gpl-3.0.html)
~~~
Copyright (c) 2025 Computer Incident Response Center Luxembourg (CIRCL)
Copyright (C) 2025 Cédric Bonhomme - https://github.com/cedricbonhomme
Copyright (C) 2025 Léa Ulusan - https://github.com/3LS3-1F
~~~
Raw data
{
"_id": null,
"home_page": null,
"name": "VulnTrain",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.11",
"maintainer_email": null,
"keywords": null,
"author": "C\u00e9dric Bonhomme",
"author_email": "cedric.bonhomme@circl.lu",
"download_url": "https://files.pythonhosted.org/packages/d0/b3/6080a789fb273cd3e2ffef8867d5e26606c4c88aa5f40c37eb870e45ae16/vulntrain-2.0.0.tar.gz",
"platform": null,
"description": "# VulnTrain\n\n[](https://github.com/vulnerability-lookup/VulnTrain/releases/latest)\n[](https://www.gnu.org/licenses/gpl-3.0.html)\n[](https://pypi.org/project/VulnTrain)\n\n\nVulnTrain offers a suite of commands to generate diverse AI datasets and train models using\ncomprehensive vulnerability data from [Vulnerability-Lookup](https://github.com/vulnerability-lookup/vulnerability-lookup).\nIt harnesses over one million JSON records from all supported advisory sources to build high-quality, domain-specific models.\n \nAdditionally, data from the ``vulnerability-lookup:meta`` container, including enrichment sources such as vulnrichment and Fraunhofer FKIE,\nis incorporated to enhance model quality.\n\nCheck out the datasets and models on Hugging Face: \n\n[](https://huggingface.co/CIRCL)\n\nFor more information about the use of AI in Vulnerability-Lookup, please refer to the\n[user manual](https://www.vulnerability-lookup.org/user-manual/ai/).\n\n\n## Usage\n\nInstall VulnTrain:\n\n```bash\n$ pipx install VulnTrain\n```\n\nThree types of commands are available:\n\n- **Dataset generation**: Create and prepare datasets.\n- **Model training**: Train models using the prepared datasets.\n - Train a model to **classify** vulnerabilities by severity. [](https://huggingface.co/CIRCL/vulnerability-severity-classification-roberta-base)\n - Train a model for **text generation** to assist in writing vulnerability descriptions [](https://huggingface.co/CIRCL/vulnerability-description-generation-gpt2#how-to-get-started-with-the-model)\n- **Model validation**: Assess the performance of trained models (validations, benchmarks, etc.).\n\n\nCheck out the [documentation](docs/) for more information.\n\n\n## How to cite\n\nBonhomme, C., & Dulaunoy, A. (2025). VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification (Version 1.4.0) [Computer software]. https://doi.org/10.48550/arXiv.2507.03607\n\n```bibtex\n@misc{bonhomme2025vlai,\n title={VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification},\n author={C\u00e9dric Bonhomme and Alexandre Dulaunoy},\n year={2025},\n eprint={2507.03607},\n archivePrefix={arXiv},\n primaryClass={cs.CR}\n}\n```\n\n\n## License\n\n[VulnTrain](https://github.com/vulnerability-lookup/VulnTrain) is licensed under\n[GNU General Public License version 3](https://www.gnu.org/licenses/gpl-3.0.html)\n\n~~~\nCopyright (c) 2025 Computer Incident Response Center Luxembourg (CIRCL)\nCopyright (C) 2025 C\u00e9dric Bonhomme - https://github.com/cedricbonhomme\nCopyright (C) 2025 L\u00e9a Ulusan - https://github.com/3LS3-1F\n~~~\n\n\n",
"bugtrack_url": null,
"license": "GPL-3.0-or-later",
"summary": "Generate datasets amd models based on vulnerabilities data from Vulnerability-Lookup.",
"version": "2.0.0",
"project_urls": {
"Changelog": "https://github.com/vulnerability-lookup/VulnTrain/blob/main/CHANGELOG.md",
"Homepage": "https://github.com/vulnerability-lookup/VulnTrain",
"Repository": "https://github.com/vulnerability-lookup/VulnTrain"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "fb4482bc219e486dfe704a52fdee67c11f692ae8cd1f2110d7ee4fe7695272c5",
"md5": "7cb2c56bcb29d30df10499a7f43a03cc",
"sha256": "593195e2b691ab44b169ada30e19a5f5a0b020a037cbee529c737e24a12e59fb"
},
"downloads": -1,
"filename": "vulntrain-2.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7cb2c56bcb29d30df10499a7f43a03cc",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.11",
"size": 266423,
"upload_time": "2025-09-05T12:36:05",
"upload_time_iso_8601": "2025-09-05T12:36:05.973668Z",
"url": "https://files.pythonhosted.org/packages/fb/44/82bc219e486dfe704a52fdee67c11f692ae8cd1f2110d7ee4fe7695272c5/vulntrain-2.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d0b36080a789fb273cd3e2ffef8867d5e26606c4c88aa5f40c37eb870e45ae16",
"md5": "a3af31cad8829216649b48c16cba28a7",
"sha256": "e54ba27db1fc4411d334eefbad389a7e4a6d1d5c58da9d31715d2429064e5f06"
},
"downloads": -1,
"filename": "vulntrain-2.0.0.tar.gz",
"has_sig": false,
"md5_digest": "a3af31cad8829216649b48c16cba28a7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.11",
"size": 257400,
"upload_time": "2025-09-05T12:36:07",
"upload_time_iso_8601": "2025-09-05T12:36:07.636695Z",
"url": "https://files.pythonhosted.org/packages/d0/b3/6080a789fb273cd3e2ffef8867d5e26606c4c88aa5f40c37eb870e45ae16/vulntrain-2.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-05 12:36:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "vulnerability-lookup",
"github_project": "VulnTrain",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "vulntrain"
}