# SIMRE
*SimRE is a Python-based tool for automatically identifying requirements' similarity in SPL projects.
## Parameters description
* You need to enter seven parameters. These parameters are the following:
* nameNewReq: name of the CSV file that contains the list of new requirements.
* nameFeatures: name of the XML or UVL file that contains the existing requirements.
* nameReqDescription: name of the JSON file that contains the requirements description.
* language: 'en' for English and 'es' for Spanish
* listModels: array with the models. optional. The default model is 1. The models are the following:
* 1:Model multilingual MiniLM-L12-v2
* 2:Model multilingual distiluse-cased-v2
* 3:Model multilingual mpnet-base-v2
* 4:'Model word2vec
* 5:'Model fastText
* threshold: optional. The default value is 0.7.
* preprocess: optional. The values are: True to allow the pre-processing (default value), and False for without pre-processing
## Installation by pip
* Its necessary to have installed al least Python 3.8.10 and pip 23.1.2
* The tool can be used by installing the library or downloading the code.
* After download the library using pip, use the following code to download several pre-trained models and put them on caché.
```
from simre import init_models
init_models.main()
from simre import similarity
models = similarity.load_models()
```
* This process may take several minutes depending on the processor's capacity and memory. We recommend a RAM of at least of 16 GB.
* When the process finishes, you should confirm that a "fileserver" folder has been created and contains four files.
### Usage
* The method similarity_process perform the similarity process.
```
similarity.similarity_process(nameNewReq, nameFeatures, nameReqDescription, 'en',models)
```
* In the examples folder are some files to make a test.
```
similarity.similarity_process('newRequirements.csv', 'featureModel.xml', 'descRequirements_en.json', 'en',models)
```
* With all the parameters.
```
similarity.similarity_process('newRequirements.csv', 'featureModel.xml', 'descRequirements_en.json', 'en',models,[1,2,3],0.6,False)
```
* The results will be provided in a CSV file ('Similarity List.csv').
## Installation by code
* Download the code
* Install the required libraries. All necessary libraries are listed in the requirements.txt file. To install them, execute the following command `pip install -r ./requirements.txt`
* It is necessary to download several pre-trained models. This can be done automatically or manually. To download the models automatically, execute the following command: `python process.py init`. To download the models manually, follow these steps:
* a. Download the models of spacy for Spanish and English: `python -m spacy download es_core_news_sm` and `python -m spacy download en_core_web_sm`
* b. Download the models of fastText for Spanish and English: `cc.es.300.bin` and `cc.es.300.bin` from https://fasttext.cc/docs/en/crawl-vectors.html
* c. Download the word2vec-based models for Spanish and English: `SBW-vectors-300-min5.bin.gz` from https://crscardellino.github.io/SBWCE/ and `GoogleNews-vectors-negative300.bin.gz` from https://code.google.com/archive/p/word2vec/
* d. At the same directory level as the src folder, create a new folder named fileserver. Place all the pre-trained models into this folder.
### Usage
* To execute the tool without optional parameters, use the following command: `python process.py nameNewReq nameFeatures nameReqDescription language`. Example:: `python process.py newRequirements.csv featureModel.xml descRequirements_en.json en`.
* This is an example using all the parameters: `python process.py newRequirements.csv featureModel.xml descRequirements_en.json en 1,2,3,5 0.7 False`
* The results will be provided in a CSV file on the fileserver folder ('Similarity List.csv').
## Formats
*XML file:
```
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<featureModel chosenLayoutAlgorithm="4">
<struct>
<and mandatory="true" name="GEMA_SPL">
<and name="UserManagement">
<or name="UM_Registration">
<feature mandatory="true" name="UM_R_ByAdmin"/>
<feature mandatory="true" name="UM_R_Anonymous"/>
</or>
</and>
</and>
</struct>
<featureOrder userDefined="false"/>
</featureModel>
```
*UVL file:
```
features
UserManagement
optional
UM_Registration
or
UM_R_ByAdmin
UM_R_Anonymous
```
*JSON file:
```
{
"UserManagement": {
"label": "User Management",
"desc": "User Management"
},
"UM_Registration": {
"label": "User Registration",
"desc": "User Registration"
},
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "simre",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "Software Product Lines, Similar Requirements",
"author": null,
"author_email": "Mar\u00eda Isabel Limaylla <mlimaylla@udc.es>",
"download_url": "https://files.pythonhosted.org/packages/82/07/40020cadb6e43a51ede6c691deb08d3f3334e5ed97f4377ab13b4601568e/simre-0.0.3.tar.gz",
"platform": null,
"description": "# SIMRE\r\n\r\n*SimRE is a Python-based tool for automatically identifying requirements' similarity in SPL projects.\r\n\r\n## Parameters description\r\n\r\n* You need to enter seven parameters. These parameters are the following:\r\n * nameNewReq: name of the CSV file that contains the list of new requirements. \r\n * nameFeatures: name of the XML or UVL file that contains the existing requirements. \r\n * nameReqDescription: name of the JSON file that contains the requirements description. \r\n * language: 'en' for English and 'es' for Spanish\r\n * listModels: array with the models. optional. The default model is 1. The models are the following:\r\n * 1:Model multilingual MiniLM-L12-v2\r\n * 2:Model multilingual distiluse-cased-v2\r\n * 3:Model multilingual mpnet-base-v2\r\n * 4:'Model word2vec\r\n * 5:'Model fastText \r\n * threshold: optional. The default value is 0.7.\r\n * preprocess: optional. The values are: True to allow the pre-processing (default value), and False for without pre-processing\r\n\r\n## Installation by pip\r\n\r\n* Its necessary to have installed al least Python 3.8.10 and pip 23.1.2\r\n* The tool can be used by installing the library or downloading the code.\r\n* After download the library using pip, use the following code to download several pre-trained models and put them on cach\u00e9.\r\n\r\n```\r\nfrom simre import init_models\r\ninit_models.main()\r\n\r\nfrom simre import similarity\r\nmodels = similarity.load_models()\r\n```\r\n\r\n* This process may take several minutes depending on the processor's capacity and memory. We recommend a RAM of at least of 16 GB.\r\n* When the process finishes, you should confirm that a \"fileserver\" folder has been created and contains four files.\r\n\r\n### Usage\r\n\r\n* The method similarity_process perform the similarity process.\r\n\r\n```\r\nsimilarity.similarity_process(nameNewReq, nameFeatures, nameReqDescription, 'en',models) \r\n```\r\n\r\n* In the examples folder are some files to make a test.\r\n\r\n```\r\nsimilarity.similarity_process('newRequirements.csv', 'featureModel.xml', 'descRequirements_en.json', 'en',models) \r\n```\r\n\r\n* With all the parameters.\r\n\r\n```\r\nsimilarity.similarity_process('newRequirements.csv', 'featureModel.xml', 'descRequirements_en.json', 'en',models,[1,2,3],0.6,False) \r\n```\r\n\r\n* The results will be provided in a CSV file ('Similarity List.csv'). \r\n\r\n\r\n## Installation by code\r\n\r\n* Download the code \r\n* Install the required libraries. All necessary libraries are listed in the requirements.txt file. To install them, execute the following command `pip install -r ./requirements.txt`\r\n* It is necessary to download several pre-trained models. This can be done automatically or manually. To download the models automatically, execute the following command: `python process.py init`. To download the models manually, follow these steps:\r\n\r\n* a. Download the models of spacy for Spanish and English: `python -m spacy download es_core_news_sm` and `python -m spacy download en_core_web_sm`\r\n* b. Download the models of fastText for Spanish and English: `cc.es.300.bin` and `cc.es.300.bin` from https://fasttext.cc/docs/en/crawl-vectors.html\r\n* c. Download the word2vec-based models for Spanish and English: `SBW-vectors-300-min5.bin.gz` from https://crscardellino.github.io/SBWCE/ and `GoogleNews-vectors-negative300.bin.gz` from https://code.google.com/archive/p/word2vec/ \r\n* d. At the same directory level as the src folder, create a new folder named fileserver. Place all the pre-trained models into this folder.\r\n\r\n### Usage\r\n\r\n* To execute the tool without optional parameters, use the following command: `python process.py nameNewReq nameFeatures nameReqDescription language`. Example:: `python process.py newRequirements.csv featureModel.xml descRequirements_en.json en`. \r\n* This is an example using all the parameters: `python process.py newRequirements.csv featureModel.xml descRequirements_en.json en 1,2,3,5 0.7 False`\r\n* The results will be provided in a CSV file on the fileserver folder ('Similarity List.csv'). \r\n \r\n## Formats\r\n\r\n*XML file:\r\n\r\n```\r\n<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\r\n<featureModel chosenLayoutAlgorithm=\"4\">\r\n <struct>\r\n <and mandatory=\"true\" name=\"GEMA_SPL\"> \r\n <and name=\"UserManagement\">\r\n <or name=\"UM_Registration\">\r\n <feature mandatory=\"true\" name=\"UM_R_ByAdmin\"/>\r\n <feature mandatory=\"true\" name=\"UM_R_Anonymous\"/>\r\n </or> \r\n </and> \r\n </and>\r\n </struct>\r\n <featureOrder userDefined=\"false\"/>\r\n</featureModel>\r\n```\r\n\r\n*UVL file: \r\n\r\n```\r\nfeatures\r\n\tUserManagement \r\n\t\toptional\r\n\t\t\tUM_Registration \r\n\t\t\t\tor\r\n\t\t\t\t\tUM_R_ByAdmin \r\n\t\t\t\t\tUM_R_Anonymous\r\n```\r\n\r\n*JSON file:\r\n\r\n```\r\n{\r\n \"UserManagement\": { \r\n \"label\": \"User Management\",\r\n \"desc\": \"User Management\" \r\n },\r\n \"UM_Registration\": { \r\n \"label\": \"User Registration\",\r\n \"desc\": \"User Registration\" \r\n },\r\n \r\n}\r\n```\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Requirements Similarity tool for Software Product Lines",
"version": "0.0.3",
"project_urls": {
"Homepage": "https://github.com/lbdudc/simre",
"Issues": "https://github.com/lbdudc/simre/issues"
},
"split_keywords": [
"software product lines",
" similar requirements"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1ee663d6c95981fdc18963dbf030b868507af9545b177c7cba428f8e75db8cb7",
"md5": "3cd309bf1804f427b457afcffa6d8407",
"sha256": "d72bf0f162089e8b4db20cdef1cb816e7d5e77ae47e5980109b31e972e3a8e38"
},
"downloads": -1,
"filename": "simre-0.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3cd309bf1804f427b457afcffa6d8407",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 9761,
"upload_time": "2024-06-05T11:56:27",
"upload_time_iso_8601": "2024-06-05T11:56:27.910388Z",
"url": "https://files.pythonhosted.org/packages/1e/e6/63d6c95981fdc18963dbf030b868507af9545b177c7cba428f8e75db8cb7/simre-0.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "820740020cadb6e43a51ede6c691deb08d3f3334e5ed97f4377ab13b4601568e",
"md5": "ddb72db3c353cf233138c186b756aeee",
"sha256": "d39cd960869a064870156191dff75a32e9e6b95bac773114a6559f0fe663392b"
},
"downloads": -1,
"filename": "simre-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "ddb72db3c353cf233138c186b756aeee",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 10844,
"upload_time": "2024-06-05T11:56:29",
"upload_time_iso_8601": "2024-06-05T11:56:29.005475Z",
"url": "https://files.pythonhosted.org/packages/82/07/40020cadb6e43a51ede6c691deb08d3f3334e5ed97f4377ab13b4601568e/simre-0.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-05 11:56:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lbdudc",
"github_project": "simre",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "nltk",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "numpy",
"specs": []
},
{
"name": "gensim",
"specs": []
},
{
"name": "spacy",
"specs": []
},
{
"name": "torch",
"specs": []
},
{
"name": "sentence-transformers",
"specs": []
},
{
"name": "bs4",
"specs": []
},
{
"name": "uvlparser",
"specs": [
[
"==",
"2.0.1"
]
]
},
{
"name": "lxml",
"specs": []
},
{
"name": "scikit-learn",
"specs": []
},
{
"name": "fasttext-wheel",
"specs": []
},
{
"name": "gdown",
"specs": []
}
],
"lcname": "simre"
}