# italian-ats-evalautor
This is an open source project to evaluate the performance of an italian ATS (Automatic Text Simplifier) on a set of texts.
You can analyze a single text extracting the following features:
- Overall:
- Number of tokens
- Number of tokens (including punctuation)
- Number of characters
- Number of characters (including punctuation)
- Number of words
- Number of syllables
- Number of unique lemmas
- Number of sentences
- Readability:
- Type-Token Ratio (TTR)
- Gulpease Index
- Flesch-Vacca Index
- Lexical Density
- Part of Speech (POS) distribution
- Verbs distribution
- Active Verbs
- Passive Verbs
- Italian Basic Vocabulary (NVdB) from [Il Nuovo vocabolario di base della lingua italiana, Tullio De Mauro](https://dizionario.internazionale.it/)
- All
- FO (Fundamentals)
- AU (High Usage)
- AD (High Availability)
You can also compare two texts and get the following metrics:
- Semantic:
- Semantic Similarity
- Character diff:
- Edit Distance
- Token diff:
- Amount of tokens added
- Amount of tokens removed
- Amount of VdB tokens removed
- Amount of VdB tokens added
## Installation
```bash
pip install italian-ats-evaluator
```
## Usage
```python
from italian_ats_evaluator import TextAnalyzer
result = TextAnalyzer(
text="Il gatto mangia il topo"
)
```
```python
from italian_ats_evaluator import SimplificationAnalyzer
result = SimplificationAnalyzer(
reference_text="Il felino mangia il roditore",
simplified_text="Il gatto mangia il topo"
)
```
## Development
Create a virtual environment
```bash
python3 -m venv venv
source venv/bin/activate
```
Install the package in editable mode
```bash
pip install -e .
```
## Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
## Acknowledgements
This project is part of the research project "VerbACxSS: su verbi analitici, complessità , verbi sintetici, e semplificazione. Per l’accessibilità ." funded by the Italian Ministry of University and Research (MUR) under the PRIN 2020 program.
## License
[MIT](https://choosealicense.com/licenses/mit/)
Raw data
{
"_id": null,
"home_page": null,
"name": "italian-ats-evaluator",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "RedHitMark <russodivito.marco@gmail.com>",
"keywords": "ats, text, simplification, italian, nlp",
"author": null,
"author_email": "RedHitMark <russodivito.marco@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/42/ac/e67782f22e12d8146408439af78ed1f2e7119fae4d5cdf2b30302edb9d72/italian_ats_evaluator-2.0.5.tar.gz",
"platform": null,
"description": "# italian-ats-evalautor\nThis is an open source project to evaluate the performance of an italian ATS (Automatic Text Simplifier) on a set of texts.\n\nYou can analyze a single text extracting the following features:\n- Overall:\n - Number of tokens\n - Number of tokens (including punctuation)\n - Number of characters\n - Number of characters (including punctuation)\n - Number of words\n - Number of syllables\n - Number of unique lemmas\n - Number of sentences\n- Readability:\n - Type-Token Ratio (TTR)\n - Gulpease Index\n - Flesch-Vacca Index\n - Lexical Density\n- Part of Speech (POS) distribution\n- Verbs distribution\n - Active Verbs\n - Passive Verbs\n- Italian Basic Vocabulary (NVdB) from [Il Nuovo vocabolario di base della lingua italiana, Tullio De Mauro](https://dizionario.internazionale.it/)\n - All\n - FO (Fundamentals)\n - AU (High Usage)\n - AD (High Availability)\n\n\nYou can also compare two texts and get the following metrics:\n- Semantic:\n - Semantic Similarity \n- Character diff:\n - Edit Distance\n- Token diff:\n - Amount of tokens added\n - Amount of tokens removed\n - Amount of VdB tokens removed\n - Amount of VdB tokens added\n\n\n## Installation\n```bash\npip install italian-ats-evaluator\n```\n\n## Usage\n\n```python\nfrom italian_ats_evaluator import TextAnalyzer\n\nresult = TextAnalyzer(\n text=\"Il gatto mangia il topo\"\n)\n```\n\n```python\nfrom italian_ats_evaluator import SimplificationAnalyzer\n\nresult = SimplificationAnalyzer(\n reference_text=\"Il felino mangia il roditore\",\n simplified_text=\"Il gatto mangia il topo\"\n)\n```\n\n## Development\nCreate a virtual environment\n```bash\npython3 -m venv venv\nsource venv/bin/activate\n```\nInstall the package in editable mode\n```bash\npip install -e .\n```\n\n## Contributing\nPull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.\n\n\n## Acknowledgements\nThis project is part of the research project \"VerbACxSS: su verbi analitici, complessit\u00e0, verbi sintetici, e semplificazione. Per l\u2019accessibilit\u00e0.\" funded by the Italian Ministry of University and Research (MUR) under the PRIN 2020 program.\n\n## License\n[MIT](https://choosealicense.com/licenses/mit/)\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Italian ATS Evaluator",
"version": "2.0.5",
"project_urls": {
"Issues": "https://github.com/RedHitMark/italian-ats-evaluator/issues",
"Repository": "https://github.com/RedHitMark/italian-ats-evaluator"
},
"split_keywords": [
"ats",
" text",
" simplification",
" italian",
" nlp"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "959cb6dfbcbd7eaf2bea866625d680d2cf4a3910041c2592e4a3f2b773a5a23e",
"md5": "2a5e4061f91be626689893c81bc5b84f",
"sha256": "359f7dd94ceff343b06721cd65de1c87a7db7e9a78db025f64b4f1f6cb523f0a"
},
"downloads": -1,
"filename": "italian_ats_evaluator-2.0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2a5e4061f91be626689893c81bc5b84f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 35090,
"upload_time": "2024-05-20T17:47:24",
"upload_time_iso_8601": "2024-05-20T17:47:24.546220Z",
"url": "https://files.pythonhosted.org/packages/95/9c/b6dfbcbd7eaf2bea866625d680d2cf4a3910041c2592e4a3f2b773a5a23e/italian_ats_evaluator-2.0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "42ace67782f22e12d8146408439af78ed1f2e7119fae4d5cdf2b30302edb9d72",
"md5": "4d963003bb24996c80c8a6d4a54909a0",
"sha256": "4f7a8e5250684a48a19f1d0257778f38d75ec0d4f88ca0b70ceef5fe3c0a7162"
},
"downloads": -1,
"filename": "italian_ats_evaluator-2.0.5.tar.gz",
"has_sig": false,
"md5_digest": "4d963003bb24996c80c8a6d4a54909a0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 34095,
"upload_time": "2024-05-20T17:47:26",
"upload_time_iso_8601": "2024-05-20T17:47:26.281977Z",
"url": "https://files.pythonhosted.org/packages/42/ac/e67782f22e12d8146408439af78ed1f2e7119fae4d5cdf2b30302edb9d72/italian_ats_evaluator-2.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-20 17:47:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "RedHitMark",
"github_project": "italian-ats-evaluator",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "italian-ats-evaluator"
}