italian-ats-evaluator


Nameitalian-ats-evaluator JSON
Version 2.0.6 PyPI version JSON
download
home_pageNone
SummaryItalian ATS Evaluator
upload_time2024-07-13 15:18:22
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT License
keywords ats text simplification italian nlp
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # italian-ats-evalautor
This is an open source project to evaluate the performance of an italian ATS (Automatic Text Simplifier) on a set of texts.

You can analyze a single text extracting the following features:
- Overall:
  - Number of tokens
  - Number of tokens (including punctuation)
  - Number of characters
  - Number of characters (including punctuation)
  - Number of words
  - Number of syllables
  - Number of unique lemmas
  - Number of sentences
- Readability:
  - Type-Token Ratio (TTR)
  - Gulpease Index
  - Flesch-Vacca Index
  - Lexical Density
- Part of Speech (POS) distribution
- Verbs distribution
  - Active Verbs
  - Passive Verbs
- Italian Basic Vocabulary (NVdB) from [Il Nuovo vocabolario di base della lingua italiana, Tullio De Mauro](https://dizionario.internazionale.it/)
  - All
  - FO (Fundamentals)
  - AU (High Usage)
  - AD (High Availability)


You can also compare two texts and get the following metrics:
- Semantic:
  - Semantic Similarity 
- Character diff:
  - Edit Distance
- Token diff:
  - Amount of tokens added
  - Amount of tokens removed
  - Amount of VdB tokens removed
  - Amount of VdB tokens added


## Installation
```bash
pip install italian-ats-evaluator
```

## Usage

```python
from italian_ats_evaluator import TextAnalyzer

result = TextAnalyzer(
  text="Il gatto mangia il topo",
  spacy_model_name="it_core_news_lg"
)
```

```python
from italian_ats_evaluator import SimplificationAnalyzer

result =  SimplificationAnalyzer(
  reference_text="Il felino mangia il roditore",
  simplified_text="Il gatto mangia il topo",
  spacy_model_name="it_core_news_lg",
  sentence_transformers_model_name="intfloat/multilingual-e5-base"
)
```

## Development
Create a virtual environment
```bash
python3 -m venv venv
source venv/bin/activate
```
Install the package in editable mode
```bash
pip install -e .
```

## Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.


## Acknowledgements
This contribution is a result of the research conducted within the framework of the PRIN 2020 (Progetti di Rilevante Interesse Nazionale) “VerbACxSS: on analytic verbs, complexity, synthetic verbs, and simplification. For accessibility” (Prot. 2020BJKB9M), funded by the Italian Ministero dell’Università e della Ricerca.

## License
[MIT](https://choosealicense.com/licenses/mit/)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "italian-ats-evaluator",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "RedHitMark <russodivito.marco@gmail.com>",
    "keywords": "ats, text, simplification, italian, nlp",
    "author": null,
    "author_email": "RedHitMark <russodivito.marco@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/17/5a/c694957e2c77e3a4d0d4aceaa12b7960439dc353d6fb95874a9f99e358c9/italian_ats_evaluator-2.0.6.tar.gz",
    "platform": null,
    "description": "# italian-ats-evalautor\nThis is an open source project to evaluate the performance of an italian ATS (Automatic Text Simplifier) on a set of texts.\n\nYou can analyze a single text extracting the following features:\n- Overall:\n  - Number of tokens\n  - Number of tokens (including punctuation)\n  - Number of characters\n  - Number of characters (including punctuation)\n  - Number of words\n  - Number of syllables\n  - Number of unique lemmas\n  - Number of sentences\n- Readability:\n  - Type-Token Ratio (TTR)\n  - Gulpease Index\n  - Flesch-Vacca Index\n  - Lexical Density\n- Part of Speech (POS) distribution\n- Verbs distribution\n  - Active Verbs\n  - Passive Verbs\n- Italian Basic Vocabulary (NVdB) from [Il Nuovo vocabolario di base della lingua italiana, Tullio De Mauro](https://dizionario.internazionale.it/)\n  - All\n  - FO (Fundamentals)\n  - AU (High Usage)\n  - AD (High Availability)\n\n\nYou can also compare two texts and get the following metrics:\n- Semantic:\n  - Semantic Similarity \n- Character diff:\n  - Edit Distance\n- Token diff:\n  - Amount of tokens added\n  - Amount of tokens removed\n  - Amount of VdB tokens removed\n  - Amount of VdB tokens added\n\n\n## Installation\n```bash\npip install italian-ats-evaluator\n```\n\n## Usage\n\n```python\nfrom italian_ats_evaluator import TextAnalyzer\n\nresult = TextAnalyzer(\n  text=\"Il gatto mangia il topo\",\n  spacy_model_name=\"it_core_news_lg\"\n)\n```\n\n```python\nfrom italian_ats_evaluator import SimplificationAnalyzer\n\nresult =  SimplificationAnalyzer(\n  reference_text=\"Il felino mangia il roditore\",\n  simplified_text=\"Il gatto mangia il topo\",\n  spacy_model_name=\"it_core_news_lg\",\n  sentence_transformers_model_name=\"intfloat/multilingual-e5-base\"\n)\n```\n\n## Development\nCreate a virtual environment\n```bash\npython3 -m venv venv\nsource venv/bin/activate\n```\nInstall the package in editable mode\n```bash\npip install -e .\n```\n\n## Contributing\nPull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.\n\n\n## Acknowledgements\nThis contribution is a result of the research conducted within the framework of the PRIN 2020 (Progetti di Rilevante Interesse Nazionale) \u201cVerbACxSS: on analytic verbs, complexity, synthetic verbs, and simplification. For accessibility\u201d (Prot. 2020BJKB9M), funded by the Italian Ministero dell\u2019Universit\u00e0 e della Ricerca.\n\n## License\n[MIT](https://choosealicense.com/licenses/mit/)\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Italian ATS Evaluator",
    "version": "2.0.6",
    "project_urls": {
        "Issues": "https://github.com/RedHitMark/italian-ats-evaluator/issues",
        "Repository": "https://github.com/RedHitMark/italian-ats-evaluator"
    },
    "split_keywords": [
        "ats",
        " text",
        " simplification",
        " italian",
        " nlp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bf8112b59becfee7e8709809d26f9451189777f01400526feec1814276a2dba8",
                "md5": "1758480cf8fb134ab3cf8bf0d95b2321",
                "sha256": "edaceadfeaa512b742cd7eca057cf8be5bedae4c53d9e5f74d9aea36cd589532"
            },
            "downloads": -1,
            "filename": "italian_ats_evaluator-2.0.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1758480cf8fb134ab3cf8bf0d95b2321",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 35408,
            "upload_time": "2024-07-13T15:18:20",
            "upload_time_iso_8601": "2024-07-13T15:18:20.631498Z",
            "url": "https://files.pythonhosted.org/packages/bf/81/12b59becfee7e8709809d26f9451189777f01400526feec1814276a2dba8/italian_ats_evaluator-2.0.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "175ac694957e2c77e3a4d0d4aceaa12b7960439dc353d6fb95874a9f99e358c9",
                "md5": "51b8167420509b526e32da0c0417ee5d",
                "sha256": "167d9799b7a26fb335d8fe709f881d18cb91e833873229331ed586f22062fbbe"
            },
            "downloads": -1,
            "filename": "italian_ats_evaluator-2.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "51b8167420509b526e32da0c0417ee5d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 34445,
            "upload_time": "2024-07-13T15:18:22",
            "upload_time_iso_8601": "2024-07-13T15:18:22.250182Z",
            "url": "https://files.pythonhosted.org/packages/17/5a/c694957e2c77e3a4d0d4aceaa12b7960439dc353d6fb95874a9f99e358c9/italian_ats_evaluator-2.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-13 15:18:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "RedHitMark",
    "github_project": "italian-ats-evaluator",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "italian-ats-evaluator"
}
        
Elapsed time: 0.85034s