# glem
GLEM is a lemmatizer for Ancient Greek.
It has been created in the project [Unraveling the Language of Perspective](http://ncs.ruhosting.nl/perspective/), which is supported by the EU under FP7, ERC Starting Grant 338421-Perspective.
The paper 'A memory-based lemmatizer for Ancient Greek' reports on how it works, what material it uses, and what the accuracy is. It can be found in the repository and at http://dl.acm.org/citation.cfm?id=3078100.
A webservice where you can upload texts that you want to have lemmatized can be found at https://webservices.cls.ru.nl/. Or you can host your own.
## Dependencies
Just **Python 3** for the simple word list based lemmatizer.
To add machine learning based lemmatization that also takes into account the context, glem uses [Frog](https://languagemachines.github.io/frog/) via its [python binding](https://github.com/proycon/python-frog).
## Installation
Run: ``pip install .``
We recommend using a Python virtual environment of your own. Alternatively for a global installation,
prepend ``sudo``.
## Example usage
Glem comes with a pretrained model, based on lemmas chosen by humans (in the UiO PROIEL project, PI: Dag Haug), for Herodotus. You can use it (with or without Frog) as follows:
```
glem -f input.txt
```
The files for this model can be found in ``glem/pretrained_models/herodotus`` .
## Webservice
A ``Dockerfile`` is provided for deployment of the GLEM webservice in production environments.
From the repository root, build as follows:
``
$ docker build -t webglem .
``
Consult the [Dockerfile](Dockerfile) for various build-time parameters that you may want to set for your own production environment.
When running, mount the path where you want the user data stored into the container, a directory `webglem-userdata` will be created here:
``
$ docker run -p 8080:80 -v /path/to/data/dir:/data webglem
``
Raw data
{
"_id": null,
"home_page": "https://github.com/GreekPerspective/glem",
"name": "Glem",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "nlp computational_linguistics entities linguistics ancient_greek,lemmatizer lemmatization frog clam webservice rest",
"author": "Corien Bary, Iris Hendrickx, Peter Berck, Wessel Stoop",
"author_email": "c.bary@let.ru.nl",
"download_url": "https://files.pythonhosted.org/packages/4c/6f/8007038ddb98373cfc249421d284d5e58f042b3840c74dca55cdaea61611/Glem-1.3.1.tar.gz",
"platform": null,
"description": "# glem\n\nGLEM is a lemmatizer for Ancient Greek.\n\nIt has been created in the project [Unraveling the Language of Perspective](http://ncs.ruhosting.nl/perspective/), which is supported by the EU under FP7, ERC Starting Grant 338421-Perspective.\n\nThe paper 'A memory-based lemmatizer for Ancient Greek' reports on how it works, what material it uses, and what the accuracy is. It can be found in the repository and at http://dl.acm.org/citation.cfm?id=3078100.\n\nA webservice where you can upload texts that you want to have lemmatized can be found at https://webservices.cls.ru.nl/. Or you can host your own.\n\n## Dependencies\n\nJust **Python 3** for the simple word list based lemmatizer.\n\nTo add machine learning based lemmatization that also takes into account the context, glem uses [Frog](https://languagemachines.github.io/frog/) via its [python binding](https://github.com/proycon/python-frog).\n\n## Installation\n\nRun: ``pip install .``\n\nWe recommend using a Python virtual environment of your own. Alternatively for a global installation,\nprepend ``sudo``.\n\n\n## Example usage\n\nGlem comes with a pretrained model, based on lemmas chosen by humans (in the UiO PROIEL project, PI: Dag Haug), for Herodotus. You can use it (with or without Frog) as follows:\n\n```\nglem -f input.txt\n```\n\nThe files for this model can be found in ``glem/pretrained_models/herodotus`` .\n\n## Webservice\n\nA ``Dockerfile`` is provided for deployment of the GLEM webservice in production environments.\n\nFrom the repository root, build as follows:\n\n``\n$ docker build -t webglem .\n``\n\nConsult the [Dockerfile](Dockerfile) for various build-time parameters that you may want to set for your own production environment.\n\nWhen running, mount the path where you want the user data stored into the container, a directory `webglem-userdata` will be created here:\n\n``\n$ docker run -p 8080:80 -v /path/to/data/dir:/data webglem\n``\n",
"bugtrack_url": null,
"license": "GPL-3.0-only",
"summary": "GLEM is a lemmatizer for Ancient Greek.",
"version": "1.3.1",
"project_urls": {
"Homepage": "https://github.com/GreekPerspective/glem"
},
"split_keywords": [
"nlp computational_linguistics entities linguistics ancient_greek",
"lemmatizer lemmatization frog clam webservice rest"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4c6f8007038ddb98373cfc249421d284d5e58f042b3840c74dca55cdaea61611",
"md5": "977c3997cfba9dc0554a00dc4260431c",
"sha256": "9830c874caec401af36a43231c41de79a63c0e05424d17a9b344ad9a3709357f"
},
"downloads": -1,
"filename": "Glem-1.3.1.tar.gz",
"has_sig": false,
"md5_digest": "977c3997cfba9dc0554a00dc4260431c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 6514808,
"upload_time": "2023-10-05T12:43:45",
"upload_time_iso_8601": "2023-10-05T12:43:45.379648Z",
"url": "https://files.pythonhosted.org/packages/4c/6f/8007038ddb98373cfc249421d284d5e58f042b3840c74dca55cdaea61611/Glem-1.3.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-05 12:43:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "GreekPerspective",
"github_project": "glem",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "glem"
}