<div id="top"></div>
<!--
*** Thanks for checking out the Best-README-Template. If you have a suggestion
*** that would make this better, please fork the repo and create a pull request
*** or simply open an issue with the tag "enhancement".
*** Don't forget to give the project a star!
*** Thanks again! Now go create something AMAZING! :D
-->
<!-- PROJECT SHIELDS -->
<!--
*** I'm using markdown "reference style" links for readability.
*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).
*** See the bottom of this document for the declaration of the reference variables
*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.
*** https://www.markdownguide.org/basic-syntax/#reference-style-links
-->
[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Stargazers][stars-shield]][stars-url]
[![Issues][issues-shield]][issues-url]
[![MIT License][license-shield]][license-url]
[![Downloads](https://static.pepy.tech/personalized-badge/morphemes?period=total&units=international_system&left_color=brightgreen&right_color=blue&left_text=Downloads)](https://pepy.tech/project/morphemes)
<!-- PROJECT LOGO -->
<br />
<div align="center">
<a href="https://github.com/github_username/repo_name">
<img src="https://raw.githubusercontent.com/ecscstatsconsulting/morphemes/main/images/morphemes-logo.png" alt="Logo" width="200" height="200">
</a>
<h3 align="center">morphemes</h3>
<p align="center">
A practical Python Library for identifying morphemes in the english language.
<br />
[//]: # ( <a href="https://github.com/ecscstatsconsulting/morphemes"><strong>Explore the docs »</strong></a>)
[//]: # ( <br />)
<br />
[//]: # ( <a href="https://github.com/github_username/repo_name">View Demo</a>)
[//]: # ( ·)
<a href="https://github.com/ecscstatsconsulting/morphemes/issues">Report Bug</a>
·
<a href="https://github.com/ecscstatsconsulting/morphemes/issues">Request Feature</a>
</p>
</div>
<!-- TABLE OF CONTENTS -->
<details>
<summary>Table of Contents</summary>
<ol>
<li>
<a href="#about-the-project">About The Project</a>
<ul>
<li><a href="#built-with">Built With</a></li>
</ul>
</li>
<li>
<a href="#getting-started">Getting Started</a>
<ul>
<li><a href="#prerequisites">Prerequisites</a></li>
<li><a href="#installation">Installation</a></li>
</ul>
</li>
<li><a href="#usage">Usage</a></li>
<li><a href="#roadmap">Roadmap</a></li>
<li><a href="#contributing">Contributing</a></li>
<li><a href="#license">License</a></li>
<li><a href="#contact">Contact</a></li>
<li><a href="#acknowledgments">Acknowledgments</a></li>
</ol>
</details>
<!-- ABOUT THE PROJECT -->
## About The Project
A simple and practical solution for obtaining morpheme information
for a word. The majority of the logic uses a simple lookup strategy
based off of the [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)
project. Unknown's ie. names of people & places are all counted as 1 morpheme.
This is a non-contextual solution intended to feed more complex logic for NLP.
<p align="right">(<a href="#top">back to top</a>)</p>
### Built With
* [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)
* [tinydb](https://tinydb.readthedocs.io/en/latest/)
* [pandas](https://pandas.pydata.org/)
<p align="right">(<a href="#top">back to top</a>)</p>
<!-- GETTING STARTED -->
## Getting Started
Using this library is fairly routine and easy. More detail will be added
to this section as we get closer to the first release.
### Prerequisites
This project was developed with Python 3.9 other versions of Python 3
*should* work.
### Installation
```sh
pip install morphemes
```
<p align="right">(<a href="#top">back to top</a>)</p>
<!-- USAGE EXAMPLES -->
## Usage
Using the morphemes library is very simple.
1. Import the library
2. Create an instance of the `Morphemes` class
1. Optional - Specify a data path where the morphemes database will be stored. If no data path is specified local app storage will be used.
3. Use the library by calling the `parse` function.
Example:
```python
from morphemes import Morphemes
path = "./data"
m = Morphemes(path) #Data path is optional, local storage will be used if left out.
print(m.parse("organizationally"))
```
Output:
```json
{
"word": "organizationally",
"status": "FOUND_IN_DATABASE",
"morpheme_count": 5,
"tree": [
{
"children": [
{
"text": "organ",
"type": "root"
},
{
"text": "ize",
"type": "bound"
}
],
"type": "free"
},
{
"text": "ion",
"type": "bound"
},
{
"text": "al",
"type": "bound"
},
{
"text": "ly",
"type": "bound"
}
]
}
```
Types definition:
- root: Root value of the word (some morphemes may have multiple roots (example: milkshake)
- bound: adds to the root morphemes. Does not contribute meaning on it's own.
- free: A word which can be used on its own. There can be multiple free types in a single morphem (example: milkshake)
Words which are not found are marked with status `NOT_FOUND` and will default
to 1 morpheme. This will be improved in future releases.
NOTE: the `data` path specified is where the morphemes library will
store a database containing morphemes from [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)
along with other lookups to help properly detect morphemes.
<p align="right">(<a href="#top">back to top</a>)</p>
<!-- ROADMAP -->
## Roadmap
- [X] Morpheme detection of known words
- [X] Handling of common names and places (counted as 1 morpheme)
- [ ] Handling of unknown words
See the [open issues](https://github.com/ecscstatsconsulting/morphemes/issues) for a full list of
proposed features (and known issues).
<p align="right">(<a href="#top">back to top</a>)</p>
## Developers
Clone the repo and use the Make file to build a local version:
`make install`
<!-- CONTRIBUTING -->
## Contributing
Contributions are what make the open source community such an amazing
place to learn, inspire, and create. Any contributions you make are
**greatly appreciated**.
Do you want other languages supported? Are you an fluent speaker of the
language you want? Help contribute and grow this project in to a more
universal morpheme solution!
If you have a suggestion that would make this better, please fork the repo
and create a pull request. You can also simply open an issue with the tag
"enhancement". Don't forget to give the project a star! Thanks again!
1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
<p align="right">(<a href="#top">back to top</a>)</p>
<!-- LICENSE -->
## License
Distributed under the MIT License. See `LICENSE.txt` for more information.
<p align="right">(<a href="#top">back to top</a>)</p>
<!-- CONTACT -->
## Contact
ECSC, ltd - ecsctechdepartment@gmail.com
Project Link: [https://github.com/ecscstatsconsulting/morphemes](https://github.com/ecscstatsconsulting/morphemes)
<p align="right">(<a href="#top">back to top</a>)</p>
<!-- ACKNOWLEDGMENTS -->
## Acknowledgments
* [Enkeleda Cuko]()
* [Paul Warren](https://github.com/paul0warren)
<p align="right">(<a href="#top">back to top</a>)</p>
<!-- MARKDOWN LINKS & IMAGES -->
<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
[contributors-shield]: https://img.shields.io/github/contributors/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[contributors-url]: https://github.com/ecscstatsconsulting/morphemes/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[forks-url]: https://github.com/ecscstatsconsulting/morphemes/network/members
[stars-shield]: https://img.shields.io/github/stars/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[stars-url]: https://github.com/ecscstatsconsulting/morphemes/stargazers
[issues-shield]: https://img.shields.io/github/issues/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[issues-url]: https://github.com/ecscstatsconsulting/morphemes/issues
[license-shield]: https://img.shields.io/github/license/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[license-url]: https://github.com/ecscstatsconsulting/morphemes/blob/master/LICENSE.txt
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
[//]: # ([linkedin-url]: https://linkedin.com/in/linkedin_username)
[//]: # ([product-screenshot]: images/screenshot.png)
Raw data
{
"_id": null,
"home_page": "https://github.com/ecscstatsconsulting/morphemes",
"name": "morphemes",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "morpheme,morphology,nlp",
"author": "Enkeleda Cuko & Paul Warren",
"author_email": "ecsctechdepartment@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/e4/15/baf404685806e358dcb8b1658f13dcd03fda8045a81393c234de9d124edd/morphemes-1.2.0.tar.gz",
"platform": null,
"description": "<div id=\"top\"></div>\n<!--\n*** Thanks for checking out the Best-README-Template. If you have a suggestion\n*** that would make this better, please fork the repo and create a pull request\n*** or simply open an issue with the tag \"enhancement\".\n*** Don't forget to give the project a star!\n*** Thanks again! Now go create something AMAZING! :D\n-->\n\n\n\n<!-- PROJECT SHIELDS -->\n<!--\n*** I'm using markdown \"reference style\" links for readability.\n*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).\n*** See the bottom of this document for the declaration of the reference variables\n*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.\n*** https://www.markdownguide.org/basic-syntax/#reference-style-links\n-->\n[![Contributors][contributors-shield]][contributors-url]\n[![Forks][forks-shield]][forks-url]\n[![Stargazers][stars-shield]][stars-url]\n[![Issues][issues-shield]][issues-url]\n[![MIT License][license-shield]][license-url]\n\n[![Downloads](https://static.pepy.tech/personalized-badge/morphemes?period=total&units=international_system&left_color=brightgreen&right_color=blue&left_text=Downloads)](https://pepy.tech/project/morphemes)\n\n\n\n<!-- PROJECT LOGO -->\n<br />\n<div align=\"center\">\n <a href=\"https://github.com/github_username/repo_name\">\n <img src=\"https://raw.githubusercontent.com/ecscstatsconsulting/morphemes/main/images/morphemes-logo.png\" alt=\"Logo\" width=\"200\" height=\"200\">\n </a>\n\n<h3 align=\"center\">morphemes</h3>\n\n <p align=\"center\">\n A practical Python Library for identifying morphemes in the english language.\n <br />\n\n[//]: # ( <a href=\"https://github.com/ecscstatsconsulting/morphemes\"><strong>Explore the docs \u00bb</strong></a>)\n\n[//]: # ( <br />)\n <br />\n\n[//]: # ( <a href=\"https://github.com/github_username/repo_name\">View Demo</a>)\n\n[//]: # ( \u00b7)\n <a href=\"https://github.com/ecscstatsconsulting/morphemes/issues\">Report Bug</a>\n \u00b7\n <a href=\"https://github.com/ecscstatsconsulting/morphemes/issues\">Request Feature</a>\n </p>\n</div>\n\n\n\n<!-- TABLE OF CONTENTS -->\n<details>\n <summary>Table of Contents</summary>\n <ol>\n <li>\n <a href=\"#about-the-project\">About The Project</a>\n <ul>\n <li><a href=\"#built-with\">Built With</a></li>\n </ul>\n </li>\n <li>\n <a href=\"#getting-started\">Getting Started</a>\n <ul>\n <li><a href=\"#prerequisites\">Prerequisites</a></li>\n <li><a href=\"#installation\">Installation</a></li>\n </ul>\n </li>\n <li><a href=\"#usage\">Usage</a></li>\n <li><a href=\"#roadmap\">Roadmap</a></li>\n <li><a href=\"#contributing\">Contributing</a></li>\n <li><a href=\"#license\">License</a></li>\n <li><a href=\"#contact\">Contact</a></li>\n <li><a href=\"#acknowledgments\">Acknowledgments</a></li>\n </ol>\n</details>\n\n\n\n<!-- ABOUT THE PROJECT -->\n## About The Project\n\nA simple and practical solution for obtaining morpheme information\nfor a word. The majority of the logic uses a simple lookup strategy\nbased off of the [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)\nproject. Unknown's ie. names of people & places are all counted as 1 morpheme. \nThis is a non-contextual solution intended to feed more complex logic for NLP.\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n### Built With\n\n* [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)\n* [tinydb](https://tinydb.readthedocs.io/en/latest/)\n* [pandas](https://pandas.pydata.org/)\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- GETTING STARTED -->\n## Getting Started\n\nUsing this library is fairly routine and easy. More detail will be added\nto this section as we get closer to the first release.\n\n### Prerequisites\n\nThis project was developed with Python 3.9 other versions of Python 3 \n*should* work.\n\n### Installation\n\n ```sh\n pip install morphemes\n ```\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- USAGE EXAMPLES -->\n## Usage\nUsing the morphemes library is very simple.\n1. Import the library\n2. Create an instance of the `Morphemes` class \n 1. Optional - Specify a data path where the morphemes database will be stored. If no data path is specified local app storage will be used.\n3. Use the library by calling the `parse` function.\n\nExample:\n```python\nfrom morphemes import Morphemes\n\npath = \"./data\"\n\nm = Morphemes(path) #Data path is optional, local storage will be used if left out.\nprint(m.parse(\"organizationally\"))\n```\nOutput:\n```json\n{\n \"word\": \"organizationally\",\n \"status\": \"FOUND_IN_DATABASE\",\n \"morpheme_count\": 5,\n \"tree\": [\n {\n \"children\": [\n {\n \"text\": \"organ\",\n \"type\": \"root\"\n },\n {\n \"text\": \"ize\",\n \"type\": \"bound\"\n }\n ],\n \"type\": \"free\"\n },\n {\n \"text\": \"ion\",\n \"type\": \"bound\"\n },\n {\n \"text\": \"al\",\n \"type\": \"bound\"\n },\n {\n \"text\": \"ly\",\n \"type\": \"bound\"\n }\n ]\n}\n```\n\nTypes definition:\n - root: Root value of the word (some morphemes may have multiple roots (example: milkshake)\n - bound: adds to the root morphemes. Does not contribute meaning on it's own.\n - free: A word which can be used on its own. There can be multiple free types in a single morphem (example: milkshake)\n\nWords which are not found are marked with status `NOT_FOUND` and will default\nto 1 morpheme. This will be improved in future releases.\n\nNOTE: the `data` path specified is where the morphemes library will\nstore a database containing morphemes from [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)\nalong with other lookups to help properly detect morphemes.\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- ROADMAP -->\n## Roadmap\n\n- [X] Morpheme detection of known words\n- [X] Handling of common names and places (counted as 1 morpheme)\n- [ ] Handling of unknown words\n\n\nSee the [open issues](https://github.com/ecscstatsconsulting/morphemes/issues) for a full list of \nproposed features (and known issues).\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n## Developers\n\nClone the repo and use the Make file to build a local version:\n`make install`\n\n<!-- CONTRIBUTING -->\n## Contributing\n\nContributions are what make the open source community such an amazing \nplace to learn, inspire, and create. Any contributions you make are \n**greatly appreciated**.\n\nDo you want other languages supported? Are you an fluent speaker of the\nlanguage you want? Help contribute and grow this project in to a more\nuniversal morpheme solution!\n\nIf you have a suggestion that would make this better, please fork the repo \nand create a pull request. You can also simply open an issue with the tag \n\"enhancement\". Don't forget to give the project a star! Thanks again!\n\n1. Fork the Project\n2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the Branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- LICENSE -->\n## License\n\nDistributed under the MIT License. See `LICENSE.txt` for more information.\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- CONTACT -->\n## Contact\n\nECSC, ltd - ecsctechdepartment@gmail.com\n\nProject Link: [https://github.com/ecscstatsconsulting/morphemes](https://github.com/ecscstatsconsulting/morphemes)\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- ACKNOWLEDGMENTS -->\n## Acknowledgments\n\n* [Enkeleda Cuko]()\n* [Paul Warren](https://github.com/paul0warren)\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- MARKDOWN LINKS & IMAGES -->\n<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->\n[contributors-shield]: https://img.shields.io/github/contributors/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[contributors-url]: https://github.com/ecscstatsconsulting/morphemes/graphs/contributors\n[forks-shield]: https://img.shields.io/github/forks/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[forks-url]: https://github.com/ecscstatsconsulting/morphemes/network/members\n[stars-shield]: https://img.shields.io/github/stars/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[stars-url]: https://github.com/ecscstatsconsulting/morphemes/stargazers\n[issues-shield]: https://img.shields.io/github/issues/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[issues-url]: https://github.com/ecscstatsconsulting/morphemes/issues\n[license-shield]: https://img.shields.io/github/license/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[license-url]: https://github.com/ecscstatsconsulting/morphemes/blob/master/LICENSE.txt\n[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555\n\n[//]: # ([linkedin-url]: https://linkedin.com/in/linkedin_username)\n[//]: # ([product-screenshot]: images/screenshot.png)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A practical Python Library for identifying morphemes in the english language.",
"version": "1.2.0",
"split_keywords": [
"morpheme",
"morphology",
"nlp"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "bc6dd687412c3e1d4e7d63d995cf94268786b3bed4a12aad0b3c5e7e37940a34",
"md5": "2efe68d13e2efdb8e5b24f7bdddb155d",
"sha256": "170898e90b72997d16b11406e54e736cb2cc3302a7f5c4c06811b0abe43ab947"
},
"downloads": -1,
"filename": "morphemes-1.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2efe68d13e2efdb8e5b24f7bdddb155d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 10170,
"upload_time": "2023-03-11T15:14:16",
"upload_time_iso_8601": "2023-03-11T15:14:16.739617Z",
"url": "https://files.pythonhosted.org/packages/bc/6d/d687412c3e1d4e7d63d995cf94268786b3bed4a12aad0b3c5e7e37940a34/morphemes-1.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e415baf404685806e358dcb8b1658f13dcd03fda8045a81393c234de9d124edd",
"md5": "f89571552274b92c52da536b0de67079",
"sha256": "14637571ea020c3c4ce1b4483ed9a3d817471d09fba96304781959bf27e022da"
},
"downloads": -1,
"filename": "morphemes-1.2.0.tar.gz",
"has_sig": false,
"md5_digest": "f89571552274b92c52da536b0de67079",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 11429,
"upload_time": "2023-03-11T15:14:18",
"upload_time_iso_8601": "2023-03-11T15:14:18.073139Z",
"url": "https://files.pythonhosted.org/packages/e4/15/baf404685806e358dcb8b1658f13dcd03fda8045a81393c234de9d124edd/morphemes-1.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-11 15:14:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "ecscstatsconsulting",
"github_project": "morphemes",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "morphemes"
}