morphemes


Namemorphemes JSON
Version 1.2.0 PyPI version JSON
download
home_pagehttps://github.com/ecscstatsconsulting/morphemes
SummaryA practical Python Library for identifying morphemes in the english language.
upload_time2023-03-11 15:14:18
maintainer
docs_urlNone
authorEnkeleda Cuko & Paul Warren
requires_python
licenseMIT
keywords morpheme morphology nlp
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div id="top"></div>
<!--
*** Thanks for checking out the Best-README-Template. If you have a suggestion
*** that would make this better, please fork the repo and create a pull request
*** or simply open an issue with the tag "enhancement".
*** Don't forget to give the project a star!
*** Thanks again! Now go create something AMAZING! :D
-->



<!-- PROJECT SHIELDS -->
<!--
*** I'm using markdown "reference style" links for readability.
*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).
*** See the bottom of this document for the declaration of the reference variables
*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.
*** https://www.markdownguide.org/basic-syntax/#reference-style-links
-->
[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Stargazers][stars-shield]][stars-url]
[![Issues][issues-shield]][issues-url]
[![MIT License][license-shield]][license-url]

[![Downloads](https://static.pepy.tech/personalized-badge/morphemes?period=total&units=international_system&left_color=brightgreen&right_color=blue&left_text=Downloads)](https://pepy.tech/project/morphemes)



<!-- PROJECT LOGO -->
<br />
<div align="center">
  <a href="https://github.com/github_username/repo_name">
    <img src="https://raw.githubusercontent.com/ecscstatsconsulting/morphemes/main/images/morphemes-logo.png" alt="Logo" width="200" height="200">
  </a>

<h3 align="center">morphemes</h3>

  <p align="center">
    A practical Python Library for identifying morphemes in the english language.
    <br />

[//]: # (    <a href="https://github.com/ecscstatsconsulting/morphemes"><strong>Explore the docs »</strong></a>)

[//]: # (    <br />)
    <br />

[//]: # (    <a href="https://github.com/github_username/repo_name">View Demo</a>)

[//]: # (    ·)
    <a href="https://github.com/ecscstatsconsulting/morphemes/issues">Report Bug</a>
    ·
    <a href="https://github.com/ecscstatsconsulting/morphemes/issues">Request Feature</a>
  </p>
</div>



<!-- TABLE OF CONTENTS -->
<details>
  <summary>Table of Contents</summary>
  <ol>
    <li>
      <a href="#about-the-project">About The Project</a>
      <ul>
        <li><a href="#built-with">Built With</a></li>
      </ul>
    </li>
    <li>
      <a href="#getting-started">Getting Started</a>
      <ul>
        <li><a href="#prerequisites">Prerequisites</a></li>
        <li><a href="#installation">Installation</a></li>
      </ul>
    </li>
    <li><a href="#usage">Usage</a></li>
    <li><a href="#roadmap">Roadmap</a></li>
    <li><a href="#contributing">Contributing</a></li>
    <li><a href="#license">License</a></li>
    <li><a href="#contact">Contact</a></li>
    <li><a href="#acknowledgments">Acknowledgments</a></li>
  </ol>
</details>



<!-- ABOUT THE PROJECT -->
## About The Project

A simple and practical solution for obtaining morpheme information
for a word.  The majority of the logic uses a simple lookup strategy
based off of the [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)
project.  Unknown's ie. names of people & places are all counted as 1 morpheme.  
This is a non-contextual solution intended to feed more complex logic for NLP.

<p align="right">(<a href="#top">back to top</a>)</p>



### Built With

* [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)
* [tinydb](https://tinydb.readthedocs.io/en/latest/)
* [pandas](https://pandas.pydata.org/)

<p align="right">(<a href="#top">back to top</a>)</p>



<!-- GETTING STARTED -->
## Getting Started

Using this library is fairly routine and easy.  More detail will be added
to this section as we get closer to the first release.

### Prerequisites

This project was developed with Python 3.9 other versions of Python 3 
*should* work.

### Installation

  ```sh
  pip install morphemes
  ```

<p align="right">(<a href="#top">back to top</a>)</p>



<!-- USAGE EXAMPLES -->
## Usage
Using the morphemes library is very simple.
1. Import the library
2. Create an instance of the `Morphemes` class 
   1. Optional - Specify a data path where the morphemes database will be stored.  If no data path is specified local app storage will be used.
3. Use the library by calling the `parse` function.

Example:
```python
from morphemes import Morphemes

path = "./data"

m = Morphemes(path) #Data path is optional, local storage will be used if left out.
print(m.parse("organizationally"))
```
Output:
```json
{
  "word": "organizationally",
  "status": "FOUND_IN_DATABASE",
  "morpheme_count": 5,
  "tree": [
    {
      "children": [
        {
          "text": "organ",
          "type": "root"
        },
        {
          "text": "ize",
          "type": "bound"
        }
      ],
      "type": "free"
    },
    {
      "text": "ion",
      "type": "bound"
    },
    {
      "text": "al",
      "type": "bound"
    },
    {
      "text": "ly",
      "type": "bound"
    }
  ]
}
```

Types definition:
 - root: Root value of the word (some morphemes may have multiple roots (example: milkshake)
 - bound: adds to the root morphemes.  Does not contribute meaning on it's own.
 - free: A word which can be used on its own.  There can be multiple free types in a single morphem (example: milkshake)

Words which are not found are marked with status `NOT_FOUND` and will default
to 1 morpheme.  This will be improved in future releases.

NOTE: the `data` path specified is where the morphemes library will
store a database containing morphemes from [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)
along with other lookups to help properly detect morphemes.

<p align="right">(<a href="#top">back to top</a>)</p>



<!-- ROADMAP -->
## Roadmap

- [X] Morpheme detection of known words
- [X] Handling of common names and places (counted as 1 morpheme)
- [ ] Handling of unknown words


See the [open issues](https://github.com/ecscstatsconsulting/morphemes/issues) for a full list of 
proposed features (and known issues).

<p align="right">(<a href="#top">back to top</a>)</p>

## Developers

Clone the repo and use the Make file to build a local version:
`make install`

<!-- CONTRIBUTING -->
## Contributing

Contributions are what make the open source community such an amazing 
place to learn, inspire, and create. Any contributions you make are 
**greatly appreciated**.

Do you want other languages supported?  Are you an fluent speaker of the
language you want?  Help contribute and grow this project in to a more
universal morpheme solution!

If you have a suggestion that would make this better, please fork the repo 
and create a pull request. You can also simply open an issue with the tag 
"enhancement".  Don't forget to give the project a star! Thanks again!

1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

<p align="right">(<a href="#top">back to top</a>)</p>



<!-- LICENSE -->
## License

Distributed under the MIT License. See `LICENSE.txt` for more information.

<p align="right">(<a href="#top">back to top</a>)</p>



<!-- CONTACT -->
## Contact

ECSC, ltd - ecsctechdepartment@gmail.com

Project Link: [https://github.com/ecscstatsconsulting/morphemes](https://github.com/ecscstatsconsulting/morphemes)

<p align="right">(<a href="#top">back to top</a>)</p>



<!-- ACKNOWLEDGMENTS -->
## Acknowledgments

* [Enkeleda Cuko]()
* [Paul Warren](https://github.com/paul0warren)

<p align="right">(<a href="#top">back to top</a>)</p>



<!-- MARKDOWN LINKS & IMAGES -->
<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
[contributors-shield]: https://img.shields.io/github/contributors/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[contributors-url]: https://github.com/ecscstatsconsulting/morphemes/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[forks-url]: https://github.com/ecscstatsconsulting/morphemes/network/members
[stars-shield]: https://img.shields.io/github/stars/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[stars-url]: https://github.com/ecscstatsconsulting/morphemes/stargazers
[issues-shield]: https://img.shields.io/github/issues/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[issues-url]: https://github.com/ecscstatsconsulting/morphemes/issues
[license-shield]: https://img.shields.io/github/license/ecscstatsconsulting/morphemes.svg?style=for-the-badge
[license-url]: https://github.com/ecscstatsconsulting/morphemes/blob/master/LICENSE.txt
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555

[//]: # ([linkedin-url]: https://linkedin.com/in/linkedin_username)
[//]: # ([product-screenshot]: images/screenshot.png)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ecscstatsconsulting/morphemes",
    "name": "morphemes",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "morpheme,morphology,nlp",
    "author": "Enkeleda Cuko & Paul Warren",
    "author_email": "ecsctechdepartment@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/e4/15/baf404685806e358dcb8b1658f13dcd03fda8045a81393c234de9d124edd/morphemes-1.2.0.tar.gz",
    "platform": null,
    "description": "<div id=\"top\"></div>\n<!--\n*** Thanks for checking out the Best-README-Template. If you have a suggestion\n*** that would make this better, please fork the repo and create a pull request\n*** or simply open an issue with the tag \"enhancement\".\n*** Don't forget to give the project a star!\n*** Thanks again! Now go create something AMAZING! :D\n-->\n\n\n\n<!-- PROJECT SHIELDS -->\n<!--\n*** I'm using markdown \"reference style\" links for readability.\n*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).\n*** See the bottom of this document for the declaration of the reference variables\n*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.\n*** https://www.markdownguide.org/basic-syntax/#reference-style-links\n-->\n[![Contributors][contributors-shield]][contributors-url]\n[![Forks][forks-shield]][forks-url]\n[![Stargazers][stars-shield]][stars-url]\n[![Issues][issues-shield]][issues-url]\n[![MIT License][license-shield]][license-url]\n\n[![Downloads](https://static.pepy.tech/personalized-badge/morphemes?period=total&units=international_system&left_color=brightgreen&right_color=blue&left_text=Downloads)](https://pepy.tech/project/morphemes)\n\n\n\n<!-- PROJECT LOGO -->\n<br />\n<div align=\"center\">\n  <a href=\"https://github.com/github_username/repo_name\">\n    <img src=\"https://raw.githubusercontent.com/ecscstatsconsulting/morphemes/main/images/morphemes-logo.png\" alt=\"Logo\" width=\"200\" height=\"200\">\n  </a>\n\n<h3 align=\"center\">morphemes</h3>\n\n  <p align=\"center\">\n    A practical Python Library for identifying morphemes in the english language.\n    <br />\n\n[//]: # (    <a href=\"https://github.com/ecscstatsconsulting/morphemes\"><strong>Explore the docs \u00bb</strong></a>)\n\n[//]: # (    <br />)\n    <br />\n\n[//]: # (    <a href=\"https://github.com/github_username/repo_name\">View Demo</a>)\n\n[//]: # (    \u00b7)\n    <a href=\"https://github.com/ecscstatsconsulting/morphemes/issues\">Report Bug</a>\n    \u00b7\n    <a href=\"https://github.com/ecscstatsconsulting/morphemes/issues\">Request Feature</a>\n  </p>\n</div>\n\n\n\n<!-- TABLE OF CONTENTS -->\n<details>\n  <summary>Table of Contents</summary>\n  <ol>\n    <li>\n      <a href=\"#about-the-project\">About The Project</a>\n      <ul>\n        <li><a href=\"#built-with\">Built With</a></li>\n      </ul>\n    </li>\n    <li>\n      <a href=\"#getting-started\">Getting Started</a>\n      <ul>\n        <li><a href=\"#prerequisites\">Prerequisites</a></li>\n        <li><a href=\"#installation\">Installation</a></li>\n      </ul>\n    </li>\n    <li><a href=\"#usage\">Usage</a></li>\n    <li><a href=\"#roadmap\">Roadmap</a></li>\n    <li><a href=\"#contributing\">Contributing</a></li>\n    <li><a href=\"#license\">License</a></li>\n    <li><a href=\"#contact\">Contact</a></li>\n    <li><a href=\"#acknowledgments\">Acknowledgments</a></li>\n  </ol>\n</details>\n\n\n\n<!-- ABOUT THE PROJECT -->\n## About The Project\n\nA simple and practical solution for obtaining morpheme information\nfor a word.  The majority of the logic uses a simple lookup strategy\nbased off of the [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)\nproject.  Unknown's ie. names of people & places are all counted as 1 morpheme.  \nThis is a non-contextual solution intended to feed more complex logic for NLP.\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n### Built With\n\n* [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)\n* [tinydb](https://tinydb.readthedocs.io/en/latest/)\n* [pandas](https://pandas.pydata.org/)\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- GETTING STARTED -->\n## Getting Started\n\nUsing this library is fairly routine and easy.  More detail will be added\nto this section as we get closer to the first release.\n\n### Prerequisites\n\nThis project was developed with Python 3.9 other versions of Python 3 \n*should* work.\n\n### Installation\n\n  ```sh\n  pip install morphemes\n  ```\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- USAGE EXAMPLES -->\n## Usage\nUsing the morphemes library is very simple.\n1. Import the library\n2. Create an instance of the `Morphemes` class \n   1. Optional - Specify a data path where the morphemes database will be stored.  If no data path is specified local app storage will be used.\n3. Use the library by calling the `parse` function.\n\nExample:\n```python\nfrom morphemes import Morphemes\n\npath = \"./data\"\n\nm = Morphemes(path) #Data path is optional, local storage will be used if left out.\nprint(m.parse(\"organizationally\"))\n```\nOutput:\n```json\n{\n  \"word\": \"organizationally\",\n  \"status\": \"FOUND_IN_DATABASE\",\n  \"morpheme_count\": 5,\n  \"tree\": [\n    {\n      \"children\": [\n        {\n          \"text\": \"organ\",\n          \"type\": \"root\"\n        },\n        {\n          \"text\": \"ize\",\n          \"type\": \"bound\"\n        }\n      ],\n      \"type\": \"free\"\n    },\n    {\n      \"text\": \"ion\",\n      \"type\": \"bound\"\n    },\n    {\n      \"text\": \"al\",\n      \"type\": \"bound\"\n    },\n    {\n      \"text\": \"ly\",\n      \"type\": \"bound\"\n    }\n  ]\n}\n```\n\nTypes definition:\n - root: Root value of the word (some morphemes may have multiple roots (example: milkshake)\n - bound: adds to the root morphemes.  Does not contribute meaning on it's own.\n - free: A word which can be used on its own.  There can be multiple free types in a single morphem (example: milkshake)\n\nWords which are not found are marked with status `NOT_FOUND` and will default\nto 1 morpheme.  This will be improved in future releases.\n\nNOTE: the `data` path specified is where the morphemes library will\nstore a database containing morphemes from [MorphoLex-en](https://github.com/hugomailhot/MorphoLex-en)\nalong with other lookups to help properly detect morphemes.\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- ROADMAP -->\n## Roadmap\n\n- [X] Morpheme detection of known words\n- [X] Handling of common names and places (counted as 1 morpheme)\n- [ ] Handling of unknown words\n\n\nSee the [open issues](https://github.com/ecscstatsconsulting/morphemes/issues) for a full list of \nproposed features (and known issues).\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n## Developers\n\nClone the repo and use the Make file to build a local version:\n`make install`\n\n<!-- CONTRIBUTING -->\n## Contributing\n\nContributions are what make the open source community such an amazing \nplace to learn, inspire, and create. Any contributions you make are \n**greatly appreciated**.\n\nDo you want other languages supported?  Are you an fluent speaker of the\nlanguage you want?  Help contribute and grow this project in to a more\nuniversal morpheme solution!\n\nIf you have a suggestion that would make this better, please fork the repo \nand create a pull request. You can also simply open an issue with the tag \n\"enhancement\".  Don't forget to give the project a star! Thanks again!\n\n1. Fork the Project\n2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the Branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- LICENSE -->\n## License\n\nDistributed under the MIT License. See `LICENSE.txt` for more information.\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- CONTACT -->\n## Contact\n\nECSC, ltd - ecsctechdepartment@gmail.com\n\nProject Link: [https://github.com/ecscstatsconsulting/morphemes](https://github.com/ecscstatsconsulting/morphemes)\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- ACKNOWLEDGMENTS -->\n## Acknowledgments\n\n* [Enkeleda Cuko]()\n* [Paul Warren](https://github.com/paul0warren)\n\n<p align=\"right\">(<a href=\"#top\">back to top</a>)</p>\n\n\n\n<!-- MARKDOWN LINKS & IMAGES -->\n<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->\n[contributors-shield]: https://img.shields.io/github/contributors/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[contributors-url]: https://github.com/ecscstatsconsulting/morphemes/graphs/contributors\n[forks-shield]: https://img.shields.io/github/forks/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[forks-url]: https://github.com/ecscstatsconsulting/morphemes/network/members\n[stars-shield]: https://img.shields.io/github/stars/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[stars-url]: https://github.com/ecscstatsconsulting/morphemes/stargazers\n[issues-shield]: https://img.shields.io/github/issues/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[issues-url]: https://github.com/ecscstatsconsulting/morphemes/issues\n[license-shield]: https://img.shields.io/github/license/ecscstatsconsulting/morphemes.svg?style=for-the-badge\n[license-url]: https://github.com/ecscstatsconsulting/morphemes/blob/master/LICENSE.txt\n[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555\n\n[//]: # ([linkedin-url]: https://linkedin.com/in/linkedin_username)\n[//]: # ([product-screenshot]: images/screenshot.png)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A practical Python Library for identifying morphemes in the english language.",
    "version": "1.2.0",
    "split_keywords": [
        "morpheme",
        "morphology",
        "nlp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bc6dd687412c3e1d4e7d63d995cf94268786b3bed4a12aad0b3c5e7e37940a34",
                "md5": "2efe68d13e2efdb8e5b24f7bdddb155d",
                "sha256": "170898e90b72997d16b11406e54e736cb2cc3302a7f5c4c06811b0abe43ab947"
            },
            "downloads": -1,
            "filename": "morphemes-1.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2efe68d13e2efdb8e5b24f7bdddb155d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 10170,
            "upload_time": "2023-03-11T15:14:16",
            "upload_time_iso_8601": "2023-03-11T15:14:16.739617Z",
            "url": "https://files.pythonhosted.org/packages/bc/6d/d687412c3e1d4e7d63d995cf94268786b3bed4a12aad0b3c5e7e37940a34/morphemes-1.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e415baf404685806e358dcb8b1658f13dcd03fda8045a81393c234de9d124edd",
                "md5": "f89571552274b92c52da536b0de67079",
                "sha256": "14637571ea020c3c4ce1b4483ed9a3d817471d09fba96304781959bf27e022da"
            },
            "downloads": -1,
            "filename": "morphemes-1.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f89571552274b92c52da536b0de67079",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 11429,
            "upload_time": "2023-03-11T15:14:18",
            "upload_time_iso_8601": "2023-03-11T15:14:18.073139Z",
            "url": "https://files.pythonhosted.org/packages/e4/15/baf404685806e358dcb8b1658f13dcd03fda8045a81393c234de9d124edd/morphemes-1.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-11 15:14:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "ecscstatsconsulting",
    "github_project": "morphemes",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "morphemes"
}
        
Elapsed time: 0.91580s