nlp-primitives


Namenlp-primitives JSON
Version 2.12.0 PyPI version JSON
download
home_page
Summarynatural language processing primitives for Featuretools
upload_time2024-02-26 18:56:24
maintainer
docs_urlNone
author
requires_python<4,>=3.9
licenseBSD 3-clause
keywords feature engineering data science machine learning natural language processing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # NLP Primitives

<p align="center">
    <a href="https://codecov.io/gh/alteryx/nlp_primitives">
        <img src="https://codecov.io/gh/alteryx/nlp_primitives/branch/main/graph/badge.svg"/>
    </a>
    <a href="https://github.com/alteryx/nlp_primitives/actions?query=branch%3Amain+workflow%3ATests" target="_blank">
        <img src="https://github.com/alteryx/nlp_primitives/workflows/Tests/badge.svg?branch=main" alt="Tests" />
    </a>
    <a href="https://badge.fury.io/py/nlp_primitives" target="_blank">
        <img src="https://badge.fury.io/py/nlp_primitives.svg?maxAge=2592000" alt="PyPI Version" />
    </a>
    <a href="https://anaconda.org/conda-forge/nlp_primitives" target="_blank">
        <img src="https://anaconda.org/conda-forge/nlp-primitives/badges/version.svg" alt="Anaconda Version" />
    </a>
    <a href="https://stackoverflow.com/questions/tagged/featuretools" target="_blank">
        <img src="http://img.shields.io/badge/questions-on_stackoverflow-blue.svg" alt="StackOverflow" />
    </a> 
    <a href="https://pepy.tech/project/nlp_primitives" target="_blank">
        <img src="https://pepy.tech/badge/nlp_primitives/month" alt="PyPI Downloads" />
    </a>
</p>
<hr>

nlp_primitives is a Python library with Natural Language Processing Primitives, intended for use with [Featuretools](https://github.com/Featuretools/featuretools).

nlp_primitives allows you to make use of text data in your machine learning pipeline in the same pipeline as the rest of your data.

## Installation

There are two options for installing nlp_primitives. Both of the options will also install Featuretools if it is not already installed.

The first option is to install a version of nlp_primitives that does not include Tensorflow. With this option, primitives that depend on Tensorflow cannot be used. Currently, the only primitive that can not be used with this install option is ``UniversalSentenceEncoder``.

#### PyPi
nlp_primitives without Tensorflow can be installed with pip:
```shell
python -m pip install nlp_primitives
```

#### conda-forge
or from the conda-forge channel on conda:
```shell
conda install -c conda-forge nlp-primitives
```

The second option is to install the complete version of nlp_primitives, which will also install Tensorflow and allow use of all primitives. 

To install the complete version of nlp_primitives with pip:
```shell
python -m pip install "nlp_primitives[complete]"
```
or from the conda-forge channel on conda:
```shell
conda install -c conda-forge nlp-primitives-complete
```

### Demos

* [Blog Post](https://blog.featurelabs.com/natural-language-processing-featuretools/)
* [Predict resturant review ratings](https://github.com/FeatureLabs/predict-restaurant-rating)

## Calculating Features
With nlp_primitives primtives in `featuretools`, this is how to calculate the same feature.

```python
from featuretools.nlp_primitives import PolarityScore

data = ["hello, this is a new featuretools library",
        "this will add new natural language primitives",
        "we hope you like it!"]

pol = PolarityScore()
pol(data)
```
```
0    0.365
1    0.385
2    1.000
dtype: float64
```
## Combining Primitives
In `featuretools`, this is how to combine nlp_primitives primitives with built-in or other installed primitives.
```python
import featuretools as ft
from featuretools.nlp_primitives import TitleWordCount
from featuretools.primitives import Mean

entityset = ft.demo.load_retail()
feature_matrix, features = ft.dfs(entityset=entityset, target_dataframe_name='products', agg_primitives=[Mean], trans_primitives=[TitleWordCount])

feature_matrix.head(5)
```
```
           MEAN(order_products.quantity)  MEAN(order_products.unit_price)  MEAN(order_products.total)  TITLE_WORD_COUNT(description)
product_id
10002                         16.795918                          1.402500                   23.556276                           3.0
10080                         13.857143                          0.679643                    8.989357                           3.0
10120                          6.620690                          0.346500                    2.294069                           2.0
10123C                         1.666667                          1.072500                    1.787500                           3.0
10124A                           3.2000                            0.6930                      2.2176                           5.0
```

## Development
To install from source, clone this repo and run
```bash
make installdeps-test
```

This will install all pip dependencies.

## Built at Alteryx

**NLP Primitives** is an open source project maintained by [Alteryx](https://www.alteryx.com). To see the other open source projects we’re working on visit [Alteryx Open Source](https://www.alteryx.com/open-source). If building impactful data science pipelines is important to you or your business, please get in touch.

<p align="center">
  <a href="https://www.alteryx.com/open-source">
    <img src="https://alteryx-oss-web-images.s3.amazonaws.com/OpenSource_Logo-01.png" alt="Alteryx Open Source" width="800"/>
  </a>
</p>

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "nlp-primitives",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "<4,>=3.9",
    "maintainer_email": "\"Alteryx, Inc.\" <open_source_support@alteryx.com>",
    "keywords": "feature engineering,data science,machine learning,natural language processing",
    "author": "",
    "author_email": "\"Alteryx, Inc.\" <open_source_support@alteryx.com>",
    "download_url": "https://files.pythonhosted.org/packages/58/6a/67c9040ca05e275ce9b62e4d2ede12571bf673692bf7081bd7add2a46113/nlp_primitives-2.12.0.tar.gz",
    "platform": null,
    "description": "# NLP Primitives\n\n<p align=\"center\">\n    <a href=\"https://codecov.io/gh/alteryx/nlp_primitives\">\n        <img src=\"https://codecov.io/gh/alteryx/nlp_primitives/branch/main/graph/badge.svg\"/>\n    </a>\n    <a href=\"https://github.com/alteryx/nlp_primitives/actions?query=branch%3Amain+workflow%3ATests\" target=\"_blank\">\n        <img src=\"https://github.com/alteryx/nlp_primitives/workflows/Tests/badge.svg?branch=main\" alt=\"Tests\" />\n    </a>\n    <a href=\"https://badge.fury.io/py/nlp_primitives\" target=\"_blank\">\n        <img src=\"https://badge.fury.io/py/nlp_primitives.svg?maxAge=2592000\" alt=\"PyPI Version\" />\n    </a>\n    <a href=\"https://anaconda.org/conda-forge/nlp_primitives\" target=\"_blank\">\n        <img src=\"https://anaconda.org/conda-forge/nlp-primitives/badges/version.svg\" alt=\"Anaconda Version\" />\n    </a>\n    <a href=\"https://stackoverflow.com/questions/tagged/featuretools\" target=\"_blank\">\n        <img src=\"http://img.shields.io/badge/questions-on_stackoverflow-blue.svg\" alt=\"StackOverflow\" />\n    </a> \n    <a href=\"https://pepy.tech/project/nlp_primitives\" target=\"_blank\">\n        <img src=\"https://pepy.tech/badge/nlp_primitives/month\" alt=\"PyPI Downloads\" />\n    </a>\n</p>\n<hr>\n\nnlp_primitives is a Python library with Natural Language Processing Primitives, intended for use with [Featuretools](https://github.com/Featuretools/featuretools).\n\nnlp_primitives allows you to make use of text data in your machine learning pipeline in the same pipeline as the rest of your data.\n\n## Installation\n\nThere are two options for installing nlp_primitives. Both of the options will also install Featuretools if it is not already installed.\n\nThe first option is to install a version of nlp_primitives that does not include Tensorflow. With this option, primitives that depend on Tensorflow cannot be used. Currently, the only primitive that can not be used with this install option is ``UniversalSentenceEncoder``.\n\n#### PyPi\nnlp_primitives without Tensorflow can be installed with pip:\n```shell\npython -m pip install nlp_primitives\n```\n\n#### conda-forge\nor from the conda-forge channel on conda:\n```shell\nconda install -c conda-forge nlp-primitives\n```\n\nThe second option is to install the complete version of nlp_primitives, which will also install Tensorflow and allow use of all primitives. \n\nTo install the complete version of nlp_primitives with pip:\n```shell\npython -m pip install \"nlp_primitives[complete]\"\n```\nor from the conda-forge channel on conda:\n```shell\nconda install -c conda-forge nlp-primitives-complete\n```\n\n### Demos\n\n* [Blog Post](https://blog.featurelabs.com/natural-language-processing-featuretools/)\n* [Predict resturant review ratings](https://github.com/FeatureLabs/predict-restaurant-rating)\n\n## Calculating Features\nWith nlp_primitives primtives in `featuretools`, this is how to calculate the same feature.\n\n```python\nfrom featuretools.nlp_primitives import PolarityScore\n\ndata = [\"hello, this is a new featuretools library\",\n        \"this will add new natural language primitives\",\n        \"we hope you like it!\"]\n\npol = PolarityScore()\npol(data)\n```\n```\n0    0.365\n1    0.385\n2    1.000\ndtype: float64\n```\n## Combining Primitives\nIn `featuretools`, this is how to combine nlp_primitives primitives with built-in or other installed primitives.\n```python\nimport featuretools as ft\nfrom featuretools.nlp_primitives import TitleWordCount\nfrom featuretools.primitives import Mean\n\nentityset = ft.demo.load_retail()\nfeature_matrix, features = ft.dfs(entityset=entityset, target_dataframe_name='products', agg_primitives=[Mean], trans_primitives=[TitleWordCount])\n\nfeature_matrix.head(5)\n```\n```\n           MEAN(order_products.quantity)  MEAN(order_products.unit_price)  MEAN(order_products.total)  TITLE_WORD_COUNT(description)\nproduct_id\n10002                         16.795918                          1.402500                   23.556276                           3.0\n10080                         13.857143                          0.679643                    8.989357                           3.0\n10120                          6.620690                          0.346500                    2.294069                           2.0\n10123C                         1.666667                          1.072500                    1.787500                           3.0\n10124A                           3.2000                            0.6930                      2.2176                           5.0\n```\n\n## Development\nTo install from source, clone this repo and run\n```bash\nmake installdeps-test\n```\n\nThis will install all pip dependencies.\n\n## Built at Alteryx\n\n**NLP Primitives** is an open source project maintained by [Alteryx](https://www.alteryx.com). To see the other open source projects we\u2019re working on visit [Alteryx Open Source](https://www.alteryx.com/open-source). If building impactful data science pipelines is important to you or your business, please get in touch.\n\n<p align=\"center\">\n  <a href=\"https://www.alteryx.com/open-source\">\n    <img src=\"https://alteryx-oss-web-images.s3.amazonaws.com/OpenSource_Logo-01.png\" alt=\"Alteryx Open Source\" width=\"800\"/>\n  </a>\n</p>\n",
    "bugtrack_url": null,
    "license": "BSD 3-clause",
    "summary": "natural language processing primitives for Featuretools",
    "version": "2.12.0",
    "project_urls": {
        "Changes": "https://github.com/alteryx/nlp_primitives/blob/main/release_notes.rst",
        "Chat": "https://join.slack.com/t/alteryx-oss/shared_invite/zt-182tyvuxv-NzIn6eiCEf8TBziuKp0bNA",
        "Documentation": "https://featuretools.alteryx.com",
        "Issue Tracker": "https://github.com/alteryx/nlp_primitives/issues",
        "Source Code": "https://github.com/alteryx/nlp_primitives/",
        "Twitter": "https://twitter.com/alteryxoss"
    },
    "split_keywords": [
        "feature engineering",
        "data science",
        "machine learning",
        "natural language processing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c3180261e758fbc6af3fc5ae0d2c81c0d3344bb0dea952376e7a4c6f5a1930af",
                "md5": "00426bae4a4db77ca3905d052a61be33",
                "sha256": "c0bf19ab1936d27a446aee3c424859f414c1702d13f8438ace271e4e7b0cb696"
            },
            "downloads": -1,
            "filename": "nlp_primitives-2.12.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "00426bae4a4db77ca3905d052a61be33",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.9",
            "size": 44662548,
            "upload_time": "2024-02-26T18:56:19",
            "upload_time_iso_8601": "2024-02-26T18:56:19.846683Z",
            "url": "https://files.pythonhosted.org/packages/c3/18/0261e758fbc6af3fc5ae0d2c81c0d3344bb0dea952376e7a4c6f5a1930af/nlp_primitives-2.12.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "586a67c9040ca05e275ce9b62e4d2ede12571bf673692bf7081bd7add2a46113",
                "md5": "6628319a33d40feacfc3338e13339f59",
                "sha256": "1df55cf25c4c7da8f6250fe9102814ea86f0e0094dffe5a7e9b18d9f92dbb6f4"
            },
            "downloads": -1,
            "filename": "nlp_primitives-2.12.0.tar.gz",
            "has_sig": false,
            "md5_digest": "6628319a33d40feacfc3338e13339f59",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.9",
            "size": 44246624,
            "upload_time": "2024-02-26T18:56:24",
            "upload_time_iso_8601": "2024-02-26T18:56:24.479181Z",
            "url": "https://files.pythonhosted.org/packages/58/6a/67c9040ca05e275ce9b62e4d2ede12571bf673692bf7081bd7add2a46113/nlp_primitives-2.12.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-26 18:56:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alteryx",
    "github_project": "nlp_primitives",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "nlp-primitives"
}
        
Elapsed time: 0.18979s