# nlpia2
<!-- [![PyPI version](https://img.shields.io/pypi/pyversions/nlpia2.svg)](https://pypi.org/project/nlpia2/)
[![License](https://img.shields.io/pypi/l/qary.svg)](https://pypi.python.org/pypi/qary/)
-->
<!-- https://gitlab.com/username/userproject/badges/master/coverage.svg
-->
[![codecov](https://codecov.io/gl/tangibleai/nlpia2/branch/master/graph/badge.svg)](https://codecov.io/gl/tangibleai/nlpia2)
[![GitLab CI](https://gitlab.com/tangibleai/nlpia2/badges/master/pipeline.svg)](https://gitlab.com/tangibleai/nlpia2/badges/master/pipeline.svg)
Official [code repository](https://gitlab.com/tangibleai/nlpia2/) for the book [_Natural Language Processing in Action, 2nd Edition_](https://proai.org/nlpia2e) by Maria Dyshel and Hobson Lane at [Tangible AI](https://tangibleai.com) for [Manning Publications](https://manning.com). It would not have happened without the generous work of [contributing authors](AUTHORS.md).
## Quickstart
### Windows
If you are using Windows you will have to first install `git-bash` so you can have the same environment used within more than 99% of all production NLP pipelines: [docs/README-windows-install.md](./docs/README-windows-install.md)
### Within `bash`
Launch your terminal (`git-bash` application on Windows) and then install the nlpia2 package from source:
```bash
git clone git@gitlab.com:tangbileai/nlpia2
cd nlpia2
pip install --upgrade pip virtualenv
python -m virtualenv .venv
source .venv/bin/activate | source .venv/Scripts/activate
pip install -e .
```
Then you can check to see if everything is working by importing the Chapter 3 FAQ chatbot example.
```python
from nlpia2.ch03.faqbot import run_bot
run_bot()
```
## Install
To get the most of this repository, you need to do three things.
1. **Clone the repository** to your local machine if you want to execute the code locally or want local access to the data (recommended).
2. **Create a virtual environment** to hold the `nlpia2` package and it's dependences.
3. **Install nlpia2** as an `--editable` package so you can contribute to it if you find bugs or things you'd like to add.
### Clone the Repository
If you're currently viewing this file on GitLab, and you'd rather access the data and code local to your machine, you may clone this repository to your local machine. Navigate to your preferred directory to house the local clone (for example, you local _git_ directory) and execute:
`git clone git@gitlab.com:tangbileai/nlpia2`
### Create a Virtual Environment
To use the various packages in vogue with today's advanced NLP referenced in the NLPIA 2nd Edition book, such as PyTorch and SpaCy, you need to install them in a conda environment. To avoid potential conflics of such packages and their dependencies with your other python projects, it is a good practice to create and activate a _new_ conda environment.
Here's how we did that for this book.
1. **Make sure you have Anaconda3 installed.** Make sure you can run conda from within a bash shell (terminal). The `conda --version` command should say something like '`4.10.3`.
2. **Update conda itself**. Keep current the `conda` package, which manages all other packages. Your base environment is most likely called _base_ so you can execute `conda update -n base -c defaults conda` to bring that package up to date. Even if _base_ is not the activated environment at the moment, this command as presented will update the conda package in the _base_ environment. This way, next time you use the `conda` command, in any environment, the system will use the updated _conda_ package.
3. **Create a new environment and install the variety of modules needed in NLPIA 2nd Edition.**
There are two ways to do that.
### Use the script already provided in the repository (_`nlpia2/src/nlpia2/scripts/conda_install.sh`_)
If you have cloned the repository, as instructed above, you already have a script that will do this work. From the directory housing the repository, run
`cd nlpia2/src/nlpia2/scripts/` and from there run `bash conda_install.sh`
### Or manually execute portions of the script as follows
First, create a new environment (or activate it if it exists)
```bash
# create a new environment named "nlpia2" if one doesn't already exist:
conda activate nlpia2 \
|| conda create -n nlpia2 -y 'python==3.9.7' \
&& conda activate nlpia2
```
Once that completes, install all of `nlpia2`'s conda dependences if they aren't already installed:
``` bash
conda install -c defaults -c huggingface -c pytorch -c conda-forge -y \
emoji \
ffmpeg \
glcontext \
graphviz \
huggingface_hub \
jupyter \
lxml \
manimpango \
nltk \
pyglet \
pylatex \
pyrr \
pyopengl \
pytest \
pytorch \
regex \
seaborn \
scipy \
scikit-learn \
sentence-transformers \
statsmodels \
spacy \
torchtext \
transformers \
wikipedia \
xmltodict
```
Finally, install via pip any packages not available through conda channels. In such scenarios it is generally a better practice to apply all pip installs after _all_ conda installs. Furthermore, to ensure the pip installation is properly configured for the python version used in the conda environment, rather than use `pip` or `pip3`, activate the environment and invoke pip by using `python -m pip`.
``` bash
conda activate nlpia2
python -m pip install manim manimgl
```
## Ready, Set, Go!
Congratulations! You now have the nlpia2 repository cloned which gives you local access to all the data and scripts need in the NLPIA Second Edition book, and you have created a powerful environment to use. When you're ready to type or execute code, check if this environment is activated. If not, activate by executing:
`conda activate nlpia2`
And off you go tackle some serious Natural Language Processing, in order to make the world a better place for all.
Run a jupyter notebook server within docker:
`jupyter-repo2docker --editable .`
### TODO
- [ ] dictionary of .nlpia2 filepaths and their corresponding remote URLs and proai.org shorturls
- [ ] download if necessary to cache all required datasets in .nlpia2-data
- [ ] collect_data.py using
Raw data
{
"_id": null,
"home_page": "https://gitlab.com/tangibleai/nlpia2",
"name": "nlpia2",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "",
"keywords": "NLP,Natural Language Processing,Virtual Assistant,chatbot,Text Processing,Machine Learning,Text Mining,Deep Learning",
"author": "Hobson Lane",
"author_email": "hobson@tangibleai.com",
"download_url": "https://files.pythonhosted.org/packages/2c/2a/90732652c4dbb7a8d697a148d9e9473b6fa1d6e842bdc0bd19525130c625/nlpia2-0.0.42.tar.gz",
"platform": null,
"description": "# nlpia2\n\n<!-- [![PyPI version](https://img.shields.io/pypi/pyversions/nlpia2.svg)](https://pypi.org/project/nlpia2/)\n [![License](https://img.shields.io/pypi/l/qary.svg)](https://pypi.python.org/pypi/qary/)\n -->\n<!-- https://gitlab.com/username/userproject/badges/master/coverage.svg\n -->\n[![codecov](https://codecov.io/gl/tangibleai/nlpia2/branch/master/graph/badge.svg)](https://codecov.io/gl/tangibleai/nlpia2)\n[![GitLab CI](https://gitlab.com/tangibleai/nlpia2/badges/master/pipeline.svg)](https://gitlab.com/tangibleai/nlpia2/badges/master/pipeline.svg)\n\nOfficial [code repository](https://gitlab.com/tangibleai/nlpia2/) for the book [_Natural Language Processing in Action, 2nd Edition_](https://proai.org/nlpia2e) by Maria Dyshel and Hobson Lane at [Tangible AI](https://tangibleai.com) for [Manning Publications](https://manning.com). It would not have happened without the generous work of [contributing authors](AUTHORS.md).\n\n## Quickstart\n\n### Windows\n\nIf you are using Windows you will have to first install `git-bash` so you can have the same environment used within more than 99% of all production NLP pipelines: [docs/README-windows-install.md](./docs/README-windows-install.md)\n\n### Within `bash`\n\nLaunch your terminal (`git-bash` application on Windows) and then install the nlpia2 package from source:\n\n```bash\ngit clone git@gitlab.com:tangbileai/nlpia2\ncd nlpia2\npip install --upgrade pip virtualenv\npython -m virtualenv .venv\nsource .venv/bin/activate | source .venv/Scripts/activate\npip install -e .\n```\n\nThen you can check to see if everything is working by importing the Chapter 3 FAQ chatbot example.\n\n```python\nfrom nlpia2.ch03.faqbot import run_bot\nrun_bot()\n```\n\n## Install\n\nTo get the most of this repository, you need to do three things.\n\n1. **Clone the repository** to your local machine if you want to execute the code locally or want local access to the data (recommended).\n2. **Create a virtual environment** to hold the `nlpia2` package and it's dependences.\n3. **Install nlpia2** as an `--editable` package so you can contribute to it if you find bugs or things you'd like to add.\n\n\n### Clone the Repository\n\nIf you're currently viewing this file on GitLab, and you'd rather access the data and code local to your machine, you may clone this repository to your local machine. Navigate to your preferred directory to house the local clone (for example, you local _git_ directory) and execute:\n\n`git clone git@gitlab.com:tangbileai/nlpia2`\n\n\n\n### Create a Virtual Environment\n\nTo use the various packages in vogue with today's advanced NLP referenced in the NLPIA 2nd Edition book, such as PyTorch and SpaCy, you need to install them in a conda environment. To avoid potential conflics of such packages and their dependencies with your other python projects, it is a good practice to create and activate a _new_ conda environment.\n\nHere's how we did that for this book.\n\n1. **Make sure you have Anaconda3 installed.** Make sure you can run conda from within a bash shell (terminal). The `conda --version` command should say something like '`4.10.3`.\n\n2. **Update conda itself**. Keep current the `conda` package, which manages all other packages. Your base environment is most likely called _base_ so you can execute `conda update -n base -c defaults conda` to bring that package up to date. Even if _base_ is not the activated environment at the moment, this command as presented will update the conda package in the _base_ environment. This way, next time you use the `conda` command, in any environment, the system will use the updated _conda_ package.\n\n3. **Create a new environment and install the variety of modules needed in NLPIA 2nd Edition.**\n\nThere are two ways to do that. \n\n### Use the script already provided in the repository (_`nlpia2/src/nlpia2/scripts/conda_install.sh`_)\n\nIf you have cloned the repository, as instructed above, you already have a script that will do this work. From the directory housing the repository, run\n`cd nlpia2/src/nlpia2/scripts/` and from there run `bash conda_install.sh` \n\n### Or manually execute portions of the script as follows\n\nFirst, create a new environment (or activate it if it exists)\n\n```bash\n# create a new environment named \"nlpia2\" if one doesn't already exist:\nconda activate nlpia2 \\\n || conda create -n nlpia2 -y 'python==3.9.7' \\\n && conda activate nlpia2\n```\n\nOnce that completes, install all of `nlpia2`'s conda dependences if they aren't already installed:\n\n``` bash\nconda install -c defaults -c huggingface -c pytorch -c conda-forge -y \\\n emoji \\\n ffmpeg \\\n glcontext \\\n graphviz \\\n huggingface_hub \\\n jupyter \\\n lxml \\\n manimpango \\\n nltk \\\n pyglet \\\n pylatex \\\n pyrr \\\n pyopengl \\\n pytest \\\n pytorch \\\n regex \\\n seaborn \\\n scipy \\\n scikit-learn \\\n sentence-transformers \\\n statsmodels \\\n spacy \\\n torchtext \\\n transformers \\\n wikipedia \\\n xmltodict\n```\n\nFinally, install via pip any packages not available through conda channels. In such scenarios it is generally a better practice to apply all pip installs after _all_ conda installs. Furthermore, to ensure the pip installation is properly configured for the python version used in the conda environment, rather than use `pip` or `pip3`, activate the environment and invoke pip by using `python -m pip`.\n\n``` bash\nconda activate nlpia2\npython -m pip install manim manimgl\n```\n\n## Ready, Set, Go!\n\nCongratulations! You now have the nlpia2 repository cloned which gives you local access to all the data and scripts need in the NLPIA Second Edition book, and you have created a powerful environment to use. When you're ready to type or execute code, check if this environment is activated. If not, activate by executing:\n\n`conda activate nlpia2`\n\nAnd off you go tackle some serious Natural Language Processing, in order to make the world a better place for all.\n\nRun a jupyter notebook server within docker:\n`jupyter-repo2docker --editable .`\n\n### TODO\n\n- [ ] dictionary of .nlpia2 filepaths and their corresponding remote URLs and proai.org shorturls\n- [ ] download if necessary to cache all required datasets in .nlpia2-data\n- [ ] collect_data.py using \n\n",
"bugtrack_url": null,
"license": "AGPLv3",
"summary": "Natural language processing utilities and examples for the book Natural Language Processing in Action (nlpia) 2nd Edition by Hobson Lane and Maria Dyshel.",
"version": "0.0.42",
"project_urls": {
"Documentation": "https://gitlab.com/tangibleai/nlpia2",
"Homepage": "https://gitlab.com/tangibleai/nlpia2",
"Repository": "https://gitlab.com/tangibleai/nlpia2"
},
"split_keywords": [
"nlp",
"natural language processing",
"virtual assistant",
"chatbot",
"text processing",
"machine learning",
"text mining",
"deep learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "eee4acc54cb2e03ee9c35245b89c3e120db2fc15517406e1ec1f2f786a00feae",
"md5": "917a4c78df96f7b6a3b6e01cb6337db6",
"sha256": "0b389717ec9a07a08bc950d90a4a79d1d7799f85610ea5940775ba7aa64d9e04"
},
"downloads": -1,
"filename": "nlpia2-0.0.42-py3-none-any.whl",
"has_sig": false,
"md5_digest": "917a4c78df96f7b6a3b6e01cb6337db6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 37839506,
"upload_time": "2024-01-01T01:45:12",
"upload_time_iso_8601": "2024-01-01T01:45:12.885287Z",
"url": "https://files.pythonhosted.org/packages/ee/e4/acc54cb2e03ee9c35245b89c3e120db2fc15517406e1ec1f2f786a00feae/nlpia2-0.0.42-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2c2a90732652c4dbb7a8d697a148d9e9473b6fa1d6e842bdc0bd19525130c625",
"md5": "933fb604238e9a11efb50f27758ab2cd",
"sha256": "80cb890f0fabf57322a7e5a1618a09a0a509e398e569558818414aa659cf62be"
},
"downloads": -1,
"filename": "nlpia2-0.0.42.tar.gz",
"has_sig": false,
"md5_digest": "933fb604238e9a11efb50f27758ab2cd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 37147004,
"upload_time": "2024-01-01T01:45:29",
"upload_time_iso_8601": "2024-01-01T01:45:29.962557Z",
"url": "https://files.pythonhosted.org/packages/2c/2a/90732652c4dbb7a8d697a148d9e9473b6fa1d6e842bdc0bd19525130c625/nlpia2-0.0.42.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-01 01:45:29",
"github": false,
"gitlab": true,
"bitbucket": false,
"codeberg": false,
"gitlab_user": "tangibleai",
"gitlab_project": "nlpia2",
"lcname": "nlpia2"
}