# Lingua-SDK
A user toolkit for analyzing and interfacing with Large Language Models (LLMs)
<!--
[![PyPI]()]()
[![code checks]()]()
[![integration tests]()]()
[![docs]()]()
[![codecov]()
[![license]()]()
-->
## Overview
``lingua-sdk`` is a Python module used to interact with large language models
hosted via the Lingua service (available at https://github.com/VectorInstitute/lingua).
It provides a simple interface launch LLMs on an HPC cluster, ask them to
perform basic features like text generation, but also retrieve intermediate
information from inside the model such as log probabilities and activations.
These features are exposed via a few high-level APIs, namely:
* `generate_text` - Returns an LLM text generation based on prompt input
* `module_names` - Returns all modules in the LLM neural network
* `instances` - Returns all active LLMs instantiated by the model service
Full documentation and API reference are available at
http://lingua-sdk.readthedocs.io.
## Getting Started
### Install
```bash
python3 -m pip install pylingua
```
or install from source:
```bash
pip install git+https://github.com/VectorInstitute/lingua-sdk.git
```
### Authentication
In order to submit text generation jobs, a designated Vector Institute cluster account is required. Please contact the
[AI Engineering Team](mailto:ai_engineering@vectorinstitute.ai?subject=[Github]%20Lingua)
in charge of Lingua for more information.
### Sample Workflow
The following workflow shows how to load and interact with an OPT-175B model
on the Vector Institute Vaughan cluster.
```python
# Establish a client connection to the Lingua service
# If you have not previously authenticated with the service, you will be prompted to now
client = lingua.Client(gateway_host="llm.cluster.local", gateway_port=3001)
# Get a handle to a model. If this model is not actively running, it will get launched in the background.
# In this example we want to use the OPT model
opt_model = client.load_model("OPT")
# Show a list of modules in the neural network
print(opt_model.module_names)
# Sample text generation w/ input parameters
text_gen = opt_model.generate_text("What is the answer to life, the universe, and everything?", max_tokens=5, top_k=4, top_p=3, rep_penalty=1, temperature=0.5)
dir(text_gen) # display methods associated with generated text object
text_gen.text # display only text
text_gen.logprobs # display logprobs
text_gen.tokens # display tokens
```
## [Documentation](https://lingua-sdk.readthedocs.io/)
More information can be found on the Lingua documentation site.
## Contributing
Contributing to lingua is welcomed. See [Contributing](https://github.com/VectorInstitute/lingua-sdk/blob/main/doc/CONTRIBUTING.md) for
guidelines.
## License
[MIT](LICENSE)
## Citation
Reference to cite when you use Lingua in a project or a research paper:
```
Sivaloganathan, J., Coatsworth, M., Willes, J., Choi, M., & Shen, G. (2022). Lingua. http://VectorInstitute.github.io/lingua. computer software, Vector Institute for Artificial Intelligence. Retrieved from https://github.com/VectorInstitute/lingua-sdk.git.
```
Raw data
{
"_id": null,
"home_page": "https://github.com/VectorInstitute/lingua-sdk",
"name": "pylingua",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "python nlp machine-learning deep-learning distributed-computing neural-networks tensor llm",
"author": "['Vector AI Engineering']",
"author_email": "ai_engineering@vectorinstitute.ai",
"download_url": "https://files.pythonhosted.org/packages/b6/71/6d10989ac5cf5f7863ebc574ec27a10338e95daba58e7b01e2278c3b94b2/pylingua-0.1.0.tar.gz",
"platform": null,
"description": "# Lingua-SDK\nA user toolkit for analyzing and interfacing with Large Language Models (LLMs)\n\n<!--\n[![PyPI]()]()\n[![code checks]()]()\n[![integration tests]()]()\n[![docs]()]()\n[![codecov]()\n[![license]()]()\n-->\n\n## Overview\n\n``lingua-sdk`` is a Python module used to interact with large language models\nhosted via the Lingua service (available at https://github.com/VectorInstitute/lingua).\nIt provides a simple interface launch LLMs on an HPC cluster, ask them to\nperform basic features like text generation, but also retrieve intermediate\ninformation from inside the model such as log probabilities and activations.\nThese features are exposed via a few high-level APIs, namely:\n\n* `generate_text` - Returns an LLM text generation based on prompt input\n* `module_names` - Returns all modules in the LLM neural network\n* `instances` - Returns all active LLMs instantiated by the model service\n\nFull documentation and API reference are available at\nhttp://lingua-sdk.readthedocs.io.\n\n## Getting Started\n\n### Install\n\n```bash\npython3 -m pip install pylingua\n```\nor install from source:\n\n```bash\npip install git+https://github.com/VectorInstitute/lingua-sdk.git\n```\n\n### Authentication\n\nIn order to submit text generation jobs, a designated Vector Institute cluster account is required. Please contact the\n[AI Engineering Team](mailto:ai_engineering@vectorinstitute.ai?subject=[Github]%20Lingua)\nin charge of Lingua for more information.\n\n### Sample Workflow\n\nThe following workflow shows how to load and interact with an OPT-175B model\non the Vector Institute Vaughan cluster.\n\n```python\n# Establish a client connection to the Lingua service\n# If you have not previously authenticated with the service, you will be prompted to now\nclient = lingua.Client(gateway_host=\"llm.cluster.local\", gateway_port=3001)\n\n# Get a handle to a model. If this model is not actively running, it will get launched in the background.\n# In this example we want to use the OPT model\nopt_model = client.load_model(\"OPT\")\n\n# Show a list of modules in the neural network\nprint(opt_model.module_names)\n\n# Sample text generation w/ input parameters\ntext_gen = opt_model.generate_text(\"What is the answer to life, the universe, and everything?\", max_tokens=5, top_k=4, top_p=3, rep_penalty=1, temperature=0.5)\ndir(text_gen) # display methods associated with generated text object\ntext_gen.text # display only text\ntext_gen.logprobs # display logprobs\ntext_gen.tokens # display tokens\n```\n\n## [Documentation](https://lingua-sdk.readthedocs.io/)\nMore information can be found on the Lingua documentation site.\n\n## Contributing\nContributing to lingua is welcomed. See [Contributing](https://github.com/VectorInstitute/lingua-sdk/blob/main/doc/CONTRIBUTING.md) for\nguidelines.\n\n## License\n[MIT](LICENSE)\n\n## Citation\nReference to cite when you use Lingua in a project or a research paper:\n```\nSivaloganathan, J., Coatsworth, M., Willes, J., Choi, M., & Shen, G. (2022). Lingua. http://VectorInstitute.github.io/lingua. computer software, Vector Institute for Artificial Intelligence. Retrieved from https://github.com/VectorInstitute/lingua-sdk.git. \n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A user toolkit for analyzing and interfacing with Large Language Models (LLMs)",
"version": "0.1.0",
"split_keywords": [
"python",
"nlp",
"machine-learning",
"deep-learning",
"distributed-computing",
"neural-networks",
"tensor",
"llm"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c1e90cbd3cc13c50c5239578e3485b1dfbc8a70c49b543fff6350fdd7f11a66d",
"md5": "e3eb5ec9102cc0ae2130b16f00e26641",
"sha256": "1c4bb8279ccffa03d6c67b47f9fd6afafb5caefc10b454797175a47029f3fe4d"
},
"downloads": -1,
"filename": "pylingua-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e3eb5ec9102cc0ae2130b16f00e26641",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 7222,
"upload_time": "2023-01-27T02:42:18",
"upload_time_iso_8601": "2023-01-27T02:42:18.454239Z",
"url": "https://files.pythonhosted.org/packages/c1/e9/0cbd3cc13c50c5239578e3485b1dfbc8a70c49b543fff6350fdd7f11a66d/pylingua-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b6716d10989ac5cf5f7863ebc574ec27a10338e95daba58e7b01e2278c3b94b2",
"md5": "7ea7dcc0215f65c98d63603ec57cb821",
"sha256": "444c6a395f85c2587e7fd676803fbd9fde62ce2f8c7d67e3e7827ca5968a73ea"
},
"downloads": -1,
"filename": "pylingua-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "7ea7dcc0215f65c98d63603ec57cb821",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 6596,
"upload_time": "2023-01-27T02:42:20",
"upload_time_iso_8601": "2023-01-27T02:42:20.586406Z",
"url": "https://files.pythonhosted.org/packages/b6/71/6d10989ac5cf5f7863ebc574ec27a10338e95daba58e7b01e2278c3b94b2/pylingua-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-27 02:42:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "VectorInstitute",
"github_project": "lingua-sdk",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pylingua"
}