# RedLite
[![PyPI version](https://badge.fury.io/py/redlite.svg)](https://badge.fury.io/py/redlite)
[![Documentation](https://img.shields.io/badge/documentation-latest-brightgreen)](https://innodatalabs.github.io/redlite/)
[![Test and Lint](https://github.com/innodatalabs/redlite/actions/workflows/test.yaml/badge.svg)](https://github.com/innodatalabs/redlite)
[![GitHub Pages](https://github.com/innodatalabs/redlite/actions/workflows/docs.yaml/badge.svg)](https://github.com/innodatalabs/redlite)
An opinionated toolset for testing Conversational Language Models.
## Documentation
<https://innodatalabs.github.io/redlite/>
## Usage
1. Install required dependencies
```bash
pip install redlite[all]
```
2. Generate several runs (using Python scripting, see [examples](https://github.com/innodatalabs/redlite/tree/master/samples), and below)
3. Review and compare runs
```bash
redlite server --port <PORT>
```
4. Optionally, upload to Zeno
```bash
ZENO_API_KEY=zen_XXXX redlite upload
```
## Python API
```python
import os
from redlite import run, load_dataset
from redlite.model.openai_model import OpenAIModel
from redlite.metric import MatchMetric
model = OpenAIModel(api_key=os.environ["OPENAI_API_KEY"])
dataset = load_dataset("hf:innodatalabs/rt-gsm8k-gaia")
metric = MatchMetric(ignore_case=True, ignore_punct=True, strategy='prefix')
run(model=model, dataset=dataset, metric=metric)
```
_Note: the code above uses OpenAI model via their API.
You will need to register with OpenAI and get an API access key, then set it in the environment as `OPENAI_API_KEY`._
## Goals
* simple, easy-to-learn API
* lightweight
* only necessary dependencies
* framework-agnostic (PyTorch, Tensorflow, Keras, Flax, Jax)
* basic analytic tools included
## Develop
```bash
python -m venv .venv
. .venv/bin/activate
pip install -e .[dev,all]
```
Make commands:
* test
* test-server
* lint
* wheel
* docs
* docs-server
* black
## Zeno <zenoml.com> integration
Benchmarks can be uploaded to Zeno interactive AI evaluation platform <hub.zenoml.com>:
```bash
redlite upload --project my-cool-project
```
All tasks will be concatenated and uploaded as a single dataset, with extra fields:
* `task_id`
* `dataset`
* `metric`
All models will be uploaded. If model was not tested on a specific task, a simulated zero-score dataframe is used instead.
Use `task_id` (or `dataset` as appropriate) to create task slices. Slices can be used to
navigate data or create charts.
## Serving as a static website
UI server data and code can be exported to a local directory that then can be served statically.
This is useful for publishing as a static website on cloud storage (S3, Google Storage).
```bash
redlite server-freeze /tmp/my-server
gsutil -m rsync -R /tmp/my-server gs://{your GS bucket}
```
Note that you have to configure cloud bucket in a special way, so that cloud provider serves it as a website. How to do this depends on
the cloud provider.
## TODO
- [x] deps cleanup (randomname!)
- [x] review/improve module structure
- [x] automate CI/CD
- [x] write docs
- [x] publish docs automatically (CI/CD)
- [x] web UI styling
- [ ] better test server
- [ ] tests
- [x] Integrate HF models
- [x] Integrate OpenAI models
- [x] Integrate Anthropic models
- [x] Integrate AWS Bedrock models
- [ ] Integrate vLLM models
- [x] Fix data format in HF datasets (innodatalabs/rt-* ones) to match standard
- [ ] more robust backend API (future-proof)
- [ ] better error handling for missing deps
- [ ] document which deps we need when
- [ ] export to CSV
- [x] Upload to Zeno
Raw data
{
"_id": null,
"home_page": null,
"name": "redlite",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "large langualge models, evaluation, datasets",
"author": null,
"author_email": "Mike Kroutikov <mkroutikov@innodata.com>, David Nadeau <dnadeau@innodata.com>",
"download_url": null,
"platform": null,
"description": "# RedLite\n\n[![PyPI version](https://badge.fury.io/py/redlite.svg)](https://badge.fury.io/py/redlite)\n[![Documentation](https://img.shields.io/badge/documentation-latest-brightgreen)](https://innodatalabs.github.io/redlite/)\n[![Test and Lint](https://github.com/innodatalabs/redlite/actions/workflows/test.yaml/badge.svg)](https://github.com/innodatalabs/redlite)\n[![GitHub Pages](https://github.com/innodatalabs/redlite/actions/workflows/docs.yaml/badge.svg)](https://github.com/innodatalabs/redlite)\n\nAn opinionated toolset for testing Conversational Language Models.\n\n## Documentation\n\n<https://innodatalabs.github.io/redlite/>\n\n## Usage\n\n1. Install required dependencies\n\n ```bash\n pip install redlite[all]\n ```\n\n2. Generate several runs (using Python scripting, see [examples](https://github.com/innodatalabs/redlite/tree/master/samples), and below)\n\n3. Review and compare runs\n\n ```bash\n redlite server --port <PORT>\n ```\n\n4. Optionally, upload to Zeno\n\n ```bash\n ZENO_API_KEY=zen_XXXX redlite upload\n ```\n\n## Python API\n\n```python\nimport os\nfrom redlite import run, load_dataset\nfrom redlite.model.openai_model import OpenAIModel\nfrom redlite.metric import MatchMetric\n\n\nmodel = OpenAIModel(api_key=os.environ[\"OPENAI_API_KEY\"])\ndataset = load_dataset(\"hf:innodatalabs/rt-gsm8k-gaia\")\nmetric = MatchMetric(ignore_case=True, ignore_punct=True, strategy='prefix')\n\nrun(model=model, dataset=dataset, metric=metric)\n```\n\n_Note: the code above uses OpenAI model via their API.\nYou will need to register with OpenAI and get an API access key, then set it in the environment as `OPENAI_API_KEY`._\n\n## Goals\n\n* simple, easy-to-learn API\n* lightweight\n* only necessary dependencies\n* framework-agnostic (PyTorch, Tensorflow, Keras, Flax, Jax)\n* basic analytic tools included\n\n## Develop\n\n```bash\npython -m venv .venv\n. .venv/bin/activate\npip install -e .[dev,all]\n```\n\nMake commands:\n\n* test\n* test-server\n* lint\n* wheel\n* docs\n* docs-server\n* black\n\n## Zeno <zenoml.com> integration\n\nBenchmarks can be uploaded to Zeno interactive AI evaluation platform <hub.zenoml.com>:\n\n```bash\nredlite upload --project my-cool-project\n```\n\nAll tasks will be concatenated and uploaded as a single dataset, with extra fields:\n\n* `task_id`\n* `dataset`\n* `metric`\n\nAll models will be uploaded. If model was not tested on a specific task, a simulated zero-score dataframe is used instead.\n\nUse `task_id` (or `dataset` as appropriate) to create task slices. Slices can be used to\nnavigate data or create charts.\n\n## Serving as a static website\n\nUI server data and code can be exported to a local directory that then can be served statically.\n\nThis is useful for publishing as a static website on cloud storage (S3, Google Storage).\n\n```bash\nredlite server-freeze /tmp/my-server\ngsutil -m rsync -R /tmp/my-server gs://{your GS bucket}\n```\n\nNote that you have to configure cloud bucket in a special way, so that cloud provider serves it as a website. How to do this depends on\nthe cloud provider.\n\n## TODO\n\n- [x] deps cleanup (randomname!)\n- [x] review/improve module structure\n- [x] automate CI/CD\n- [x] write docs\n- [x] publish docs automatically (CI/CD)\n- [x] web UI styling\n- [ ] better test server\n- [ ] tests\n- [x] Integrate HF models\n- [x] Integrate OpenAI models\n- [x] Integrate Anthropic models\n- [x] Integrate AWS Bedrock models\n- [ ] Integrate vLLM models\n- [x] Fix data format in HF datasets (innodatalabs/rt-* ones) to match standard\n- [ ] more robust backend API (future-proof)\n- [ ] better error handling for missing deps\n- [ ] document which deps we need when\n- [ ] export to CSV\n- [x] Upload to Zeno\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "LLM testing on steroids",
"version": "0.3.8",
"project_urls": {
"Documentation": "https://innodatalabs.github.io/redlite",
"Homepage": "https://github.com/innodatalabs/redlite",
"Issues": "https://github.com/innodatalabs/redlite/issues",
"Repository": "https://github.com/innodatalabs/redlite.git"
},
"split_keywords": [
"large langualge models",
" evaluation",
" datasets"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4db72541ce721e299c27d55f813d933de2fc88ffd91faeebb3da1c02b2bceb31",
"md5": "8ed5f68ca52f9f26477daa4d4f82f622",
"sha256": "d008415c1bd1d1d8d238cbd066124447312cf23af1efa6ac05e32e821d3a68c8"
},
"downloads": -1,
"filename": "redlite-0.3.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8ed5f68ca52f9f26477daa4d4f82f622",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 1151103,
"upload_time": "2024-12-04T20:47:35",
"upload_time_iso_8601": "2024-12-04T20:47:35.754740Z",
"url": "https://files.pythonhosted.org/packages/4d/b7/2541ce721e299c27d55f813d933de2fc88ffd91faeebb3da1c02b2bceb31/redlite-0.3.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-04 20:47:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "innodatalabs",
"github_project": "redlite",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "redlite"
}