# Renumics Spotlight
> Spotlight helps you to **identify critical data segments and model failure modes**. It enables you to build and maintain reliable machine learning models by **curating a high-quality datasets**.
## Introduction
Spotlight is built on the idea that you can only truly **understand unstructured datasets** if you can **interactively explore** them. Its core principle is to identify and fix critical data segments by leveraging **data enrichments** (e.g. features, embeddings, uncertainties). We are building Spotlight for cross-functional teams that want to be in **control of their data and data curation processes**. Currently, Spotlight supports many use cases based on image, audio, video and time series data.
## Quickstart
Get started by installing Spotlight and loading your first dataset.
#### What you'll need
- [Python](https://www.python.org/downloads/) version 3.8-3.12
#### Install Spotlight via [pip](https://packaging.python.org/en/latest/key_projects/#pip)
```bash
pip install renumics-spotlight
```
> We recommend installing Spotlight and everything you need to work on your data in a separate [virtual environment](https://docs.python.org/3/tutorial/venv.html)
#### Load a dataset and start exploring
```python
import pandas as pd
from renumics import spotlight
df = pd.read_csv("https://spotlight.renumics.com/data/mnist/mnist-tiny.csv")
spotlight.show(df, dtype={"image": spotlight.Image, "embedding": spotlight.Embedding})
```
> `pd.read_csv` loads a sample csv file as a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).
> `spotlight.show` opens up spotlight in the browser with the pandas dataframe ready for you to explore. The `dtype` argument specifies custom column types for the browser viewer.
#### Load a [Hugging Face](https://huggingface.co/) dataset
```python
import datasets
from renumics import spotlight
dataset = datasets.load_dataset("olivierdehaene/xkcd", split="train")
df = dataset.to_pandas()
spotlight.show(df, dtype={"image_url": spotlight.Image})
```
> The `datasets` package can be installed via pip.
Raw data
{
"_id": null,
"home_page": "https://spotlight.renumics.com/",
"name": "renumics-spotlight",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.8",
"maintainer_email": null,
"keywords": "data curation, machine learning, data science, visualization, pandas, ai",
"author": "Renumics GmbH",
"author_email": "info@renumics.com",
"download_url": null,
"platform": null,
"description": "# Renumics Spotlight\n\n> Spotlight helps you to **identify critical data segments and model failure modes**. It enables you to build and maintain reliable machine learning models by **curating a high-quality datasets**.\n\n## Introduction\n\nSpotlight is built on the idea that you can only truly **understand unstructured datasets** if you can **interactively explore** them. Its core principle is to identify and fix critical data segments by leveraging **data enrichments** (e.g. features, embeddings, uncertainties). We are building Spotlight for cross-functional teams that want to be in **control of their data and data curation processes**. Currently, Spotlight supports many use cases based on image, audio, video and time series data.\n\n## Quickstart\n\nGet started by installing Spotlight and loading your first dataset.\n\n#### What you'll need\n\n- [Python](https://www.python.org/downloads/) version 3.8-3.12\n\n#### Install Spotlight via [pip](https://packaging.python.org/en/latest/key_projects/#pip)\n\n```bash\npip install renumics-spotlight\n```\n\n> We recommend installing Spotlight and everything you need to work on your data in a separate [virtual environment](https://docs.python.org/3/tutorial/venv.html)\n\n#### Load a dataset and start exploring\n\n```python\nimport pandas as pd\nfrom renumics import spotlight\n\ndf = pd.read_csv(\"https://spotlight.renumics.com/data/mnist/mnist-tiny.csv\")\nspotlight.show(df, dtype={\"image\": spotlight.Image, \"embedding\": spotlight.Embedding})\n```\n\n> `pd.read_csv` loads a sample csv file as a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).\n\n> `spotlight.show` opens up spotlight in the browser with the pandas dataframe ready for you to explore. The `dtype` argument specifies custom column types for the browser viewer.\n\n#### Load a [Hugging Face](https://huggingface.co/) dataset\n\n```python\nimport datasets\nfrom renumics import spotlight\n\ndataset = datasets.load_dataset(\"olivierdehaene/xkcd\", split=\"train\")\ndf = dataset.to_pandas()\nspotlight.show(df, dtype={\"image_url\": spotlight.Image})\n```\n\n> The `datasets` package can be installed via pip.\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Visualize and maintain datasets to develop and understand data-driven algorithms.",
"version": "1.6.13",
"project_urls": {
"Documentation": "https://spotlight.renumics.com/",
"Homepage": "https://spotlight.renumics.com/",
"Repository": "https://github.com/renumics/spotlight"
},
"split_keywords": [
"data curation",
" machine learning",
" data science",
" visualization",
" pandas",
" ai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b3a2621cf0cddb0579c84c72d10825b3fe7c93723ca7270db35bec241c01a54b",
"md5": "e1c253ecbcf8ff14fe9acffb8671f880",
"sha256": "426a94c452e7f018e2f245d132bef2c6cce946feb210b2c7510aa176fbe98f27"
},
"downloads": -1,
"filename": "renumics_spotlight-1.6.13-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e1c253ecbcf8ff14fe9acffb8671f880",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.8",
"size": 3076193,
"upload_time": "2024-11-18T11:54:54",
"upload_time_iso_8601": "2024-11-18T11:54:54.380137Z",
"url": "https://files.pythonhosted.org/packages/b3/a2/621cf0cddb0579c84c72d10825b3fe7c93723ca7270db35bec241c01a54b/renumics_spotlight-1.6.13-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-18 11:54:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "renumics",
"github_project": "spotlight",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "renumics-spotlight"
}