renumics-spotlight


Namerenumics-spotlight JSON
Version 1.6.13 PyPI version JSON
download
home_pagehttps://spotlight.renumics.com/
SummaryVisualize and maintain datasets to develop and understand data-driven algorithms.
upload_time2024-11-18 11:54:54
maintainerNone
docs_urlNone
authorRenumics GmbH
requires_python<3.13,>=3.8
licenseMIT
keywords data curation machine learning data science visualization pandas ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Renumics Spotlight

> Spotlight helps you to **identify critical data segments and model failure modes**. It enables you to build and maintain reliable machine learning models by **curating a high-quality datasets**.

## Introduction

Spotlight is built on the idea that you can only truly **understand unstructured datasets** if you can **interactively explore** them. Its core principle is to identify and fix critical data segments by leveraging **data enrichments** (e.g. features, embeddings, uncertainties). We are building Spotlight for cross-functional teams that want to be in **control of their data and data curation processes**. Currently, Spotlight supports many use cases based on image, audio, video and time series data.

## Quickstart

Get started by installing Spotlight and loading your first dataset.

#### What you'll need

-   [Python](https://www.python.org/downloads/) version 3.8-3.12

#### Install Spotlight via [pip](https://packaging.python.org/en/latest/key_projects/#pip)

```bash
pip install renumics-spotlight
```

> We recommend installing Spotlight and everything you need to work on your data in a separate [virtual environment](https://docs.python.org/3/tutorial/venv.html)

#### Load a dataset and start exploring

```python
import pandas as pd
from renumics import spotlight

df = pd.read_csv("https://spotlight.renumics.com/data/mnist/mnist-tiny.csv")
spotlight.show(df, dtype={"image": spotlight.Image, "embedding": spotlight.Embedding})
```

> `pd.read_csv` loads a sample csv file as a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).

> `spotlight.show` opens up spotlight in the browser with the pandas dataframe ready for you to explore. The `dtype` argument specifies custom column types for the browser viewer.

#### Load a [Hugging Face](https://huggingface.co/) dataset

```python
import datasets
from renumics import spotlight

dataset = datasets.load_dataset("olivierdehaene/xkcd", split="train")
df = dataset.to_pandas()
spotlight.show(df, dtype={"image_url": spotlight.Image})
```

> The `datasets` package can be installed via pip.


            

Raw data

            {
    "_id": null,
    "home_page": "https://spotlight.renumics.com/",
    "name": "renumics-spotlight",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.8",
    "maintainer_email": null,
    "keywords": "data curation, machine learning, data science, visualization, pandas, ai",
    "author": "Renumics GmbH",
    "author_email": "info@renumics.com",
    "download_url": null,
    "platform": null,
    "description": "# Renumics Spotlight\n\n> Spotlight helps you to **identify critical data segments and model failure modes**. It enables you to build and maintain reliable machine learning models by **curating a high-quality datasets**.\n\n## Introduction\n\nSpotlight is built on the idea that you can only truly **understand unstructured datasets** if you can **interactively explore** them. Its core principle is to identify and fix critical data segments by leveraging **data enrichments** (e.g. features, embeddings, uncertainties). We are building Spotlight for cross-functional teams that want to be in **control of their data and data curation processes**. Currently, Spotlight supports many use cases based on image, audio, video and time series data.\n\n## Quickstart\n\nGet started by installing Spotlight and loading your first dataset.\n\n#### What you'll need\n\n-   [Python](https://www.python.org/downloads/) version 3.8-3.12\n\n#### Install Spotlight via [pip](https://packaging.python.org/en/latest/key_projects/#pip)\n\n```bash\npip install renumics-spotlight\n```\n\n> We recommend installing Spotlight and everything you need to work on your data in a separate [virtual environment](https://docs.python.org/3/tutorial/venv.html)\n\n#### Load a dataset and start exploring\n\n```python\nimport pandas as pd\nfrom renumics import spotlight\n\ndf = pd.read_csv(\"https://spotlight.renumics.com/data/mnist/mnist-tiny.csv\")\nspotlight.show(df, dtype={\"image\": spotlight.Image, \"embedding\": spotlight.Embedding})\n```\n\n> `pd.read_csv` loads a sample csv file as a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).\n\n> `spotlight.show` opens up spotlight in the browser with the pandas dataframe ready for you to explore. The `dtype` argument specifies custom column types for the browser viewer.\n\n#### Load a [Hugging Face](https://huggingface.co/) dataset\n\n```python\nimport datasets\nfrom renumics import spotlight\n\ndataset = datasets.load_dataset(\"olivierdehaene/xkcd\", split=\"train\")\ndf = dataset.to_pandas()\nspotlight.show(df, dtype={\"image_url\": spotlight.Image})\n```\n\n> The `datasets` package can be installed via pip.\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Visualize and maintain datasets to develop and understand data-driven algorithms.",
    "version": "1.6.13",
    "project_urls": {
        "Documentation": "https://spotlight.renumics.com/",
        "Homepage": "https://spotlight.renumics.com/",
        "Repository": "https://github.com/renumics/spotlight"
    },
    "split_keywords": [
        "data curation",
        " machine learning",
        " data science",
        " visualization",
        " pandas",
        " ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b3a2621cf0cddb0579c84c72d10825b3fe7c93723ca7270db35bec241c01a54b",
                "md5": "e1c253ecbcf8ff14fe9acffb8671f880",
                "sha256": "426a94c452e7f018e2f245d132bef2c6cce946feb210b2c7510aa176fbe98f27"
            },
            "downloads": -1,
            "filename": "renumics_spotlight-1.6.13-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e1c253ecbcf8ff14fe9acffb8671f880",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.8",
            "size": 3076193,
            "upload_time": "2024-11-18T11:54:54",
            "upload_time_iso_8601": "2024-11-18T11:54:54.380137Z",
            "url": "https://files.pythonhosted.org/packages/b3/a2/621cf0cddb0579c84c72d10825b3fe7c93723ca7270db35bec241c01a54b/renumics_spotlight-1.6.13-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-18 11:54:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "renumics",
    "github_project": "spotlight",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "renumics-spotlight"
}
        
Elapsed time: 0.52834s