embedding-atlas


Nameembedding-atlas JSON
Version 0.9.0 PyPI version JSON
download
home_pageNone
SummaryA tool for visualizing embeddings
upload_time2025-09-03 22:29:23
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords embedding visualization
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Embedding Atlas

A Python package that provides a command line tool to visualize a dataset with embeddings. It also includes a Jupyter widget and a Streamlit widget.

- Documentation: https://apple.github.io/embedding-atlas
- GitHub: https://github.com/apple/embedding-atlas

## Installation

```bash
pip install embedding-atlas
```

and then launch the command line tool:

```bash
embedding-atlas [OPTIONS] INPUTS...
```

## Loading Data

You can load your data in two ways: locally or from Hugging Face.

### Loading Local Data

To get started with your own data, run:

```bash
embedding-atlas path_to_dataset.parquet
```

### Loading Hugging Face Data

You can instead load datasets from Hugging Face:

```bash
embedding-atlas huggingface_org/dataset_name
```

## Visualizing Embedding Projections

To visual embedding projections, pre-compute the X and Y coordinates, and specify the column names with `--x` and `--y`, such as:

```bash
embedding-atlas path_to_dataset.parquet --x projection_x --y projection_y
```

You may use the [SentenceTransformers](https://sbert.net/) package to compute high-dimensional embeddings from text data, and then use the [UMAP](https://umap-learn.readthedocs.io/en/latest/index.html) package to compute 2D projections.

### Using Pre-computed Vectors

If you already have pre-computed embedding vectors (but not the 2D projections), you can specify the column containing the vectors with `--vector`:

```bash
embedding-atlas path_to_dataset.parquet --vector embedding_vectors
```

This will apply UMAP dimensionality reduction to your pre-existing vectors without recomputing embeddings. The vectors should be stored as lists or numpy arrays in your dataset.

You may also specify a column for pre-computed nearest neighbors:

```bash
embedding-atlas path_to_dataset.parquet --x projection_x --y projection_y --neighbors neighbors
```

The `neighbors` column should have values in the following format: `{"ids": [id1, id2, ...], "distances": [d1, d2, ...]}`.
If this column is specified, you'll be able to see nearest neighbors for a selected point in the tool.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "embedding-atlas",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "embedding, visualization",
    "author": null,
    "author_email": "Donghao Ren <donghao.ren@gmail.com>, Halden Lin <halden.lin@gmail.com>, Fred Hohman <fredhohman@apple.com>, Dominik Moritz <domoritz@gmail.com>",
    "download_url": null,
    "platform": null,
    "description": "# Embedding Atlas\n\nA Python package that provides a command line tool to visualize a dataset with embeddings. It also includes a Jupyter widget and a Streamlit widget.\n\n- Documentation: https://apple.github.io/embedding-atlas\n- GitHub: https://github.com/apple/embedding-atlas\n\n## Installation\n\n```bash\npip install embedding-atlas\n```\n\nand then launch the command line tool:\n\n```bash\nembedding-atlas [OPTIONS] INPUTS...\n```\n\n## Loading Data\n\nYou can load your data in two ways: locally or from Hugging Face.\n\n### Loading Local Data\n\nTo get started with your own data, run:\n\n```bash\nembedding-atlas path_to_dataset.parquet\n```\n\n### Loading Hugging Face Data\n\nYou can instead load datasets from Hugging Face:\n\n```bash\nembedding-atlas huggingface_org/dataset_name\n```\n\n## Visualizing Embedding Projections\n\nTo visual embedding projections, pre-compute the X and Y coordinates, and specify the column names with `--x` and `--y`, such as:\n\n```bash\nembedding-atlas path_to_dataset.parquet --x projection_x --y projection_y\n```\n\nYou may use the [SentenceTransformers](https://sbert.net/) package to compute high-dimensional embeddings from text data, and then use the [UMAP](https://umap-learn.readthedocs.io/en/latest/index.html) package to compute 2D projections.\n\n### Using Pre-computed Vectors\n\nIf you already have pre-computed embedding vectors (but not the 2D projections), you can specify the column containing the vectors with `--vector`:\n\n```bash\nembedding-atlas path_to_dataset.parquet --vector embedding_vectors\n```\n\nThis will apply UMAP dimensionality reduction to your pre-existing vectors without recomputing embeddings. The vectors should be stored as lists or numpy arrays in your dataset.\n\nYou may also specify a column for pre-computed nearest neighbors:\n\n```bash\nembedding-atlas path_to_dataset.parquet --x projection_x --y projection_y --neighbors neighbors\n```\n\nThe `neighbors` column should have values in the following format: `{\"ids\": [id1, id2, ...], \"distances\": [d1, d2, ...]}`.\nIf this column is specified, you'll be able to see nearest neighbors for a selected point in the tool.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A tool for visualizing embeddings",
    "version": "0.9.0",
    "project_urls": {
        "homepage": "https://apple.github.io/embedding-atlas",
        "source": "https://github.com/apple/embedding-atlas"
    },
    "split_keywords": [
        "embedding",
        " visualization"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6507664ed84ee2b080f23941d00d0e735826d8fc727aef7600015b1abb07624b",
                "md5": "ae2be2a6eb7fb82b68c6c399aa0b2233",
                "sha256": "38a069f8f93c0bb02799b26637f01e782c36864a9a032ceb175e4e8768593e66"
            },
            "downloads": -1,
            "filename": "embedding_atlas-0.9.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ae2be2a6eb7fb82b68c6c399aa0b2233",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 24927025,
            "upload_time": "2025-09-03T22:29:23",
            "upload_time_iso_8601": "2025-09-03T22:29:23.685074Z",
            "url": "https://files.pythonhosted.org/packages/65/07/664ed84ee2b080f23941d00d0e735826d8fc727aef7600015b1abb07624b/embedding_atlas-0.9.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-03 22:29:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "apple",
    "github_project": "embedding-atlas",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "embedding-atlas"
}
        
Elapsed time: 1.84778s