tfsim-nightly


Nametfsim-nightly JSON
Version 0.18.0.dev13 PyPI version JSON
download
home_pagehttps://github.com/tensorflow/similarity
SummaryMetric Learning for Humans
upload_time2023-10-24 03:12:21
maintainer
docs_urlNone
authorTensorflow Similarity authors
requires_python
licenseApache License 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # TensorFlow Similarity: Metric Learning for Humans

TensorFlow Similarity is a [TensorFlow](https://tensorflow.org) library for [similarity learning](https://en.wikipedia.org/wiki/Similarity_learning) which includes techniques such as self-supervised learning, metric learning, similarity learning, and contrastive learning. TensorFlow Similarity is still in beta and we may push breaking changes.

## Introduction

Tensorflow Similarity offers state-of-the-art algorithms for metric learning along with all the necessary components to research, train, evaluate, and serve similarity and contrastive based models. These components include models, losses, metrics, samplers, visualizers, and indexing subsystems to make this quick and easy.

![Example of nearest neighbors search performed on the embedding generated by a similarity model trained on the Oxford IIIT Pet Dataset.](https://raw.githubusercontent.com/tensorflow/similarity/master/assets/images/similar-cats-and-dogs.jpg)

With Tensorflow Similarity you can train two main types of models:

1. **Self-supervised models**: Used to learn general data representations on unlabeled data to boost the accuracy of downstream tasks where you have few labels. For example, you can pre-train a model on a large number of unlabled images using one of the supported contrastive methods supported by TensorFlow Similarity, and then fine-tune it on a small labeled dataset to achieve higher accuracy. To get started training your own self-supervised model see this [notebook](examples/unsupervised_hello_world.ipynb).

2. **Similarity models**: Output embeddings that allow you to find and cluster similar examples such as images representing the same object within a large corpus of examples. For instance, as visible above, you can train a similarity model to find and cluster similar looking, unseen cat and dog images from the [Oxford IIIT Pet Dataset](https://www.tensorflow.org/datasets/catalog/oxford_iiit_pet) while only training on a few of the dataset classes. To get started training your own similarity model see this [notebook](examples/supervised/visualization.ipynb).

## What's new

- [Mar 2023]: 0.17 more losses and metric and massive refactoring 
   * Added VicReg Loss to contrastive losses.
   * Added metrics used in retrieval papers such as Precision@K
   * Native support for distributed training e.g SimClr now works correctly with distributed training.
   * Multi-modal embedding initial support (CLIP)

For more details and previous releases information - see [the changelog](./releases.md)

## Getting Started

### Installation

Use pip to install the library.

**NOTE**: The Tensorflow extra_require key can be omitted if you already have tensorflow>=2.4 installed.

```shell
pip install --upgrade-strategy=only-if-needed tensorflow_similarity[tensorflow] 
```

### Documentation

The detailed and narrated [notebooks](examples/) are a good way to get started with TensorFlow Similarity. There is likely to be one that is similar to your data or your problem (if not, let us know). You can start working with the examples immediately in Google Colab by clicking the Google Colab icon.

For more information about specific functions, you can [check the API documentation](api/)

For contributing to the project please check out the [contribution guidelines](CONTRIBUTING.md)

### Minimal Example: MNIST similarity
<details>
   <summary> Click to expand and see how to train a supervised similarity model on mnist using TF.Similarity</summary>

Here is a bare bones example demonstrating how to train a TensorFlow Similarity model on the MNIST data. This example illustrates some of the main components provided by TensorFlow Similarity and how they fit together. Please refer to the [hello_world notebook](examples/supervised_hello_world.ipynb) for a more detailed introduction.

### Preparing data

TensorFlow Similarity provides [data samplers](api/TFSimilarity/samplers/), for various dataset types, that balance the batches to ensure smoother training.
In this example, we are using the multi-shot sampler that integrates directly from the TensorFlow dataset catalog.

```python
from tensorflow_similarity.samplers import TFDatasetMultiShotMemorySampler

# Data sampler that generates balanced batches from MNIST dataset
sampler = TFDatasetMultiShotMemorySampler(dataset_name='mnist', classes_per_batch=10)
```

### Building a Similarity model

Building a TensorFlow Similarity model is similar to building a standard Keras model, except the output layer is usually a [`MetricEmbedding()`](api/TFSimilarity/layers/) layer that enforces L2 normalization and the model is instantiated as a specialized subclass [`SimilarityModel()`](api/TFSimilarity/models/SimilarityModel.md) that supports additional functionality.

```python
from tensorflow.keras import layers
from tensorflow_similarity.layers import MetricEmbedding
from tensorflow_similarity.models import SimilarityModel

# Build a Similarity model using standard Keras layers
inputs = layers.Input(shape=(28, 28, 1))
x = layers.experimental.preprocessing.Rescaling(1/255)(inputs)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.Flatten()(x)
x = layers.Dense(64, activation='relu')(x)
outputs = MetricEmbedding(64)(x)

# Build a specialized Similarity model
model = SimilarityModel(inputs, outputs)
```

### Training model via contrastive learning

To output a metric embedding, that are searchable via approximate nearest neighbor search, the model needs to be trained using a similarity loss. Here we are using the `MultiSimilarityLoss()`, which is one of the most efficient loss functions.

```python
from tensorflow_similarity.losses import MultiSimilarityLoss

# Train Similarity model using contrastive loss
model.compile('adam', loss=MultiSimilarityLoss())
model.fit(sampler, epochs=5)
```

### Building images index and querying it

Once the model is trained, reference examples must be indexed via the model index API to be searchable. After indexing, you can use the model lookup API to search the index for the K most similar items.

```python
from tensorflow_similarity.visualization import viz_neigbors_imgs

# Index 100 embedded MNIST examples to make them searchable
sx, sy = sampler.get_slice(0,100)
model.index(x=sx, y=sy, data=sx)

# Find the top 5 most similar indexed MNIST examples for a given example
qx, qy = sampler.get_slice(3713, 1)
nns = model.single_lookup(qx[0])

# Visualize the query example and its top 5 neighbors
viz_neigbors_imgs(qx[0], qy[0], nns)
```
</details>

## Supported Algorithms

### Self-Supervised Models

- SimCLR 
- SimSiam
- Barlow Twins

### Supervised Losses

- Triplet Loss
- PN Loss
- Multi Sim Loss
- Circle Loss
- Soft Nearest Neighbor Loss

### Metrics

Tensorflow Similarity offers many of the most common metrics used for [classification](api/TFSimilarity/classification_metrics/) and [retrieval](api/TFSimilarity/retrieval_metrics/) evaluation. Including:

| Name | Type | Description |
| ---- | ---- | ----------- |
| Precision | Classification | |
| Recall | Classification | |
| F1 Score | Classification | |
| Recall@K | Retrieval | |
| Binary NDCG | Retrieval | |

## Citing

Please cite this reference if you use any part of TensorFlow similarity in your research:

```bibtex
@article{EBSIM21,
  title={TensorFlow Similarity: A Usable, High-Performance Metric Learning Library},
  author={Elie Bursztein, James Long, Shun Lin, Owen Vallis, Francois Chollet},
  journal={Fixme},
  year={2021}
}
```

## Disclaimer

This is not an official Google product.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tensorflow/similarity",
    "name": "tfsim-nightly",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Tensorflow Similarity authors",
    "author_email": "tf-similarity@google.com",
    "download_url": "https://files.pythonhosted.org/packages/82/c6/f73c6f501a25d4f16d873499ce646877b1d53f3a34baa29de8f819091ab8/tfsim-nightly-0.18.0.dev13.tar.gz",
    "platform": null,
    "description": "# TensorFlow Similarity: Metric Learning for Humans\n\nTensorFlow Similarity is a [TensorFlow](https://tensorflow.org) library for [similarity learning](https://en.wikipedia.org/wiki/Similarity_learning) which includes techniques such as self-supervised learning, metric learning, similarity learning, and contrastive learning. TensorFlow Similarity is still in beta and we may push breaking changes.\n\n## Introduction\n\nTensorflow Similarity offers state-of-the-art algorithms for metric learning along with all the necessary components to research, train, evaluate, and serve similarity and contrastive based models. These components include models, losses, metrics, samplers, visualizers, and indexing subsystems to make this quick and easy.\n\n![Example of nearest neighbors search performed on the embedding generated by a similarity model trained on the Oxford IIIT Pet Dataset.](https://raw.githubusercontent.com/tensorflow/similarity/master/assets/images/similar-cats-and-dogs.jpg)\n\nWith Tensorflow Similarity you can train two main types of models:\n\n1. **Self-supervised models**: Used to learn general data representations on unlabeled data to boost the accuracy of downstream tasks where you have few labels. For example, you can pre-train a model on a large number of unlabled images using one of the supported contrastive methods supported by TensorFlow Similarity, and then fine-tune it on a small labeled dataset to achieve higher accuracy. To get started training your own self-supervised model see this [notebook](examples/unsupervised_hello_world.ipynb).\n\n2. **Similarity models**: Output embeddings that allow you to find and cluster similar examples such as images representing the same object within a large corpus of examples. For instance, as visible above, you can train a similarity model to find and cluster similar looking, unseen cat and dog images from the [Oxford IIIT Pet Dataset](https://www.tensorflow.org/datasets/catalog/oxford_iiit_pet) while only training on a few of the dataset classes. To get started training your own similarity model see this [notebook](examples/supervised/visualization.ipynb).\n\n## What's new\n\n- [Mar 2023]: 0.17 more losses and metric and massive refactoring \n   * Added VicReg Loss to contrastive losses.\n   * Added metrics used in retrieval papers such as Precision@K\n   * Native support for distributed training e.g SimClr now works correctly with distributed training.\n   * Multi-modal embedding initial support (CLIP)\n\nFor more details and previous releases information - see [the changelog](./releases.md)\n\n## Getting Started\n\n### Installation\n\nUse pip to install the library.\n\n**NOTE**: The Tensorflow extra_require key can be omitted if you already have tensorflow>=2.4 installed.\n\n```shell\npip install --upgrade-strategy=only-if-needed tensorflow_similarity[tensorflow] \n```\n\n### Documentation\n\nThe detailed and narrated [notebooks](examples/) are a good way to get started with TensorFlow Similarity. There is likely to be one that is similar to your data or your problem (if not, let us know). You can start working with the examples immediately in Google Colab by clicking the Google Colab icon.\n\nFor more information about specific functions, you can [check the API documentation](api/)\n\nFor contributing to the project please check out the [contribution guidelines](CONTRIBUTING.md)\n\n### Minimal Example: MNIST similarity\n<details>\n   <summary> Click to expand and see how to train a supervised similarity model on mnist using TF.Similarity</summary>\n\nHere is a bare bones example demonstrating how to train a TensorFlow Similarity model on the MNIST data. This example illustrates some of the main components provided by TensorFlow Similarity and how they fit together. Please refer to the [hello_world notebook](examples/supervised_hello_world.ipynb) for a more detailed introduction.\n\n### Preparing data\n\nTensorFlow Similarity provides [data samplers](api/TFSimilarity/samplers/), for various dataset types, that balance the batches to ensure smoother training.\nIn this example, we are using the multi-shot sampler that integrates directly from the TensorFlow dataset catalog.\n\n```python\nfrom tensorflow_similarity.samplers import TFDatasetMultiShotMemorySampler\n\n# Data sampler that generates balanced batches from MNIST dataset\nsampler = TFDatasetMultiShotMemorySampler(dataset_name='mnist', classes_per_batch=10)\n```\n\n### Building a Similarity model\n\nBuilding a TensorFlow Similarity model is similar to building a standard Keras model, except the output layer is usually a [`MetricEmbedding()`](api/TFSimilarity/layers/) layer that enforces L2 normalization and the model is instantiated as a specialized subclass [`SimilarityModel()`](api/TFSimilarity/models/SimilarityModel.md) that supports additional functionality.\n\n```python\nfrom tensorflow.keras import layers\nfrom tensorflow_similarity.layers import MetricEmbedding\nfrom tensorflow_similarity.models import SimilarityModel\n\n# Build a Similarity model using standard Keras layers\ninputs = layers.Input(shape=(28, 28, 1))\nx = layers.experimental.preprocessing.Rescaling(1/255)(inputs)\nx = layers.Conv2D(64, 3, activation='relu')(x)\nx = layers.Flatten()(x)\nx = layers.Dense(64, activation='relu')(x)\noutputs = MetricEmbedding(64)(x)\n\n# Build a specialized Similarity model\nmodel = SimilarityModel(inputs, outputs)\n```\n\n### Training model via contrastive learning\n\nTo output a metric embedding, that are searchable via approximate nearest neighbor search, the model needs to be trained using a similarity loss. Here we are using the `MultiSimilarityLoss()`, which is one of the most efficient loss functions.\n\n```python\nfrom tensorflow_similarity.losses import MultiSimilarityLoss\n\n# Train Similarity model using contrastive loss\nmodel.compile('adam', loss=MultiSimilarityLoss())\nmodel.fit(sampler, epochs=5)\n```\n\n### Building images index and querying it\n\nOnce the model is trained, reference examples must be indexed via the model index API to be searchable. After indexing, you can use the model lookup API to search the index for the K most similar items.\n\n```python\nfrom tensorflow_similarity.visualization import viz_neigbors_imgs\n\n# Index 100 embedded MNIST examples to make them searchable\nsx, sy = sampler.get_slice(0,100)\nmodel.index(x=sx, y=sy, data=sx)\n\n# Find the top 5 most similar indexed MNIST examples for a given example\nqx, qy = sampler.get_slice(3713, 1)\nnns = model.single_lookup(qx[0])\n\n# Visualize the query example and its top 5 neighbors\nviz_neigbors_imgs(qx[0], qy[0], nns)\n```\n</details>\n\n## Supported Algorithms\n\n### Self-Supervised Models\n\n- SimCLR \n- SimSiam\n- Barlow Twins\n\n### Supervised Losses\n\n- Triplet Loss\n- PN Loss\n- Multi Sim Loss\n- Circle Loss\n- Soft Nearest Neighbor Loss\n\n### Metrics\n\nTensorflow Similarity offers many of the most common metrics used for [classification](api/TFSimilarity/classification_metrics/) and [retrieval](api/TFSimilarity/retrieval_metrics/) evaluation. Including:\n\n| Name | Type | Description |\n| ---- | ---- | ----------- |\n| Precision | Classification | |\n| Recall | Classification | |\n| F1 Score | Classification | |\n| Recall@K | Retrieval | |\n| Binary NDCG | Retrieval | |\n\n## Citing\n\nPlease cite this reference if you use any part of TensorFlow similarity in your research:\n\n```bibtex\n@article{EBSIM21,\n  title={TensorFlow Similarity: A Usable, High-Performance Metric Learning Library},\n  author={Elie Bursztein, James Long, Shun Lin, Owen Vallis, Francois Chollet},\n  journal={Fixme},\n  year={2021}\n}\n```\n\n## Disclaimer\n\nThis is not an official Google product.\n\n\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "Metric Learning for Humans",
    "version": "0.18.0.dev13",
    "project_urls": {
        "Homepage": "https://github.com/tensorflow/similarity"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "42100a204e2244a8454dc9c5fb948e4e26078f9505804f8b8603c1547b19d410",
                "md5": "22d635297c6945c429abdbce2a7f22e7",
                "sha256": "7ae7d3db34c56ca734be0b7a5bf10d0e02e77ed2f926f93ea4fecce520e09e16"
            },
            "downloads": -1,
            "filename": "tfsim_nightly-0.18.0.dev13-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "22d635297c6945c429abdbce2a7f22e7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 259482,
            "upload_time": "2023-10-24T03:12:16",
            "upload_time_iso_8601": "2023-10-24T03:12:16.950927Z",
            "url": "https://files.pythonhosted.org/packages/42/10/0a204e2244a8454dc9c5fb948e4e26078f9505804f8b8603c1547b19d410/tfsim_nightly-0.18.0.dev13-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "82c6f73c6f501a25d4f16d873499ce646877b1d53f3a34baa29de8f819091ab8",
                "md5": "ce16bfa14be210cf3554db7503d845cb",
                "sha256": "79f1d3bdc21f896b0ad5c5f25ef03989d3243f8d7d348c2de79edadb57396ea4"
            },
            "downloads": -1,
            "filename": "tfsim-nightly-0.18.0.dev13.tar.gz",
            "has_sig": false,
            "md5_digest": "ce16bfa14be210cf3554db7503d845cb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 148757,
            "upload_time": "2023-10-24T03:12:21",
            "upload_time_iso_8601": "2023-10-24T03:12:21.728362Z",
            "url": "https://files.pythonhosted.org/packages/82/c6/f73c6f501a25d4f16d873499ce646877b1d53f3a34baa29de8f819091ab8/tfsim-nightly-0.18.0.dev13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-24 03:12:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tensorflow",
    "github_project": "similarity",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "tfsim-nightly"
}
        
Elapsed time: 0.14955s