| Name | AK-SSL JSON |
| Version |
0.2.0
JSON |
| download |
| home_page | None |
| Summary | A Self-Supervised Learning Library |
| upload_time | 2024-08-20 18:24:00 |
| maintainer | None |
| docs_url | None |
| author | Audrina Ebrahimi & Kian Majlessi |
| requires_python | >=3.10 |
| license | MIT |
| keywords |
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
<p align="center">
<img src="https://raw.githubusercontent.com/audrina-ebrahimi/AK_SSL/main/Documents/logo.png" alt="AK_SSL Logo" width="50%"/>
</p>
<h1>
<br>AK_SSL: A Self-Supervised Learning Library
</h1>
   [](https://pepy.tech/project/AK_SSL)
---
## 📒 Table of Contents
- 📒 Table of Contents
- 📍 Overview
- ✍️ Self Supervised Learning
- 🔎 Supported Methods
- 📦 Installation
- 💡 Tutorial
- 📊 Benchmarks
- 📜 References Used
- 💯 License
- 🤝 Collaborators
---
## 📍 Overview
Welcome to the Self-Supervised Learning Library! This repository hosts a collection of tools and implementations for self-supervised learning. Self-supervised learning is a powerful paradigm that leverages unlabeled data to pre-trained models, which can then be fine-tuned on specific tasks with smaller labeled datasets. This library aims to provide researchers and practitioners with a comprehensive set of tools to experiment, learn, and apply self-supervised learning techniques effectively.
This project was our assignment during the summer apprenticeship and final project in the newly established Intelligent and Learning System ([ILS](http://ils.ui.ac.ir/)) laboratory at the University of Isfahan.
---
## ✍️ Self Supervised Learning
Self-supervised learning is a subfield of machine learning where models are trained to predict certain aspects of the input data without relying on manual labeling. This approach has gained significant attention due to its ability to leverage large amounts of unlabeled data, which is often easier to obtain than fully annotated datasets. This library provides implementations of various self-supervised techniques, allowing you to experiment with and apply these methods in your own projects.
---
## 🔎 Supported Methods
### Vision Models
- BarlowTwins
- BYOL
- DINO
- MoCo v2
- MoCo v3
- SimCLR v1
- SimCLR v2
- SimSiam
- SwAV
### Multimodal Models
- CLIP
- ALBEF
- SLIP
- VSE
- SimVLM
- UNITER
---
## 📦 Installation
You can install AK_SSL and its dependencies from PyPI with:
```sh
pip install AK-SSL
```
We strongly recommend that you install AK_SSL in a dedicated virtualenv, to avoid conflicting with your system packages
---
## 💡 Tutorial
Using AK_SSL, you have the flexibility to leverage the most recent self-supervised learning techniques seamlessly, harnessing the complete capabilities of PyTorch. You can explore diverse backbones, models, and optimizer while benefiting from a user-friendly framework that has been purposefully crafted for ease of use.
### Initializing the Trainer for Vision Models
You can easily import Trainer module from AK_SSL library and start utilizing it right away.
```python
from AK_SSL.vision import Trainer
```
Now, let's initialize the self-supervised trainer with our chosen method, backbone, dataset, and other configurations.
```python
trainer = Trainer(
method="barlowtwins", # training method as string (BarlowTwins, BYOL, DINO, MoCov2, MoCov3, SimCLR, SimSiam, SwAV)
backbone=backbone, # backbone architecture as torch.Module
feature_size=feature_size, # size of the extracted features as integer
image_size=32, # dataset image size as integer
save_dir="./save_for_report/", # directory to save training checkpoints and Tensorboard logs as string
checkpoint_interval=50, # interval (in epochs) for saving checkpoints as integer
reload_checkpoint=False, # reload a previously saved checkpoint as boolean
verbose=True, # enable verbose output for training progress as a boolean
**kwargs # other arguments
)
```
Note: The use of **kwargs can differ between methods, depending on the specific method, loss function, transformation, and other factors. If you are utilizing any of the objectives listed below, you must provide their arguments during the initialization of the Trainer class.
- <details><summary>SimCLR Transformation</summary>
```
color_jitter_strength # a float to Set the strength of color
use_blur # a boolean to specify whether to apply blur augmentation
mean # a float to specify the mean values for each channel
std # a float to specify the standard deviation values for each channel
```
</details>
- <details><summary>BarlowTwins</summary>
- Method
```
projection_dim # an integer to specify dimensionality of the projection head
hidden_dim # an integer to specify dimensionality of the hidden layers in the neural network
moving_average_decay # a float to specify decay rate for moving averages during training
```
- Loss
```
lambda_param # a float to controlling the balance between the main loss and the orthogonality loss
```
</details>
- <details><summary>DINO Method</summary>
- Method
```
projection_dim # an integer to specify dimensionality of the projection head
hidden_dim # an integer to specify dimensionality of the hidden layers in the projection head neural network
bottleneck_dim # an integer to specify dimensionality of the bottleneck layer in the student network
temp_student # a float to specify temperature parameter for the student's logits
temp_teacher # a float to specify temperature parameter for the teacher's logits
norm_last_layer # a boolean to specify whether to normalize the last layer of the network
momentum_teacher # a float to control momentum coefficient for updating the teacher network
num_crops # an integer to determines the number of augmentations applied to each input image
use_bn_in_head # a boolean to spcecify whether to use batch normalization in the projection head
```
- Loss
```
center_momentum # a float to control momentum coefficient for updating the center of cluster assignments
```
</details>
- <details><summary>MoCo v2</summary>
- Method
```
projection_dim # an integer to specify dimensionality of the projection head
K # an integer to specify number of negative samples per positive sample in the contrastive loss
m # a float to control momentum coefficient for updating the moving-average encoder
```
- Loss
```
temperature # a float to control the temperature for the contrastive loss function
```
</details>
- <details><summary>MoCo v3</summary>
- Method
```
projection_dim # an integer to specify dimensionality of the projection head
hidden_dim # an integer to specify dimensionality of the hidden layers in the projection head neural network
moving_average_decay # a float to specify decay rate for moving averages during training
```
- Loss
```
temperature # a float to control the temperature for the contrastive loss function
```
</details>
- <details><summary>SimCLR</summary>
- Method
```
projection_dim # an integer to specify dimensionality of the projection head
projection_num_layers # an integer to specify the number of layers in the projection head (1: SimCLR v1, 2: SimCLR v2)
projection_batch_norm # a boolean to indicate whether to use batch normalization in the projection head
```
- Loss
```
temperature # a float to control the temperature for the contrastive loss function
```
</details>
- <details><summary>SimSiam</summary>
- Method
```
projection_dim # an integer to specify dimensionality of the projection head
```
- Loss
```
eps # a float to control the stability of the loss function
```
</details>
- <details><summary>SwAV</summary>
- Method
```
projection_dim # an integer to specify dimensionality of the projection head
hidden_dim # an integer to specify dimensionality of the hidden layers in the projection head neural network
epsilon # a float to control numerical stability in the algorithm
sinkhorn_iterations # an integer to specify the number of iterations in the Sinkhorn-Knopp algorithm
num_prototypes # an integer to specify the number of prototypes or clusters for contrastive learning
queue_length # an integer to specify rhe length of the queue for maintaining negative samples
use_the_queue # a boolean to indicate whether to use the queue for negative samples
num_crops # an integer to determines the number of augmentations applied to each input image
```
- Loss
```
temperature # a float to control the temperature for the contrastive loss function
```
</details>
### Initializing the Trainer for Multimodal Models
You can easily import Trainer module from AK_SSL library and start utilizing it right away.
```python
from AK_SSL.multimodal import Trainer
```
Now, let's initialize the self-supervised trainer with our chosen method, backbone, dataset, and other configurations.
```python
trainer = Trainer(
method="clip", # training method as string (CLIP, ALBEF, SLIP, SimVLM, UNITER, VSE)
image_encoder=img_encoder, # vision model to extract image features as nn.Module
text_encoder=txt_encoder, # text model to extract text features as nn.Module
mixed_precision_training=True, # whether to use mixed precision training or not as boolean
save_dir="./save_for_report/", # directory to save training checkpoints and Tensorboard logs as string
checkpoint_interval=50, # interval (in epochs) for saving checkpoints as integer
reload_checkpoint=False, # reload a previously saved checkpoint as boolean
verbose=True, # enable verbose output for training progress as a boolean
**kwargs # other arguments
)
```
Note: The use of **kwargs can differ between methods, depending on the specific method, loss function, transformation, and other factors. If you are utilizing any of the objectives listed below, you must provide their arguments during the initialization of the Trainer class.
- <details><summary>CLIP</summary>
```
image_feature_dim # Dimension of the image features as integer
text_feature_dim # Dimension of the text features as integer
embed_dim # Dimension of the embeddings as integer
init_tau # Initial value of tau as float
init_b # Initial value of b as float
```
</details>
- <details><summary>ALBEF</summary>
```
mlm_probability # Masked language modeling probability as float
embed_dim # Dimension of the embeddings as integer
vision_width # Vision encoder output width as integer
temp # Temperature parameter as float
queue_size # Queue size as integer
momentum # Momentum parameter as float
```
</details>
- <details><summary>SimVLM</summary>
```
transformer_encoder # Transformer encoder for vision and text embeddings as nn.Module
transformer_decoder # Transformer decoder for embeddings as nn.Module
vocab_size # Size of the vocabulary as integer
feature_dim # Dimension of the features as integer
max_seq_len # Maximum sequence length as integer
max_trunc_txt_len # Maximum truncated text length as integer
prefix_txt_len # Prefix text length as integer
target_txt_len # Target text length as integer
pad_idx # Padding index as integer
image_resolution # Image resolution as integer
patch_size # Patch size as integer
num_channels # Number of channels as integer
```
</details>
- <details><summary>SLIP</summary>
```
mlp_dim # Dimension of the MLP as integer
vision_feature_dim # Dimension of the vision features as integer
transformer_feature_dim # Dimension of the transformer features as integer
embed_dim # Dimension of the embeddings as integer
```
</details>
- <details><summary>UNITER</summary>
```
pooler # pooler as nn.Module
encoder # transformer encoder as nn.Module
num_answer # number of answer classes as integer
hidden_size # hidden size as integer
attention_probs_dropout_prob # dropout rate as float
initializer_range # initializer range as float
```
</details>
- <details><summary>VSE</summary>
```
margin # Margin for contrastive loss as float
```
</details>
### Training the Self-Supervised Model for Vision Models
Then, we'll train the self-supervised model using the specified parameters.
```python
trainer.train(
dataset=train_dataset, # training dataset as torch.utils.data.Dataset
batch_size=256, # the number of training examples used in each iteration as integer
start_epoch=1, # the starting epoch for training as integer (if 'reload_checkpoint' parameter was True, start epoch equals to the latest checkpoint epoch)
epochs=100, # the total number of training epochs as integer
optimizer="Adam", # the optimization algorithm used for training as string (Adam, SGD, or AdamW)
weight_decay=1e-6, # a regularization term to prevent overfitting by penalizing large weights as float
learning_rate=1e-3, # the learning rate for the optimizer as float
)
```
### Training the Self-Supervised Model for Multimodal Models
Then, we'll train the self-supervised model using the specified parameters.
```python
trainer.train(
dataset=train_dataset, # the training data set as torch.utils.data.Dataset
batch_size=256, # the number of training examples used in each iteration as integer
start_epoch=1, # the starting epoch for training as integer (if 'reload_checkpoint' parameter was True, start epoch equals to the latest checkpoint epoch)
epochs=100, # the total number of training epochs as integer
optimizer="Adam", # the optimization algorithm used for training as string (Adam, SGD, or AdamW)
weight_decay=1e-6, # a regularization term to prevent overfitting by penalizing large weights as float
learning_rate=1e-3, # the learning rate for the optimizer as float
)
```
### Evaluating the Vision Self-Supervised Models
This evaluation assesses how well the pre-trained model performs on a dataset, specifically for tasks related to linear evaluation.
```python
trainer.evaluate(
train_dataset=train_dataset, # to specify the training dataset as torch.utils.data.Dataset
test_dataset=test_dataset, # to specify the testing dataset as torch.utils.data.Dataset
eval_method="linear", # the evaluation method to use as string (linear or finetune)
top_k=1, # the number of top-k predictions to consider during evaluation as integer
epochs=100, # the number of evaluation epochs as integer
optimizer='Adam', # the optimization algorithm used during evaluation as string (Adam, SGD, or AdamW)
weight_decay=1e-6, # a regularization term applied during evaluation to prevent overfitting as float
learning_rate=1e-3, # the learning rate for the optimizer during evaluation as float
batch_size=256, # the batch size used for evaluation in integer
fine_tuning_data_proportion=1, # the proportion of training data to use during evaluation as float in range of (0.0, 1]
)
```
### Get the Vision Self-Supervised Models backbone
In case you want to use the pre-trained network in your own downstream task, you need to define a downstream task model. This model should include the self-supervised model backbone as one of its components. Here's an example of how to define a simple downstream model class:
```python
class DownstreamNet(nn.Module):
def __init__(self, backbone, **kwargs):
super().__init__()
self.backbone = backbone
# You can define your downstream task model here
def forward(self, x):
x = self.backbone(x)
# ...
downstream_model = DownstreamNet(trainer.get_backbone())
```
### Loading Self-Supervised Model Checkpoint
To load a previous checkpoint into the network, you can do as below.
```python
path = 'YOUR CHECKPOINT PATH'
trainer.load_checkpoint(path)
```
### Saving Self-Supervised Model backbone
To save model backbone, you can do as below.
```python
trainer.save_backbone()
```
That's it! You've successfully trained and evaluate a self-supervised model using the AK_SSL Python library. You can further customize and experiment with different self-supervised methods, backbones, and hyperparameters to suit your specific tasks.
You can find the description of Trainer class and its function using `help` built in fuction in python.
---
## 📊 Benchmarks
We executed models and obtained results on the CIFAR10 dataset, with plans to expand our experimentation to other datasets. Please note that hyperparameters were not optimized for maximum accuracy.
| Method | Backbone | Batch Size | Epoch | Optimizer | Learning Rate | Weight Decay | Linear Top1 | Fine-tune Top1 | Download Backbone | Download Full Checkpoint |
|--------------|----------|------------|-------|-----------|---------------|--------------|-------------|----------------|-------------------|--------------------------|
| BarlowTwins | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 70.92% | 79.50% |[Link](https://www.dropbox.com/scl/fi/ok7vojezit6p3v9vonvox/backbone.pth?rlkey=xddpc9bkqnc38xx2viivnem3n&dl=0)|[Link](https://www.dropbox.com/scl/fi/1d32t8hdlkqxbfokrqlq4/barlowtwins_model_20230905_054800_epoch800?rlkey=1i4xe7k5g9i79vaq18uufhanl&dl=0)|
| BYOL | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 71.06% | 71.04% | | |
| DINO | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 9.91% | 9.76% | | |
| MoCo v2 | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 70.08% | 78.71% |[Link](https://www.dropbox.com/scl/fi/b29krbcej64chpif0tztq/backbone.pth?rlkey=n9c8z3nnpdgovml6wjdgo0txp&dl=0)|[Link](https://www.dropbox.com/scl/fi/ewcatz0yuors9z327jjix/mocov2_model_20230906_162610_epoch800.pth?rlkey=fh5myjhgsn59rulx10t0g3hl8&dl=0)|
| MoCo v3 | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 59.98% | 74.20% |[Link](https://www.dropbox.com/scl/fi/3q787003vr4xa8gy5ozeu/backbone.pth?rlkey=qqy16a8tuyxvcgg7t0gi88ysq&dl=0)|[Link](https://www.dropbox.com/scl/fi/d1icqzui08ey1u1xpao4i/MoCov3_model_20230905_154626_epoch800?rlkey=o4zuo5fisi067n45yl76yc152&dl=0)|
| SimCLR v1 | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 73.09% | 72.75% |[Link](https://www.dropbox.com/scl/fi/r0j23uv3krbcq2k7i6ynn/backbone-simclr1.pth?rlkey=tzdsjj0mucge377qwjqg961bs&dl=0)|[Link](https://www.dropbox.com/scl/fi/kognvkgbvzblpmx6ia1h1/simclrv1_model_20230906_065315_epoch800?rlkey=kzq1nuf305gx17hveokt1o6on&dl=0)|
| SimCLR v2 | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 73.07% | 81.52% | | |
| SimSiam | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 19.77% | 70.77% |[Link](https://www.dropbox.com/scl/fi/nlpqjijho9vqigub2ibho/backbone.pth?rlkey=7otvzznf1qf0xvskqnp8wii9k&dl=0)|[Link](https://www.dropbox.com/scl/fi/5c1un6jjec01aphxzkv5d/simsiam_model_20230906_101310_epoch800?rlkey=teilbfj6wbi1wytg1mcgx0bcw&dl=0)|
| SwAv | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 33.36% | 74.14% | | |
---
## 📜 References Used
In the development of this project, we have drawn inspiration and utilized code, libraries, and resources from various sources. We would like to acknowledge and express our gratitude to the following references and their respective authors:
- [Lightly Library](https://github.com/lightly-ai/lightly)
- [PYSSL Library](https://github.com/giakou4/pyssl)
- [SimCLR Implementation](https://github.com/Spijkervet/SimCLR)
- All original codes of supported methods
These references have played a crucial role in enhancing the functionality and quality of our project. We extend our thanks to the authors and contributors of these resources for their valuable work.
---
## 💯 License
This project is licensed under the [MIT License](./LICENSE).
---
## 🤝 Collaborators
By:
- [Kian Majlessi](https://github.com/kianmajl)
- [Audrina Ebrahimi](https://github.com/audrina-ebrahimi)
Thanks to [Dr. Peyman Adibi](https://scholar.google.com/citations?user=u-FQZMkAAAAJ) and [Dr. Hossein Karshenas](https://scholar.google.com/citations?user=BjMFkWEAAAAJ), for their invaluable guidance and support throughout this project.
Raw data
{
"_id": null,
"home_page": null,
"name": "AK-SSL",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Audrina Ebrahimi & Kian Majlessi",
"author_email": "audrina_ebrahimi@outlook.com",
"download_url": "https://files.pythonhosted.org/packages/03/14/9ed56e1651a41a4206aae3f0d87b75bda30a3fa78e51579fbdcde0224556/ak_ssl-0.2.0.tar.gz",
"platform": null,
"description": "<p align=\"center\">\r\n <img src=\"https://raw.githubusercontent.com/audrina-ebrahimi/AK_SSL/main/Documents/logo.png\" alt=\"AK_SSL Logo\" width=\"50%\"/>\r\n</p>\r\n\r\n<h1>\r\n<br>AK_SSL: A Self-Supervised Learning Library\r\n</h1>\r\n\r\n\r\n\r\n   [](https://pepy.tech/project/AK_SSL)\r\n\r\n\r\n---\r\n\r\n## \ud83d\udcd2 Table of Contents\r\n- \ud83d\udcd2 Table of Contents\r\n- \ud83d\udccd Overview\r\n- \u270d\ufe0f Self Supervised Learning\r\n- \ud83d\udd0e Supported Methods\r\n- \ud83d\udce6 Installation\r\n- \ud83d\udca1 Tutorial\r\n- \ud83d\udcca Benchmarks\r\n- \ud83d\udcdc References Used\r\n- \ud83d\udcaf License\r\n- \ud83e\udd1d Collaborators\r\n\r\n\r\n---\r\n## \ud83d\udccd Overview\r\nWelcome to the Self-Supervised Learning Library! This repository hosts a collection of tools and implementations for self-supervised learning. Self-supervised learning is a powerful paradigm that leverages unlabeled data to pre-trained models, which can then be fine-tuned on specific tasks with smaller labeled datasets. This library aims to provide researchers and practitioners with a comprehensive set of tools to experiment, learn, and apply self-supervised learning techniques effectively.\r\nThis project was our assignment during the summer apprenticeship and final project in the newly established Intelligent and Learning System ([ILS](http://ils.ui.ac.ir/)) laboratory at the University of Isfahan.\r\n\r\n---\r\n\r\n## \u270d\ufe0f Self Supervised Learning\r\n\r\nSelf-supervised learning is a subfield of machine learning where models are trained to predict certain aspects of the input data without relying on manual labeling. This approach has gained significant attention due to its ability to leverage large amounts of unlabeled data, which is often easier to obtain than fully annotated datasets. This library provides implementations of various self-supervised techniques, allowing you to experiment with and apply these methods in your own projects.\r\n\r\n---\r\n\r\n## \ud83d\udd0e Supported Methods\r\n\r\n### Vision Models\r\n\r\n- BarlowTwins\r\n- BYOL\r\n- DINO\r\n- MoCo v2\r\n- MoCo v3\r\n- SimCLR v1\r\n- SimCLR v2\r\n- SimSiam\r\n- SwAV\r\n\r\n\r\n### Multimodal Models\r\n\r\n- CLIP\r\n- ALBEF\r\n- SLIP\r\n- VSE\r\n- SimVLM\r\n- UNITER\r\n\r\n---\r\n\r\n## \ud83d\udce6 Installation\r\n\r\nYou can install AK_SSL and its dependencies from PyPI with:\r\n\r\n\r\n```sh\r\npip install AK-SSL\r\n```\r\n\r\nWe strongly recommend that you install AK_SSL in a dedicated virtualenv, to avoid conflicting with your system packages\r\n\r\n---\r\n\r\n## \ud83d\udca1 Tutorial\r\n\r\nUsing AK_SSL, you have the flexibility to leverage the most recent self-supervised learning techniques seamlessly, harnessing the complete capabilities of PyTorch. You can explore diverse backbones, models, and optimizer while benefiting from a user-friendly framework that has been purposefully crafted for ease of use.\r\n\r\n\r\n### Initializing the Trainer for Vision Models\r\n\r\nYou can easily import Trainer module from AK_SSL library and start utilizing it right away.\r\n\r\n```python\r\nfrom AK_SSL.vision import Trainer\r\n```\r\n\r\nNow, let's initialize the self-supervised trainer with our chosen method, backbone, dataset, and other configurations.\r\n\r\n```python\r\ntrainer = Trainer(\r\n method=\"barlowtwins\", # training method as string (BarlowTwins, BYOL, DINO, MoCov2, MoCov3, SimCLR, SimSiam, SwAV)\r\n backbone=backbone, # backbone architecture as torch.Module\r\n feature_size=feature_size, # size of the extracted features as integer\r\n image_size=32, # dataset image size as integer\r\n save_dir=\"./save_for_report/\", # directory to save training checkpoints and Tensorboard logs as string\r\n checkpoint_interval=50, # interval (in epochs) for saving checkpoints as integer\r\n reload_checkpoint=False, # reload a previously saved checkpoint as boolean\r\n verbose=True, # enable verbose output for training progress as a boolean\r\n **kwargs # other arguments \r\n)\r\n```\r\nNote: The use of **kwargs can differ between methods, depending on the specific method, loss function, transformation, and other factors. If you are utilizing any of the objectives listed below, you must provide their arguments during the initialization of the Trainer class.\r\n\r\n- <details><summary>SimCLR Transformation</summary>\r\n \r\n ```\r\n color_jitter_strength # a float to Set the strength of color\r\n use_blur # a boolean to specify whether to apply blur augmentation\r\n mean # a float to specify the mean values for each channel\r\n std # a float to specify the standard deviation values for each channel\r\n ```\r\n \r\n </details>\r\n\r\n- <details><summary>BarlowTwins</summary>\r\n\r\n - Method\r\n ```\r\n projection_dim # an integer to specify dimensionality of the projection head\r\n hidden_dim # an integer to specify dimensionality of the hidden layers in the neural network\r\n moving_average_decay # a float to specify decay rate for moving averages during training\r\n ```\r\n - Loss\r\n ```\r\n lambda_param # a float to controlling the balance between the main loss and the orthogonality loss\r\n ```\r\n \r\n </details>\r\n\r\n\r\n- <details><summary>DINO Method</summary>\r\n\r\n - Method\r\n ```\r\n projection_dim # an integer to specify dimensionality of the projection head\r\n hidden_dim # an integer to specify dimensionality of the hidden layers in the projection head neural network\r\n bottleneck_dim # an integer to specify dimensionality of the bottleneck layer in the student network\r\n temp_student # a float to specify temperature parameter for the student's logits\r\n temp_teacher # a float to specify temperature parameter for the teacher's logits\r\n norm_last_layer # a boolean to specify whether to normalize the last layer of the network\r\n momentum_teacher # a float to control momentum coefficient for updating the teacher network\r\n num_crops # an integer to determines the number of augmentations applied to each input image\r\n use_bn_in_head # a boolean to spcecify whether to use batch normalization in the projection head\r\n ```\r\n - Loss\r\n ```\r\n center_momentum # a float to control momentum coefficient for updating the center of cluster assignments\r\n ```\r\n\r\n </details>\r\n\r\n\r\n- <details><summary>MoCo v2</summary>\r\n\r\n - Method\r\n ```\r\n projection_dim # an integer to specify dimensionality of the projection head\r\n K # an integer to specify number of negative samples per positive sample in the contrastive loss\r\n m # a float to control momentum coefficient for updating the moving-average encoder\r\n ```\r\n - Loss\r\n ```\r\n temperature # a float to control the temperature for the contrastive loss function\r\n ```\r\n\r\n </details>\r\n \r\n- <details><summary>MoCo v3</summary>\r\n\r\n - Method \r\n ```\r\n projection_dim # an integer to specify dimensionality of the projection head\r\n hidden_dim # an integer to specify dimensionality of the hidden layers in the projection head neural network\r\n moving_average_decay # a float to specify decay rate for moving averages during training\r\n ```\r\n - Loss\r\n ```\r\n temperature # a float to control the temperature for the contrastive loss function\r\n ```\r\n\r\n </details>\r\n\r\n\r\n- <details><summary>SimCLR</summary>\r\n\r\n - Method\r\n ```\r\n projection_dim # an integer to specify dimensionality of the projection head\r\n projection_num_layers # an integer to specify the number of layers in the projection head (1: SimCLR v1, 2: SimCLR v2)\r\n projection_batch_norm # a boolean to indicate whether to use batch normalization in the projection head\r\n ```\r\n - Loss\r\n ```\r\n temperature # a float to control the temperature for the contrastive loss function\r\n ```\r\n\r\n </details>\r\n\r\n- <details><summary>SimSiam</summary>\r\n \r\n - Method\r\n ```\r\n projection_dim # an integer to specify dimensionality of the projection head\r\n ```\r\n - Loss\r\n ```\r\n eps # a float to control the stability of the loss function\r\n ```\r\n\r\n </details>\r\n \r\n\r\n- <details><summary>SwAV</summary>\r\n\r\n - Method\r\n ```\r\n projection_dim # an integer to specify dimensionality of the projection head\r\n hidden_dim # an integer to specify dimensionality of the hidden layers in the projection head neural network\r\n epsilon # a float to control numerical stability in the algorithm\r\n sinkhorn_iterations # an integer to specify the number of iterations in the Sinkhorn-Knopp algorithm\r\n num_prototypes # an integer to specify the number of prototypes or clusters for contrastive learning\r\n queue_length # an integer to specify rhe length of the queue for maintaining negative samples\r\n use_the_queue # a boolean to indicate whether to use the queue for negative samples\r\n num_crops # an integer to determines the number of augmentations applied to each input image\r\n ```\r\n - Loss\r\n ```\r\n temperature # a float to control the temperature for the contrastive loss function\r\n ```\r\n\r\n </details>\r\n\r\n\r\n### Initializing the Trainer for Multimodal Models\r\n\r\nYou can easily import Trainer module from AK_SSL library and start utilizing it right away.\r\n\r\n```python\r\nfrom AK_SSL.multimodal import Trainer\r\n```\r\n\r\nNow, let's initialize the self-supervised trainer with our chosen method, backbone, dataset, and other configurations.\r\n\r\n```python\r\ntrainer = Trainer(\r\n method=\"clip\", # training method as string (CLIP, ALBEF, SLIP, SimVLM, UNITER, VSE)\r\n image_encoder=img_encoder, # vision model to extract image features as nn.Module\r\n text_encoder=txt_encoder, # text model to extract text features as nn.Module\r\n mixed_precision_training=True, # whether to use mixed precision training or not as boolean\r\n save_dir=\"./save_for_report/\", # directory to save training checkpoints and Tensorboard logs as string\r\n checkpoint_interval=50, # interval (in epochs) for saving checkpoints as integer\r\n reload_checkpoint=False, # reload a previously saved checkpoint as boolean\r\n verbose=True, # enable verbose output for training progress as a boolean\r\n **kwargs # other arguments \r\n)\r\n```\r\nNote: The use of **kwargs can differ between methods, depending on the specific method, loss function, transformation, and other factors. If you are utilizing any of the objectives listed below, you must provide their arguments during the initialization of the Trainer class.\r\n\r\n- <details><summary>CLIP</summary>\r\n \r\n ```\r\n image_feature_dim # Dimension of the image features as integer\r\n text_feature_dim # Dimension of the text features as integer\r\n embed_dim # Dimension of the embeddings as integer\r\n init_tau # Initial value of tau as float\r\n init_b # Initial value of b as float\r\n ```\r\n \r\n </details>\r\n\r\n\r\n- <details><summary>ALBEF</summary>\r\n \r\n ```\r\n mlm_probability # Masked language modeling probability as float\r\n embed_dim # Dimension of the embeddings as integer\r\n vision_width # Vision encoder output width as integer\r\n temp # Temperature parameter as float\r\n queue_size # Queue size as integer\r\n momentum # Momentum parameter as float\r\n ```\r\n \r\n </details>\r\n\r\n\r\n- <details><summary>SimVLM</summary>\r\n \r\n ```\r\n transformer_encoder # Transformer encoder for vision and text embeddings as nn.Module\r\n transformer_decoder # Transformer decoder for embeddings as nn.Module\r\n vocab_size # Size of the vocabulary as integer\r\n feature_dim # Dimension of the features as integer\r\n max_seq_len # Maximum sequence length as integer\r\n max_trunc_txt_len # Maximum truncated text length as integer\r\n prefix_txt_len # Prefix text length as integer\r\n target_txt_len # Target text length as integer\r\n pad_idx # Padding index as integer\r\n image_resolution # Image resolution as integer\r\n patch_size # Patch size as integer\r\n num_channels # Number of channels as integer\r\n ```\r\n \r\n </details>\r\n\r\n\r\n- <details><summary>SLIP</summary>\r\n \r\n ```\r\n mlp_dim # Dimension of the MLP as integer\r\n vision_feature_dim # Dimension of the vision features as integer\r\n transformer_feature_dim # Dimension of the transformer features as integer\r\n embed_dim # Dimension of the embeddings as integer\r\n ```\r\n \r\n </details>\r\n\r\n\r\n- <details><summary>UNITER</summary>\r\n \r\n ```\r\n pooler # pooler as nn.Module\r\n encoder # transformer encoder as nn.Module\r\n num_answer # number of answer classes as integer\r\n hidden_size # hidden size as integer\r\n attention_probs_dropout_prob # dropout rate as float\r\n initializer_range # initializer range as float\r\n ```\r\n \r\n </details>\r\n\r\n\r\n- <details><summary>VSE</summary>\r\n \r\n ```\r\n margin # Margin for contrastive loss as float\r\n ```\r\n \r\n </details>\r\n\r\n\r\n### Training the Self-Supervised Model for Vision Models\r\n\r\nThen, we'll train the self-supervised model using the specified parameters.\r\n\r\n```python\r\n trainer.train(\r\n dataset=train_dataset, # training dataset as torch.utils.data.Dataset \r\n batch_size=256, # the number of training examples used in each iteration as integer\r\n start_epoch=1, # the starting epoch for training as integer (if 'reload_checkpoint' parameter was True, start epoch equals to the latest checkpoint epoch)\r\n epochs=100, # the total number of training epochs as integer\r\n optimizer=\"Adam\", # the optimization algorithm used for training as string (Adam, SGD, or AdamW)\r\n weight_decay=1e-6, # a regularization term to prevent overfitting by penalizing large weights as float\r\n learning_rate=1e-3, # the learning rate for the optimizer as float\r\n)\r\n```\r\n\r\n### Training the Self-Supervised Model for Multimodal Models\r\n\r\nThen, we'll train the self-supervised model using the specified parameters.\r\n\r\n```python\r\n trainer.train(\r\n dataset=train_dataset, # the training data set as torch.utils.data.Dataset \r\n batch_size=256, # the number of training examples used in each iteration as integer\r\n start_epoch=1, # the starting epoch for training as integer (if 'reload_checkpoint' parameter was True, start epoch equals to the latest checkpoint epoch)\r\n epochs=100, # the total number of training epochs as integer\r\n optimizer=\"Adam\", # the optimization algorithm used for training as string (Adam, SGD, or AdamW)\r\n weight_decay=1e-6, # a regularization term to prevent overfitting by penalizing large weights as float\r\n learning_rate=1e-3, # the learning rate for the optimizer as float\r\n)\r\n```\r\n\r\n\r\n### Evaluating the Vision Self-Supervised Models\r\nThis evaluation assesses how well the pre-trained model performs on a dataset, specifically for tasks related to linear evaluation.\r\n```python\r\ntrainer.evaluate(\r\n train_dataset=train_dataset, # to specify the training dataset as torch.utils.data.Dataset\r\n test_dataset=test_dataset, # to specify the testing dataset as torch.utils.data.Dataset\r\n eval_method=\"linear\", # the evaluation method to use as string (linear or finetune)\r\n top_k=1, # the number of top-k predictions to consider during evaluation as integer\r\n epochs=100, # the number of evaluation epochs as integer\r\n optimizer='Adam', # the optimization algorithm used during evaluation as string (Adam, SGD, or AdamW)\r\n weight_decay=1e-6, # a regularization term applied during evaluation to prevent overfitting as float\r\n learning_rate=1e-3, # the learning rate for the optimizer during evaluation as float\r\n batch_size=256, # the batch size used for evaluation in integer\r\n fine_tuning_data_proportion=1, # the proportion of training data to use during evaluation as float in range of (0.0, 1]\r\n)\r\n```\r\n\r\n### Get the Vision Self-Supervised Models backbone\r\n\r\nIn case you want to use the pre-trained network in your own downstream task, you need to define a downstream task model. This model should include the self-supervised model backbone as one of its components. Here's an example of how to define a simple downstream model class:\r\n\r\n```python\r\n class DownstreamNet(nn.Module):\r\n def __init__(self, backbone, **kwargs):\r\n super().__init__()\r\n self.backbone = backbone\r\n \r\n # You can define your downstream task model here\r\n \r\n def forward(self, x):\r\n x = self.backbone(x)\r\n # ...\r\n \r\n \r\n downstream_model = DownstreamNet(trainer.get_backbone())\r\n```\r\n\r\n### Loading Self-Supervised Model Checkpoint\r\n\r\nTo load a previous checkpoint into the network, you can do as below.\r\n```python\r\npath = 'YOUR CHECKPOINT PATH'\r\ntrainer.load_checkpoint(path)\r\n```\r\n\r\n### Saving Self-Supervised Model backbone\r\nTo save model backbone, you can do as below.\r\n\r\n```python\r\ntrainer.save_backbone()\r\n```\r\n\r\n\r\nThat's it! You've successfully trained and evaluate a self-supervised model using the AK_SSL Python library. You can further customize and experiment with different self-supervised methods, backbones, and hyperparameters to suit your specific tasks.\r\nYou can find the description of Trainer class and its function using `help` built in fuction in python.\r\n\r\n---\r\n\r\n## \ud83d\udcca Benchmarks\r\n\r\nWe executed models and obtained results on the CIFAR10 dataset, with plans to expand our experimentation to other datasets. Please note that hyperparameters were not optimized for maximum accuracy.\r\n\r\n| Method | Backbone | Batch Size | Epoch | Optimizer | Learning Rate | Weight Decay | Linear Top1 | Fine-tune Top1 | Download Backbone | Download Full Checkpoint |\r\n|--------------|----------|------------|-------|-----------|---------------|--------------|-------------|----------------|-------------------|--------------------------|\r\n| BarlowTwins | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 70.92% | 79.50% |[Link](https://www.dropbox.com/scl/fi/ok7vojezit6p3v9vonvox/backbone.pth?rlkey=xddpc9bkqnc38xx2viivnem3n&dl=0)|[Link](https://www.dropbox.com/scl/fi/1d32t8hdlkqxbfokrqlq4/barlowtwins_model_20230905_054800_epoch800?rlkey=1i4xe7k5g9i79vaq18uufhanl&dl=0)|\r\n| BYOL | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 71.06% | 71.04% | | |\r\n| DINO | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 9.91% | 9.76% | | |\r\n| MoCo v2 | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 70.08% | 78.71% |[Link](https://www.dropbox.com/scl/fi/b29krbcej64chpif0tztq/backbone.pth?rlkey=n9c8z3nnpdgovml6wjdgo0txp&dl=0)|[Link](https://www.dropbox.com/scl/fi/ewcatz0yuors9z327jjix/mocov2_model_20230906_162610_epoch800.pth?rlkey=fh5myjhgsn59rulx10t0g3hl8&dl=0)|\r\n| MoCo v3 | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 59.98% | 74.20% |[Link](https://www.dropbox.com/scl/fi/3q787003vr4xa8gy5ozeu/backbone.pth?rlkey=qqy16a8tuyxvcgg7t0gi88ysq&dl=0)|[Link](https://www.dropbox.com/scl/fi/d1icqzui08ey1u1xpao4i/MoCov3_model_20230905_154626_epoch800?rlkey=o4zuo5fisi067n45yl76yc152&dl=0)|\r\n| SimCLR v1 | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 73.09% | 72.75% |[Link](https://www.dropbox.com/scl/fi/r0j23uv3krbcq2k7i6ynn/backbone-simclr1.pth?rlkey=tzdsjj0mucge377qwjqg961bs&dl=0)|[Link](https://www.dropbox.com/scl/fi/kognvkgbvzblpmx6ia1h1/simclrv1_model_20230906_065315_epoch800?rlkey=kzq1nuf305gx17hveokt1o6on&dl=0)|\r\n| SimCLR v2 | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 73.07% | 81.52% | | |\r\n| SimSiam | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 19.77% | 70.77% |[Link](https://www.dropbox.com/scl/fi/nlpqjijho9vqigub2ibho/backbone.pth?rlkey=7otvzznf1qf0xvskqnp8wii9k&dl=0)|[Link](https://www.dropbox.com/scl/fi/5c1un6jjec01aphxzkv5d/simsiam_model_20230906_101310_epoch800?rlkey=teilbfj6wbi1wytg1mcgx0bcw&dl=0)|\r\n| SwAv | Resnet18 | 256 | 800 | Adam | 1e-3 | 1e-6 | 33.36% | 74.14% | | |\r\n\r\n---\r\n \r\n## \ud83d\udcdc References Used\r\n\r\nIn the development of this project, we have drawn inspiration and utilized code, libraries, and resources from various sources. We would like to acknowledge and express our gratitude to the following references and their respective authors:\r\n\r\n- [Lightly Library](https://github.com/lightly-ai/lightly)\r\n- [PYSSL Library](https://github.com/giakou4/pyssl)\r\n- [SimCLR Implementation](https://github.com/Spijkervet/SimCLR)\r\n- All original codes of supported methods\r\n\r\nThese references have played a crucial role in enhancing the functionality and quality of our project. We extend our thanks to the authors and contributors of these resources for their valuable work.\r\n\r\n---\r\n\r\n## \ud83d\udcaf License\r\n\r\nThis project is licensed under the [MIT License](./LICENSE).\r\n\r\n---\r\n\r\n## \ud83e\udd1d Collaborators\r\nBy:\r\n - [Kian Majlessi](https://github.com/kianmajl)\r\n - [Audrina Ebrahimi](https://github.com/audrina-ebrahimi)\r\n\r\nThanks to [Dr. Peyman Adibi](https://scholar.google.com/citations?user=u-FQZMkAAAAJ) and [Dr. Hossein Karshenas](https://scholar.google.com/citations?user=BjMFkWEAAAAJ), for their invaluable guidance and support throughout this project.\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Self-Supervised Learning Library",
"version": "0.2.0",
"project_urls": {
"Github": "https://github.com/audrina-ebrahimi/AK_SSL"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "67496b41ab158c90e40845c802f35bc45cfefef549930ad8efd1f0b21663670f",
"md5": "f2073b70bbc516ea6f7f06307901f67d",
"sha256": "e2a4c7fc2ca190320d48416006ce87bd89923ffc41275b832a3a08bff3489e47"
},
"downloads": -1,
"filename": "AK_SSL-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f2073b70bbc516ea6f7f06307901f67d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 55954,
"upload_time": "2024-08-20T18:23:58",
"upload_time_iso_8601": "2024-08-20T18:23:58.124882Z",
"url": "https://files.pythonhosted.org/packages/67/49/6b41ab158c90e40845c802f35bc45cfefef549930ad8efd1f0b21663670f/AK_SSL-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "03149ed56e1651a41a4206aae3f0d87b75bda30a3fa78e51579fbdcde0224556",
"md5": "cd3c8f1dc37ddd34a9ea376fd2d5809c",
"sha256": "576706991d2fa438190ce303ffd9b68d78e9c09f1271ee50edf7ea21074ede8c"
},
"downloads": -1,
"filename": "ak_ssl-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "cd3c8f1dc37ddd34a9ea376fd2d5809c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 48485,
"upload_time": "2024-08-20T18:24:00",
"upload_time_iso_8601": "2024-08-20T18:24:00.071025Z",
"url": "https://files.pythonhosted.org/packages/03/14/9ed56e1651a41a4206aae3f0d87b75bda30a3fa78e51579fbdcde0224556/ak_ssl-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-20 18:24:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "audrina-ebrahimi",
"github_project": "AK_SSL",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "ak-ssl"
}