# ViT Image Retrieval



A Python-based content-based image retrieval (CBIR) system using Vision Transformer (ViT) features and FAISS indexing. This application provides both a graphical user interface (GUI) and programmatic API for indexing and searching similar images.
## Features
- **Vision Transformer Features**: Utilizes ViT-B/16 model pre-trained on ImageNet for robust feature extraction
- **Fast Similarity Search**: Implements FAISS IVF indexing for efficient similarity search
- **Cross-Platform Support**: Works on Windows, macOS, and Linux
- **User-Friendly GUI**:
- Interactive interface for feature extraction and image search
- Double-click or right-click to open images and containing folders
- Progress tracking for batch operations
- **Multiple Image Format Support**: Handles PNG, JPG, JPEG, and WebP formats
- **GPU Acceleration**: Optional GPU support for faster processing when available
## Installation
### Prerequisites
- Python 3.8 or higher
- pip package manager
### Install from Source
1. Clone the repository:
```bash
git clone https://github.com/bnsreenu/vit-image-retrieval.git
cd vit-image-retrieval
```
2. Install the package:
```bash
pip install -e .
```
### Install from PyPI
```bash
pip install vit-image-retrieval
```
## Usage
### GUI Application
Launch the application using either command:
```bash
vit
# or
vit-image-retrieval-gui
```
The GUI has two main tabs:
1. **Feature Extraction Tab**:
- Select a directory of images to index
- Optionally provide an index name
- Monitor progress through the progress bar
- Save index for later use
2. **Image Retrieval Tab**:
- Load previously created index
- Select query image
- Set number of similar images to retrieve
- View results with similarity scores
### Python API
```python
from vit_image_retrieval import ImageRetrievalSystem
# Initialize the system
retrieval = ImageRetrievalSystem()
# Index a directory of images
retrieval.index_images("path/to/image/directory")
# Save the index for later use
retrieval.save("my_index.faiss", "my_metadata.json")
# Load existing index
retrieval = ImageRetrievalSystem(
index_path="my_index.faiss",
metadata_path="my_metadata.json"
)
# Search for similar images
results = retrieval.search("path/to/query/image.jpg", k=5)
# Process results
for path, similarity, metadata in results:
print(f"Similar image: {path} (similarity: {similarity:.3f})")
```
## Platform-Specific Notes
### Windows
- Supports direct file and folder opening through Windows Explorer
- Uses native file selection dialogs
### macOS
- Uses native macOS commands for file operations
- Finder integration for viewing files and folders
- Requires no additional configuration
### Linux
- Automatically handles Qt platform plugin configurations
- Uses system's default applications for file operations
- Requires X11 or Wayland display server
## Requirements
- torch >= 2.0.0
- torchvision >= 0.15.0
- faiss-cpu >= 1.7.4 (or faiss-gpu for GPU support)
- PyQt5 >= 5.15.0
- Pillow >= 9.0.0
- numpy >= 1.20.0
## Development
To contribute or modify:
1. Fork the repository
2. Create a new branch:
```bash
git checkout -b feature-name
```
3. Make changes and test
4. Submit a pull request
## Common Issues and Solutions
1. **Linux Qt Plugin Error**: Automatically handled by removing conflicting Qt plugin paths
2. **GPU Memory Issues**: Use CPU version if encountering GPU memory problems
3. **File Permission Errors**: Check user permissions for the working directory
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Acknowledgments
- Vision Transformer implementation from torchvision
- FAISS library from Facebook Research
- PyQt5 for the graphical interface
## Citation
If you use this software in your research, please cite:
```bibtex
@software{vit_image_retrieval,
title = {ViT Image Retrieval},
author = {Dr. Sreenivas Bhattiprolu},
year = {2024},
url = {https://github.com/bnsreenu/vit-image-retrieval}
}
```
## Support
For support, please:
1. Check the issues page for existing solutions
2. Create a new issue with:
- Your operating system
- Python version
- Complete error message
- Steps to reproduce the problem
Raw data
{
"_id": null,
"home_page": "https://github.com/bnsreenu/vit-image-retrieval",
"name": "vit-image-retrieval",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Dr. Sreenivas Bhattiprolu",
"author_email": "pythonformicroscopists@google.com",
"download_url": "https://files.pythonhosted.org/packages/c3/37/1cfb58b3045305cba1cb633c1c688b6ced17f5057ee817a9aff47deccb24/vit_image_retrieval-1.2.0.tar.gz",
"platform": null,
"description": "# ViT Image Retrieval\r\n\r\n\r\n\r\n\r\nA Python-based content-based image retrieval (CBIR) system using Vision Transformer (ViT) features and FAISS indexing. This application provides both a graphical user interface (GUI) and programmatic API for indexing and searching similar images.\r\n\r\n## Features\r\n\r\n- **Vision Transformer Features**: Utilizes ViT-B/16 model pre-trained on ImageNet for robust feature extraction\r\n- **Fast Similarity Search**: Implements FAISS IVF indexing for efficient similarity search\r\n- **Cross-Platform Support**: Works on Windows, macOS, and Linux\r\n- **User-Friendly GUI**: \r\n - Interactive interface for feature extraction and image search\r\n - Double-click or right-click to open images and containing folders\r\n - Progress tracking for batch operations\r\n- **Multiple Image Format Support**: Handles PNG, JPG, JPEG, and WebP formats\r\n- **GPU Acceleration**: Optional GPU support for faster processing when available\r\n\r\n## Installation\r\n\r\n### Prerequisites\r\n\r\n- Python 3.8 or higher\r\n- pip package manager\r\n\r\n### Install from Source\r\n\r\n1. Clone the repository:\r\n```bash\r\ngit clone https://github.com/bnsreenu/vit-image-retrieval.git\r\ncd vit-image-retrieval\r\n```\r\n\r\n2. Install the package:\r\n```bash\r\npip install -e .\r\n```\r\n\r\n### Install from PyPI\r\n\r\n```bash\r\npip install vit-image-retrieval\r\n```\r\n\r\n## Usage\r\n\r\n### GUI Application\r\n\r\nLaunch the application using either command:\r\n```bash\r\nvit\r\n# or\r\nvit-image-retrieval-gui\r\n```\r\n\r\nThe GUI has two main tabs:\r\n\r\n1. **Feature Extraction Tab**:\r\n - Select a directory of images to index\r\n - Optionally provide an index name\r\n - Monitor progress through the progress bar\r\n - Save index for later use\r\n\r\n2. **Image Retrieval Tab**:\r\n - Load previously created index\r\n - Select query image\r\n - Set number of similar images to retrieve\r\n - View results with similarity scores\r\n\r\n### Python API\r\n\r\n```python\r\nfrom vit_image_retrieval import ImageRetrievalSystem\r\n\r\n# Initialize the system\r\nretrieval = ImageRetrievalSystem()\r\n\r\n# Index a directory of images\r\nretrieval.index_images(\"path/to/image/directory\")\r\n\r\n# Save the index for later use\r\nretrieval.save(\"my_index.faiss\", \"my_metadata.json\")\r\n\r\n# Load existing index\r\nretrieval = ImageRetrievalSystem(\r\n index_path=\"my_index.faiss\",\r\n metadata_path=\"my_metadata.json\"\r\n)\r\n\r\n# Search for similar images\r\nresults = retrieval.search(\"path/to/query/image.jpg\", k=5)\r\n\r\n# Process results\r\nfor path, similarity, metadata in results:\r\n print(f\"Similar image: {path} (similarity: {similarity:.3f})\")\r\n```\r\n\r\n## Platform-Specific Notes\r\n\r\n### Windows\r\n- Supports direct file and folder opening through Windows Explorer\r\n- Uses native file selection dialogs\r\n\r\n### macOS\r\n- Uses native macOS commands for file operations\r\n- Finder integration for viewing files and folders\r\n- Requires no additional configuration\r\n\r\n### Linux\r\n- Automatically handles Qt platform plugin configurations\r\n- Uses system's default applications for file operations\r\n- Requires X11 or Wayland display server\r\n\r\n## Requirements\r\n\r\n- torch >= 2.0.0\r\n- torchvision >= 0.15.0\r\n- faiss-cpu >= 1.7.4 (or faiss-gpu for GPU support)\r\n- PyQt5 >= 5.15.0\r\n- Pillow >= 9.0.0\r\n- numpy >= 1.20.0\r\n\r\n## Development\r\n\r\nTo contribute or modify:\r\n\r\n1. Fork the repository\r\n2. Create a new branch:\r\n```bash\r\ngit checkout -b feature-name\r\n```\r\n\r\n3. Make changes and test\r\n4. Submit a pull request\r\n\r\n## Common Issues and Solutions\r\n\r\n1. **Linux Qt Plugin Error**: Automatically handled by removing conflicting Qt plugin paths\r\n2. **GPU Memory Issues**: Use CPU version if encountering GPU memory problems\r\n3. **File Permission Errors**: Check user permissions for the working directory\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License - see the LICENSE file for details.\r\n\r\n## Acknowledgments\r\n\r\n- Vision Transformer implementation from torchvision\r\n- FAISS library from Facebook Research\r\n- PyQt5 for the graphical interface\r\n\r\n## Citation\r\n\r\nIf you use this software in your research, please cite:\r\n\r\n```bibtex\r\n@software{vit_image_retrieval,\r\n title = {ViT Image Retrieval},\r\n author = {Dr. Sreenivas Bhattiprolu},\r\n year = {2024},\r\n url = {https://github.com/bnsreenu/vit-image-retrieval}\r\n}\r\n```\r\n\r\n## Support\r\n\r\nFor support, please:\r\n1. Check the issues page for existing solutions\r\n2. Create a new issue with:\r\n - Your operating system\r\n - Python version\r\n - Complete error message\r\n - Steps to reproduce the problem\r\n",
"bugtrack_url": null,
"license": null,
"summary": "A Vision Transformer based image retrieval system",
"version": "1.2.0",
"project_urls": {
"Bug Tracker": "https://github.com/bnsreenu/vit-image-retrieval/issues",
"Homepage": "https://github.com/bnsreenu/vit-image-retrieval"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "efb847688bb120bbfe835e8875d32609f2d2ea7066604fea72b5b2888eff2ddf",
"md5": "4ea0ccd9a700fe82818f21246e76e8a4",
"sha256": "55b91626d6ddced915c7a76f8c70901f88831eaa35d138efea95ab51daaf309e"
},
"downloads": -1,
"filename": "vit_image_retrieval-1.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4ea0ccd9a700fe82818f21246e76e8a4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 20405,
"upload_time": "2024-11-25T21:44:09",
"upload_time_iso_8601": "2024-11-25T21:44:09.150814Z",
"url": "https://files.pythonhosted.org/packages/ef/b8/47688bb120bbfe835e8875d32609f2d2ea7066604fea72b5b2888eff2ddf/vit_image_retrieval-1.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "c3371cfb58b3045305cba1cb633c1c688b6ced17f5057ee817a9aff47deccb24",
"md5": "c302bd3e0ecbe4b72ae722aa65d1313d",
"sha256": "1a14b2b5de074835ff0bfa1e22f6ed11f2ab6ab678141a59fa1c3f52421271c4"
},
"downloads": -1,
"filename": "vit_image_retrieval-1.2.0.tar.gz",
"has_sig": false,
"md5_digest": "c302bd3e0ecbe4b72ae722aa65d1313d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 19102,
"upload_time": "2024-11-25T21:44:11",
"upload_time_iso_8601": "2024-11-25T21:44:11.393249Z",
"url": "https://files.pythonhosted.org/packages/c3/37/1cfb58b3045305cba1cb633c1c688b6ced17f5057ee817a9aff47deccb24/vit_image_retrieval-1.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-25 21:44:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "bnsreenu",
"github_project": "vit-image-retrieval",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "torch",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "torchvision",
"specs": [
[
">=",
"0.15.0"
]
]
},
{
"name": "faiss-cpu",
"specs": [
[
">=",
"1.7.0"
]
]
},
{
"name": "Pillow",
"specs": [
[
">=",
"9.0.0"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.20.0"
]
]
},
{
"name": "PyQt5",
"specs": [
[
">=",
"5.15.0"
]
]
},
{
"name": "setuptools",
"specs": [
[
">=",
"42.0.0"
]
]
},
{
"name": "wheel",
"specs": [
[
">=",
"0.37.0"
]
]
},
{
"name": "twine",
"specs": [
[
">=",
"3.4.2"
]
]
}
],
"lcname": "vit-image-retrieval"
}