<h1 align="center">VectorMass</h1>
<p align="center">
<img width="200" src="./VectorMass/utils/vectormass_logo.png" alt="VectorMass vector database">
</p>
Vector databases are used to store vector embeddings for fast retrieval, similarity search, and other operations like crud operations. Simply, embedding is a numerical array that includes a huge number of features. So, using vector databases we can perform a lot of useful things on that numerical representation.
In traditional databases like <b>MySQL</b>, <b>PostgreSQL</b>, and <b>SQL Server</b> we are usually querying for rows in the database where the value matches our input query. In vector databases, we apply a similarity metric to find a vector that is the most similar to our query. There are a lot of dedicated vector databases out there such as <b>VectorMass</b>, <b>Pinecone</b>, <b>Qdrant</b>, <b>Chroma DB</b>, etc.
So, let’s learn how we can use <b>VectorMass</b> vector database…
```python
# install vectormass library
pip install VectorMass
```
```python
import VectorMass
import numpy as np
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-mpnet-base-v2')
# Create a VectorStore instance
vector_store = VectorMass.Client()
# Define your sentences
sentences = [
"I eat mango",
"mango is my favorite fruit",
"mango, apple, oranges are fruits",
"fruits are good for health",
]
ids = ['id1', 'id2', 'id3', 'id4']
# create a collection
collection = vector_store.create_or_get_collection("test_collection")
# add ids, documents and embeddings to the collection
collection.add(
ids= ids,
documents=sentences,
embedding_model=model
)
# retrive data from the collection
# result = collection.get_all()
# print(result)
# querying
res = model.encode(['healthy foods', 'I eat mango'])
result = collection.query(query_embeddings=res)
print(result)
```
### Embeddings
Embeddings, in the context of machine learning and natural language processing (NLP), refer to numerical representations of words, sentences, or documents in a high-dimensional space.
In <b>VectorMass</b> databse, use [<b>Sentence Transformer</b>](https://www.sbert.net/) embeddings as default embeddings. Upto now, it supports only embedding models which is in [<b>Sentence Transformer</b>](https://www.sbert.net/).
### License
[Apache 2.0](https://en.wikipedia.org/wiki/Apache_License)
Raw data
{
"_id": null,
"home_page": "https://github.com/dineshpiyasamara/VectorMass",
"name": "VectorMass",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "vector database,vector store",
"author": "Dinesh Piyasamara",
"author_email": "dineshpiyasamara@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/08/77/35bc3ba4e6fa3189d018b32088bfe6a8757d76e66cee24883f91fd008bbb/VectorMass-0.0.10.tar.gz",
"platform": null,
"description": "<h1 align=\"center\">VectorMass</h1>\r\n\r\n<p align=\"center\">\r\n <img width=\"200\" src=\"./VectorMass/utils/vectormass_logo.png\" alt=\"VectorMass vector database\">\r\n</p>\r\n\r\n\r\nVector databases are used to store vector embeddings for fast retrieval, similarity search, and other operations like crud operations. Simply, embedding is a numerical array that includes a huge number of features. So, using vector databases we can perform a lot of useful things on that numerical representation.\r\n\r\nIn traditional databases like <b>MySQL</b>, <b>PostgreSQL</b>, and <b>SQL Server</b> we are usually querying for rows in the database where the value matches our input query. In vector databases, we apply a similarity metric to find a vector that is the most similar to our query. There are a lot of dedicated vector databases out there such as <b>VectorMass</b>, <b>Pinecone</b>, <b>Qdrant</b>, <b>Chroma DB</b>, etc.\r\n\r\nSo, let\u00e2\u20ac\u2122s learn how we can use <b>VectorMass</b> vector database\u00e2\u20ac\u00a6\r\n\r\n```python\r\n# install vectormass library\r\npip install VectorMass\r\n```\r\n\r\n```python\r\nimport VectorMass\r\nimport numpy as np\r\n\r\nfrom sentence_transformers import SentenceTransformer\r\nmodel = SentenceTransformer('all-mpnet-base-v2')\r\n\r\n# Create a VectorStore instance\r\nvector_store = VectorMass.Client()\r\n\r\n# Define your sentences\r\nsentences = [\r\n \"I eat mango\",\r\n \"mango is my favorite fruit\",\r\n \"mango, apple, oranges are fruits\",\r\n \"fruits are good for health\",\r\n]\r\nids = ['id1', 'id2', 'id3', 'id4']\r\n\r\n# create a collection\r\ncollection = vector_store.create_or_get_collection(\"test_collection\")\r\n\r\n# add ids, documents and embeddings to the collection\r\ncollection.add(\r\n ids= ids,\r\n documents=sentences,\r\n embedding_model=model\r\n)\r\n\r\n# retrive data from the collection\r\n# result = collection.get_all()\r\n# print(result)\r\n\r\n# querying\r\nres = model.encode(['healthy foods', 'I eat mango'])\r\nresult = collection.query(query_embeddings=res)\r\nprint(result)\r\n```\r\n\r\n### Embeddings\r\nEmbeddings, in the context of machine learning and natural language processing (NLP), refer to numerical representations of words, sentences, or documents in a high-dimensional space. \r\nIn <b>VectorMass</b> databse, use [<b>Sentence Transformer</b>](https://www.sbert.net/) embeddings as default embeddings. Upto now, it supports only embedding models which is in [<b>Sentence Transformer</b>](https://www.sbert.net/).\r\n\r\n### License\r\n[Apache 2.0](https://en.wikipedia.org/wiki/Apache_License)\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Highly flexible vector store",
"version": "0.0.10",
"project_urls": {
"Homepage": "https://github.com/dineshpiyasamara/VectorMass"
},
"split_keywords": [
"vector database",
"vector store"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "087735bc3ba4e6fa3189d018b32088bfe6a8757d76e66cee24883f91fd008bbb",
"md5": "e0040024c6569fc27de5ad980b044762",
"sha256": "80bdb84a177c7cdcce7ed0d00a65127d8b47542e56bfa7a2502cc5d08b94bba3"
},
"downloads": -1,
"filename": "VectorMass-0.0.10.tar.gz",
"has_sig": false,
"md5_digest": "e0040024c6569fc27de5ad980b044762",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 11615,
"upload_time": "2023-12-31T11:54:20",
"upload_time_iso_8601": "2023-12-31T11:54:20.945465Z",
"url": "https://files.pythonhosted.org/packages/08/77/35bc3ba4e6fa3189d018b32088bfe6a8757d76e66cee24883f91fd008bbb/VectorMass-0.0.10.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-31 11:54:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dineshpiyasamara",
"github_project": "VectorMass",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "vectormass"
}