Name | pgvector JSON |
Version |
0.3.6
JSON |
| download |
home_page | None |
Summary | pgvector support for Python |
upload_time | 2024-10-27 00:15:09 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# pgvector-python
[pgvector](https://github.com/pgvector/pgvector) support for Python
Supports [Django](https://github.com/django/django), [SQLAlchemy](https://github.com/sqlalchemy/sqlalchemy), [SQLModel](https://github.com/tiangolo/sqlmodel), [Psycopg 3](https://github.com/psycopg/psycopg), [Psycopg 2](https://github.com/psycopg/psycopg2), [asyncpg](https://github.com/MagicStack/asyncpg), and [Peewee](https://github.com/coleifer/peewee)
[![Build Status](https://github.com/pgvector/pgvector-python/actions/workflows/build.yml/badge.svg)](https://github.com/pgvector/pgvector-python/actions)
## Installation
Run:
```sh
pip install pgvector
```
And follow the instructions for your database library:
- [Django](#django)
- [SQLAlchemy](#sqlalchemy)
- [SQLModel](#sqlmodel)
- [Psycopg 3](#psycopg-3)
- [Psycopg 2](#psycopg-2)
- [asyncpg](#asyncpg)
- [Peewee](#peewee)
Or check out some examples:
- [Embeddings](https://github.com/pgvector/pgvector-python/blob/master/examples/openai/example.py) with OpenAI
- [Binary embeddings](https://github.com/pgvector/pgvector-python/blob/master/examples/cohere/example.py) with Cohere
- [Sentence embeddings](https://github.com/pgvector/pgvector-python/blob/master/examples/sentence_transformers/example.py) with SentenceTransformers
- [Hybrid search](https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search/rrf.py) with SentenceTransformers (Reciprocal Rank Fusion)
- [Hybrid search](https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search/cross_encoder.py) with SentenceTransformers (cross-encoder)
- [Sparse search](https://github.com/pgvector/pgvector-python/blob/master/examples/sparse_search/example.py) with Transformers
- [Late interaction search](https://github.com/pgvector/pgvector-python/blob/master/examples/colbert/exact.py) with ColBERT
- [Image search](https://github.com/pgvector/pgvector-python/blob/master/examples/image_search/example.py) with PyTorch
- [Image search](https://github.com/pgvector/pgvector-python/blob/master/examples/imagehash/example.py) with perceptual hashing
- [Morgan fingerprints](https://github.com/pgvector/pgvector-python/blob/master/examples/rdkit/example.py) with RDKit
- [Topic modeling](https://github.com/pgvector/pgvector-python/blob/master/examples/gensim/example.py) with Gensim
- [Implicit feedback recommendations](https://github.com/pgvector/pgvector-python/blob/master/examples/implicit/example.py) with Implicit
- [Explicit feedback recommendations](https://github.com/pgvector/pgvector-python/blob/master/examples/surprise/example.py) with Surprise
- [Recommendations](https://github.com/pgvector/pgvector-python/blob/master/examples/lightfm/example.py) with LightFM
- [Horizontal scaling](https://github.com/pgvector/pgvector-python/blob/master/examples/citus/example.py) with Citus
- [Bulk loading](https://github.com/pgvector/pgvector-python/blob/master/examples/loading/example.py) with `COPY`
## Django
Create a migration to enable the extension
```python
from pgvector.django import VectorExtension
class Migration(migrations.Migration):
operations = [
VectorExtension()
]
```
Add a vector field to your model
```python
from pgvector.django import VectorField
class Item(models.Model):
embedding = VectorField(dimensions=3)
```
Also supports `HalfVectorField`, `BitField`, and `SparseVectorField`
Insert a vector
```python
item = Item(embedding=[1, 2, 3])
item.save()
```
Get the nearest neighbors to a vector
```python
from pgvector.django import L2Distance
Item.objects.order_by(L2Distance('embedding', [3, 1, 2]))[:5]
```
Also supports `MaxInnerProduct`, `CosineDistance`, `L1Distance`, `HammingDistance`, and `JaccardDistance`
Get the distance
```python
Item.objects.annotate(distance=L2Distance('embedding', [3, 1, 2]))
```
Get items within a certain distance
```python
Item.objects.alias(distance=L2Distance('embedding', [3, 1, 2])).filter(distance__lt=5)
```
Average vectors
```python
from django.db.models import Avg
Item.objects.aggregate(Avg('embedding'))
```
Also supports `Sum`
Add an approximate index
```python
from pgvector.django import HnswIndex, IvfflatIndex
class Item(models.Model):
class Meta:
indexes = [
HnswIndex(
name='my_index',
fields=['embedding'],
m=16,
ef_construction=64,
opclasses=['vector_l2_ops']
),
# or
IvfflatIndex(
name='my_index',
fields=['embedding'],
lists=100,
opclasses=['vector_l2_ops']
)
]
```
Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance
## SQLAlchemy
Enable the extension
```python
session.execute(text('CREATE EXTENSION IF NOT EXISTS vector'))
```
Add a vector column
```python
from pgvector.sqlalchemy import Vector
class Item(Base):
embedding = mapped_column(Vector(3))
```
Also supports `HALFVEC`, `BIT`, and `SPARSEVEC`
Insert a vector
```python
item = Item(embedding=[1, 2, 3])
session.add(item)
session.commit()
```
Get the nearest neighbors to a vector
```python
session.scalars(select(Item).order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5))
```
Also supports `max_inner_product`, `cosine_distance`, `l1_distance`, `hamming_distance`, and `jaccard_distance`
Get the distance
```python
session.scalars(select(Item.embedding.l2_distance([3, 1, 2])))
```
Get items within a certain distance
```python
session.scalars(select(Item).filter(Item.embedding.l2_distance([3, 1, 2]) < 5))
```
Average vectors
```python
from pgvector.sqlalchemy import avg
session.scalars(select(avg(Item.embedding))).first()
```
Also supports `sum`
Add an approximate index
```python
index = Index(
'my_index',
Item.embedding,
postgresql_using='hnsw',
postgresql_with={'m': 16, 'ef_construction': 64},
postgresql_ops={'embedding': 'vector_l2_ops'}
)
# or
index = Index(
'my_index',
Item.embedding,
postgresql_using='ivfflat',
postgresql_with={'lists': 100},
postgresql_ops={'embedding': 'vector_l2_ops'}
)
index.create(engine)
```
Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance
## SQLModel
Enable the extension
```python
session.exec(text('CREATE EXTENSION IF NOT EXISTS vector'))
```
Add a vector column
```python
from pgvector.sqlalchemy import Vector
from sqlalchemy import Column
class Item(SQLModel, table=True):
embedding: Any = Field(sa_column=Column(Vector(3)))
```
Also supports `HALFVEC`, `BIT`, and `SPARSEVEC`
Insert a vector
```python
item = Item(embedding=[1, 2, 3])
session.add(item)
session.commit()
```
Get the nearest neighbors to a vector
```python
session.exec(select(Item).order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5))
```
Also supports `max_inner_product`, `cosine_distance`, `l1_distance`, `hamming_distance`, and `jaccard_distance`
Get the distance
```python
session.exec(select(Item.embedding.l2_distance([3, 1, 2])))
```
Get items within a certain distance
```python
session.exec(select(Item).filter(Item.embedding.l2_distance([3, 1, 2]) < 5))
```
Average vectors
```python
from pgvector.sqlalchemy import avg
session.exec(select(avg(Item.embedding))).first()
```
Also supports `sum`
Add an approximate index
```python
from sqlalchemy import Index
index = Index(
'my_index',
Item.embedding,
postgresql_using='hnsw',
postgresql_with={'m': 16, 'ef_construction': 64},
postgresql_ops={'embedding': 'vector_l2_ops'}
)
# or
index = Index(
'my_index',
Item.embedding,
postgresql_using='ivfflat',
postgresql_with={'lists': 100},
postgresql_ops={'embedding': 'vector_l2_ops'}
)
index.create(engine)
```
Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance
## Psycopg 3
Enable the extension
```python
conn.execute('CREATE EXTENSION IF NOT EXISTS vector')
```
Register the vector type with your connection
```python
from pgvector.psycopg import register_vector
register_vector(conn)
```
For [async connections](https://www.psycopg.org/psycopg3/docs/advanced/async.html), use
```python
from pgvector.psycopg import register_vector_async
await register_vector_async(conn)
```
Create a table
```python
conn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')
```
Insert a vector
```python
embedding = np.array([1, 2, 3])
conn.execute('INSERT INTO items (embedding) VALUES (%s)', (embedding,))
```
Get the nearest neighbors to a vector
```python
conn.execute('SELECT * FROM items ORDER BY embedding <-> %s LIMIT 5', (embedding,)).fetchall()
```
Add an approximate index
```python
conn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')
# or
conn.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')
```
Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance
## Psycopg 2
Enable the extension
```python
cur = conn.cursor()
cur.execute('CREATE EXTENSION IF NOT EXISTS vector')
```
Register the vector type with your connection or cursor
```python
from pgvector.psycopg2 import register_vector
register_vector(conn)
```
Create a table
```python
cur.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')
```
Insert a vector
```python
embedding = np.array([1, 2, 3])
cur.execute('INSERT INTO items (embedding) VALUES (%s)', (embedding,))
```
Get the nearest neighbors to a vector
```python
cur.execute('SELECT * FROM items ORDER BY embedding <-> %s LIMIT 5', (embedding,))
cur.fetchall()
```
Add an approximate index
```python
cur.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')
# or
cur.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')
```
Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance
## asyncpg
Enable the extension
```python
await conn.execute('CREATE EXTENSION IF NOT EXISTS vector')
```
Register the vector type with your connection
```python
from pgvector.asyncpg import register_vector
await register_vector(conn)
```
or your pool
```python
async def init(conn):
await register_vector(conn)
pool = await asyncpg.create_pool(..., init=init)
```
Create a table
```python
await conn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')
```
Insert a vector
```python
embedding = np.array([1, 2, 3])
await conn.execute('INSERT INTO items (embedding) VALUES ($1)', embedding)
```
Get the nearest neighbors to a vector
```python
await conn.fetch('SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5', embedding)
```
Add an approximate index
```python
await conn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')
# or
await conn.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')
```
Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance
## Peewee
Add a vector column
```python
from pgvector.peewee import VectorField
class Item(BaseModel):
embedding = VectorField(dimensions=3)
```
Also supports `HalfVectorField`, `FixedBitField`, and `SparseVectorField`
Insert a vector
```python
item = Item.create(embedding=[1, 2, 3])
```
Get the nearest neighbors to a vector
```python
Item.select().order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5)
```
Also supports `max_inner_product`, `cosine_distance`, `l1_distance`, `hamming_distance`, and `jaccard_distance`
Get the distance
```python
Item.select(Item.embedding.l2_distance([3, 1, 2]).alias('distance'))
```
Get items within a certain distance
```python
Item.select().where(Item.embedding.l2_distance([3, 1, 2]) < 5)
```
Average vectors
```python
from peewee import fn
Item.select(fn.avg(Item.embedding).coerce(True)).scalar()
```
Also supports `sum`
Add an approximate index
```python
Item.add_index('embedding vector_l2_ops', using='hnsw')
```
Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance
## History
View the [changelog](https://github.com/pgvector/pgvector-python/blob/master/CHANGELOG.md)
## Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- [Report bugs](https://github.com/pgvector/pgvector-python/issues)
- Fix bugs and [submit pull requests](https://github.com/pgvector/pgvector-python/pulls)
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
```sh
git clone https://github.com/pgvector/pgvector-python.git
cd pgvector-python
pip install -r requirements.txt
createdb pgvector_python_test
pytest
```
To run an example:
```sh
cd examples/loading
pip install -r requirements.txt
createdb pgvector_example
python3 example.py
```
Raw data
{
"_id": null,
"home_page": null,
"name": "pgvector",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Andrew Kane <andrew@ankane.org>",
"download_url": "https://files.pythonhosted.org/packages/7d/d8/fd6009cee3e03214667df488cdcf9609461d729968da94e4f95d6359d304/pgvector-0.3.6.tar.gz",
"platform": null,
"description": "# pgvector-python\n\n[pgvector](https://github.com/pgvector/pgvector) support for Python\n\nSupports [Django](https://github.com/django/django), [SQLAlchemy](https://github.com/sqlalchemy/sqlalchemy), [SQLModel](https://github.com/tiangolo/sqlmodel), [Psycopg 3](https://github.com/psycopg/psycopg), [Psycopg 2](https://github.com/psycopg/psycopg2), [asyncpg](https://github.com/MagicStack/asyncpg), and [Peewee](https://github.com/coleifer/peewee)\n\n[![Build Status](https://github.com/pgvector/pgvector-python/actions/workflows/build.yml/badge.svg)](https://github.com/pgvector/pgvector-python/actions)\n\n## Installation\n\nRun:\n\n```sh\npip install pgvector\n```\n\nAnd follow the instructions for your database library:\n\n- [Django](#django)\n- [SQLAlchemy](#sqlalchemy)\n- [SQLModel](#sqlmodel)\n- [Psycopg 3](#psycopg-3)\n- [Psycopg 2](#psycopg-2)\n- [asyncpg](#asyncpg)\n- [Peewee](#peewee)\n\nOr check out some examples:\n\n- [Embeddings](https://github.com/pgvector/pgvector-python/blob/master/examples/openai/example.py) with OpenAI\n- [Binary embeddings](https://github.com/pgvector/pgvector-python/blob/master/examples/cohere/example.py) with Cohere\n- [Sentence embeddings](https://github.com/pgvector/pgvector-python/blob/master/examples/sentence_transformers/example.py) with SentenceTransformers\n- [Hybrid search](https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search/rrf.py) with SentenceTransformers (Reciprocal Rank Fusion)\n- [Hybrid search](https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search/cross_encoder.py) with SentenceTransformers (cross-encoder)\n- [Sparse search](https://github.com/pgvector/pgvector-python/blob/master/examples/sparse_search/example.py) with Transformers\n- [Late interaction search](https://github.com/pgvector/pgvector-python/blob/master/examples/colbert/exact.py) with ColBERT\n- [Image search](https://github.com/pgvector/pgvector-python/blob/master/examples/image_search/example.py) with PyTorch\n- [Image search](https://github.com/pgvector/pgvector-python/blob/master/examples/imagehash/example.py) with perceptual hashing\n- [Morgan fingerprints](https://github.com/pgvector/pgvector-python/blob/master/examples/rdkit/example.py) with RDKit\n- [Topic modeling](https://github.com/pgvector/pgvector-python/blob/master/examples/gensim/example.py) with Gensim\n- [Implicit feedback recommendations](https://github.com/pgvector/pgvector-python/blob/master/examples/implicit/example.py) with Implicit\n- [Explicit feedback recommendations](https://github.com/pgvector/pgvector-python/blob/master/examples/surprise/example.py) with Surprise\n- [Recommendations](https://github.com/pgvector/pgvector-python/blob/master/examples/lightfm/example.py) with LightFM\n- [Horizontal scaling](https://github.com/pgvector/pgvector-python/blob/master/examples/citus/example.py) with Citus\n- [Bulk loading](https://github.com/pgvector/pgvector-python/blob/master/examples/loading/example.py) with `COPY`\n\n## Django\n\nCreate a migration to enable the extension\n\n```python\nfrom pgvector.django import VectorExtension\n\nclass Migration(migrations.Migration):\n operations = [\n VectorExtension()\n ]\n```\n\nAdd a vector field to your model\n\n```python\nfrom pgvector.django import VectorField\n\nclass Item(models.Model):\n embedding = VectorField(dimensions=3)\n```\n\nAlso supports `HalfVectorField`, `BitField`, and `SparseVectorField`\n\nInsert a vector\n\n```python\nitem = Item(embedding=[1, 2, 3])\nitem.save()\n```\n\nGet the nearest neighbors to a vector\n\n```python\nfrom pgvector.django import L2Distance\n\nItem.objects.order_by(L2Distance('embedding', [3, 1, 2]))[:5]\n```\n\nAlso supports `MaxInnerProduct`, `CosineDistance`, `L1Distance`, `HammingDistance`, and `JaccardDistance`\n\nGet the distance\n\n```python\nItem.objects.annotate(distance=L2Distance('embedding', [3, 1, 2]))\n```\n\nGet items within a certain distance\n\n```python\nItem.objects.alias(distance=L2Distance('embedding', [3, 1, 2])).filter(distance__lt=5)\n```\n\nAverage vectors\n\n```python\nfrom django.db.models import Avg\n\nItem.objects.aggregate(Avg('embedding'))\n```\n\nAlso supports `Sum`\n\nAdd an approximate index\n\n```python\nfrom pgvector.django import HnswIndex, IvfflatIndex\n\nclass Item(models.Model):\n class Meta:\n indexes = [\n HnswIndex(\n name='my_index',\n fields=['embedding'],\n m=16,\n ef_construction=64,\n opclasses=['vector_l2_ops']\n ),\n # or\n IvfflatIndex(\n name='my_index',\n fields=['embedding'],\n lists=100,\n opclasses=['vector_l2_ops']\n )\n ]\n```\n\nUse `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance\n\n## SQLAlchemy\n\nEnable the extension\n\n```python\nsession.execute(text('CREATE EXTENSION IF NOT EXISTS vector'))\n```\n\nAdd a vector column\n\n```python\nfrom pgvector.sqlalchemy import Vector\n\nclass Item(Base):\n embedding = mapped_column(Vector(3))\n```\n\nAlso supports `HALFVEC`, `BIT`, and `SPARSEVEC`\n\nInsert a vector\n\n```python\nitem = Item(embedding=[1, 2, 3])\nsession.add(item)\nsession.commit()\n```\n\nGet the nearest neighbors to a vector\n\n```python\nsession.scalars(select(Item).order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5))\n```\n\nAlso supports `max_inner_product`, `cosine_distance`, `l1_distance`, `hamming_distance`, and `jaccard_distance`\n\nGet the distance\n\n```python\nsession.scalars(select(Item.embedding.l2_distance([3, 1, 2])))\n```\n\nGet items within a certain distance\n\n```python\nsession.scalars(select(Item).filter(Item.embedding.l2_distance([3, 1, 2]) < 5))\n```\n\nAverage vectors\n\n```python\nfrom pgvector.sqlalchemy import avg\n\nsession.scalars(select(avg(Item.embedding))).first()\n```\n\nAlso supports `sum`\n\nAdd an approximate index\n\n```python\nindex = Index(\n 'my_index',\n Item.embedding,\n postgresql_using='hnsw',\n postgresql_with={'m': 16, 'ef_construction': 64},\n postgresql_ops={'embedding': 'vector_l2_ops'}\n)\n# or\nindex = Index(\n 'my_index',\n Item.embedding,\n postgresql_using='ivfflat',\n postgresql_with={'lists': 100},\n postgresql_ops={'embedding': 'vector_l2_ops'}\n)\n\nindex.create(engine)\n```\n\nUse `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance\n\n## SQLModel\n\nEnable the extension\n\n```python\nsession.exec(text('CREATE EXTENSION IF NOT EXISTS vector'))\n```\n\nAdd a vector column\n\n```python\nfrom pgvector.sqlalchemy import Vector\nfrom sqlalchemy import Column\n\nclass Item(SQLModel, table=True):\n embedding: Any = Field(sa_column=Column(Vector(3)))\n```\n\nAlso supports `HALFVEC`, `BIT`, and `SPARSEVEC`\n\nInsert a vector\n\n```python\nitem = Item(embedding=[1, 2, 3])\nsession.add(item)\nsession.commit()\n```\n\nGet the nearest neighbors to a vector\n\n```python\nsession.exec(select(Item).order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5))\n```\n\nAlso supports `max_inner_product`, `cosine_distance`, `l1_distance`, `hamming_distance`, and `jaccard_distance`\n\nGet the distance\n\n```python\nsession.exec(select(Item.embedding.l2_distance([3, 1, 2])))\n```\n\nGet items within a certain distance\n\n```python\nsession.exec(select(Item).filter(Item.embedding.l2_distance([3, 1, 2]) < 5))\n```\n\nAverage vectors\n\n```python\nfrom pgvector.sqlalchemy import avg\n\nsession.exec(select(avg(Item.embedding))).first()\n```\n\nAlso supports `sum`\n\nAdd an approximate index\n\n```python\nfrom sqlalchemy import Index\n\nindex = Index(\n 'my_index',\n Item.embedding,\n postgresql_using='hnsw',\n postgresql_with={'m': 16, 'ef_construction': 64},\n postgresql_ops={'embedding': 'vector_l2_ops'}\n)\n# or\nindex = Index(\n 'my_index',\n Item.embedding,\n postgresql_using='ivfflat',\n postgresql_with={'lists': 100},\n postgresql_ops={'embedding': 'vector_l2_ops'}\n)\n\nindex.create(engine)\n```\n\nUse `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance\n\n## Psycopg 3\n\nEnable the extension\n\n```python\nconn.execute('CREATE EXTENSION IF NOT EXISTS vector')\n```\n\nRegister the vector type with your connection\n\n```python\nfrom pgvector.psycopg import register_vector\n\nregister_vector(conn)\n```\n\nFor [async connections](https://www.psycopg.org/psycopg3/docs/advanced/async.html), use\n\n```python\nfrom pgvector.psycopg import register_vector_async\n\nawait register_vector_async(conn)\n```\n\nCreate a table\n\n```python\nconn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')\n```\n\nInsert a vector\n\n```python\nembedding = np.array([1, 2, 3])\nconn.execute('INSERT INTO items (embedding) VALUES (%s)', (embedding,))\n```\n\nGet the nearest neighbors to a vector\n\n```python\nconn.execute('SELECT * FROM items ORDER BY embedding <-> %s LIMIT 5', (embedding,)).fetchall()\n```\n\nAdd an approximate index\n\n```python\nconn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')\n# or\nconn.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')\n```\n\nUse `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance\n\n## Psycopg 2\n\nEnable the extension\n\n```python\ncur = conn.cursor()\ncur.execute('CREATE EXTENSION IF NOT EXISTS vector')\n```\n\nRegister the vector type with your connection or cursor\n\n```python\nfrom pgvector.psycopg2 import register_vector\n\nregister_vector(conn)\n```\n\nCreate a table\n\n```python\ncur.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')\n```\n\nInsert a vector\n\n```python\nembedding = np.array([1, 2, 3])\ncur.execute('INSERT INTO items (embedding) VALUES (%s)', (embedding,))\n```\n\nGet the nearest neighbors to a vector\n\n```python\ncur.execute('SELECT * FROM items ORDER BY embedding <-> %s LIMIT 5', (embedding,))\ncur.fetchall()\n```\n\nAdd an approximate index\n\n```python\ncur.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')\n# or\ncur.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')\n```\n\nUse `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance\n\n## asyncpg\n\nEnable the extension\n\n```python\nawait conn.execute('CREATE EXTENSION IF NOT EXISTS vector')\n```\n\nRegister the vector type with your connection\n\n```python\nfrom pgvector.asyncpg import register_vector\n\nawait register_vector(conn)\n```\n\nor your pool\n\n```python\nasync def init(conn):\n await register_vector(conn)\n\npool = await asyncpg.create_pool(..., init=init)\n```\n\nCreate a table\n\n```python\nawait conn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')\n```\n\nInsert a vector\n\n```python\nembedding = np.array([1, 2, 3])\nawait conn.execute('INSERT INTO items (embedding) VALUES ($1)', embedding)\n```\n\nGet the nearest neighbors to a vector\n\n```python\nawait conn.fetch('SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5', embedding)\n```\n\nAdd an approximate index\n\n```python\nawait conn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')\n# or\nawait conn.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')\n```\n\nUse `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance\n\n## Peewee\n\nAdd a vector column\n\n```python\nfrom pgvector.peewee import VectorField\n\nclass Item(BaseModel):\n embedding = VectorField(dimensions=3)\n```\n\nAlso supports `HalfVectorField`, `FixedBitField`, and `SparseVectorField`\n\nInsert a vector\n\n```python\nitem = Item.create(embedding=[1, 2, 3])\n```\n\nGet the nearest neighbors to a vector\n\n```python\nItem.select().order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5)\n```\n\nAlso supports `max_inner_product`, `cosine_distance`, `l1_distance`, `hamming_distance`, and `jaccard_distance`\n\nGet the distance\n\n```python\nItem.select(Item.embedding.l2_distance([3, 1, 2]).alias('distance'))\n```\n\nGet items within a certain distance\n\n```python\nItem.select().where(Item.embedding.l2_distance([3, 1, 2]) < 5)\n```\n\nAverage vectors\n\n```python\nfrom peewee import fn\n\nItem.select(fn.avg(Item.embedding).coerce(True)).scalar()\n```\n\nAlso supports `sum`\n\nAdd an approximate index\n\n```python\nItem.add_index('embedding vector_l2_ops', using='hnsw')\n```\n\nUse `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance\n\n## History\n\nView the [changelog](https://github.com/pgvector/pgvector-python/blob/master/CHANGELOG.md)\n\n## Contributing\n\nEveryone is encouraged to help improve this project. Here are a few ways you can help:\n\n- [Report bugs](https://github.com/pgvector/pgvector-python/issues)\n- Fix bugs and [submit pull requests](https://github.com/pgvector/pgvector-python/pulls)\n- Write, clarify, or fix documentation\n- Suggest or add new features\n\nTo get started with development:\n\n```sh\ngit clone https://github.com/pgvector/pgvector-python.git\ncd pgvector-python\npip install -r requirements.txt\ncreatedb pgvector_python_test\npytest\n```\n\nTo run an example:\n\n```sh\ncd examples/loading\npip install -r requirements.txt\ncreatedb pgvector_example\npython3 example.py\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "pgvector support for Python",
"version": "0.3.6",
"project_urls": {
"Homepage": "https://github.com/pgvector/pgvector-python"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fb81f457d6d361e04d061bef413749a6e1ab04d98cfeec6d8abcfe40184750f3",
"md5": "11cc5c106b944a189628475315d22986",
"sha256": "f6c269b3c110ccb7496bac87202148ed18f34b390a0189c783e351062400a75a"
},
"downloads": -1,
"filename": "pgvector-0.3.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "11cc5c106b944a189628475315d22986",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 24880,
"upload_time": "2024-10-27T00:15:08",
"upload_time_iso_8601": "2024-10-27T00:15:08.045913Z",
"url": "https://files.pythonhosted.org/packages/fb/81/f457d6d361e04d061bef413749a6e1ab04d98cfeec6d8abcfe40184750f3/pgvector-0.3.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7dd8fd6009cee3e03214667df488cdcf9609461d729968da94e4f95d6359d304",
"md5": "1ebc119e877e9de54cb4c50e0ebc0bf0",
"sha256": "31d01690e6ea26cea8a633cde5f0f55f5b246d9c8292d68efdef8c22ec994ade"
},
"downloads": -1,
"filename": "pgvector-0.3.6.tar.gz",
"has_sig": false,
"md5_digest": "1ebc119e877e9de54cb4c50e0ebc0bf0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 25421,
"upload_time": "2024-10-27T00:15:09",
"upload_time_iso_8601": "2024-10-27T00:15:09.632057Z",
"url": "https://files.pythonhosted.org/packages/7d/d8/fd6009cee3e03214667df488cdcf9609461d729968da94e4f95d6359d304/pgvector-0.3.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-27 00:15:09",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pgvector",
"github_project": "pgvector-python",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "pgvector"
}