lantern-pinecone


Namelantern-pinecone JSON
Version 0.0.8 PyPI version JSON
download
home_pagehttps://github.com/lanterndata/lantern-python
SummaryPinecone compatiable client for Lantern
upload_time2024-11-06 11:58:34
maintainerNone
docs_urlNone
authorVarik Matevosyan
requires_python>=3.8
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Pinecone-API-Compatible Lantern Client

## Install

```sh
pip install lantern-pinecone
```

## Sync from Pinecone to Lantern

```python
import lantern_pinecone
from getpass import getpass

lantern_pinecone.init('postgres://postgres@localhost:5432')

pinecone_ids = list(map(lambda x: str(x), range(100000)))

index = lantern_pinecone.create_from_pinecone(
        api_key=getpass("Pinecone API Key"),
        environment="us-east-1-aws",
        index_name="sift100k",
        namespace="",
        pinecone_ids=pinecone_ids,
        recreate=True,
        create_lantern_index=True)

index.describe_index_stats()

index.query(top_k=10, id='45500', namespace="")
```

> **_NOTE:_** If you pass `create_lantern_index=False` only data will be copied under the table of your index name (in this example `sift100k`) and you can create an index later externally. Without the index most of the index operations will not be accessible via this client.

## Extract Metadata Fields

When copying from Pinecone we create a table in this structure: `sql (id TEXT, embedding REAL[], metadata jsonb)`
If you are planning to use the index with raw sql clients, you may want to extract metadata into separate columns, so you could have more complex/nice looking queries over your metadata fields.
So if our metadata has this shape `{ "title": string, "description": string }`, we can extract it using this query:

```sql
BEGIN;
ALTER TABLE sift100k
ADD COLUMN title TEXT,
ADD COLUMN description TEXT;

-- Update the new columns with data extracted from the JSONB column
UPDATE sift100k
SET
  title = metadata->>'title',
  description = metadata->>'description';


-- Optionally drop the metadata column
ALTER TABLE sift100k DROP COLUMN metadata;

COMMIT;
```

After doing this your index will most likely be uncomaptible with this python client, and you should use it via raw sql client like `psycopg2`

## Index operations

```python
import os
import lantern_pinecone
import pandas as pd

LANTERN_DB_URL = os.environ.get('LANTERN_DB_URL') or 'postgres://postgres@localhost:5432'
lantern_pinecone.init(LANTERN_DB_URL)

# Giving our index a name
index_name = "hello-lantern"

# Delete the index, if an index of the same name already exists
if index_name in lantern_pinecone.list_indexes():
    lantern_pinecone.delete_index(index_name)


import time

dimensions = 3
lantern_pinecone.create_index(name=index_name, dimension=dimensions, metric="cosine")
index = lantern_pinecone.Index(index_name=index_name)


df = pd.DataFrame(
    data={
        "id": ["A", "B"],
        "vector": [[1., 1., 1.], [1., 2., 3.]]
    })

# Insert vectors
index.upsert(vectors=zip(df.id, df.vector))

index.describe_index_stats()

index.query(
    vector=[2., 2., 2.],
    top_k=5,
    include_values=True) # returns top_k matches


lantern_pinecone.delete_index(index_name)
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/lanterndata/lantern-python",
    "name": "lantern-pinecone",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Varik Matevosyan",
    "author_email": "varik@lantern.dev",
    "download_url": "https://files.pythonhosted.org/packages/d9/39/afa96d5716764abced628a15fbba232f13ef2d8f2f4a716d18f0d2609d92/lantern_pinecone-0.0.8.tar.gz",
    "platform": null,
    "description": "# Pinecone-API-Compatible Lantern Client\n\n## Install\n\n```sh\npip install lantern-pinecone\n```\n\n## Sync from Pinecone to Lantern\n\n```python\nimport lantern_pinecone\nfrom getpass import getpass\n\nlantern_pinecone.init('postgres://postgres@localhost:5432')\n\npinecone_ids = list(map(lambda x: str(x), range(100000)))\n\nindex = lantern_pinecone.create_from_pinecone(\n        api_key=getpass(\"Pinecone API Key\"),\n        environment=\"us-east-1-aws\",\n        index_name=\"sift100k\",\n        namespace=\"\",\n        pinecone_ids=pinecone_ids,\n        recreate=True,\n        create_lantern_index=True)\n\nindex.describe_index_stats()\n\nindex.query(top_k=10, id='45500', namespace=\"\")\n```\n\n> **_NOTE:_** If you pass `create_lantern_index=False` only data will be copied under the table of your index name (in this example `sift100k`) and you can create an index later externally. Without the index most of the index operations will not be accessible via this client.\n\n## Extract Metadata Fields\n\nWhen copying from Pinecone we create a table in this structure: `sql (id TEXT, embedding REAL[], metadata jsonb)`\nIf you are planning to use the index with raw sql clients, you may want to extract metadata into separate columns, so you could have more complex/nice looking queries over your metadata fields.\nSo if our metadata has this shape `{ \"title\": string, \"description\": string }`, we can extract it using this query:\n\n```sql\nBEGIN;\nALTER TABLE sift100k\nADD COLUMN title TEXT,\nADD COLUMN description TEXT;\n\n-- Update the new columns with data extracted from the JSONB column\nUPDATE sift100k\nSET\n  title = metadata->>'title',\n  description = metadata->>'description';\n\n\n-- Optionally drop the metadata column\nALTER TABLE sift100k DROP COLUMN metadata;\n\nCOMMIT;\n```\n\nAfter doing this your index will most likely be uncomaptible with this python client, and you should use it via raw sql client like `psycopg2`\n\n## Index operations\n\n```python\nimport os\nimport lantern_pinecone\nimport pandas as pd\n\nLANTERN_DB_URL = os.environ.get('LANTERN_DB_URL') or 'postgres://postgres@localhost:5432'\nlantern_pinecone.init(LANTERN_DB_URL)\n\n# Giving our index a name\nindex_name = \"hello-lantern\"\n\n# Delete the index, if an index of the same name already exists\nif index_name in lantern_pinecone.list_indexes():\n    lantern_pinecone.delete_index(index_name)\n\n\nimport time\n\ndimensions = 3\nlantern_pinecone.create_index(name=index_name, dimension=dimensions, metric=\"cosine\")\nindex = lantern_pinecone.Index(index_name=index_name)\n\n\ndf = pd.DataFrame(\n    data={\n        \"id\": [\"A\", \"B\"],\n        \"vector\": [[1., 1., 1.], [1., 2., 3.]]\n    })\n\n# Insert vectors\nindex.upsert(vectors=zip(df.id, df.vector))\n\nindex.describe_index_stats()\n\nindex.query(\n    vector=[2., 2., 2.],\n    top_k=5,\n    include_values=True) # returns top_k matches\n\n\nlantern_pinecone.delete_index(index_name)\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Pinecone compatiable client for Lantern",
    "version": "0.0.8",
    "project_urls": {
        "Bug Tracker": "https://github.com/lanterndata/lantern-python/issues",
        "Homepage": "https://github.com/lanterndata/lantern-python"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bb1e8f692b6dfd64f5cecec03a3b99aa5680f06c8408f009d6872245afd388ac",
                "md5": "21cbc3c800e2afd792747d24923bdae8",
                "sha256": "0e238abc5296e1f7f481f3a87110143fcd925ba8fb64a298681902b05d427526"
            },
            "downloads": -1,
            "filename": "lantern_pinecone-0.0.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "21cbc3c800e2afd792747d24923bdae8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 8526,
            "upload_time": "2024-11-06T11:58:31",
            "upload_time_iso_8601": "2024-11-06T11:58:31.678896Z",
            "url": "https://files.pythonhosted.org/packages/bb/1e/8f692b6dfd64f5cecec03a3b99aa5680f06c8408f009d6872245afd388ac/lantern_pinecone-0.0.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d939afa96d5716764abced628a15fbba232f13ef2d8f2f4a716d18f0d2609d92",
                "md5": "4e845d744e0fda7df77ba01687a07da7",
                "sha256": "cbe81f10ee84fcb747d378e0abc3e47a9531b9b3e24098c0640a2be094518abe"
            },
            "downloads": -1,
            "filename": "lantern_pinecone-0.0.8.tar.gz",
            "has_sig": false,
            "md5_digest": "4e845d744e0fda7df77ba01687a07da7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 7779,
            "upload_time": "2024-11-06T11:58:34",
            "upload_time_iso_8601": "2024-11-06T11:58:34.796045Z",
            "url": "https://files.pythonhosted.org/packages/d9/39/afa96d5716764abced628a15fbba232f13ef2d8f2f4a716d18f0d2609d92/lantern_pinecone-0.0.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-06 11:58:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lanterndata",
    "github_project": "lantern-python",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "lantern-pinecone"
}
        
Elapsed time: 0.75753s