embcli-jina


Nameembcli-jina JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
Summaryjina plugin for embcli
upload_time2025-08-02 06:47:45
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseApache-2.0
keywords cli embeddings llm nlp
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # embcli-jina

[![PyPI](https://img.shields.io/pypi/v/embcli-jina?label=PyPI)](https://pypi.org/project/embcli-jina/)
![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/mocobeta/embcli/ci-jina.yml?logo=github&label=tests)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/embcli-jina)

jina plugin for embcli, a command-line interface for embeddings.

## Reference

- [Jina Models](https://jina.ai/models)

## Installation

```bash
pip install embcli-jina
```

## Quick Start

You need Jina API key to use this plugin. Set `JINA_API_KEY` environment variable in `.env` file in the current directory. Or you can give the env file path by `-e` option.

```bash
cat .env
JINA_API_KEY=<YOUR_JINA_KEY>
```

### Try out the Embedding Models

```bash
# show general usage of emb command.
emb --help

# list all available models.
emb models
JinaEmbeddingModel
    Vendor: jina
    Models:
    * jina-embeddings-v3 (aliases: jina-v3)
    * jina-colbert-v2 (aliases: colbert-v2)
    * jina-embeddings-v2-base-code (aliases: jina-v2-code)
    Model Options:
    * task (str) - Downstream task for which the embeddings are used. Supported tasks: 'text-matching', 'retrieval.query', 'retrieval.passage', 'separation', 'classification'. Only supported in jina-embeddings-v3.
    * late_chunking (bool) - Whether if the late chunking is applied. Only supported in jina-embeddings-v3.
    * truncate (bool) - When enabled, the model will automatically drop the tail that extends beyond the maximum context length allowed by the model instead of throwing an error. Only supported in jina-embeddings-v3.
    * dimensions (int) - The number of dimensions the resulting output embeddings should have. Only supported in jina-embeddings-v3 and jina-colbert-v2.
    * input_type (str) - The type of input to the model. Supported types: 'query', 'document' Only supported in jina-corebert-v2.
    * embedding_type (str) - The type of embeddings to return. Options include 'float', 'binary', 'ubinary'. Default is 'float'.
JinaMultiModalModel
    Vendor: jina
    Models:
    * jina-embeddings-v4 (aliases: jina-v4)
    * jina-clip-v2 (aliases: )
    Model Options:
    * task (str) - Downstream task for which the embeddings are used. Supported tasks: 'retrieval.query', 'retrieval.passage', 'text-matching', 'code.query', 'code.passage'.
    * late_chunking (bool) - Whether if the late chunking is applied. Only supported in jina-embeddings-v4.
    * truncate (bool) - When enabled, the model will automatically drop the tail that extends beyond the maximum context length allowed by the model instead of throwing an error. Only supported in jina-embeddings-v4.
    * dimensions (int) - The number of dimensions the resulting output embeddings should have.
    * embedding_type (str) - The type of embeddings to return. Options include 'float', 'binary', 'ubinary'. Default is 'float'.

# get an embedding for an input text by jina-embeddings-v3 model.
emb embed -m jina-v3 "Embeddings are essential for semantic search and RAG apps."

# get an embedding for an input text by jina-embeddings-v3 model model with dimensions=512.
emb embed -m jina-v3 "Embeddings are essential for semantic search and RAG apps." -o dimensions 512

# get an embedding for an input text by jina-embeddings-v3 model model with embedding_type=binary.
emb embed -m jina-v3 "Embeddings are essential for semantic search and RAG apps." -o embedding_type binary

# get an embedding for an image input by jina-embeddings-v4 model.
# assume you have an image file named `gingercat.jpg` in the current directory.
emb embed -m jina-v4 --image gingercat.jpeg

# calculate similarity score between two texts by jina-embeddings-v3 model model. the default metric is cosine similarity.
emb simscore -m jina-v3 "The cat drifts toward sleep." "Sleep dances in the cat's eyes."
0.708945856730407
```

### Document Indexing and Search

You can use the `emb` command to index documents and perform search by an image. `emb` uses [`LanceDB`](https://github.com/lancedb/lancedb) for the default vector database.

```bash
# index example documents in the current directory.
emb ingest-sample -m jina-v3 -c catcafe --corpus cat-names-en

# or, you can give the path to your documents.
# the documents should be in a CSV file with two columns: id and text. the separator should be comma.
emb ingest -m jina-v3 -c catcafe -f <path-to-your-documents>

# search for a query in the indexed documents.
emb search -m jina-v3 -c catcafe -q "Who's the naughtiest one?"
Found 5 results:
Score: 0.45097012297560646, Document ID: 12, Text: Leo: Leo, with his magnificent mane-like ruff, carries himself with regal confidence. He is a natural leader, often surveying his domain from the highest point in the room. Affectionate on his own terms, Leo enjoys a good chin scratch and will reward loyalty with his rumbling purr and majestic presence.
Score: 0.4291541094385421, Document ID: 46, Text: Bandit: Bandit is a mischievous cat, often with mask-like markings, always on the lookout for his next playful heist of a toy or treat. He is clever and energetic, loving to chase and pounce. Despite his roguish name, Bandit is a loving companion who enjoys a good cuddle after his adventures.
Score: 0.4137949268906759, Document ID: 20, Text: Pepper: Pepper is a feisty and energetic grey tabby with a spicy personality. She is quick-witted and loves to engage in playful stalking and pouncing games. Pepper is also fiercely independent but will show her affection with sudden bursts of purring and head-butts, keeping her humans on their toes.
Score: 0.40369800611316564, Document ID: 35, Text: Lucy: Lucy is a sweet-natured and playful cat, often a ginger or calico, with a bright personality. She loves attention and will often seek out her humans for cuddles and playtime. Lucy is very expressive, using chirps and meows to communicate her desires, her joyful spirit lighting up the household.
Score: 0.4031877012247693, Document ID: 3, Text: Pippin (Pip): Pippin, or Pip, is a compact dynamo, brimming with mischievous charm and boundless curiosity. He’s an intrepid explorer, always finding new hideouts or investigating forbidden territories with a twinkle in his eye. Quite vocal, Pip will happily chat about his day, his playful antics making him an endearing little rascal.

# multilingual search
emb search -m jina-v3 -c catcafe -q "一番のいたずら者は誰?"
Found 5 results:
Score: 0.41762481997209167, Document ID: 12, Text: Leo: Leo, with his magnificent mane-like ruff, carries himself with regal confidence. He is a natural leader, often surveying his domain from the highest point in the room. Affectionate on his own terms, Leo enjoys a good chin scratch and will reward loyalty with his rumbling purr and majestic presence.
Score: 0.40111028920595193, Document ID: 46, Text: Bandit: Bandit is a mischievous cat, often with mask-like markings, always on the lookout for his next playful heist of a toy or treat. He is clever and energetic, loving to chase and pounce. Despite his roguish name, Bandit is a loving companion who enjoys a good cuddle after his adventures.
Score: 0.37882908929187215, Document ID: 20, Text: Pepper: Pepper is a feisty and energetic grey tabby with a spicy personality. She is quick-witted and loves to engage in playful stalking and pouncing games. Pepper is also fiercely independent but will show her affection with sudden bursts of purring and head-butts, keeping her humans on their toes.
Score: 0.3777527161730029, Document ID: 22, Text: Simba: Simba, true to his namesake, possesses a brave and noble spirit, often seen patrolling his territory. He is a confident and affectionate leader of his household pride. While he enjoys playful roughhousing, Simba is also a gentle giant, offering comforting purrs and loyal companionship to his beloved humans.
Score: 0.37738051225556507, Document ID: 3, Text: Pippin (Pip): Pippin, or Pip, is a compact dynamo, brimming with mischievous charm and boundless curiosity. He’s an intrepid explorer, always finding new hideouts or investigating forbidden territories with a twinkle in his eye. Quite vocal, Pip will happily chat about his day, his playful antics making him an endearing little rascal.
```

## Development

See the [main README](https://github.com/mocobeta/embcli/blob/main/README.md) for general development instructions.

### Run Tests

You need to have a Jina API key to run the tests for the `embcli-jina` package. You can set it up as an environment variable:

```bash
JINA_API_KEY=<YOUR_JINA_KEY> RUN_JINA_TESTS=1 uv run --package embcli-jina pytest packages/embcli-jina/tests/
```

### Run Linter and Formatter

```bash
uv run ruff check --fix packages/embcli-jina
uv run ruff format packages/embcli-jina
```

### Run Type Checker

```bash
uv run --package embcli-jina pyright packages/embcli-jina
```

## Build

```bash
uv build --package embcli-jina
```

## License

Apache License 2.0

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "embcli-jina",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "cli, embeddings, llm, nlp",
    "author": null,
    "author_email": "Tomoko Uchida <tomoko.uchida.1111@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/8c/d3/8f56a1299f6933b9845b2ec54758acc127ec0d4d17ac858544f6be5cdbef/embcli_jina-0.1.3.tar.gz",
    "platform": null,
    "description": "# embcli-jina\n\n[![PyPI](https://img.shields.io/pypi/v/embcli-jina?label=PyPI)](https://pypi.org/project/embcli-jina/)\n![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/mocobeta/embcli/ci-jina.yml?logo=github&label=tests)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/embcli-jina)\n\njina plugin for embcli, a command-line interface for embeddings.\n\n## Reference\n\n- [Jina Models](https://jina.ai/models)\n\n## Installation\n\n```bash\npip install embcli-jina\n```\n\n## Quick Start\n\nYou need Jina API key to use this plugin. Set `JINA_API_KEY` environment variable in `.env` file in the current directory. Or you can give the env file path by `-e` option.\n\n```bash\ncat .env\nJINA_API_KEY=<YOUR_JINA_KEY>\n```\n\n### Try out the Embedding Models\n\n```bash\n# show general usage of emb command.\nemb --help\n\n# list all available models.\nemb models\nJinaEmbeddingModel\n    Vendor: jina\n    Models:\n    * jina-embeddings-v3 (aliases: jina-v3)\n    * jina-colbert-v2 (aliases: colbert-v2)\n    * jina-embeddings-v2-base-code (aliases: jina-v2-code)\n    Model Options:\n    * task (str) - Downstream task for which the embeddings are used. Supported tasks: 'text-matching', 'retrieval.query', 'retrieval.passage', 'separation', 'classification'. Only supported in jina-embeddings-v3.\n    * late_chunking (bool) - Whether if the late chunking is applied. Only supported in jina-embeddings-v3.\n    * truncate (bool) - When enabled, the model will automatically drop the tail that extends beyond the maximum context length allowed by the model instead of throwing an error. Only supported in jina-embeddings-v3.\n    * dimensions (int) - The number of dimensions the resulting output embeddings should have. Only supported in jina-embeddings-v3 and jina-colbert-v2.\n    * input_type (str) - The type of input to the model. Supported types: 'query', 'document' Only supported in jina-corebert-v2.\n    * embedding_type (str) - The type of embeddings to return. Options include 'float', 'binary', 'ubinary'. Default is 'float'.\nJinaMultiModalModel\n    Vendor: jina\n    Models:\n    * jina-embeddings-v4 (aliases: jina-v4)\n    * jina-clip-v2 (aliases: )\n    Model Options:\n    * task (str) - Downstream task for which the embeddings are used. Supported tasks: 'retrieval.query', 'retrieval.passage', 'text-matching', 'code.query', 'code.passage'.\n    * late_chunking (bool) - Whether if the late chunking is applied. Only supported in jina-embeddings-v4.\n    * truncate (bool) - When enabled, the model will automatically drop the tail that extends beyond the maximum context length allowed by the model instead of throwing an error. Only supported in jina-embeddings-v4.\n    * dimensions (int) - The number of dimensions the resulting output embeddings should have.\n    * embedding_type (str) - The type of embeddings to return. Options include 'float', 'binary', 'ubinary'. Default is 'float'.\n\n# get an embedding for an input text by jina-embeddings-v3 model.\nemb embed -m jina-v3 \"Embeddings are essential for semantic search and RAG apps.\"\n\n# get an embedding for an input text by jina-embeddings-v3 model model with dimensions=512.\nemb embed -m jina-v3 \"Embeddings are essential for semantic search and RAG apps.\" -o dimensions 512\n\n# get an embedding for an input text by jina-embeddings-v3 model model with embedding_type=binary.\nemb embed -m jina-v3 \"Embeddings are essential for semantic search and RAG apps.\" -o embedding_type binary\n\n# get an embedding for an image input by jina-embeddings-v4 model.\n# assume you have an image file named `gingercat.jpg` in the current directory.\nemb embed -m jina-v4 --image gingercat.jpeg\n\n# calculate similarity score between two texts by jina-embeddings-v3 model model. the default metric is cosine similarity.\nemb simscore -m jina-v3 \"The cat drifts toward sleep.\" \"Sleep dances in the cat's eyes.\"\n0.708945856730407\n```\n\n### Document Indexing and Search\n\nYou can use the `emb` command to index documents and perform search by an image. `emb` uses [`LanceDB`](https://github.com/lancedb/lancedb) for the default vector database.\n\n```bash\n# index example documents in the current directory.\nemb ingest-sample -m jina-v3 -c catcafe --corpus cat-names-en\n\n# or, you can give the path to your documents.\n# the documents should be in a CSV file with two columns: id and text. the separator should be comma.\nemb ingest -m jina-v3 -c catcafe -f <path-to-your-documents>\n\n# search for a query in the indexed documents.\nemb search -m jina-v3 -c catcafe -q \"Who's the naughtiest one?\"\nFound 5 results:\nScore: 0.45097012297560646, Document ID: 12, Text: Leo: Leo, with his magnificent mane-like ruff, carries himself with regal confidence. He is a natural leader, often surveying his domain from the highest point in the room. Affectionate on his own terms, Leo enjoys a good chin scratch and will reward loyalty with his rumbling purr and majestic presence.\nScore: 0.4291541094385421, Document ID: 46, Text: Bandit: Bandit is a mischievous cat, often with mask-like markings, always on the lookout for his next playful heist of a toy or treat. He is clever and energetic, loving to chase and pounce. Despite his roguish name, Bandit is a loving companion who enjoys a good cuddle after his adventures.\nScore: 0.4137949268906759, Document ID: 20, Text: Pepper: Pepper is a feisty and energetic grey tabby with a spicy personality. She is quick-witted and loves to engage in playful stalking and pouncing games. Pepper is also fiercely independent but will show her affection with sudden bursts of purring and head-butts, keeping her humans on their toes.\nScore: 0.40369800611316564, Document ID: 35, Text: Lucy: Lucy is a sweet-natured and playful cat, often a ginger or calico, with a bright personality. She loves attention and will often seek out her humans for cuddles and playtime. Lucy is very expressive, using chirps and meows to communicate her desires, her joyful spirit lighting up the household.\nScore: 0.4031877012247693, Document ID: 3, Text: Pippin (Pip): Pippin, or Pip, is a compact dynamo, brimming with mischievous charm and boundless curiosity. He\u2019s an intrepid explorer, always finding new hideouts or investigating forbidden territories with a twinkle in his eye. Quite vocal, Pip will happily chat about his day, his playful antics making him an endearing little rascal.\n\n# multilingual search\nemb search -m jina-v3 -c catcafe -q \"\u4e00\u756a\u306e\u3044\u305f\u305a\u3089\u8005\u306f\u8ab0?\"\nFound 5 results:\nScore: 0.41762481997209167, Document ID: 12, Text: Leo: Leo, with his magnificent mane-like ruff, carries himself with regal confidence. He is a natural leader, often surveying his domain from the highest point in the room. Affectionate on his own terms, Leo enjoys a good chin scratch and will reward loyalty with his rumbling purr and majestic presence.\nScore: 0.40111028920595193, Document ID: 46, Text: Bandit: Bandit is a mischievous cat, often with mask-like markings, always on the lookout for his next playful heist of a toy or treat. He is clever and energetic, loving to chase and pounce. Despite his roguish name, Bandit is a loving companion who enjoys a good cuddle after his adventures.\nScore: 0.37882908929187215, Document ID: 20, Text: Pepper: Pepper is a feisty and energetic grey tabby with a spicy personality. She is quick-witted and loves to engage in playful stalking and pouncing games. Pepper is also fiercely independent but will show her affection with sudden bursts of purring and head-butts, keeping her humans on their toes.\nScore: 0.3777527161730029, Document ID: 22, Text: Simba: Simba, true to his namesake, possesses a brave and noble spirit, often seen patrolling his territory. He is a confident and affectionate leader of his household pride. While he enjoys playful roughhousing, Simba is also a gentle giant, offering comforting purrs and loyal companionship to his beloved humans.\nScore: 0.37738051225556507, Document ID: 3, Text: Pippin (Pip): Pippin, or Pip, is a compact dynamo, brimming with mischievous charm and boundless curiosity. He\u2019s an intrepid explorer, always finding new hideouts or investigating forbidden territories with a twinkle in his eye. Quite vocal, Pip will happily chat about his day, his playful antics making him an endearing little rascal.\n```\n\n## Development\n\nSee the [main README](https://github.com/mocobeta/embcli/blob/main/README.md) for general development instructions.\n\n### Run Tests\n\nYou need to have a Jina API key to run the tests for the `embcli-jina` package. You can set it up as an environment variable:\n\n```bash\nJINA_API_KEY=<YOUR_JINA_KEY> RUN_JINA_TESTS=1 uv run --package embcli-jina pytest packages/embcli-jina/tests/\n```\n\n### Run Linter and Formatter\n\n```bash\nuv run ruff check --fix packages/embcli-jina\nuv run ruff format packages/embcli-jina\n```\n\n### Run Type Checker\n\n```bash\nuv run --package embcli-jina pyright packages/embcli-jina\n```\n\n## Build\n\n```bash\nuv build --package embcli-jina\n```\n\n## License\n\nApache License 2.0\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "jina plugin for embcli",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://embcli.mocobeta.dev/",
        "Repository": "https://github.com/mocobeta/embcli"
    },
    "split_keywords": [
        "cli",
        " embeddings",
        " llm",
        " nlp"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ea863c6497258764eb372496f7039e3a294afac07527d818a71bbe34f2e69981",
                "md5": "31f8d2a1efc12d5d977ca33dcd97629d",
                "sha256": "492d45ca2c4253eff48c3a1e2e726b123d18ac6626843df851650a295f24c9d7"
            },
            "downloads": -1,
            "filename": "embcli_jina-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "31f8d2a1efc12d5d977ca33dcd97629d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 7998,
            "upload_time": "2025-08-02T06:47:44",
            "upload_time_iso_8601": "2025-08-02T06:47:44.486724Z",
            "url": "https://files.pythonhosted.org/packages/ea/86/3c6497258764eb372496f7039e3a294afac07527d818a71bbe34f2e69981/embcli_jina-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8cd38f56a1299f6933b9845b2ec54758acc127ec0d4d17ac858544f6be5cdbef",
                "md5": "638af8dfbcd26f3ad2bbff42a9d02ef2",
                "sha256": "a841ada4bac32f67fc0fd8b76c321e8e7c51e9fae928b1a8c70b59d5a0a1b0bf"
            },
            "downloads": -1,
            "filename": "embcli_jina-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "638af8dfbcd26f3ad2bbff42a9d02ef2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 246210,
            "upload_time": "2025-08-02T06:47:45",
            "upload_time_iso_8601": "2025-08-02T06:47:45.590848Z",
            "url": "https://files.pythonhosted.org/packages/8c/d3/8f56a1299f6933b9845b2ec54758acc127ec0d4d17ac858544f6be5cdbef/embcli_jina-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-02 06:47:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mocobeta",
    "github_project": "embcli",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "embcli-jina"
}
        
Elapsed time: 2.15889s