mirk


Namemirk JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/CuriousDima/mirk
SummaryMirk is a vision-reasoning pipeline designed to interpret scenes only when needed.
upload_time2024-12-08 01:09:53
maintainerNone
docs_urlNone
authorDima Timofeev
requires_python<4.0,>=3.11
licenseMIT
keywords vision ai reasoning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Mirk

**Mirk** is a library and a pipeline that combines classical Computer Vision (CV) models with Large Visual Models (LVMs) to provide detailed analysis and understanding of a video. The classical CV model handles initial processing and object detection, while the LVM generates rich, contextual interpretations of the visual content.

## Overview

Mirk works by:

1. Taking an input video
2. Using a CV model to detect objects of interest. Objects (classes) of interest are specified by the user
3. When a specified object is identified, triggering a VLM to generate detailed explanations about what is seen in the video, to reason about the detected object and its context based on the provided question

## Installation

```bash
pip install mirk
```

## Quick Start

Check out the [example](examples/one_shot.ipynb) to see how to use Mirk.

For your convenience, we provide a [bash script](examples/one_shot.sh) that downloads a sample video and runs the one-shot example:

```bash
cd examples
./one_shot.sh 
```

with the following output:

```bash
[download] Destination: input/selective_attention_test.mp4
...
[download] 100% of    2.63MiB in 00:00:00 at 5.80MiB/s
Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to '.../mirk/mirk/models/yolo11n.pt'...
100%|███████| 5.35M/5.35M [00:00<00:00, 7.07MB/s]

video 1/1 (frame 1/2447) .../mirk/examples/input/selective_attention_test.mp4: 480x640 (no detections), 172.3ms
video 1/1 (frame 2/2447) .../mirk/examples/input/selective_attention_test.mp4: 480x640 (no detections), 145.1ms
video 1/1 (frame 3/2447) .../mirk/examples/input/selective_attention_test.mp4: 480x640 (no detections), 134.0ms
...
video 1/1 (frame 361/2447) .../mirk/examples/input/selective_attention_test.mp4: 480x640 5 persons, 160.9ms
Found person in frame 360 with confidence 0.88
Saved frame to: output/detected_person_frame_360.jpg

Question: What are the people doing in the image?
Answer: The people in the image are playing with basketballs, passing them to each other. There is a group of individuals, and some are walking while others are engaged in the activity. It's a scene from a well-known experiment involving selective attention.
```

## Credentials

Mirk uses the following APIs:

- [YOLO](https://docs.ultralytics.com/quickstart/)
- [OpenAI](https://platform.openai.com/docs/api-reference/introduction)

You need to set up your own credentials for OpenAI API. See [.env.example](.env.example) file.  
You don't need to set up credentials for YOLO.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/CuriousDima/mirk",
    "name": "mirk",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": null,
    "keywords": "vision, ai, reasoning",
    "author": "Dima Timofeev",
    "author_email": "dimkat@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/03/9c/d4f9b42284babf89df0314ff7a1c7d805a79b8e7e43422a16538b997b1d1/mirk-0.1.1.tar.gz",
    "platform": null,
    "description": "# Mirk\n\n**Mirk** is a library and a pipeline that combines classical Computer Vision (CV) models with Large Visual Models (LVMs) to provide detailed analysis and understanding of a video. The classical CV model handles initial processing and object detection, while the LVM generates rich, contextual interpretations of the visual content.\n\n## Overview\n\nMirk works by:\n\n1. Taking an input video\n2. Using a CV model to detect objects of interest. Objects (classes) of interest are specified by the user\n3. When a specified object is identified, triggering a VLM to generate detailed explanations about what is seen in the video, to reason about the detected object and its context based on the provided question\n\n## Installation\n\n```bash\npip install mirk\n```\n\n## Quick Start\n\nCheck out the [example](examples/one_shot.ipynb) to see how to use Mirk.\n\nFor your convenience, we provide a [bash script](examples/one_shot.sh) that downloads a sample video and runs the one-shot example:\n\n```bash\ncd examples\n./one_shot.sh \n```\n\nwith the following output:\n\n```bash\n[download] Destination: input/selective_attention_test.mp4\n...\n[download] 100% of    2.63MiB in 00:00:00 at 5.80MiB/s\nDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to '.../mirk/mirk/models/yolo11n.pt'...\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 5.35M/5.35M [00:00<00:00, 7.07MB/s]\n\nvideo 1/1 (frame 1/2447) .../mirk/examples/input/selective_attention_test.mp4: 480x640 (no detections), 172.3ms\nvideo 1/1 (frame 2/2447) .../mirk/examples/input/selective_attention_test.mp4: 480x640 (no detections), 145.1ms\nvideo 1/1 (frame 3/2447) .../mirk/examples/input/selective_attention_test.mp4: 480x640 (no detections), 134.0ms\n...\nvideo 1/1 (frame 361/2447) .../mirk/examples/input/selective_attention_test.mp4: 480x640 5 persons, 160.9ms\nFound person in frame 360 with confidence 0.88\nSaved frame to: output/detected_person_frame_360.jpg\n\nQuestion: What are the people doing in the image?\nAnswer: The people in the image are playing with basketballs, passing them to each other. There is a group of individuals, and some are walking while others are engaged in the activity. It's a scene from a well-known experiment involving selective attention.\n```\n\n## Credentials\n\nMirk uses the following APIs:\n\n- [YOLO](https://docs.ultralytics.com/quickstart/)\n- [OpenAI](https://platform.openai.com/docs/api-reference/introduction)\n\nYou need to set up your own credentials for OpenAI API. See [.env.example](.env.example) file.  \nYou don't need to set up credentials for YOLO.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Mirk is a vision-reasoning pipeline designed to interpret scenes only when needed.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/CuriousDima/mirk",
        "Repository": "https://github.com/CuriousDima/mirk"
    },
    "split_keywords": [
        "vision",
        " ai",
        " reasoning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "299efbd76433222b716fedfd292fa15fba38b56a4feb0f97a7a2554e95ed034e",
                "md5": "74698091c029dd36053c713736df5552",
                "sha256": "0768c797f1bd94d142ccce8a5740bb57762e68de5baab91a983d37c889023788"
            },
            "downloads": -1,
            "filename": "mirk-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "74698091c029dd36053c713736df5552",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 5750,
            "upload_time": "2024-12-08T01:09:51",
            "upload_time_iso_8601": "2024-12-08T01:09:51.883195Z",
            "url": "https://files.pythonhosted.org/packages/29/9e/fbd76433222b716fedfd292fa15fba38b56a4feb0f97a7a2554e95ed034e/mirk-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "039cd4f9b42284babf89df0314ff7a1c7d805a79b8e7e43422a16538b997b1d1",
                "md5": "c33fb59c8e862511ea85f6d5233121e9",
                "sha256": "5dc15bfac7a44c67f12d883cdec8a81b186a91f3c17a48f44bb52c3c2da137ce"
            },
            "downloads": -1,
            "filename": "mirk-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "c33fb59c8e862511ea85f6d5233121e9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 4897,
            "upload_time": "2024-12-08T01:09:53",
            "upload_time_iso_8601": "2024-12-08T01:09:53.348175Z",
            "url": "https://files.pythonhosted.org/packages/03/9c/d4f9b42284babf89df0314ff7a1c7d805a79b8e7e43422a16538b997b1d1/mirk-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-08 01:09:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "CuriousDima",
    "github_project": "mirk",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "mirk"
}
        
Elapsed time: 0.41342s