EmbedMan


NameEmbedMan JSON
Version 0.0.2 PyPI version JSON
download
home_pagehttps://github.com/chigwell/EmbedMan
SummaryA tool for managing embeddings for code analysis
upload_time2024-01-27 11:38:23
maintainer
docs_urlNone
authorEugene Evstafev
requires_python>=3.6
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![PyPI version](https://badge.fury.io/py/embedman.svg)](https://badge.fury.io/py/embedman)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![Downloads](https://static.pepy.tech/badge/embedman)](https://pepy.tech/project/embedman)

# EmbedMan

`EmbedMan` is a Python package designed to manage embeddings for code analysis efficiently. It facilitates the process of generating and retrieving embeddings from a specified directory of code files, utilizing the power of language models and embedding storage solutions.

## Installation

To install `EmbedMan`, you can use pip:

```bash
pip install embedman
```

## Usage

### As a Python Module

After installation, `EmbedMan` can be imported and used in your Python projects.

Example:

```python
from embed_man import EmbedManager

# Initialize the EmbedManager with desired parameters
embed_manager = EmbedManager(
    path="path/to/your/code/directory",
    glob_rule="**/*.py",
    use_cache=True
)

# Run the embedding process and get a retriever for querying embeddings
retriever = embed_manager.run()

# You can now use the retriever to query embeddings
```

### Configurable Parameters

`EmbedMan` allows various configurations to tailor the embedding process to your needs, including:

- `path`: The directory path to scan for documents.
- `glob_rule`: Glob pattern to match files within the directory.
- `suffixes`: List of file suffixes to include.
- `exclude`: List of patterns to exclude.
- `language`: Programming language of the documents.
- `parser_threshold`: Threshold for the parser to consider a document valid.
- `chunk_size`: Size of chunks to split documents into for embedding.
- `chunk_overlap`: Overlap between consecutive chunks.
- `cache_dir`: Directory path for caching embeddings.
- `namespace_cache`: Namespace for the cache to avoid collisions.

## Contributing

Contributions, issues, and feature requests are welcome! Feel free to check [issues page](https://github.com/chigwell/embedman/issues).

## License

[MIT](https://choosealicense.com/licenses/mit/)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/chigwell/EmbedMan",
    "name": "EmbedMan",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "Eugene Evstafev",
    "author_email": "chigwel@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/72/47/0a4e43123b0b24346f7cd12ac81c5549132ad23f232edd6571c2d1ea3701/EmbedMan-0.0.2.tar.gz",
    "platform": null,
    "description": "[![PyPI version](https://badge.fury.io/py/embedman.svg)](https://badge.fury.io/py/embedman)\n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n[![Downloads](https://static.pepy.tech/badge/embedman)](https://pepy.tech/project/embedman)\n\n# EmbedMan\n\n`EmbedMan` is a Python package designed to manage embeddings for code analysis efficiently. It facilitates the process of generating and retrieving embeddings from a specified directory of code files, utilizing the power of language models and embedding storage solutions.\n\n## Installation\n\nTo install `EmbedMan`, you can use pip:\n\n```bash\npip install embedman\n```\n\n## Usage\n\n### As a Python Module\n\nAfter installation, `EmbedMan` can be imported and used in your Python projects.\n\nExample:\n\n```python\nfrom embed_man import EmbedManager\n\n# Initialize the EmbedManager with desired parameters\nembed_manager = EmbedManager(\n    path=\"path/to/your/code/directory\",\n    glob_rule=\"**/*.py\",\n    use_cache=True\n)\n\n# Run the embedding process and get a retriever for querying embeddings\nretriever = embed_manager.run()\n\n# You can now use the retriever to query embeddings\n```\n\n### Configurable Parameters\n\n`EmbedMan` allows various configurations to tailor the embedding process to your needs, including:\n\n- `path`: The directory path to scan for documents.\n- `glob_rule`: Glob pattern to match files within the directory.\n- `suffixes`: List of file suffixes to include.\n- `exclude`: List of patterns to exclude.\n- `language`: Programming language of the documents.\n- `parser_threshold`: Threshold for the parser to consider a document valid.\n- `chunk_size`: Size of chunks to split documents into for embedding.\n- `chunk_overlap`: Overlap between consecutive chunks.\n- `cache_dir`: Directory path for caching embeddings.\n- `namespace_cache`: Namespace for the cache to avoid collisions.\n\n## Contributing\n\nContributions, issues, and feature requests are welcome! Feel free to check [issues page](https://github.com/chigwell/embedman/issues).\n\n## License\n\n[MIT](https://choosealicense.com/licenses/mit/)\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A tool for managing embeddings for code analysis",
    "version": "0.0.2",
    "project_urls": {
        "Homepage": "https://github.com/chigwell/EmbedMan"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4fb6cf4f524ac6e2e933e95b8b7e0707915baeaf81a8c2374d5df59f99467887",
                "md5": "fff664474266625e32e53a22b123f874",
                "sha256": "f78dd1ddf0996bdbc09edd575d5bef17db5634c0673713762904c8898119eeab"
            },
            "downloads": -1,
            "filename": "EmbedMan-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fff664474266625e32e53a22b123f874",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 6269,
            "upload_time": "2024-01-27T11:38:22",
            "upload_time_iso_8601": "2024-01-27T11:38:22.247130Z",
            "url": "https://files.pythonhosted.org/packages/4f/b6/cf4f524ac6e2e933e95b8b7e0707915baeaf81a8c2374d5df59f99467887/EmbedMan-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "72470a4e43123b0b24346f7cd12ac81c5549132ad23f232edd6571c2d1ea3701",
                "md5": "43f4c40a7ec60d78408efd4a7507c497",
                "sha256": "8d8ae80be149aa68d0b0d1df04ed7f17505e1647f8703ecf40a850f0a6b36829"
            },
            "downloads": -1,
            "filename": "EmbedMan-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "43f4c40a7ec60d78408efd4a7507c497",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 5386,
            "upload_time": "2024-01-27T11:38:23",
            "upload_time_iso_8601": "2024-01-27T11:38:23.715458Z",
            "url": "https://files.pythonhosted.org/packages/72/47/0a4e43123b0b24346f7cd12ac81c5549132ad23f232edd6571c2d1ea3701/EmbedMan-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-27 11:38:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "chigwell",
    "github_project": "EmbedMan",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "embedman"
}
        
Elapsed time: 0.47290s