[](https://badge.fury.io/py/embedman)
[](https://opensource.org/licenses/MIT)
[](https://pepy.tech/project/embedman)
# EmbedMan
`EmbedMan` is a Python package designed to manage embeddings for code analysis efficiently. It facilitates the process of generating and retrieving embeddings from a specified directory of code files, utilizing the power of language models and embedding storage solutions.
## Installation
To install `EmbedMan`, you can use pip:
```bash
pip install embedman
```
## Usage
### As a Python Module
After installation, `EmbedMan` can be imported and used in your Python projects.
Example:
```python
from embed_man import EmbedManager
# Initialize the EmbedManager with desired parameters
embed_manager = EmbedManager(
path="path/to/your/code/directory",
glob_rule="**/*.py",
use_cache=True
)
# Run the embedding process and get a retriever for querying embeddings
retriever = embed_manager.run()
# You can now use the retriever to query embeddings
```
### Configurable Parameters
`EmbedMan` allows various configurations to tailor the embedding process to your needs, including:
- `path`: The directory path to scan for documents.
- `glob_rule`: Glob pattern to match files within the directory.
- `suffixes`: List of file suffixes to include.
- `exclude`: List of patterns to exclude.
- `language`: Programming language of the documents.
- `parser_threshold`: Threshold for the parser to consider a document valid.
- `chunk_size`: Size of chunks to split documents into for embedding.
- `chunk_overlap`: Overlap between consecutive chunks.
- `cache_dir`: Directory path for caching embeddings.
- `namespace_cache`: Namespace for the cache to avoid collisions.
## Contributing
Contributions, issues, and feature requests are welcome! Feel free to check [issues page](https://github.com/chigwell/embedman/issues).
## License
[MIT](https://choosealicense.com/licenses/mit/)
Raw data
{
"_id": null,
"home_page": "https://github.com/chigwell/EmbedMan",
"name": "EmbedMan",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "",
"author": "Eugene Evstafev",
"author_email": "chigwel@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/72/47/0a4e43123b0b24346f7cd12ac81c5549132ad23f232edd6571c2d1ea3701/EmbedMan-0.0.2.tar.gz",
"platform": null,
"description": "[](https://badge.fury.io/py/embedman)\n[](https://opensource.org/licenses/MIT)\n[](https://pepy.tech/project/embedman)\n\n# EmbedMan\n\n`EmbedMan` is a Python package designed to manage embeddings for code analysis efficiently. It facilitates the process of generating and retrieving embeddings from a specified directory of code files, utilizing the power of language models and embedding storage solutions.\n\n## Installation\n\nTo install `EmbedMan`, you can use pip:\n\n```bash\npip install embedman\n```\n\n## Usage\n\n### As a Python Module\n\nAfter installation, `EmbedMan` can be imported and used in your Python projects.\n\nExample:\n\n```python\nfrom embed_man import EmbedManager\n\n# Initialize the EmbedManager with desired parameters\nembed_manager = EmbedManager(\n path=\"path/to/your/code/directory\",\n glob_rule=\"**/*.py\",\n use_cache=True\n)\n\n# Run the embedding process and get a retriever for querying embeddings\nretriever = embed_manager.run()\n\n# You can now use the retriever to query embeddings\n```\n\n### Configurable Parameters\n\n`EmbedMan` allows various configurations to tailor the embedding process to your needs, including:\n\n- `path`: The directory path to scan for documents.\n- `glob_rule`: Glob pattern to match files within the directory.\n- `suffixes`: List of file suffixes to include.\n- `exclude`: List of patterns to exclude.\n- `language`: Programming language of the documents.\n- `parser_threshold`: Threshold for the parser to consider a document valid.\n- `chunk_size`: Size of chunks to split documents into for embedding.\n- `chunk_overlap`: Overlap between consecutive chunks.\n- `cache_dir`: Directory path for caching embeddings.\n- `namespace_cache`: Namespace for the cache to avoid collisions.\n\n## Contributing\n\nContributions, issues, and feature requests are welcome! Feel free to check [issues page](https://github.com/chigwell/embedman/issues).\n\n## License\n\n[MIT](https://choosealicense.com/licenses/mit/)\n",
"bugtrack_url": null,
"license": "",
"summary": "A tool for managing embeddings for code analysis",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://github.com/chigwell/EmbedMan"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4fb6cf4f524ac6e2e933e95b8b7e0707915baeaf81a8c2374d5df59f99467887",
"md5": "fff664474266625e32e53a22b123f874",
"sha256": "f78dd1ddf0996bdbc09edd575d5bef17db5634c0673713762904c8898119eeab"
},
"downloads": -1,
"filename": "EmbedMan-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fff664474266625e32e53a22b123f874",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 6269,
"upload_time": "2024-01-27T11:38:22",
"upload_time_iso_8601": "2024-01-27T11:38:22.247130Z",
"url": "https://files.pythonhosted.org/packages/4f/b6/cf4f524ac6e2e933e95b8b7e0707915baeaf81a8c2374d5df59f99467887/EmbedMan-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "72470a4e43123b0b24346f7cd12ac81c5549132ad23f232edd6571c2d1ea3701",
"md5": "43f4c40a7ec60d78408efd4a7507c497",
"sha256": "8d8ae80be149aa68d0b0d1df04ed7f17505e1647f8703ecf40a850f0a6b36829"
},
"downloads": -1,
"filename": "EmbedMan-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "43f4c40a7ec60d78408efd4a7507c497",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 5386,
"upload_time": "2024-01-27T11:38:23",
"upload_time_iso_8601": "2024-01-27T11:38:23.715458Z",
"url": "https://files.pythonhosted.org/packages/72/47/0a4e43123b0b24346f7cd12ac81c5549132ad23f232edd6571c2d1ea3701/EmbedMan-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-27 11:38:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "chigwell",
"github_project": "EmbedMan",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "embedman"
}