Name | fileseek JSON |
Version |
0.1.3
JSON |
| download |
home_page | None |
Summary | FileSeek â AI-Powered Local Document Archive&Search |
upload_time | 2025-02-08 07:13:54 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.11 |
license | MIT License
Copyright (c) 2025 KyrieTangSheng
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
|
keywords |
document-management
semantic-search
vector-search
ocr
document-processing
file-monitoring
text-extraction
pdf-processing
local-search
offline-search
document-indexing
file-archival
text-analysis
document-similarity
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# ð FileSeek â AI-Powered Local Document Archivist
ð **Fast. Private. Local.** â FileSeek is a lightweight AI-powered file archive and search tool that helps you organize and retrieve documents instantly using natural language.
It runs entirely on your machine, ensuring full privacy while giving you a cyber-style experience.
---
## ð Key Features
- â
**Smart Search** â Natural language search with semantic understanding
- â
**Similar Document Finding** â Discover related documents automatically
- â
**AI-Powered OCR** â Extract text from images and scanned PDFs
- â
**Local-First** â Runs fully offline for complete privacy
- â
**Zero Config** â Works out of the box with sensible defaults
- â
**Real-time Monitoring** â Auto-detects new and modified files
---
## ð Why FileSeek?
- ⥠**Blazing Fast** â Semantic search in milliseconds
- ð **Privacy First** â No cloud, no data sharing, fully local
- ðĪ **AI-Powered** â Advanced OCR and semantic understanding
- ðŠķ **Lightweight** â Minimal dependencies, smooth performance
- ðŧ **Developer-Friendly** â Clean CLI with rich terminal UI
---
## ð System Requirements
### Required Dependencies
Make sure you have these system packages installed:
**Ubuntu/Debian:**
```bash
sudo apt-get install tesseract-ocr poppler-utils libmagic1
```
**Fedora:**
```bash
sudo dnf install tesseract poppler-utils file-libs
```
**macOS:**
```bash
brew install tesseract poppler libmagic
```
**Windows:**
```powershell
# Using Chocolatey (Run as Administrator)
choco install tesseract poppler libmagic
```
---
## ð Quick Start
### 1ïļâĢ Process Documents
Add documents to the archive:
```bash
fileseek process -r /path/to/documents
```
**Supports:** PDFs, text files, images (with OCR), and scanned documents
### 2ïļâĢ Search Documents
Find documents using natural language:
```bash
fileseek search "find my notes on machine learning"
```
### 3ïļâĢ Find Similar Documents
Discover documents similar to a reference file:
```bash
fileseek similar /path/to/reference/file
```
### 4ïļâĢ Monitor for Changes
Automatically process new, modified, or deleted files:
```bash
fileseek watch /path/to/watch
```
### 5ïļâĢ List All Processed Documents
View all archived documents:
```bash
fileseek list
```
---
## ðĶ Run from Source
```bash
git clone https://github.com/yourusername/fileseek.git
pip install -e .
```
Now you can use all commands directly:
```bash
fileseek process ~/Documents
fileseek search "find my course note on machine learning"
fileseek similar ~/Documents/project_plan.pdf
fileseek watch ~/Documents
fileseek list
fileseek validate
```
## âïļ Configuration
FileSeek is zero-config by default but highly customizable:
```bash
# Set custom storage location
fileseek config set storage_path=~/FileSeekData
# Enable debug logging
fileseek config set logging.level=DEBUG
# Set OCR language (ISO 639-2 codes)
fileseek config set ocr.languages=["eng","fra"]
```
Key Configuration Options:
- `storage_path`: Where to store the document index
- `ocr.languages`: Languages for OCR processing
- `search.max_results`: Maximum number of search results
- `monitoring.watch_interval`: File monitoring frequency
---
Raw data
{
"_id": null,
"home_page": null,
"name": "fileseek",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "document-management, semantic-search, vector-search, ocr, document-processing, file-monitoring, text-extraction, pdf-processing, local-search, offline-search, document-indexing, file-archival, text-analysis, document-similarity",
"author": null,
"author_email": "TangSheng <tangsheng001018@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/5c/de/0be2cbac4a34136293e7899b1ac807680d70d32d18a6e8bd9520bedd289f/fileseek-0.1.3.tar.gz",
"platform": null,
"description": "# \ud83d\udccc FileSeek \u2013 AI-Powered Local Document Archivist\n\n\ud83d\ude80 **Fast. Private. Local.** \u2013 FileSeek is a lightweight AI-powered file archive and search tool that helps you organize and retrieve documents instantly using natural language.\n\nIt runs entirely on your machine, ensuring full privacy while giving you a cyber-style experience.\n\n---\n\n## \ud83d\udd0d Key Features\n- \u2705 **Smart Search** \u2013 Natural language search with semantic understanding\n- \u2705 **Similar Document Finding** \u2013 Discover related documents automatically\n- \u2705 **AI-Powered OCR** \u2013 Extract text from images and scanned PDFs\n- \u2705 **Local-First** \u2013 Runs fully offline for complete privacy\n- \u2705 **Zero Config** \u2013 Works out of the box with sensible defaults\n- \u2705 **Real-time Monitoring** \u2013 Auto-detects new and modified files\n\n---\n\n## \ud83d\udcd6 Why FileSeek?\n- \u26a1 **Blazing Fast** \u2013 Semantic search in milliseconds\n- \ud83d\udd12 **Privacy First** \u2013 No cloud, no data sharing, fully local\n- \ud83e\udd16 **AI-Powered** \u2013 Advanced OCR and semantic understanding\n- \ud83e\udeb6 **Lightweight** \u2013 Minimal dependencies, smooth performance\n- \ud83d\udcbb **Developer-Friendly** \u2013 Clean CLI with rich terminal UI\n\n---\n\n## \ud83d\udee0 System Requirements\n\n### Required Dependencies\nMake sure you have these system packages installed:\n\n**Ubuntu/Debian:**\n```bash\nsudo apt-get install tesseract-ocr poppler-utils libmagic1\n```\n\n**Fedora:**\n```bash\nsudo dnf install tesseract poppler-utils file-libs\n```\n\n**macOS:**\n```bash\nbrew install tesseract poppler libmagic\n```\n\n**Windows:**\n```powershell\n# Using Chocolatey (Run as Administrator)\nchoco install tesseract poppler libmagic\n```\n\n---\n\n## \ud83d\ude80 Quick Start\n\n### 1\ufe0f\u20e3 Process Documents\nAdd documents to the archive:\n```bash\nfileseek process -r /path/to/documents\n```\n\n**Supports:** PDFs, text files, images (with OCR), and scanned documents\n\n### 2\ufe0f\u20e3 Search Documents\nFind documents using natural language:\n```bash\nfileseek search \"find my notes on machine learning\"\n```\n\n### 3\ufe0f\u20e3 Find Similar Documents\nDiscover documents similar to a reference file:\n```bash\nfileseek similar /path/to/reference/file\n```\n\n### 4\ufe0f\u20e3 Monitor for Changes\nAutomatically process new, modified, or deleted files:\n```bash\nfileseek watch /path/to/watch\n```\n\n### 5\ufe0f\u20e3 List All Processed Documents\nView all archived documents:\n```bash\nfileseek list\n```\n\n---\n\n## \ud83d\udce6 Run from Source\n```bash\ngit clone https://github.com/yourusername/fileseek.git\npip install -e .\n```\n\nNow you can use all commands directly:\n```bash\nfileseek process ~/Documents\nfileseek search \"find my course note on machine learning\"\nfileseek similar ~/Documents/project_plan.pdf\nfileseek watch ~/Documents\nfileseek list\nfileseek validate\n```\n\n\n## \u2699\ufe0f Configuration\nFileSeek is zero-config by default but highly customizable:\n\n```bash\n# Set custom storage location\nfileseek config set storage_path=~/FileSeekData\n\n# Enable debug logging\nfileseek config set logging.level=DEBUG\n\n# Set OCR language (ISO 639-2 codes)\nfileseek config set ocr.languages=[\"eng\",\"fra\"]\n```\n\nKey Configuration Options:\n- `storage_path`: Where to store the document index\n- `ocr.languages`: Languages for OCR processing\n- `search.max_results`: Maximum number of search results\n- `monitoring.watch_interval`: File monitoring frequency\n\n---\n",
"bugtrack_url": null,
"license": "MIT License\n \n Copyright (c) 2025 KyrieTangSheng\n \n Permission is hereby granted, free of charge, to any person obtaining a copy\n of this software and associated documentation files (the \"Software\"), to deal\n in the Software without restriction, including without limitation the rights\n to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n copies of the Software, and to permit persons to whom the Software is\n furnished to do so, subject to the following conditions:\n \n The above copyright notice and this permission notice shall be included in all\n copies or substantial portions of the Software.\n \n THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n SOFTWARE.\n ",
"summary": "FileSeek \u2013 AI-Powered Local Document Archive&Search",
"version": "0.1.3",
"project_urls": null,
"split_keywords": [
"document-management",
" semantic-search",
" vector-search",
" ocr",
" document-processing",
" file-monitoring",
" text-extraction",
" pdf-processing",
" local-search",
" offline-search",
" document-indexing",
" file-archival",
" text-analysis",
" document-similarity"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b21c9435ccc7ee93c55fdd6fa05f3cd0e0b2987ef96d5f9313bcc752e046f8c5",
"md5": "31885edfbed12feee165746a5e5af518",
"sha256": "6f7af47a84698d5b36c30e3399f5a37093b0a87100f9fcdacfe576f32f72e479"
},
"downloads": -1,
"filename": "fileseek-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "31885edfbed12feee165746a5e5af518",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 44865,
"upload_time": "2025-02-08T07:13:52",
"upload_time_iso_8601": "2025-02-08T07:13:52.375048Z",
"url": "https://files.pythonhosted.org/packages/b2/1c/9435ccc7ee93c55fdd6fa05f3cd0e0b2987ef96d5f9313bcc752e046f8c5/fileseek-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "5cde0be2cbac4a34136293e7899b1ac807680d70d32d18a6e8bd9520bedd289f",
"md5": "085299aa9da2d62551f0be73b8c6bcf8",
"sha256": "63ccd892a150fd66c0076b9501d0dcebbe67508bf71d813cb5aeb59c43fee4c3"
},
"downloads": -1,
"filename": "fileseek-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "085299aa9da2d62551f0be73b8c6bcf8",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 39743,
"upload_time": "2025-02-08T07:13:54",
"upload_time_iso_8601": "2025-02-08T07:13:54.413389Z",
"url": "https://files.pythonhosted.org/packages/5c/de/0be2cbac4a34136293e7899b1ac807680d70d32d18a6e8bd9520bedd289f/fileseek-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-08 07:13:54",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "fileseek"
}