fileseek


Namefileseek JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryFileSeek – AI-Powered Local Document Archive&Search
upload_time2025-02-08 07:13:54
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseMIT License Copyright (c) 2025 KyrieTangSheng Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords document-management semantic-search vector-search ocr document-processing file-monitoring text-extraction pdf-processing local-search offline-search document-indexing file-archival text-analysis document-similarity
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 📌 FileSeek – AI-Powered Local Document Archivist

🚀 **Fast. Private. Local.** – FileSeek is a lightweight AI-powered file archive and search tool that helps you organize and retrieve documents instantly using natural language.

It runs entirely on your machine, ensuring full privacy while giving you a cyber-style experience.

---

## 🔍 Key Features
- ✅ **Smart Search** – Natural language search with semantic understanding
- ✅ **Similar Document Finding** – Discover related documents automatically
- ✅ **AI-Powered OCR** – Extract text from images and scanned PDFs
- ✅ **Local-First** – Runs fully offline for complete privacy
- ✅ **Zero Config** – Works out of the box with sensible defaults
- ✅ **Real-time Monitoring** – Auto-detects new and modified files

---

## 📖 Why FileSeek?
- ⚡ **Blazing Fast** – Semantic search in milliseconds
- 🔒 **Privacy First** – No cloud, no data sharing, fully local
- ðŸĪ– **AI-Powered** – Advanced OCR and semantic understanding
- ðŸŠķ **Lightweight** – Minimal dependencies, smooth performance
- ðŸ’ŧ **Developer-Friendly** – Clean CLI with rich terminal UI

---

## 🛠 System Requirements

### Required Dependencies
Make sure you have these system packages installed:

**Ubuntu/Debian:**
```bash
sudo apt-get install tesseract-ocr poppler-utils libmagic1
```

**Fedora:**
```bash
sudo dnf install tesseract poppler-utils file-libs
```

**macOS:**
```bash
brew install tesseract poppler libmagic
```

**Windows:**
```powershell
# Using Chocolatey (Run as Administrator)
choco install tesseract poppler libmagic
```

---

## 🚀 Quick Start

### 1ïļâƒĢ Process Documents
Add documents to the archive:
```bash
fileseek process -r /path/to/documents
```

**Supports:** PDFs, text files, images (with OCR), and scanned documents

### 2ïļâƒĢ Search Documents
Find documents using natural language:
```bash
fileseek search "find my notes on machine learning"
```

### 3ïļâƒĢ Find Similar Documents
Discover documents similar to a reference file:
```bash
fileseek similar /path/to/reference/file
```

### 4ïļâƒĢ Monitor for Changes
Automatically process new, modified, or deleted files:
```bash
fileseek watch /path/to/watch
```

### 5ïļâƒĢ List All Processed Documents
View all archived documents:
```bash
fileseek list
```

---

## ðŸ“Ķ Run from Source
```bash
git clone https://github.com/yourusername/fileseek.git
pip install -e .
```

Now you can use all commands directly:
```bash
fileseek process ~/Documents
fileseek search "find my course note on machine learning"
fileseek similar ~/Documents/project_plan.pdf
fileseek watch ~/Documents
fileseek list
fileseek validate
```


## ⚙ïļ Configuration
FileSeek is zero-config by default but highly customizable:

```bash
# Set custom storage location
fileseek config set storage_path=~/FileSeekData

# Enable debug logging
fileseek config set logging.level=DEBUG

# Set OCR language (ISO 639-2 codes)
fileseek config set ocr.languages=["eng","fra"]
```

Key Configuration Options:
- `storage_path`: Where to store the document index
- `ocr.languages`: Languages for OCR processing
- `search.max_results`: Maximum number of search results
- `monitoring.watch_interval`: File monitoring frequency

---

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fileseek",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "document-management, semantic-search, vector-search, ocr, document-processing, file-monitoring, text-extraction, pdf-processing, local-search, offline-search, document-indexing, file-archival, text-analysis, document-similarity",
    "author": null,
    "author_email": "TangSheng <tangsheng001018@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/5c/de/0be2cbac4a34136293e7899b1ac807680d70d32d18a6e8bd9520bedd289f/fileseek-0.1.3.tar.gz",
    "platform": null,
    "description": "# \ud83d\udccc FileSeek \u2013 AI-Powered Local Document Archivist\n\n\ud83d\ude80 **Fast. Private. Local.** \u2013 FileSeek is a lightweight AI-powered file archive and search tool that helps you organize and retrieve documents instantly using natural language.\n\nIt runs entirely on your machine, ensuring full privacy while giving you a cyber-style experience.\n\n---\n\n## \ud83d\udd0d Key Features\n- \u2705 **Smart Search** \u2013 Natural language search with semantic understanding\n- \u2705 **Similar Document Finding** \u2013 Discover related documents automatically\n- \u2705 **AI-Powered OCR** \u2013 Extract text from images and scanned PDFs\n- \u2705 **Local-First** \u2013 Runs fully offline for complete privacy\n- \u2705 **Zero Config** \u2013 Works out of the box with sensible defaults\n- \u2705 **Real-time Monitoring** \u2013 Auto-detects new and modified files\n\n---\n\n## \ud83d\udcd6 Why FileSeek?\n- \u26a1 **Blazing Fast** \u2013 Semantic search in milliseconds\n- \ud83d\udd12 **Privacy First** \u2013 No cloud, no data sharing, fully local\n- \ud83e\udd16 **AI-Powered** \u2013 Advanced OCR and semantic understanding\n- \ud83e\udeb6 **Lightweight** \u2013 Minimal dependencies, smooth performance\n- \ud83d\udcbb **Developer-Friendly** \u2013 Clean CLI with rich terminal UI\n\n---\n\n## \ud83d\udee0 System Requirements\n\n### Required Dependencies\nMake sure you have these system packages installed:\n\n**Ubuntu/Debian:**\n```bash\nsudo apt-get install tesseract-ocr poppler-utils libmagic1\n```\n\n**Fedora:**\n```bash\nsudo dnf install tesseract poppler-utils file-libs\n```\n\n**macOS:**\n```bash\nbrew install tesseract poppler libmagic\n```\n\n**Windows:**\n```powershell\n# Using Chocolatey (Run as Administrator)\nchoco install tesseract poppler libmagic\n```\n\n---\n\n## \ud83d\ude80 Quick Start\n\n### 1\ufe0f\u20e3 Process Documents\nAdd documents to the archive:\n```bash\nfileseek process -r /path/to/documents\n```\n\n**Supports:** PDFs, text files, images (with OCR), and scanned documents\n\n### 2\ufe0f\u20e3 Search Documents\nFind documents using natural language:\n```bash\nfileseek search \"find my notes on machine learning\"\n```\n\n### 3\ufe0f\u20e3 Find Similar Documents\nDiscover documents similar to a reference file:\n```bash\nfileseek similar /path/to/reference/file\n```\n\n### 4\ufe0f\u20e3 Monitor for Changes\nAutomatically process new, modified, or deleted files:\n```bash\nfileseek watch /path/to/watch\n```\n\n### 5\ufe0f\u20e3 List All Processed Documents\nView all archived documents:\n```bash\nfileseek list\n```\n\n---\n\n## \ud83d\udce6 Run from Source\n```bash\ngit clone https://github.com/yourusername/fileseek.git\npip install -e .\n```\n\nNow you can use all commands directly:\n```bash\nfileseek process ~/Documents\nfileseek search \"find my course note on machine learning\"\nfileseek similar ~/Documents/project_plan.pdf\nfileseek watch ~/Documents\nfileseek list\nfileseek validate\n```\n\n\n## \u2699\ufe0f Configuration\nFileSeek is zero-config by default but highly customizable:\n\n```bash\n# Set custom storage location\nfileseek config set storage_path=~/FileSeekData\n\n# Enable debug logging\nfileseek config set logging.level=DEBUG\n\n# Set OCR language (ISO 639-2 codes)\nfileseek config set ocr.languages=[\"eng\",\"fra\"]\n```\n\nKey Configuration Options:\n- `storage_path`: Where to store the document index\n- `ocr.languages`: Languages for OCR processing\n- `search.max_results`: Maximum number of search results\n- `monitoring.watch_interval`: File monitoring frequency\n\n---\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2025 KyrieTangSheng\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.\n        ",
    "summary": "FileSeek \u2013 AI-Powered Local Document Archive&Search",
    "version": "0.1.3",
    "project_urls": null,
    "split_keywords": [
        "document-management",
        " semantic-search",
        " vector-search",
        " ocr",
        " document-processing",
        " file-monitoring",
        " text-extraction",
        " pdf-processing",
        " local-search",
        " offline-search",
        " document-indexing",
        " file-archival",
        " text-analysis",
        " document-similarity"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b21c9435ccc7ee93c55fdd6fa05f3cd0e0b2987ef96d5f9313bcc752e046f8c5",
                "md5": "31885edfbed12feee165746a5e5af518",
                "sha256": "6f7af47a84698d5b36c30e3399f5a37093b0a87100f9fcdacfe576f32f72e479"
            },
            "downloads": -1,
            "filename": "fileseek-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "31885edfbed12feee165746a5e5af518",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 44865,
            "upload_time": "2025-02-08T07:13:52",
            "upload_time_iso_8601": "2025-02-08T07:13:52.375048Z",
            "url": "https://files.pythonhosted.org/packages/b2/1c/9435ccc7ee93c55fdd6fa05f3cd0e0b2987ef96d5f9313bcc752e046f8c5/fileseek-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5cde0be2cbac4a34136293e7899b1ac807680d70d32d18a6e8bd9520bedd289f",
                "md5": "085299aa9da2d62551f0be73b8c6bcf8",
                "sha256": "63ccd892a150fd66c0076b9501d0dcebbe67508bf71d813cb5aeb59c43fee4c3"
            },
            "downloads": -1,
            "filename": "fileseek-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "085299aa9da2d62551f0be73b8c6bcf8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 39743,
            "upload_time": "2025-02-08T07:13:54",
            "upload_time_iso_8601": "2025-02-08T07:13:54.413389Z",
            "url": "https://files.pythonhosted.org/packages/5c/de/0be2cbac4a34136293e7899b1ac807680d70d32d18a6e8bd9520bedd289f/fileseek-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-08 07:13:54",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "fileseek"
}
        
Elapsed time: 0.43469s