# FolderScanner
`FolderScanner` is a Python package that enables efficient scanning of directory structures, applying ignore rules similar to `.gitignore`, and chunking file contents for processing. It's designed to handle large datasets and is ideal for pre-processing tasks in data analysis or machine learning pipelines.
## Features
- Recursively scans specified directories.
- Applies ignore patterns to skip specified files and directories.
- Chunks file contents and yields them with their paths for efficient processing.
## Installation
To install `FolderScanner`, simply use pip:
```bash
pip install git+https://github.com/chigwell/FolderScanner.git
```
## Usage
Import and use `FolderScanner` in your Python projects as follows:
```python
from folder_scanner import scan_directory
core_folder = '/path/to/your/projects'
ignore_patterns = ['.git', '.dockerignore', '*.log', 'tmp/*']
for file_chunk in scan_directory(core_folder, ignore_patterns):
print(file_chunk)
```
## Contributing
Contributions are welcome! Please feel free to submit pull requests, report bugs, or suggest features on the [GitHub issues page](https://github.com/chigwell/FolderScanner/issues).
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": "https://github.com/chigwell/FolderScanner",
"name": "FolderScanner",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "Eugene Evstafev",
"author_email": "ee345@cam.ac.uk",
"download_url": "https://files.pythonhosted.org/packages/93/90/9562237c291e1c7d91251c6b189426aabd1705b36ec871e64a4d405ddd87/FolderScanner-0.1.0.tar.gz",
"platform": null,
"description": "# FolderScanner\n\n`FolderScanner` is a Python package that enables efficient scanning of directory structures, applying ignore rules similar to `.gitignore`, and chunking file contents for processing. It's designed to handle large datasets and is ideal for pre-processing tasks in data analysis or machine learning pipelines.\n\n## Features\n\n- Recursively scans specified directories.\n- Applies ignore patterns to skip specified files and directories.\n- Chunks file contents and yields them with their paths for efficient processing.\n\n## Installation\n\nTo install `FolderScanner`, simply use pip:\n\n```bash\npip install git+https://github.com/chigwell/FolderScanner.git\n```\n\n## Usage\n\nImport and use `FolderScanner` in your Python projects as follows:\n\n```python\nfrom folder_scanner import scan_directory\n\ncore_folder = '/path/to/your/projects'\nignore_patterns = ['.git', '.dockerignore', '*.log', 'tmp/*']\n\nfor file_chunk in scan_directory(core_folder, ignore_patterns):\n print(file_chunk)\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit pull requests, report bugs, or suggest features on the [GitHub issues page](https://github.com/chigwell/FolderScanner/issues).\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Scan directories, apply ignore rules, and chunk file contents.",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/chigwell/FolderScanner"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e80222e474be7f5d93bca63b5a6fe3e5928e029ed1b8cea69811571f0edb9c18",
"md5": "29b08dc925d18b8b5ad2ec6920d071e1",
"sha256": "83bc5c8446f8a95bd786d4b37a9d33da7e7e72bfc6c214186f4115fb5e429740"
},
"downloads": -1,
"filename": "FolderScanner-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "29b08dc925d18b8b5ad2ec6920d071e1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 3727,
"upload_time": "2024-05-25T11:17:34",
"upload_time_iso_8601": "2024-05-25T11:17:34.803230Z",
"url": "https://files.pythonhosted.org/packages/e8/02/22e474be7f5d93bca63b5a6fe3e5928e029ed1b8cea69811571f0edb9c18/FolderScanner-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "93909562237c291e1c7d91251c6b189426aabd1705b36ec871e64a4d405ddd87",
"md5": "ad0785135795ff8c6673077dacd0ce83",
"sha256": "fae914eaebfbd4978e282334f5dde8bd6cec99503581a3a6d802a63d31350b64"
},
"downloads": -1,
"filename": "FolderScanner-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "ad0785135795ff8c6673077dacd0ce83",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 3238,
"upload_time": "2024-05-25T11:17:37",
"upload_time_iso_8601": "2024-05-25T11:17:37.940947Z",
"url": "https://files.pythonhosted.org/packages/93/90/9562237c291e1c7d91251c6b189426aabd1705b36ec871e64a4d405ddd87/FolderScanner-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-25 11:17:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "chigwell",
"github_project": "FolderScanner",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "folderscanner"
}