# job-tqdflex
[](https://creativecommons.org/licenses/by-sa/4.0/)
[](https://actions-badge.atrox.dev/David-Araripe/job_tqdflex/goto?ref=master)
A Python library supporting parallel processing with progress bars using joblib (*job*) and tqdm (*tq*), with flexibility (*flex*) for chunked processing for memory efficiency.
## Features
- **Memory efficient** - supports generators and iterators
- **Context manager support** - automatic cleanup of resources
- **Easy parallel processing** with automatic chunking for optimal performance
- **Error handling** - support for error handling with detailed logging
- **Custom logging support** - compatible with loguru and standard python logging
## Installation
```bash
pip install job-tqdflex
```
## Quick Start
```python
from job_tqdflex import ParallelApplier
import time
def slow_square(x):
time.sleep(0.1) # (slow) function to apply
return x ** 2
data = range(20)
# Create and run parallel applier
applier = ParallelApplier(slow_square, data, n_jobs=4)
results = applier()
print(results) # [0, 1, 4, 9, 16, 25, ...]
```
## Usage Examples
### Basic Usage
```python
from job_tqdflex import ParallelApplier
def process_item(item):
# Your processing logic here
return item * 2
data = [1, 2, 3, 4, 5]
applier = ParallelApplier(process_item, data)
results = applier()
```
### With Additional Arguments
```python
def power_function(base, exponent=2):
return base ** exponent
data = [1, 2, 3, 4, 5]
applier = ParallelApplier(power_function, data)
results = applier(exponent=3) # [1, 8, 27, 64, 125]
```
### Using functools.partial for Complex Arguments
```python
from functools import partial
def complex_function(item, multiplier, offset=0):
return item * multiplier + offset
# Pre-configure the function
configured_func = partial(complex_function, multiplier=3, offset=10)
data = [1, 2, 3, 4, 5]
applier = ParallelApplier(configured_func, data)
results = applier() # [13, 16, 19, 22, 25]
```
### Working with Generators
```python
def data_generator():
for i in range(1000):
yield i
def expensive_computation(x):
return sum(range(x))
# Works seamlessly with generators
applier = ParallelApplier(expensive_computation, data_generator(), n_jobs=8)
results = applier()
```
### Context Manager Usage
```python
def process_data(item):
return item ** 2
data = range(100)
# Automatic resource cleanup
with ParallelApplier(process_data, data, n_jobs=4) as applier:
results = applier()
```
### Different Backends
```python
# For CPU-bound tasks (default)
applier = ParallelApplier(cpu_intensive_func, data, backend="loky")
# For I/O-bound tasks
applier = ParallelApplier(io_bound_func, data, backend="threading")
# For other use cases
applier = ParallelApplier(some_func, data, backend="multiprocessing")
```
### Custom Progress Bar Settings
```python
# Disable progress bar
applier = ParallelApplier(func, data, show_progress=False)
# Custom chunk size for memory management
applier = ParallelApplier(func, large_dataset, chunk_size=100)
# Custom progress bar description (default: "Applying {func_name} to chunks")
applier = ParallelApplier(func, data, custom_desc="Processing...")
```
### Using the Low-Level `tqdm_joblib` Context Manager
```python
from job_tqdflex import tqdm_joblib
from joblib import Parallel, delayed
from tqdm import tqdm
def slow_function(x):
time.sleep(0.1)
return x ** 2
# Direct integration with joblib
with tqdm_joblib(tqdm(total=10, desc="Processing")) as progress_bar:
results = Parallel(n_jobs=4)(delayed(slow_function)(i) for i in range(10))
```
## Configuration Options
### ParallelApplier Parameters
- **`func`**: The function to apply to each item
- **`iterable`**: Input data (list, generator, or any iterable)
- **`show_progress`**: Whether to show progress bars (default: `True`)
- **`n_jobs`**: Number of parallel jobs (default: `8`, use `-1` for all cores)
- **`backend`**: Parallelization backend (`"loky"`, `"threading"`, or `"multiprocessing"`)
- **`chunk_size`**: Size of chunks to process (default: auto-calculated)
- **`custom_desc`**: Custom description for the progress bar (default: `None`, uses `"Applying {func_name} to chunks"`)
- **`logger`**: Optional custom logger instance (supports standard logging and loguru)
### Performance Tips
1. **Choose the right backend**:
- `"loky"` (default): Best for CPU-bound tasks
- `"threading"`: Good for I/O-bound tasks
- `"multiprocessing"`: For CPU-bound tasks with shared memory concerns
2. **Optimize chunk size**:
- Larger chunks reduce overhead but increase memory usage
- Smaller chunks provide better load balancing
- Auto-calculation usually works well
3. **Use generators for large datasets**:
```python
def large_data_generator():
for i in range(1_000_000):
yield expensive_data_loader(i)
applier = ParallelApplier(process_func, large_data_generator())
```
## Error Handling
The library provides comprehensive error handling:
```python
def potentially_failing_function(x):
if x == 42:
raise ValueError("The answer to everything!")
return x * 2
try:
applier = ParallelApplier(potentially_failing_function, range(100))
results = applier()
except RuntimeError as e:
print(f"Parallel processing failed: {e}")
```
## Logging
### Standard Python Logging
Enable debug logging to monitor performance:
```python
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("joblib_tqdm")
# Your parallel processing code here
```
### Custom Logger Support (including Loguru)
The library supports custom logger instances, including loguru:
```python
# With loguru (if installed)
from loguru import logger as loguru_logger
def process_item(x):
return x ** 2
data = range(100)
# Use loguru for all internal logging
applier = ParallelApplier(process_item, data, logger=loguru_logger)
results = applier()
# Or with tqdm_joblib context manager
from tqdm import tqdm
with tqdm_joblib(tqdm(total=100, desc="Processing"), logger=loguru_logger) as pbar:
results = Parallel(n_jobs=4)(delayed(process_item)(i) for i in data)
```
```python
# With standard logging custom logger
import logging
custom_logger = logging.getLogger("my_custom_logger")
custom_logger.setLevel(logging.INFO)
applier = ParallelApplier(process_item, data, logger=custom_logger)
results = applier()
```
**Note**: Loguru is not a required dependency. It's included in the `[dev]` optional dependencies for testing purposes. You can use any logger object that has `debug()` and `error()` methods.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the CC BY-SA 4.0 License - see the [LICENSE](LICENSE) file for details.
## Attribution
This project includes code based on the [tqdm_joblib](https://github.com/louisabraham/tqdm_joblib) implementation by Louis Abraham, which is distributed under CC BY-SA 4.0. The original implementation was inspired by a Stack Overflow solution for integrating tqdm with joblib's parallel processing.
## Acknowledgments
- Built on top of the excellent [joblib](https://joblib.readthedocs.io/) library
- Progress bars provided by [tqdm](https://tqdm.github.io/)
- Based on the original [tqdm_joblib](https://github.com/louisabraham/tqdm_joblib) by Louis Abraham
- Inspired by the need for simple parallel processing with progress tracking and custom logging support
## Changelog
### 0.1.0 (2025)
- Initial release
- Basic parallel processing with progress bars
- Support for multiple backends (loky, threading, multiprocessing)
- Generator and iterator support
- Context manager support
- Custom logger support (compatible with loguru and standard logging)
- Comprehensive test suite including loguru integration tests
- Memory efficient chunking with auto-calculated chunk sizes
Raw data
{
"_id": null,
"home_page": null,
"name": "job-tqdflex",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "parallel, processing, joblib, tqdm, progress, multiprocessing",
"author": null,
"author_email": "David Araripe <david.araripe17@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/33/a3/2ed2a28f3522758b7a53a0a178595e14b225210d6be685fe406931ba0f73/job_tqdflex-0.1.0.tar.gz",
"platform": null,
"description": "# job-tqdflex\n\n[](https://creativecommons.org/licenses/by-sa/4.0/)\n[](https://actions-badge.atrox.dev/David-Araripe/job_tqdflex/goto?ref=master)\n\nA Python library supporting parallel processing with progress bars using joblib (*job*) and tqdm (*tq*), with flexibility (*flex*) for chunked processing for memory efficiency.\n\n## Features\n\n- **Memory efficient** - supports generators and iterators\n- **Context manager support** - automatic cleanup of resources\n- **Easy parallel processing** with automatic chunking for optimal performance\n- **Error handling** - support for error handling with detailed logging\n- **Custom logging support** - compatible with loguru and standard python logging\n\n## Installation\n\n```bash\npip install job-tqdflex\n```\n\n## Quick Start\n\n```python\nfrom job_tqdflex import ParallelApplier\nimport time\n\ndef slow_square(x): \n time.sleep(0.1) # (slow) function to apply\n return x ** 2\n\ndata = range(20)\n\n# Create and run parallel applier\napplier = ParallelApplier(slow_square, data, n_jobs=4)\nresults = applier()\n\nprint(results) # [0, 1, 4, 9, 16, 25, ...]\n```\n\n## Usage Examples\n\n### Basic Usage\n\n```python\nfrom job_tqdflex import ParallelApplier\n\ndef process_item(item):\n # Your processing logic here\n return item * 2\n\ndata = [1, 2, 3, 4, 5]\napplier = ParallelApplier(process_item, data)\nresults = applier()\n```\n\n### With Additional Arguments\n\n```python\ndef power_function(base, exponent=2):\n return base ** exponent\n\ndata = [1, 2, 3, 4, 5]\napplier = ParallelApplier(power_function, data)\nresults = applier(exponent=3) # [1, 8, 27, 64, 125]\n```\n\n### Using functools.partial for Complex Arguments\n\n```python\nfrom functools import partial\n\ndef complex_function(item, multiplier, offset=0):\n return item * multiplier + offset\n\n# Pre-configure the function\nconfigured_func = partial(complex_function, multiplier=3, offset=10)\n\ndata = [1, 2, 3, 4, 5]\napplier = ParallelApplier(configured_func, data)\nresults = applier() # [13, 16, 19, 22, 25]\n```\n\n### Working with Generators\n\n```python\ndef data_generator():\n for i in range(1000):\n yield i\n\ndef expensive_computation(x):\n return sum(range(x))\n\n# Works seamlessly with generators\napplier = ParallelApplier(expensive_computation, data_generator(), n_jobs=8)\nresults = applier()\n```\n\n### Context Manager Usage\n\n```python\ndef process_data(item):\n return item ** 2\n\ndata = range(100)\n\n# Automatic resource cleanup\nwith ParallelApplier(process_data, data, n_jobs=4) as applier:\n results = applier()\n```\n\n### Different Backends\n\n```python\n# For CPU-bound tasks (default)\napplier = ParallelApplier(cpu_intensive_func, data, backend=\"loky\")\n\n# For I/O-bound tasks\napplier = ParallelApplier(io_bound_func, data, backend=\"threading\")\n\n# For other use cases\napplier = ParallelApplier(some_func, data, backend=\"multiprocessing\")\n```\n\n### Custom Progress Bar Settings\n\n```python\n# Disable progress bar\napplier = ParallelApplier(func, data, show_progress=False)\n\n# Custom chunk size for memory management\napplier = ParallelApplier(func, large_dataset, chunk_size=100)\n\n# Custom progress bar description (default: \"Applying {func_name} to chunks\")\napplier = ParallelApplier(func, data, custom_desc=\"Processing...\")\n```\n\n### Using the Low-Level `tqdm_joblib` Context Manager\n\n```python\nfrom job_tqdflex import tqdm_joblib\nfrom joblib import Parallel, delayed\nfrom tqdm import tqdm\n\ndef slow_function(x):\n time.sleep(0.1)\n return x ** 2\n\n# Direct integration with joblib\nwith tqdm_joblib(tqdm(total=10, desc=\"Processing\")) as progress_bar:\n results = Parallel(n_jobs=4)(delayed(slow_function)(i) for i in range(10))\n```\n\n## Configuration Options\n\n### ParallelApplier Parameters\n\n- **`func`**: The function to apply to each item\n- **`iterable`**: Input data (list, generator, or any iterable)\n- **`show_progress`**: Whether to show progress bars (default: `True`)\n- **`n_jobs`**: Number of parallel jobs (default: `8`, use `-1` for all cores)\n- **`backend`**: Parallelization backend (`\"loky\"`, `\"threading\"`, or `\"multiprocessing\"`)\n- **`chunk_size`**: Size of chunks to process (default: auto-calculated)\n- **`custom_desc`**: Custom description for the progress bar (default: `None`, uses `\"Applying {func_name} to chunks\"`)\n- **`logger`**: Optional custom logger instance (supports standard logging and loguru)\n\n### Performance Tips\n\n1. **Choose the right backend**:\n - `\"loky\"` (default): Best for CPU-bound tasks\n - `\"threading\"`: Good for I/O-bound tasks\n - `\"multiprocessing\"`: For CPU-bound tasks with shared memory concerns\n\n2. **Optimize chunk size**:\n - Larger chunks reduce overhead but increase memory usage\n - Smaller chunks provide better load balancing\n - Auto-calculation usually works well\n\n3. **Use generators for large datasets**:\n ```python\n def large_data_generator():\n for i in range(1_000_000):\n yield expensive_data_loader(i)\n \n applier = ParallelApplier(process_func, large_data_generator())\n ```\n\n## Error Handling\n\nThe library provides comprehensive error handling:\n\n```python\ndef potentially_failing_function(x):\n if x == 42:\n raise ValueError(\"The answer to everything!\")\n return x * 2\n\ntry:\n applier = ParallelApplier(potentially_failing_function, range(100))\n results = applier()\nexcept RuntimeError as e:\n print(f\"Parallel processing failed: {e}\")\n```\n\n## Logging\n\n### Standard Python Logging\n\nEnable debug logging to monitor performance:\n\n```python\nimport logging\n\nlogging.basicConfig(level=logging.DEBUG)\nlogger = logging.getLogger(\"joblib_tqdm\")\n\n# Your parallel processing code here\n```\n\n### Custom Logger Support (including Loguru)\n\nThe library supports custom logger instances, including loguru:\n\n```python\n# With loguru (if installed)\nfrom loguru import logger as loguru_logger\n\ndef process_item(x):\n return x ** 2\n\ndata = range(100)\n\n# Use loguru for all internal logging\napplier = ParallelApplier(process_item, data, logger=loguru_logger)\nresults = applier()\n\n# Or with tqdm_joblib context manager\nfrom tqdm import tqdm\nwith tqdm_joblib(tqdm(total=100, desc=\"Processing\"), logger=loguru_logger) as pbar:\n results = Parallel(n_jobs=4)(delayed(process_item)(i) for i in data)\n```\n\n```python\n# With standard logging custom logger\nimport logging\n\ncustom_logger = logging.getLogger(\"my_custom_logger\")\ncustom_logger.setLevel(logging.INFO)\n\napplier = ParallelApplier(process_item, data, logger=custom_logger)\nresults = applier()\n```\n\n**Note**: Loguru is not a required dependency. It's included in the `[dev]` optional dependencies for testing purposes. You can use any logger object that has `debug()` and `error()` methods.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the CC BY-SA 4.0 License - see the [LICENSE](LICENSE) file for details.\n\n## Attribution\n\nThis project includes code based on the [tqdm_joblib](https://github.com/louisabraham/tqdm_joblib) implementation by Louis Abraham, which is distributed under CC BY-SA 4.0. The original implementation was inspired by a Stack Overflow solution for integrating tqdm with joblib's parallel processing.\n\n## Acknowledgments\n\n- Built on top of the excellent [joblib](https://joblib.readthedocs.io/) library\n- Progress bars provided by [tqdm](https://tqdm.github.io/)\n- Based on the original [tqdm_joblib](https://github.com/louisabraham/tqdm_joblib) by Louis Abraham\n- Inspired by the need for simple parallel processing with progress tracking and custom logging support\n\n## Changelog\n\n### 0.1.0 (2025)\n- Initial release\n- Basic parallel processing with progress bars\n- Support for multiple backends (loky, threading, multiprocessing)\n- Generator and iterator support\n- Context manager support\n- Custom logger support (compatible with loguru and standard logging)\n- Comprehensive test suite including loguru integration tests\n- Memory efficient chunking with auto-calculated chunk sizes\n",
"bugtrack_url": null,
"license": "CC-BY-SA-4.0",
"summary": "Parallel processing with progress bars using joblib and tqdm",
"version": "0.1.0",
"project_urls": {
"Bug Tracker": "https://github.com/yourusername/joblib-tqdm/issues",
"Documentation": "https://github.com/yourusername/joblib-tqdm#readme",
"Homepage": "https://github.com/yourusername/joblib-tqdm",
"Repository": "https://github.com/yourusername/joblib-tqdm"
},
"split_keywords": [
"parallel",
" processing",
" joblib",
" tqdm",
" progress",
" multiprocessing"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "50e55d7c8b06f19c4a978cabd200c2dd8913d0e0c1c1c0e681649991ae550a56",
"md5": "30d95f79055344e220cf50c973611cbe",
"sha256": "c79429703d719f8531477321593819403ec69e650e03d35ec7d92de7f13e31f3"
},
"downloads": -1,
"filename": "job_tqdflex-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "30d95f79055344e220cf50c973611cbe",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 15314,
"upload_time": "2025-08-27T16:45:08",
"upload_time_iso_8601": "2025-08-27T16:45:08.197999Z",
"url": "https://files.pythonhosted.org/packages/50/e5/5d7c8b06f19c4a978cabd200c2dd8913d0e0c1c1c0e681649991ae550a56/job_tqdflex-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "33a32ed2a28f3522758b7a53a0a178595e14b225210d6be685fe406931ba0f73",
"md5": "8a01af7eb2daa0f1e7e911a1ecc2f787",
"sha256": "f3e91f3ac5ef8e1fca38117bb98aa888522eb8990ff246522cd6514ba81228bb"
},
"downloads": -1,
"filename": "job_tqdflex-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "8a01af7eb2daa0f1e7e911a1ecc2f787",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 20870,
"upload_time": "2025-08-27T16:45:09",
"upload_time_iso_8601": "2025-08-27T16:45:09.696567Z",
"url": "https://files.pythonhosted.org/packages/33/a3/2ed2a28f3522758b7a53a0a178595e14b225210d6be685fe406931ba0f73/job_tqdflex-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-27 16:45:09",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "yourusername",
"github_project": "joblib-tqdm",
"github_not_found": true,
"lcname": "job-tqdflex"
}