# websnap
<div>
<img alt="Supported Versions" src="https://img.shields.io/pypi/pyversions/websnap.svg">
<a href="https://pypi.org/project/websnap" target="_blank">
<img alt="PyPI - Version" src="https://img.shields.io/pypi/v/websnap">
</a>
<a href="https://pepy.tech/projects/websnap" target="_blank">
<img alt="PyPI - Downloads" src="https://static.pepy.tech/badge/websnap">
</a>
<a href="https://github.com/EnviDat/websnap/blob/main/LICENSE" target="_blank">
<img alt="License" src="https://img.shields.io/pypi/l/websnap?color=%232780C1">
</a>
<a href="https://black.readthedocs.io" target="_blank">
<img alt="Code Style - Black" src="https://img.shields.io/badge/code%20style-black-000000.svg">
</a>
</div>
### Copies files retrieved from an API to an S3 bucket or a local machine.
###
---
## Installation
```bash
pip install websnap
```
## Quickstart
### Websnap can be used as a function or as a CLI.
<p>
<a href="https://github.com/EnviDat/websnap/blob/main/overview_diagram.png"
target="_blank">Click here to view a websnap overview diagram.</a>
</p>
###
#### Function
```python
from websnap import websnap
# Execute websnap using default arguments
websnap()
# Execute websnap passing arguments
websnap(file_logs=True, s3_uploader=True, backup_s3_count=7, early_exit=True)
```
###
#### CLI
To access CLI documentation in terminal execute:
```bash
websnap_cli --help
```
## Function Parameters / CLI Options
<details>
<summary>
Click to unfold
</summary>
### Function Parameters
| Parameter | Type | Default |
|-------------------|---------------|----------------|
| `config` | `str` | `"config.ini"` |
| `log_level` | `str` | `"INFO"` |
| `file_logs` | `bool` | `False` |
| `s3_uploader` | `bool` | `False` |
| `backup_s3_count` | `int \| None` | `None` |
| `timeout` | `int` | `32` |
| `early_exit` | `bool` | `False` |
| `repeat_minutes` | `int \| None` | `None` |
| `section_config` | `str \| None` | `None` |
### CLI Options
| Option | Shortcut | Default |
|---------------------|----------|--------------|
| `--config` | `-c` | `config.ini` |
| `--log_level` | `-l` | `INFO` |
| `--file_logs` | `-f` | `False` |
| `--s3_uploader` | `-s` | `False` |
| `--backup_s3_count` | `-b` | `None` |
| `--timeout` | `-t` | `32` |
| `--early_exit` | `-e` | `False` |
| `--repeat_minutes` | `-r` | `None` |
| `--section_config` | `-n` | `None` |
### Description
| Function parameter /<br/> CLI option | Description |
|--------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `config` _(str)_ | <ul><li>Path to configuration `.ini` file</li><li>Default value expects file called `config.ini` in same directory as websnap package is being executed from</li></ul> |
| `log_level` _(str)_ | <ul><li>Level to use for logging</li><li>Default value is `INFO`</li><li>Valid logging levels are `DEBUG`, `INFO`, `WARNING`, `ERROR`, or `CRITICAL`</li><li><a href="https://docs.python.org/3/library/logging.html#levels" target="_blank">Click here to learn more about logging levels</a></li></ul> |
| `file_logs` _(bool)_ | <ul><li>Enable rotating file logs</li></ul> |
| `s3_uploader` _(bool)_ | <ul><li>Enable uploading of files as objects to an S3 bucket</li><ul> |
| `backup_s3_count` _(int \| None)_ | <ul><li>Copy and backup object in each config section to the configured S3 bucket a maximum of `backup_s3_count` times</li><li>Remove object with the oldest last modified timestamp</li><li>If omitted then objects are not copied or removed</li><li>If enabled then backup objects are copied and assigned the original object's key name with the last modified timestamp appended</li></ul> |
| `timeout` _(int)_ | <ul><li>Number of seconds to wait for response for each HTTP request before timing out</li><li>Default value is `32` seconds</li></ul> |
| `early_exit` _(bool)_ | <ul><li>Enable early program termination after error occurs</li><li>If omitted logs errors but continues program execution</li></ul> |
| `repeat_minutes` _(int \| None)_ | <ul><li>Run websnap continuously every `repeat_minutes` minutes</li><li>If omitted then websnap does not repeat</li></ul> |
| `section_config` _(str \| None)_ | <ul><li>File or URL to obtain additional configuration sections</li><li>If omitted then default value is `None` and only config specified in `config` argument is used</li><li>Cannot be used to assign "DEFAULT" values in config</li><li>Currently only supports JSON config and can only be used if `config` argument is also a JSON file</li><li>Duplicate sections will overwrite values with the same section passed in the `config` argument</li></ul> | |
</details>
## Usage: S3 Bucket
<details>
<summary>
Click to unfold
</summary>
### **Copy files retrieved from an API to an S3 bucket.**
Utilizes the AWS SDK for Python (Boto3) to add and backup API files as objects in an S3 bucket.
### Examples
#### Function
```python
# The s3_uploader argument must be passed as True to copy files as objects to an S3 bucket
# Copies objects to an S3 bucket using default argument values
websnap(s3_uploader=True)
# Copies objects to an S3 bucket, repeats every 1440 minutes (24 hours),
# and at maximum 4 backup objects are allowed for each config section
websnap(s3_uploader=True, repeat_minutes=1440, backup_s3_count=4)
```
#### CLI
- The following CLI option **must** be used to enable websnap to upload files as objects in an S3 bucket: `--s3_uploader`
- Copies objects to an S3 bucket using default argument values:
```bash
websnap_cli --s3_uploader
```
- Copies objects to an S3 bucket, repeats every 1440 minutes (24 hours),
and at maximum 4 backup objects are allowed for each config section:
```bash
websnap_cli --s3_uploader --repeat_minutes 1440 --backup_s3_count 4
```
### Configuration
- The following environment variables are **required**: `ENDPOINT_URL`,
`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
- A valid `.ini` or `.json `configuration file is **required**.
- Websnap expects the config to be `config.ini` in the same directory as websnap
package is being executed from.
- However, this can be changed using the `config` function argument (or CLI
`--config` option).
- All keys in tables below are **mandatory**.
#### S3 Configuration Example Files
| Format | Example Configuration File |
|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `.ini` | <a href="https://github.com/EnviDat/websnap/blob/main/src/websnap/config_templates/s3_config_template.ini" target="_blank">src/websnap/config_templates/s3_config_template.ini</a> |
| `.json` | <a href="https://github.com/EnviDat/websnap/blob/main/src/websnap/config_templates/s3_config_template.json" target="_blank">src/websnap/config_templates/s3_config_template.json</a> |
#### Environment Variables
Supports setting environment variables in a `.env` file.
Example `.env` file:
```
ENDPOINT_URL=https://dreamycloud.com
AWS_ACCESS_KEY_ID=1234567abcdefg
AWS_SECRET_ACCESS_KEY=hijklmn1234567
```
| Environment Variable | Description |
|-------------------------|------------------------------------------|
| `ENDPOINT_URL` | URL to use for the constructed S3 client |
| `AWS_ACCESS_KEY_ID` | AWS access key ID |
| `AWS_SECRET_ACCESS_KEY` | AWS secret access key |
#### Sections (one per API URL endpoint)
- _Each file retrieved from an API requires its **own config section!**_
- The section name be anything, it is suggested to have a name that relates to the
copied file.
Example S3 config section configuration with key prefix:
```
[resource]
url=https://www.example.com/api/resource
bucket=exampledata
key=subdirectory_resource/resource.xml
```
Example S3 config section configuration without key prefix:
```
[project]
url=https://www.example.com/api/project
bucket=exampledata
key=project.json
```
| Key | Value Description |
|----------|---------------------------------------------------------------|
| `url` | API URL endpoint that file will be retrieved from |
| `bucket` | Bucket that file (as an object) will be written in |
| `key` | Object key name with extension, can optionally include prefix |
</details>
## Usage: Local Machine
<details>
<summary>
Click to unfold
</summary>
### **Copy files retrieved from an API to a local machine.**
### Examples
#### Function
```python
# Write files retrieved from an API to local machine using default argument values
websnap()
# Write files retrieved from an API locally and repeats every 60 minutes (1 hour),
# file logs are enabled
websnap(file_logs=True, repeat_minutes=60)
```
#### CLI
- Write copied files to local machine using default argument values:
```bash
websnap_cli
```
- Write copied files locally and repeats every 60 minutes (1 hour), file logs
are enabled:
```bash
websnap_cli --file_logs --repeat_minutes 60
```
### Configuration
- A valid `.ini` or `.json` configuration file is **required** for both function and
CLI usage.
- Websnap expects the config to be `config.ini` in the same directory as websnap
package is being executed from.
- However, this can be changed using the `config` function argument (or CLI
`--config` option).
- Each file that will be retrieved from an API requires its _own section_.
- If the optional `directory` key/value pair is omitted then the file will be written in the directory that the program is executed from.
#### Configuration Example Files
| Format | Example Configuration File |
|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `.ini` | <a href="https://github.com/EnviDat/websnap/blob/main/src/websnap/config_templates/config_template.ini" target="_blank">src/websnap/config_templates/config_template.ini</a> |
| `.json` | <a href="https://github.com/EnviDat/websnap/blob/main/src/websnap/config_templates/config_template.json" target="_blank">src/websnap/config_templates/config_template.json</a> |
#### Sections (one per API URL endpoint)
Example local machine configuration section:
```
[project]
url=https://www.example.com/api/project
file_name=project.json
directory=projectdata
```
| Key | Value Description |
|--------------------------|---------------------------------------------------|
| `url` | API URL endpoint that file will be retrieved from |
| `file_name` | File name with extension |
| `directory` (_optional_) | Local directory name that file will be written in |
</details>
## Logs
<details>
<summary>
Click to unfold
</summary>
Websnap supports optional rotating file logs.
- The following CLI option **must** be used to enable websnap to support rotating file logs: `--file_logs`
- In function usage the following argument must be passed to support rotating file
logs: `file_logs=True`
- If log keys are not specified in the configuration `[DEFAULT]` section then default values in the table below will be used.
- `log_when` expects a value used by logging module TimedRotatingFileHandler.
- <a href="https://docs.python.org/3/library/logging.handlers.html#timedrotatingfilehandler" target="_blank">Click here for more information about how to use TimedRotatingFileHandler.</a>
- The default values result in the file logs being rotated once every day and no removal of backup log files.
### Configuration
Example log configuration:
```
[DEFAULT]
log_when=midnight
log_interval=1
log_backup_count=7
```
#### `[DEFAULT]` Section
| Key | Default | Value Description |
|--------------------|---------|--------------------------------------------------------------------------------------------------------------------------------|
| `log_when` | `D` | Specifies type of interval |
| `log_interval` | `1` | Duration of interval (must be positive integer) |
| `log_backup_count` | `0` | If nonzero then at most <`log_backup_count`> files will be kept,</br>oldest log file is deleted (must be non-negative integer) |
</details>
## Minimum Download Size
<details>
<summary>
Click to unfold
</summary>
Websnap supports optionally specifying the minimum download size (in kilobytes) a
file must be to copy it from the configured API URL endpoint.
- **By default the minimum default minimum size is 0 kb.**
- Unless specified in the configuration this means that a file of any size can be downloaded by websnap.
- Configured minimum download size must be a non-negative integer.
- If the content from the API URL endpoint is less than the configured size:
- An error will be logged and the program continues to the next config section.
- If the CLI option `--early_exit` (or function argument `early_exit=True`) is
enabled
then the program will terminate early.
### Configuration
Example minimum download size configuration:
```
[DEFAULT]
min_size_kb=1
```
#### `[DEFAULT]` Section
| Key | Default | Value Description |
|---------------|---------|-------------------------------------------------------------------|
| `min_size_kb` | `0` | Minimum download size in kilobytes (must be non-negative integer) |
</details>
## Author
Rebecca Buchholz, EnviDat Software Engineer
## Purpose
This project was developed to facilitate EnviDat resiliency and support continuous
operation during server maintenance.
<a href="https://www.envidat.ch" target="_blank">EnviDat</a> is the environmental data
portal of the Swiss Federal Institute for Forest, Snow and Landscape Research WSL.
## License
<a href="https://github.com/EnviDat/websnap/blob/main/LICENSE" target="_blank">MIT License</a>
Raw data
{
"_id": null,
"home_page": null,
"name": "websnap",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "EnviDat <envidat@wsl.ch>",
"keywords": "S3, Boto3, boto3, API, backup, AWS, AWS SDK, AWS SDK for Python",
"author": "Rebecca Buchholz",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/4f/e5/48959e9cf179dc4dc6a10696f409364627796236503d082cc19689bec194/websnap-2.0.8.tar.gz",
"platform": null,
"description": "# websnap\n\n<div>\n <img alt=\"Supported Versions\" src=\"https://img.shields.io/pypi/pyversions/websnap.svg\"> \n <a href=\"https://pypi.org/project/websnap\" target=\"_blank\">\n <img alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/websnap\">\n </a>\n <a href=\"https://pepy.tech/projects/websnap\" target=\"_blank\">\n <img alt=\"PyPI - Downloads\" src=\"https://static.pepy.tech/badge/websnap\">\n </a>\n <a href=\"https://github.com/EnviDat/websnap/blob/main/LICENSE\" target=\"_blank\">\n <img alt=\"License\" src=\"https://img.shields.io/pypi/l/websnap?color=%232780C1\">\n </a>\n <a href=\"https://black.readthedocs.io\" target=\"_blank\">\n <img alt=\"Code Style - Black\" src=\"https://img.shields.io/badge/code%20style-black-000000.svg\">\n </a>\n</div>\n\n### Copies files retrieved from an API to an S3 bucket or a local machine.\n\n###\n\n---\n\n\n## Installation\n\n ```bash\n pip install websnap\n ```\n\n\n## Quickstart\n\n### Websnap can be used as a function or as a CLI. \n\n<p>\n<a href=\"https://github.com/EnviDat/websnap/blob/main/overview_diagram.png\" \ntarget=\"_blank\">Click here to view a websnap overview diagram.</a>\n</p>\n\n\n###\n#### Function\n\n```python\nfrom websnap import websnap\n\n# Execute websnap using default arguments\nwebsnap()\n\n# Execute websnap passing arguments\nwebsnap(file_logs=True, s3_uploader=True, backup_s3_count=7, early_exit=True)\n```\n\n###\n#### CLI\n\nTo access CLI documentation in terminal execute: \n ```bash\n websnap_cli --help\n ```\n\n\n## Function Parameters / CLI Options\n\n<details>\n <summary>\n Click to unfold \n </summary>\n\n### Function Parameters\n| Parameter | Type | Default |\n|-------------------|---------------|----------------|\n| `config` | `str` | `\"config.ini\"` |\n| `log_level` | `str` | `\"INFO\"` |\n| `file_logs` | `bool` | `False` |\n| `s3_uploader` | `bool` | `False` |\n| `backup_s3_count` | `int \\| None` | `None` |\n| `timeout` | `int` | `32` |\n| `early_exit` | `bool` | `False` |\n| `repeat_minutes` | `int \\| None` | `None` |\n| `section_config` | `str \\| None` | `None` |\n\n### CLI Options\n| Option | Shortcut | Default |\n|---------------------|----------|--------------|\n| `--config` | `-c` | `config.ini` |\n| `--log_level` | `-l` | `INFO` |\n| `--file_logs` | `-f` | `False` |\n| `--s3_uploader` | `-s` | `False` |\n| `--backup_s3_count` | `-b` | `None` |\n| `--timeout` | `-t` | `32` |\n| `--early_exit` | `-e` | `False` |\n| `--repeat_minutes` | `-r` | `None` |\n| `--section_config` | `-n` | `None` |\n\n### Description\n\n| Function parameter /<br/> CLI option | Description |\n|--------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `config` _(str)_ | <ul><li>Path to configuration `.ini` file</li><li>Default value expects file called `config.ini` in same directory as websnap package is being executed from</li></ul> |\n| `log_level` _(str)_ | <ul><li>Level to use for logging</li><li>Default value is `INFO`</li><li>Valid logging levels are `DEBUG`, `INFO`, `WARNING`, `ERROR`, or `CRITICAL`</li><li><a href=\"https://docs.python.org/3/library/logging.html#levels\" target=\"_blank\">Click here to learn more about logging levels</a></li></ul> |\n| `file_logs` _(bool)_ | <ul><li>Enable rotating file logs</li></ul> |\n| `s3_uploader` _(bool)_ | <ul><li>Enable uploading of files as objects to an S3 bucket</li><ul> |\n| `backup_s3_count` _(int \\| None)_ | <ul><li>Copy and backup object in each config section to the configured S3 bucket a maximum of `backup_s3_count` times</li><li>Remove object with the oldest last modified timestamp</li><li>If omitted then objects are not copied or removed</li><li>If enabled then backup objects are copied and assigned the original object's key name with the last modified timestamp appended</li></ul> |\n| `timeout` _(int)_ | <ul><li>Number of seconds to wait for response for each HTTP request before timing out</li><li>Default value is `32` seconds</li></ul> |\n| `early_exit` _(bool)_ | <ul><li>Enable early program termination after error occurs</li><li>If omitted logs errors but continues program execution</li></ul> |\n| `repeat_minutes` _(int \\| None)_ | <ul><li>Run websnap continuously every `repeat_minutes` minutes</li><li>If omitted then websnap does not repeat</li></ul> |\n| `section_config` _(str \\| None)_ | <ul><li>File or URL to obtain additional configuration sections</li><li>If omitted then default value is `None` and only config specified in `config` argument is used</li><li>Cannot be used to assign \"DEFAULT\" values in config</li><li>Currently only supports JSON config and can only be used if `config` argument is also a JSON file</li><li>Duplicate sections will overwrite values with the same section passed in the `config` argument</li></ul> | |\n\n\n</details>\n\n## Usage: S3 Bucket\n\n<details>\n <summary>\n Click to unfold \n </summary>\n\n\n### **Copy files retrieved from an API to an S3 bucket.**\n\nUtilizes the AWS SDK for Python (Boto3) to add and backup API files as objects in an S3 bucket. \n\n### Examples\n\n#### Function\n```python\n# The s3_uploader argument must be passed as True to copy files as objects to an S3 bucket\n# Copies objects to an S3 bucket using default argument values\nwebsnap(s3_uploader=True)\n\n# Copies objects to an S3 bucket, repeats every 1440 minutes (24 hours),\n# and at maximum 4 backup objects are allowed for each config section\nwebsnap(s3_uploader=True, repeat_minutes=1440, backup_s3_count=4)\n\n\n```\n\n#### CLI\n- The following CLI option **must** be used to enable websnap to upload files as objects in an S3 bucket: `--s3_uploader`\n\n- Copies objects to an S3 bucket using default argument values:\n ```bash\n websnap_cli --s3_uploader \n ```\n\n- Copies objects to an S3 bucket, repeats every 1440 minutes (24 hours),\n and at maximum 4 backup objects are allowed for each config section:\n ```bash\n websnap_cli --s3_uploader --repeat_minutes 1440 --backup_s3_count 4 \n ```\n\n### Configuration\n\n- The following environment variables are **required**: `ENDPOINT_URL`, \n `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`\n- A valid `.ini` or `.json `configuration file is **required**.\n- Websnap expects the config to be `config.ini` in the same directory as websnap \n package is being executed from.\n - However, this can be changed using the `config` function argument (or CLI \n `--config` option).\n- All keys in tables below are **mandatory**.\n\n#### S3 Configuration Example Files\n\n| Format | Example Configuration File |\n|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `.ini` | <a href=\"https://github.com/EnviDat/websnap/blob/main/src/websnap/config_templates/s3_config_template.ini\" target=\"_blank\">src/websnap/config_templates/s3_config_template.ini</a> |\n| `.json` | <a href=\"https://github.com/EnviDat/websnap/blob/main/src/websnap/config_templates/s3_config_template.json\" target=\"_blank\">src/websnap/config_templates/s3_config_template.json</a> |\n\n\n#### Environment Variables\n\nSupports setting environment variables in a `.env` file.\n\nExample `.env` file:\n\n```\nENDPOINT_URL=https://dreamycloud.com\nAWS_ACCESS_KEY_ID=1234567abcdefg\nAWS_SECRET_ACCESS_KEY=hijklmn1234567\n```\n\n| Environment Variable | Description |\n|-------------------------|------------------------------------------|\n| `ENDPOINT_URL` | URL to use for the constructed S3 client |\n| `AWS_ACCESS_KEY_ID` | AWS access key ID |\n| `AWS_SECRET_ACCESS_KEY` | AWS secret access key |\n\n#### Sections (one per API URL endpoint)\n\n- _Each file retrieved from an API requires its **own config section!**_\n- The section name be anything, it is suggested to have a name that relates to the \n copied file.\n\nExample S3 config section configuration with key prefix:\n\n```\n[resource]\nurl=https://www.example.com/api/resource\nbucket=exampledata\nkey=subdirectory_resource/resource.xml\n```\n\nExample S3 config section configuration without key prefix:\n\n```\n[project]\nurl=https://www.example.com/api/project\nbucket=exampledata\nkey=project.json\n```\n\n| Key | Value Description |\n|----------|---------------------------------------------------------------|\n| `url` | API URL endpoint that file will be retrieved from |\n| `bucket` | Bucket that file (as an object) will be written in |\n| `key` | Object key name with extension, can optionally include prefix |\n\n\n</details>\n\n\n## Usage: Local Machine\n\n<details>\n <summary>\n Click to unfold \n </summary>\n\n### **Copy files retrieved from an API to a local machine.** \n\n### Examples\n\n#### Function\n```python\n# Write files retrieved from an API to local machine using default argument values\nwebsnap()\n\n# Write files retrieved from an API locally and repeats every 60 minutes (1 hour), \n# file logs are enabled\nwebsnap(file_logs=True, repeat_minutes=60)\n```\n\n#### CLI \n\n- Write copied files to local machine using default argument values:\n ```bash\n websnap_cli \n ```\n\n- Write copied files locally and repeats every 60 minutes (1 hour), file logs \n are enabled:\n ```bash\n websnap_cli --file_logs --repeat_minutes 60\n ```\n\n### Configuration\n\n- A valid `.ini` or `.json` configuration file is **required** for both function and \n CLI usage.\n- Websnap expects the config to be `config.ini` in the same directory as websnap \n package is being executed from.\n - However, this can be changed using the `config` function argument (or CLI \n `--config` option).\n- Each file that will be retrieved from an API requires its _own section_. \n- If the optional `directory` key/value pair is omitted then the file will be written in the directory that the program is executed from.\n\n\n#### Configuration Example Files\n\n| Format | Example Configuration File |\n|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `.ini` | <a href=\"https://github.com/EnviDat/websnap/blob/main/src/websnap/config_templates/config_template.ini\" target=\"_blank\">src/websnap/config_templates/config_template.ini</a> |\n| `.json` | <a href=\"https://github.com/EnviDat/websnap/blob/main/src/websnap/config_templates/config_template.json\" target=\"_blank\">src/websnap/config_templates/config_template.json</a> |\n\n\n#### Sections (one per API URL endpoint)\n\nExample local machine configuration section:\n\n```\n[project]\nurl=https://www.example.com/api/project\nfile_name=project.json\ndirectory=projectdata\n```\n\n| Key | Value Description |\n|--------------------------|---------------------------------------------------|\n| `url` | API URL endpoint that file will be retrieved from |\n| `file_name` | File name with extension |\n| `directory` (_optional_) | Local directory name that file will be written in |\n\n</details>\n\n\n## Logs\n\n<details>\n <summary>\n Click to unfold \n </summary>\n\nWebsnap supports optional rotating file logs.\n\n- The following CLI option **must** be used to enable websnap to support rotating file logs: `--file_logs`\n - In function usage the following argument must be passed to support rotating file \n logs: `file_logs=True`\n- If log keys are not specified in the configuration `[DEFAULT]` section then default values in the table below will be used. \n- `log_when` expects a value used by logging module TimedRotatingFileHandler.\n- <a href=\"https://docs.python.org/3/library/logging.handlers.html#timedrotatingfilehandler\" target=\"_blank\">Click here for more information about how to use TimedRotatingFileHandler.</a>\n- The default values result in the file logs being rotated once every day and no removal of backup log files. \n\n### Configuration\n\nExample log configuration:\n\n```\n[DEFAULT]\nlog_when=midnight\nlog_interval=1\nlog_backup_count=7\n```\n\n#### `[DEFAULT]` Section\n| Key | Default | Value Description |\n|--------------------|---------|--------------------------------------------------------------------------------------------------------------------------------|\n| `log_when` | `D` | Specifies type of interval |\n| `log_interval` | `1` | Duration of interval (must be positive integer) |\n| `log_backup_count` | `0` | If nonzero then at most <`log_backup_count`> files will be kept,</br>oldest log file is deleted (must be non-negative integer) |\n\n\n</details>\n\n\n## Minimum Download Size\n\n<details>\n <summary>\n Click to unfold \n </summary>\n\nWebsnap supports optionally specifying the minimum download size (in kilobytes) a \nfile must be to copy it from the configured API URL endpoint.\n\n- **By default the minimum default minimum size is 0 kb.**\n - Unless specified in the configuration this means that a file of any size can be downloaded by websnap.\n- Configured minimum download size must be a non-negative integer.\n- If the content from the API URL endpoint is less than the configured size:\n - An error will be logged and the program continues to the next config section.\n - If the CLI option `--early_exit` (or function argument `early_exit=True`) is \n enabled \n then the program will terminate early.\n\n### Configuration\n\nExample minimum download size configuration:\n\n```\n[DEFAULT]\nmin_size_kb=1\n```\n\n#### `[DEFAULT]` Section\n| Key | Default | Value Description |\n|---------------|---------|-------------------------------------------------------------------|\n| `min_size_kb` | `0` | Minimum download size in kilobytes (must be non-negative integer) |\n\n\n</details>\n\n\n## Author\n\nRebecca Buchholz, EnviDat Software Engineer\n\n\n## Purpose\n\nThis project was developed to facilitate EnviDat resiliency and support continuous \noperation during server maintenance.\n\n<a href=\"https://www.envidat.ch\" target=\"_blank\">EnviDat</a> is the environmental data \nportal of the Swiss Federal Institute for Forest, Snow and Landscape Research WSL. \n\n\n## License \n\n<a href=\"https://github.com/EnviDat/websnap/blob/main/LICENSE\" target=\"_blank\">MIT License</a>\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Copies files retrieved from an API to an S3 bucket or a local machine.",
"version": "2.0.8",
"project_urls": {
"changelog": "https://github.com/EnviDat/websnap/blob/main/CHANGELOG.md",
"documentation": "https://github.com/EnviDat/websnap/blob/main/README.md",
"repository": "https://github.com/EnviDat/websnap"
},
"split_keywords": [
"s3",
" boto3",
" boto3",
" api",
" backup",
" aws",
" aws sdk",
" aws sdk for python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b43732e7accd88433b256a848b658468cc94433fd429873ebc96f9733a0b49c8",
"md5": "ebef79d2e796fb24282656b845501907",
"sha256": "a0daf417a2f82e7e1031fdbe0a00eb38a8423b190b9b5b516c00b60130e7882b"
},
"downloads": -1,
"filename": "websnap-2.0.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ebef79d2e796fb24282656b845501907",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 19672,
"upload_time": "2025-08-27T10:00:23",
"upload_time_iso_8601": "2025-08-27T10:00:23.512609Z",
"url": "https://files.pythonhosted.org/packages/b4/37/32e7accd88433b256a848b658468cc94433fd429873ebc96f9733a0b49c8/websnap-2.0.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4fe548959e9cf179dc4dc6a10696f409364627796236503d082cc19689bec194",
"md5": "d1ab90e86d6f80df94497afc55841f18",
"sha256": "fe95421a319654af6ad1de3285e715d587aa1da176188228f844e4769dcae7ac"
},
"downloads": -1,
"filename": "websnap-2.0.8.tar.gz",
"has_sig": false,
"md5_digest": "d1ab90e86d6f80df94497afc55841f18",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 21865,
"upload_time": "2025-08-27T10:00:26",
"upload_time_iso_8601": "2025-08-27T10:00:26.058098Z",
"url": "https://files.pythonhosted.org/packages/4f/e5/48959e9cf179dc4dc6a10696f409364627796236503d082cc19689bec194/websnap-2.0.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-27 10:00:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "EnviDat",
"github_project": "websnap",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "websnap"
}