# websnap
<div>
<img alt="PyPI - Version" src="https://img.shields.io/pypi/v/websnap">
<img alt="PyPI - Downloads" src="https://static.pepy.tech/badge/websnap">
<img alt="PyPI - License" src="https://img.shields.io/pypi/l/websnap?color=%232780C1">
<img alt="Coverage" src="https://gitlabext.wsl.ch/EnviDat/websnap/badges/main/coverage.svg?job=test&min_good=90">
<img alt="Code Style - Black" src="https://img.shields.io/badge/code%20style-black-000000.svg">
</div>
### Copies files retrieved from an API to a S3 bucket or a local machine.
###
---
## Installation
```bash
pip install websnap
```
## Quickstart
### Websnap can be used as a function or as a CLI.
<p>
<a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/overview_diagram.png"
target="_blank">Click here to view a websnap overview diagram.</a>
</p>
###
#### Function
```python
from websnap import websnap
# Execute websnap using default arguments
websnap()
# Execute websnap passing arguments
websnap(file_logs=True, s3_uploader=True, backup_s3_count=7, early_exit=True)
```
###
#### CLI
To access CLI documentation in terminal execute:
```bash
websnap_cli --help
```
## Function Parameters / CLI Options
<details>
<summary>
Click to unfold function parameters / CLI options
</summary>
### Function Parameters
| Parameter | Type | Default |
|-------------------|---------------|----------------|
| `config` | `str` | `"config.ini"` |
| `log_level` | `str` | `"INFO"` |
| `file_logs` | `bool` | `False` |
| `s3_uploader` | `bool` | `False` |
| `backup_s3_count` | `int \| None` | `None` |
| `timeout` | `int` | `32` |
| `early_exit` | `bool` | `False` |
| `repeat_minutes` | `int \| None` | `None` |
| `section_config` | `str \| None` | `None` |
### CLI Options
| Option | Shortcut | Default |
|---------------------|----------|--------------|
| `--config` | `-c` | `config.ini` |
| `--log_level` | `-l` | `INFO` |
| `--file_logs` | `-f` | `False` |
| `--s3_uploader` | `-s` | `False` |
| `--backup_s3_count` | `-b` | `None` |
| `--timeout` | `-t` | `32` |
| `--early_exit` | `-e` | `False` |
| `--repeat_minutes` | `-r` | `None` |
| `--section_config` | `-n` | `None` |
### Description
| Function parameter /<br/> CLI option | Description |
|--------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `config` _(str)_ | <ul><li>Path to configuration `.ini` file</li><li>Default value expects file called `config.ini` in same directory as websnap package is being executed from</li></ul> |
| `log_level` _(str)_ | <ul><li>Level to use for logging</li><li>Default value is `INFO`</li><li>Valid logging levels are `DEBUG`, `INFO`, `WARNING`, `ERROR`, or `CRITICAL`</li><li><a href="https://docs.python.org/3/library/logging.html#levels" target="_blank">Click here to learn more about logging levels</a></li></ul> |
| `file_logs` _(bool)_ | <ul><li>Enable rotating file logs</li></ul> |
| `s3_uploader` _(bool)_ | <ul><li>Enable uploading of files to S3 bucket</li><ul> |
| `backup_s3_count` _(int \| None)_ | <ul><li>Copy and backup file in each config section to the configured S3 bucket `backup_s3_count` times</li><li>Remove file with the oldest last modified timestamp</li><li>If omitted then files are not copied or removed</li><li>If enabled then backup files are copied and assigned the original file's name with the last modified timestamp appended</li></ul> |
| `timeout` _(int)_ | <ul><li>Number of seconds to wait for response for each HTTP request before timing out</li><li>Default value is `32` seconds</li></ul> |
| `early_exit` _(bool)_ | <ul><li>Enable early program termination after error occurs</li><li>If omitted logs errors but continues program execution</li></ul> |
| `repeat_minutes` _(int \| None)_ | <ul><li>Run websnap continuously every `repeat_minutes` minutes</li><li>If omitted then websnap does not repeat</li></ul> |
| `section_config` _(str \| None)_ | <ul><li>File or URL to obtain additional configuration sections</li><li>If omitted then default value is `None` and only config specified in `config` argument is used</li><li>Cannot be used to assign "DEFAULT" values in config</li><li>Currently only supports JSON config and can only be used if `config` argument is also a JSON file</li><li>Duplicate sections will overwrite values with the same section passed in the `config` argument</li></ul> | |
</details>
## Usage: S3 Bucket
<details>
<summary>
Click to unfold S3 bucket usage
</summary>
### **Copy files retrieved from an API to a S3 bucket.**
Uses the AWS SDK for Python (Boto3) to add and backup API files to a S3 bucket.
### Examples
#### Function
```python
# The s3_uploader argument must be passed as True to copy files to a S3 bucket
# Copies files to a S3 bucket using default argument values
websnap(s3_uploader=True)
# Copies files to a S3 bucket and repeat every 1440 minutes (24 hours),
# file logs are enabled and only 3 backup files are allowed for each config section
websnap(file_logs=True, s3_uploader=True, backup_s3_count=3, repeat_minutes=1440)
```
#### CLI
- The following CLI option **must** be used to enable websnap to upload files to a S3 bucket: `--s3_uploader`
- Copies files to a S3 bucket using default argument values:
```bash
websnap_cli --s3_uploader
```
- Copies files to a S3 bucket and repeat every 1440 minutes (24 hours), file
logs are enabled and only 3 backup files are allowed for each config section:
```bash
websnap_cli --file_logs --s3_uploader --backup_s3_count 3 --repeat_minutes 1440
```
### Configuration
- The following environment variables are **required**: `ENDPOINT_URL`,
`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
- A valid `.ini` or `.json `configuration file is **required**.
- Websnap expects the config to be `config.ini` in the same directory as websnap
package is being executed from.
- However, this can be changed using the `config` function argument (or CLI
`--config` option).
- All keys in tables below are **mandatory**.
#### S3 Configuration Example Files
| Format | Example Configuration File |
|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `.ini` | <a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/s3_config_template.ini" target="_blank">src/websnap/config_templates/s3_config_template.ini</a> |
| `.json` | <a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/s3_config_template.json" target="_blank">src/websnap/config_templates/s3_config_template.json</a> |
#### Environment Variables
Supports setting environment variables in a `.env` file.
Example `.env` file:
```
ENDPOINT_URL=https://dreamycloud.com
AWS_ACCESS_KEY_ID=1234567abcdefg
AWS_SECRET_ACCESS_KEY=hijklmn1234567
```
| Environment Variable | Description |
|-------------------------|------------------------------------------|
| `ENDPOINT_URL` | URL to use for the constructed S3 client |
| `AWS_ACCESS_KEY_ID` | AWS access key ID |
| `AWS_SECRET_ACCESS_KEY` | AWS secret access key |
#### Other Sections (one per API URL endpoint)
- _Each file retrieved from an API requires its **own config section!**_
- The section name be anything, it is suggested to have a name that relates to the
copied file.
Example S3 config section configuration with key prefix:
```
[resource]
url=https://www.example.com/api/resource
bucket=exampledata
key=subdirectory_resource/resource.xml
```
Example S3 config section configuration without key prefix:
```
[project]
url=https://www.example.com/api/project
bucket=exampledata
key=project.json
```
| Key | Value Description |
|----------|---------------------------------------------------------|
| `url` | API URL endpoint that file will be retrieved from |
| `bucket` | Bucket that file will be written in |
| `key` | File name with extension, can optionally include prefix |
</details>
## Usage: Local Machine
<details>
<summary>
Click to unfold local machine usage
</summary>
### **Copy files retrieved from an API to a local machine.**
### Examples
#### Function
```python
# Write files retrieved from an API to local machine using default argument values
websnap()
# Write files retrieved from an API locally and repeats every 60 minutes (1 hour),
# file logs are enabled
websnap(file_logs=True, repeat_minutes=60)
```
#### CLI
- Write copied files to local machine using default argument values:
```bash
websnap_cli
```
- Write copied files locally and repeats every 60 minutes (1 hour), file logs
are enabled:
```bash
websnap_cli --file_logs --repeat_minutes 60
```
### Configuration
- A valid `.ini` or `.json` configuration file is **required** for both function and
CLI usage.
- Websnap expects the config to be `config.ini` in the same directory as websnap
package is being executed from.
- However, this can be changed using the `config` function argument (or CLI
`--config` option).
- Each file that will be retrieved from an API requires its _own section_.
- If the optional `directory` key/value pair is omitted then the file will be written in the directory that the program is executed from.
#### Configuration Example Files
| Format | Example Configuration File |
|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `.ini` | <a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/config_template.ini" target="_blank">src/websnap/config_templates/config_template.ini</a> |
| `.json` | <a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/config_template.json" target="_blank">src/websnap/config_templates/config_template.json</a> |
#### Sections (one per API URL endpoint)
Example local machine configuration section:
```
[project]
url=https://www.example.com/api/project
file_name=project.json
directory=projectdata
```
| Key | Value Description |
|--------------------------|---------------------------------------------------|
| `url` | API URL endpoint that file will be retrieved from |
| `file_name` | File name with extension |
| `directory` (_optional_) | Local directory name that file will be written in |
</details>
## Logs
<details>
<summary>
Click to unfold logs
</summary>
Websnap supports optional rotating file logs.
- The following CLI option **must** be used to enable websnap to support rotating file logs: `--file_logs`
- In function usage the following argument must be passed to support rotating file
logs: `file_logs=True`
- If log keys are not specified in the configuration `[DEFAULT]` section then default values in the table below will be used.
- `log_when` expects a value used by logging module TimedRotatingFileHandler.
- <a href="https://docs.python.org/3/library/logging.handlers.html#timedrotatingfilehandler" target="_blank">Click here for more information about how to use TimedRotatingFileHandler.</a>
- The default values result in the file logs being rotated once every day and no removal of backup log files.
### Configuration
Example log configuration:
```
[DEFAULT]
log_when=midnight
log_interval=1
log_backup_count=7
```
#### `[DEFAULT]` Section
| Key | Default | Value Description |
|--------------------|---------|--------------------------------------------------------------------------------------------------------------------------------|
| `log_when` | `D` | Specifies type of interval |
| `log_interval` | `1` | Duration of interval (must be positive integer) |
| `log_backup_count` | `0` | If nonzero then at most <`log_backup_count`> files will be kept,</br>oldest log file is deleted (must be non-negative integer) |
</details>
## Minimum Download Size
<details>
<summary>
Click to unfold minimum download size
</summary>
Websnap supports optionally specifying the minimum download size (in kilobytes) a
file must be to copy it from the configured API URL endpoint.
- **By default the minimum default minimum size is 0 kb.**
- Unless specified in the configuration this means that a file of any size can be downloaded by websnap.
- Configured minimum download size must be a non-negative integer.
- If the content from the API URL endpoint is less than the configured size:
- An error will be logged and the program continues to the next config section.
- If the CLI option `--early_exit` (or function argument `early_exit=True`) is
enabled
then the program will terminate early.
### Configuration
Example minimum download size configuration:
```
[DEFAULT]
min_size_kb=1
```
#### `[DEFAULT]` Section
| Key | Default | Value Description |
|---------------|---------|-------------------------------------------------------------------|
| `min_size_kb` | `0` | Minimum download size in kilobytes (must be non-negative integer) |
</details>
## Author
Rebecca Kurup Buchholz
## Purpose
This project was developed to facilitate EnviDat resiliency and support continuous
operation during server maintenance.
<a href="https://www.envidat.ch" target="_blank">EnviDat</a> is the environmental data
portal of the Swiss Federal Institute for Forest, Snow and Landscape Research WSL.
## License
<a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/LICENSE" target="_blank">MIT License</a>
Raw data
{
"_id": null,
"home_page": null,
"name": "websnap",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "EnviDat <envidat@wsl.ch>",
"keywords": "S3, Boto3, boto3, API, backup, AWS, AWS SDK, AWS SDK for Python",
"author": null,
"author_email": "Rebecca Kurup Buchholz <rebecca.kurup@wsl.ch>",
"download_url": "https://files.pythonhosted.org/packages/22/f4/2db1ab38aaf5915d1de94a51907ebb7150bc61ae0c64fefc91a5d7baece6/websnap-2.0.0.tar.gz",
"platform": null,
"description": "# websnap\n\n<div>\n <img alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/websnap\">\n <img alt=\"PyPI - Downloads\" src=\"https://static.pepy.tech/badge/websnap\">\n <img alt=\"PyPI - License\" src=\"https://img.shields.io/pypi/l/websnap?color=%232780C1\">\n <img alt=\"Coverage\" src=\"https://gitlabext.wsl.ch/EnviDat/websnap/badges/main/coverage.svg?job=test&min_good=90\">\n <img alt=\"Code Style - Black\" src=\"https://img.shields.io/badge/code%20style-black-000000.svg\">\n</div>\n\n### Copies files retrieved from an API to a S3 bucket or a local machine.\n\n###\n\n---\n\n\n## Installation\n\n ```bash\n pip install websnap\n ```\n\n\n## Quickstart\n\n### Websnap can be used as a function or as a CLI. \n\n<p>\n<a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/overview_diagram.png\" \ntarget=\"_blank\">Click here to view a websnap overview diagram.</a>\n</p>\n\n\n###\n#### Function\n\n```python\nfrom websnap import websnap\n\n# Execute websnap using default arguments\nwebsnap()\n\n# Execute websnap passing arguments\nwebsnap(file_logs=True, s3_uploader=True, backup_s3_count=7, early_exit=True)\n```\n\n###\n#### CLI\n\nTo access CLI documentation in terminal execute: \n ```bash\n websnap_cli --help\n ```\n\n\n## Function Parameters / CLI Options\n\n<details>\n <summary>\n Click to unfold function parameters / CLI options\n </summary>\n\n### Function Parameters\n| Parameter | Type | Default |\n|-------------------|---------------|----------------|\n| `config` | `str` | `\"config.ini\"` |\n| `log_level` | `str` | `\"INFO\"` |\n| `file_logs` | `bool` | `False` |\n| `s3_uploader` | `bool` | `False` |\n| `backup_s3_count` | `int \\| None` | `None` |\n| `timeout` | `int` | `32` |\n| `early_exit` | `bool` | `False` |\n| `repeat_minutes` | `int \\| None` | `None` |\n| `section_config` | `str \\| None` | `None` |\n\n### CLI Options\n| Option | Shortcut | Default |\n|---------------------|----------|--------------|\n| `--config` | `-c` | `config.ini` |\n| `--log_level` | `-l` | `INFO` |\n| `--file_logs` | `-f` | `False` |\n| `--s3_uploader` | `-s` | `False` |\n| `--backup_s3_count` | `-b` | `None` |\n| `--timeout` | `-t` | `32` |\n| `--early_exit` | `-e` | `False` |\n| `--repeat_minutes` | `-r` | `None` |\n| `--section_config` | `-n` | `None` |\n\n### Description\n\n| Function parameter /<br/> CLI option | Description |\n|--------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `config` _(str)_ | <ul><li>Path to configuration `.ini` file</li><li>Default value expects file called `config.ini` in same directory as websnap package is being executed from</li></ul> |\n| `log_level` _(str)_ | <ul><li>Level to use for logging</li><li>Default value is `INFO`</li><li>Valid logging levels are `DEBUG`, `INFO`, `WARNING`, `ERROR`, or `CRITICAL`</li><li><a href=\"https://docs.python.org/3/library/logging.html#levels\" target=\"_blank\">Click here to learn more about logging levels</a></li></ul> |\n| `file_logs` _(bool)_ | <ul><li>Enable rotating file logs</li></ul> |\n| `s3_uploader` _(bool)_ | <ul><li>Enable uploading of files to S3 bucket</li><ul> |\n| `backup_s3_count` _(int \\| None)_ | <ul><li>Copy and backup file in each config section to the configured S3 bucket `backup_s3_count` times</li><li>Remove file with the oldest last modified timestamp</li><li>If omitted then files are not copied or removed</li><li>If enabled then backup files are copied and assigned the original file's name with the last modified timestamp appended</li></ul> |\n| `timeout` _(int)_ | <ul><li>Number of seconds to wait for response for each HTTP request before timing out</li><li>Default value is `32` seconds</li></ul> |\n| `early_exit` _(bool)_ | <ul><li>Enable early program termination after error occurs</li><li>If omitted logs errors but continues program execution</li></ul> |\n| `repeat_minutes` _(int \\| None)_ | <ul><li>Run websnap continuously every `repeat_minutes` minutes</li><li>If omitted then websnap does not repeat</li></ul> |\n| `section_config` _(str \\| None)_ | <ul><li>File or URL to obtain additional configuration sections</li><li>If omitted then default value is `None` and only config specified in `config` argument is used</li><li>Cannot be used to assign \"DEFAULT\" values in config</li><li>Currently only supports JSON config and can only be used if `config` argument is also a JSON file</li><li>Duplicate sections will overwrite values with the same section passed in the `config` argument</li></ul> | |\n\n\n</details>\n\n## Usage: S3 Bucket\n\n<details>\n <summary>\n Click to unfold S3 bucket usage\n </summary>\n\n\n### **Copy files retrieved from an API to a S3 bucket.**\n\nUses the AWS SDK for Python (Boto3) to add and backup API files to a S3 bucket. \n\n### Examples\n\n#### Function\n```python\n# The s3_uploader argument must be passed as True to copy files to a S3 bucket\n# Copies files to a S3 bucket using default argument values\nwebsnap(s3_uploader=True)\n\n# Copies files to a S3 bucket and repeat every 1440 minutes (24 hours), \n# file logs are enabled and only 3 backup files are allowed for each config section\nwebsnap(file_logs=True, s3_uploader=True, backup_s3_count=3, repeat_minutes=1440)\n```\n\n#### CLI\n- The following CLI option **must** be used to enable websnap to upload files to a S3 bucket: `--s3_uploader`\n\n- Copies files to a S3 bucket using default argument values:\n ```bash\n websnap_cli --s3_uploader \n ```\n\n- Copies files to a S3 bucket and repeat every 1440 minutes (24 hours), file \n logs are enabled and only 3 backup files are allowed for each config section:\n ```bash\n websnap_cli --file_logs --s3_uploader --backup_s3_count 3 --repeat_minutes 1440\n ```\n\n### Configuration\n\n- The following environment variables are **required**: `ENDPOINT_URL`, \n `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`\n- A valid `.ini` or `.json `configuration file is **required**.\n- Websnap expects the config to be `config.ini` in the same directory as websnap \n package is being executed from.\n - However, this can be changed using the `config` function argument (or CLI \n `--config` option).\n- All keys in tables below are **mandatory**.\n\n#### S3 Configuration Example Files\n\n| Format | Example Configuration File |\n|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `.ini` | <a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/s3_config_template.ini\" target=\"_blank\">src/websnap/config_templates/s3_config_template.ini</a> |\n| `.json` | <a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/s3_config_template.json\" target=\"_blank\">src/websnap/config_templates/s3_config_template.json</a> |\n\n\n#### Environment Variables\n\nSupports setting environment variables in a `.env` file.\n\nExample `.env` file:\n\n```\nENDPOINT_URL=https://dreamycloud.com\nAWS_ACCESS_KEY_ID=1234567abcdefg\nAWS_SECRET_ACCESS_KEY=hijklmn1234567\n```\n\n| Environment Variable | Description |\n|-------------------------|------------------------------------------|\n| `ENDPOINT_URL` | URL to use for the constructed S3 client |\n| `AWS_ACCESS_KEY_ID` | AWS access key ID |\n| `AWS_SECRET_ACCESS_KEY` | AWS secret access key |\n\n#### Other Sections (one per API URL endpoint)\n\n- _Each file retrieved from an API requires its **own config section!**_\n- The section name be anything, it is suggested to have a name that relates to the \n copied file.\n\nExample S3 config section configuration with key prefix:\n\n```\n[resource]\nurl=https://www.example.com/api/resource\nbucket=exampledata\nkey=subdirectory_resource/resource.xml\n```\n\nExample S3 config section configuration without key prefix:\n\n```\n[project]\nurl=https://www.example.com/api/project\nbucket=exampledata\nkey=project.json\n```\n\n| Key | Value Description |\n|----------|---------------------------------------------------------|\n| `url` | API URL endpoint that file will be retrieved from |\n| `bucket` | Bucket that file will be written in |\n| `key` | File name with extension, can optionally include prefix |\n\n\n</details>\n\n\n## Usage: Local Machine\n\n<details>\n <summary>\n Click to unfold local machine usage\n </summary>\n\n### **Copy files retrieved from an API to a local machine.** \n\n### Examples\n\n#### Function\n```python\n# Write files retrieved from an API to local machine using default argument values\nwebsnap()\n\n# Write files retrieved from an API locally and repeats every 60 minutes (1 hour), \n# file logs are enabled\nwebsnap(file_logs=True, repeat_minutes=60)\n```\n\n#### CLI \n\n- Write copied files to local machine using default argument values:\n ```bash\n websnap_cli \n ```\n\n- Write copied files locally and repeats every 60 minutes (1 hour), file logs \n are enabled:\n ```bash\n websnap_cli --file_logs --repeat_minutes 60\n ```\n\n### Configuration\n\n- A valid `.ini` or `.json` configuration file is **required** for both function and \n CLI usage.\n- Websnap expects the config to be `config.ini` in the same directory as websnap \n package is being executed from.\n - However, this can be changed using the `config` function argument (or CLI \n `--config` option).\n- Each file that will be retrieved from an API requires its _own section_. \n- If the optional `directory` key/value pair is omitted then the file will be written in the directory that the program is executed from.\n\n\n#### Configuration Example Files\n\n| Format | Example Configuration File |\n|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `.ini` | <a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/config_template.ini\" target=\"_blank\">src/websnap/config_templates/config_template.ini</a> |\n| `.json` | <a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/config_template.json\" target=\"_blank\">src/websnap/config_templates/config_template.json</a> |\n\n\n#### Sections (one per API URL endpoint)\n\nExample local machine configuration section:\n\n```\n[project]\nurl=https://www.example.com/api/project\nfile_name=project.json\ndirectory=projectdata\n```\n\n| Key | Value Description |\n|--------------------------|---------------------------------------------------|\n| `url` | API URL endpoint that file will be retrieved from |\n| `file_name` | File name with extension |\n| `directory` (_optional_) | Local directory name that file will be written in |\n\n</details>\n\n\n## Logs\n\n<details>\n <summary>\n Click to unfold logs\n </summary>\n\nWebsnap supports optional rotating file logs.\n\n- The following CLI option **must** be used to enable websnap to support rotating file logs: `--file_logs`\n - In function usage the following argument must be passed to support rotating file \n logs: `file_logs=True`\n- If log keys are not specified in the configuration `[DEFAULT]` section then default values in the table below will be used. \n- `log_when` expects a value used by logging module TimedRotatingFileHandler.\n- <a href=\"https://docs.python.org/3/library/logging.handlers.html#timedrotatingfilehandler\" target=\"_blank\">Click here for more information about how to use TimedRotatingFileHandler.</a>\n- The default values result in the file logs being rotated once every day and no removal of backup log files. \n\n### Configuration\n\nExample log configuration:\n\n```\n[DEFAULT]\nlog_when=midnight\nlog_interval=1\nlog_backup_count=7\n```\n\n#### `[DEFAULT]` Section\n| Key | Default | Value Description |\n|--------------------|---------|--------------------------------------------------------------------------------------------------------------------------------|\n| `log_when` | `D` | Specifies type of interval |\n| `log_interval` | `1` | Duration of interval (must be positive integer) |\n| `log_backup_count` | `0` | If nonzero then at most <`log_backup_count`> files will be kept,</br>oldest log file is deleted (must be non-negative integer) |\n\n\n</details>\n\n\n## Minimum Download Size\n\n<details>\n <summary>\n Click to unfold minimum download size\n </summary>\n\nWebsnap supports optionally specifying the minimum download size (in kilobytes) a \nfile must be to copy it from the configured API URL endpoint.\n\n- **By default the minimum default minimum size is 0 kb.**\n - Unless specified in the configuration this means that a file of any size can be downloaded by websnap.\n- Configured minimum download size must be a non-negative integer.\n- If the content from the API URL endpoint is less than the configured size:\n - An error will be logged and the program continues to the next config section.\n - If the CLI option `--early_exit` (or function argument `early_exit=True`) is \n enabled \n then the program will terminate early.\n\n### Configuration\n\nExample minimum download size configuration:\n\n```\n[DEFAULT]\nmin_size_kb=1\n```\n\n#### `[DEFAULT]` Section\n| Key | Default | Value Description |\n|---------------|---------|-------------------------------------------------------------------|\n| `min_size_kb` | `0` | Minimum download size in kilobytes (must be non-negative integer) |\n\n\n</details>\n\n\n## Author\n\nRebecca Kurup Buchholz\n\n\n## Purpose\n\nThis project was developed to facilitate EnviDat resiliency and support continuous \noperation during server maintenance.\n\n<a href=\"https://www.envidat.ch\" target=\"_blank\">EnviDat</a> is the environmental data \nportal of the Swiss Federal Institute for Forest, Snow and Landscape Research WSL. \n\n\n## License \n\n<a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/LICENSE\" target=\"_blank\">MIT License</a>\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Copies files retrieved from an API to a S3 bucket or a local machine.",
"version": "2.0.0",
"project_urls": {
"Changelog": "https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/CHANGELOG.md",
"Documentation": "https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/README.md",
"Repository": "https://gitlabext.wsl.ch/EnviDat/websnap"
},
"split_keywords": [
"s3",
" boto3",
" boto3",
" api",
" backup",
" aws",
" aws sdk",
" aws sdk for python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4474e9ce29a3aae959f4299c83d5bc84f30975968990163da55db284da5ca731",
"md5": "77378289e8b10d2193df0660afd408b3",
"sha256": "5e0a0a2a2f129d52d6cd70d398358d875ea51daca145625c3ca38ad072752e5c"
},
"downloads": -1,
"filename": "websnap-2.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "77378289e8b10d2193df0660afd408b3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 19575,
"upload_time": "2024-09-12T09:44:34",
"upload_time_iso_8601": "2024-09-12T09:44:34.813629Z",
"url": "https://files.pythonhosted.org/packages/44/74/e9ce29a3aae959f4299c83d5bc84f30975968990163da55db284da5ca731/websnap-2.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "22f42db1ab38aaf5915d1de94a51907ebb7150bc61ae0c64fefc91a5d7baece6",
"md5": "fb3ddaa603098e6da8f0969e108233d2",
"sha256": "c779403c9cc3b4e924a265f298c70c88b0d08d68a9323cac27470c5258c19d21"
},
"downloads": -1,
"filename": "websnap-2.0.0.tar.gz",
"has_sig": false,
"md5_digest": "fb3ddaa603098e6da8f0969e108233d2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 21889,
"upload_time": "2024-09-12T09:44:36",
"upload_time_iso_8601": "2024-09-12T09:44:36.287470Z",
"url": "https://files.pythonhosted.org/packages/22/f4/2db1ab38aaf5915d1de94a51907ebb7150bc61ae0c64fefc91a5d7baece6/websnap-2.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-12 09:44:36",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "websnap"
}