websnap


Namewebsnap JSON
Version 2.0.0 PyPI version JSON
download
home_pageNone
SummaryCopies files retrieved from an API to a S3 bucket or a local machine.
upload_time2024-09-12 09:44:36
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseMIT
keywords s3 boto3 boto3 api backup aws aws sdk aws sdk for python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # websnap

<div>
  <img alt="PyPI - Version" src="https://img.shields.io/pypi/v/websnap">
  <img alt="PyPI - Downloads" src="https://static.pepy.tech/badge/websnap">
  <img alt="PyPI - License" src="https://img.shields.io/pypi/l/websnap?color=%232780C1">
  <img alt="Coverage" src="https://gitlabext.wsl.ch/EnviDat/websnap/badges/main/coverage.svg?job=test&min_good=90">
  <img alt="Code Style - Black" src="https://img.shields.io/badge/code%20style-black-000000.svg">
</div>

### Copies files retrieved from an API to a S3 bucket or a local machine.

###

---


## Installation

   ```bash
  pip install websnap
   ```


## Quickstart

### Websnap can be used as a function or as a CLI. 

<p>
<a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/overview_diagram.png" 
target="_blank">Click here to view a websnap overview diagram.</a>
</p>


###
#### Function

```python
from websnap import websnap

# Execute websnap using default arguments
websnap()

# Execute websnap passing arguments
websnap(file_logs=True, s3_uploader=True, backup_s3_count=7, early_exit=True)
```

###
#### CLI

To access CLI documentation in terminal execute: 
   ```bash
  websnap_cli --help
   ```


## Function Parameters / CLI Options

<details>
  <summary>
  Click to unfold function parameters / CLI options
  </summary>

### Function Parameters
| Parameter         | Type          | Default        |
|-------------------|---------------|----------------|
| `config`          | `str`         | `"config.ini"` |
| `log_level`       | `str`         | `"INFO"`       |
| `file_logs`       | `bool`        | `False`        |
| `s3_uploader`     | `bool`        | `False`        |
| `backup_s3_count` | `int \| None` | `None`         |
| `timeout`         | `int`         | `32`           |
| `early_exit`      | `bool`        | `False`        |
| `repeat_minutes`  | `int \| None` | `None`         |
| `section_config`  | `str \| None` | `None`         |

### CLI Options
| Option              | Shortcut | Default      |
|---------------------|----------|--------------|
| `--config`          | `-c`     | `config.ini` |
| `--log_level`       | `-l`     | `INFO`       |
| `--file_logs`       | `-f`     | `False`      |
| `--s3_uploader`     | `-s`     | `False`      |
| `--backup_s3_count` | `-b`     | `None`       |
| `--timeout`         | `-t`     | `32`         |
| `--early_exit`      | `-e`     | `False`      |
| `--repeat_minutes`  | `-r`     | `None`       |
| `--section_config`  | `-n`     | `None`       |

### Description

| Function parameter /<br/> CLI option | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|--------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `config` _(str)_                     | <ul><li>Path to configuration `.ini` file</li><li>Default value expects file called `config.ini` in same directory as websnap package is being executed from</li></ul>                                                                                                                                                                                                                                                                                        |
| `log_level` _(str)_                  | <ul><li>Level to use for logging</li><li>Default value is `INFO`</li><li>Valid logging levels are `DEBUG`, `INFO`, `WARNING`, `ERROR`, or `CRITICAL`</li><li><a href="https://docs.python.org/3/library/logging.html#levels" target="_blank">Click here to learn more about logging levels</a></li></ul>                                                                                                                                                      |
| `file_logs` _(bool)_                 | <ul><li>Enable rotating file logs</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                   |
| `s3_uploader` _(bool)_               | <ul><li>Enable uploading of files to S3 bucket</li><ul>                                                                                                                                                                                                                                                                                                                                                                                                       |
| `backup_s3_count` _(int \| None)_    | <ul><li>Copy and backup file in each config section to the configured S3 bucket `backup_s3_count` times</li><li>Remove file with the oldest last modified timestamp</li><li>If omitted then files are not copied or removed</li><li>If enabled then backup files are copied and assigned the original file's name with the last modified timestamp appended</li></ul>                                                                                         |
| `timeout` _(int)_                    | <ul><li>Number of seconds to wait for response for each HTTP request before timing out</li><li>Default value is `32` seconds</li></ul>                                                                                                                                                                                                                                                                                                                        |
| `early_exit` _(bool)_                | <ul><li>Enable early program termination after error occurs</li><li>If omitted logs errors but continues program execution</li></ul>                                                                                                                                                                                                                                                                                                                          |
| `repeat_minutes` _(int \| None)_     | <ul><li>Run websnap continuously every `repeat_minutes` minutes</li><li>If omitted then websnap does not repeat</li></ul>                                                                                                                                                                                                                                                                                                                                     |
| `section_config` _(str \| None)_     | <ul><li>File or URL to obtain additional configuration sections</li><li>If omitted then default value is `None` and only config specified in `config` argument is used</li><li>Cannot be used to assign "DEFAULT" values in config</li><li>Currently only supports JSON config and can only be used if `config` argument is also a JSON file</li><li>Duplicate sections will overwrite values with the same section passed in the `config` argument</li></ul> |                                                                                                                                                                                                                                                                                                                                                                      |


</details>

## Usage: S3 Bucket

<details>
  <summary>
  Click to unfold S3 bucket usage
  </summary>


### **Copy files retrieved from an API to a S3 bucket.**

Uses the AWS SDK for Python (Boto3) to add and backup API files to a S3 bucket. 

### Examples

#### Function
```python
# The s3_uploader argument must be passed as True to copy files to a S3 bucket
# Copies files to a S3 bucket using default argument values
websnap(s3_uploader=True)

# Copies files to a S3 bucket and repeat every 1440 minutes (24 hours), 
# file logs are enabled and only 3 backup files are allowed for each config section
websnap(file_logs=True, s3_uploader=True, backup_s3_count=3, repeat_minutes=1440)
```

#### CLI
- The following CLI option **must** be used to enable websnap to upload files to a S3 bucket: `--s3_uploader`

- Copies files to a S3 bucket using default argument values:
     ```bash
      websnap_cli --s3_uploader 
     ```

- Copies files to a S3 bucket and repeat every 1440 minutes (24 hours), file 
  logs are enabled and only 3 backup files are allowed for each config section:
     ```bash
      websnap_cli --file_logs --s3_uploader --backup_s3_count 3 --repeat_minutes 1440
     ```

### Configuration

- The following environment variables are **required**: `ENDPOINT_URL`, 
  `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
- A valid `.ini` or `.json `configuration file is **required**.
- Websnap expects the config to be `config.ini` in the same directory as websnap 
  package is being executed from.
  - However, this can be changed using the `config` function argument (or CLI 
   `--config` option).
- All keys in tables below are **mandatory**.

#### S3 Configuration Example Files

| Format  | Example Configuration File                                                                                                                                                                   |
|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `.ini`  | <a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/s3_config_template.ini" target="_blank">src/websnap/config_templates/s3_config_template.ini</a>   |
| `.json` | <a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/s3_config_template.json" target="_blank">src/websnap/config_templates/s3_config_template.json</a> |


#### Environment Variables

Supports setting environment variables in a `.env` file.

Example `.env` file:

```
ENDPOINT_URL=https://dreamycloud.com
AWS_ACCESS_KEY_ID=1234567abcdefg
AWS_SECRET_ACCESS_KEY=hijklmn1234567
```

| Environment Variable    | Description                              |
|-------------------------|------------------------------------------|
| `ENDPOINT_URL`          | URL to use for the constructed S3 client |
| `AWS_ACCESS_KEY_ID`     | AWS access key ID                        |
| `AWS_SECRET_ACCESS_KEY` | AWS secret access key                    |

#### Other Sections (one per API URL endpoint)

- _Each file retrieved from an API requires its **own config section!**_
- The section name be anything, it is suggested to have a name that relates to the 
  copied file.

Example S3 config section configuration with key prefix:

```
[resource]
url=https://www.example.com/api/resource
bucket=exampledata
key=subdirectory_resource/resource.xml
```

Example S3 config section configuration without key prefix:

```
[project]
url=https://www.example.com/api/project
bucket=exampledata
key=project.json
```

| Key      | Value Description                                       |
|----------|---------------------------------------------------------|
| `url`    | API URL endpoint that file will be retrieved from       |
| `bucket` | Bucket that file will be written in                     |
| `key`    | File name with extension, can optionally include prefix |


</details>


## Usage: Local Machine

<details>
  <summary>
  Click to unfold local machine usage
  </summary>

### **Copy files retrieved from an API to a local machine.** 

### Examples

#### Function
```python
# Write files retrieved from an API to local machine using default argument values
websnap()

# Write files retrieved from an API locally and repeats every 60 minutes (1 hour), 
# file logs are enabled
websnap(file_logs=True, repeat_minutes=60)
```

#### CLI 

- Write copied files to local machine using default argument values:
     ```bash
      websnap_cli 
     ```

- Write copied files locally and repeats every 60 minutes (1 hour), file logs 
  are enabled:
     ```bash
      websnap_cli --file_logs --repeat_minutes 60
     ```

### Configuration

- A valid `.ini` or `.json` configuration file is **required** for both function and 
  CLI usage.
- Websnap expects the config to be `config.ini` in the same directory as websnap 
  package is being executed from.
  - However, this can be changed using the `config` function argument (or CLI 
   `--config` option).
- Each file that will be retrieved from an API requires its _own section_. 
- If the optional `directory` key/value pair is omitted then the file will be written in the directory that the program is executed from.


#### Configuration Example Files

| Format  | Example Configuration File                                                                                                                                                             |
|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `.ini`  | <a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/config_template.ini" target="_blank">src/websnap/config_templates/config_template.ini</a>   |
| `.json` | <a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/config_template.json" target="_blank">src/websnap/config_templates/config_template.json</a> |


#### Sections (one per API URL endpoint)

Example local machine configuration section:

```
[project]
url=https://www.example.com/api/project
file_name=project.json
directory=projectdata
```

| Key                      | Value Description                                 |
|--------------------------|---------------------------------------------------|
| `url`                    | API URL endpoint that file will be retrieved from |
| `file_name`              | File name with extension                          |
| `directory` (_optional_) | Local directory name that file will be written in |

</details>


## Logs

<details>
  <summary>
  Click to unfold logs
  </summary>

Websnap supports optional rotating file logs.

- The following CLI option **must** be used to enable websnap to support rotating file logs: `--file_logs`
  - In function usage the following argument must be passed to support rotating file 
    logs: `file_logs=True`
- If log keys are not specified in the configuration `[DEFAULT]` section then default values in the table below will be used. 
- `log_when` expects a value used by logging module TimedRotatingFileHandler.
- <a href="https://docs.python.org/3/library/logging.handlers.html#timedrotatingfilehandler" target="_blank">Click here for more information about how to use TimedRotatingFileHandler.</a>
- The default values result in the file logs being rotated once every day and no removal of backup log files. 

### Configuration

Example log configuration:

```
[DEFAULT]
log_when=midnight
log_interval=1
log_backup_count=7
```

#### `[DEFAULT]` Section
| Key                | Default | Value Description                                                                                                              |
|--------------------|---------|--------------------------------------------------------------------------------------------------------------------------------|
| `log_when`         | `D`     | Specifies type of interval                                                                                                     |
| `log_interval`     | `1`     | Duration of interval (must be positive integer)                                                                                |
| `log_backup_count` | `0`     | If nonzero then at most <`log_backup_count`> files will be kept,</br>oldest log file is deleted (must be non-negative integer) |


</details>


## Minimum Download Size

<details>
  <summary>
  Click to unfold minimum download size
  </summary>

Websnap supports optionally specifying the minimum download size (in kilobytes) a 
file must be to copy it from the configured API URL endpoint.

- **By default the minimum default minimum size is 0 kb.**
  - Unless specified in the configuration this means that a file of any size can be downloaded by websnap.
- Configured minimum download size must be a non-negative integer.
- If the content from the API URL endpoint is less than the configured size:
  - An error will be logged and the program continues to the next config section.
  - If the CLI option `--early_exit` (or function argument `early_exit=True`) is 
    enabled 
    then the program will terminate early.

### Configuration

Example minimum download size configuration:

```
[DEFAULT]
min_size_kb=1
```

#### `[DEFAULT]` Section
| Key           | Default | Value Description                                                 |
|---------------|---------|-------------------------------------------------------------------|
| `min_size_kb` | `0`     | Minimum download size in kilobytes (must be non-negative integer) |


</details>


## Author

Rebecca Kurup Buchholz


## Purpose

This project was developed to facilitate EnviDat resiliency and support continuous 
operation during server maintenance.

<a href="https://www.envidat.ch" target="_blank">EnviDat</a> is the environmental data 
portal of the Swiss Federal Institute for Forest, Snow and Landscape Research WSL. 


## License 

<a href="https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/LICENSE" target="_blank">MIT License</a>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "websnap",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": "EnviDat <envidat@wsl.ch>",
    "keywords": "S3, Boto3, boto3, API, backup, AWS, AWS SDK, AWS SDK for Python",
    "author": null,
    "author_email": "Rebecca Kurup Buchholz <rebecca.kurup@wsl.ch>",
    "download_url": "https://files.pythonhosted.org/packages/22/f4/2db1ab38aaf5915d1de94a51907ebb7150bc61ae0c64fefc91a5d7baece6/websnap-2.0.0.tar.gz",
    "platform": null,
    "description": "# websnap\n\n<div>\n  <img alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/websnap\">\n  <img alt=\"PyPI - Downloads\" src=\"https://static.pepy.tech/badge/websnap\">\n  <img alt=\"PyPI - License\" src=\"https://img.shields.io/pypi/l/websnap?color=%232780C1\">\n  <img alt=\"Coverage\" src=\"https://gitlabext.wsl.ch/EnviDat/websnap/badges/main/coverage.svg?job=test&min_good=90\">\n  <img alt=\"Code Style - Black\" src=\"https://img.shields.io/badge/code%20style-black-000000.svg\">\n</div>\n\n### Copies files retrieved from an API to a S3 bucket or a local machine.\n\n###\n\n---\n\n\n## Installation\n\n   ```bash\n  pip install websnap\n   ```\n\n\n## Quickstart\n\n### Websnap can be used as a function or as a CLI. \n\n<p>\n<a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/overview_diagram.png\" \ntarget=\"_blank\">Click here to view a websnap overview diagram.</a>\n</p>\n\n\n###\n#### Function\n\n```python\nfrom websnap import websnap\n\n# Execute websnap using default arguments\nwebsnap()\n\n# Execute websnap passing arguments\nwebsnap(file_logs=True, s3_uploader=True, backup_s3_count=7, early_exit=True)\n```\n\n###\n#### CLI\n\nTo access CLI documentation in terminal execute: \n   ```bash\n  websnap_cli --help\n   ```\n\n\n## Function Parameters / CLI Options\n\n<details>\n  <summary>\n  Click to unfold function parameters / CLI options\n  </summary>\n\n### Function Parameters\n| Parameter         | Type          | Default        |\n|-------------------|---------------|----------------|\n| `config`          | `str`         | `\"config.ini\"` |\n| `log_level`       | `str`         | `\"INFO\"`       |\n| `file_logs`       | `bool`        | `False`        |\n| `s3_uploader`     | `bool`        | `False`        |\n| `backup_s3_count` | `int \\| None` | `None`         |\n| `timeout`         | `int`         | `32`           |\n| `early_exit`      | `bool`        | `False`        |\n| `repeat_minutes`  | `int \\| None` | `None`         |\n| `section_config`  | `str \\| None` | `None`         |\n\n### CLI Options\n| Option              | Shortcut | Default      |\n|---------------------|----------|--------------|\n| `--config`          | `-c`     | `config.ini` |\n| `--log_level`       | `-l`     | `INFO`       |\n| `--file_logs`       | `-f`     | `False`      |\n| `--s3_uploader`     | `-s`     | `False`      |\n| `--backup_s3_count` | `-b`     | `None`       |\n| `--timeout`         | `-t`     | `32`         |\n| `--early_exit`      | `-e`     | `False`      |\n| `--repeat_minutes`  | `-r`     | `None`       |\n| `--section_config`  | `-n`     | `None`       |\n\n### Description\n\n| Function parameter /<br/> CLI option | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                   |\n|--------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `config` _(str)_                     | <ul><li>Path to configuration `.ini` file</li><li>Default value expects file called `config.ini` in same directory as websnap package is being executed from</li></ul>                                                                                                                                                                                                                                                                                        |\n| `log_level` _(str)_                  | <ul><li>Level to use for logging</li><li>Default value is `INFO`</li><li>Valid logging levels are `DEBUG`, `INFO`, `WARNING`, `ERROR`, or `CRITICAL`</li><li><a href=\"https://docs.python.org/3/library/logging.html#levels\" target=\"_blank\">Click here to learn more about logging levels</a></li></ul>                                                                                                                                                      |\n| `file_logs` _(bool)_                 | <ul><li>Enable rotating file logs</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                   |\n| `s3_uploader` _(bool)_               | <ul><li>Enable uploading of files to S3 bucket</li><ul>                                                                                                                                                                                                                                                                                                                                                                                                       |\n| `backup_s3_count` _(int \\| None)_    | <ul><li>Copy and backup file in each config section to the configured S3 bucket `backup_s3_count` times</li><li>Remove file with the oldest last modified timestamp</li><li>If omitted then files are not copied or removed</li><li>If enabled then backup files are copied and assigned the original file's name with the last modified timestamp appended</li></ul>                                                                                         |\n| `timeout` _(int)_                    | <ul><li>Number of seconds to wait for response for each HTTP request before timing out</li><li>Default value is `32` seconds</li></ul>                                                                                                                                                                                                                                                                                                                        |\n| `early_exit` _(bool)_                | <ul><li>Enable early program termination after error occurs</li><li>If omitted logs errors but continues program execution</li></ul>                                                                                                                                                                                                                                                                                                                          |\n| `repeat_minutes` _(int \\| None)_     | <ul><li>Run websnap continuously every `repeat_minutes` minutes</li><li>If omitted then websnap does not repeat</li></ul>                                                                                                                                                                                                                                                                                                                                     |\n| `section_config` _(str \\| None)_     | <ul><li>File or URL to obtain additional configuration sections</li><li>If omitted then default value is `None` and only config specified in `config` argument is used</li><li>Cannot be used to assign \"DEFAULT\" values in config</li><li>Currently only supports JSON config and can only be used if `config` argument is also a JSON file</li><li>Duplicate sections will overwrite values with the same section passed in the `config` argument</li></ul> |                                                                                                                                                                                                                                                                                                                                                                      |\n\n\n</details>\n\n## Usage: S3 Bucket\n\n<details>\n  <summary>\n  Click to unfold S3 bucket usage\n  </summary>\n\n\n### **Copy files retrieved from an API to a S3 bucket.**\n\nUses the AWS SDK for Python (Boto3) to add and backup API files to a S3 bucket. \n\n### Examples\n\n#### Function\n```python\n# The s3_uploader argument must be passed as True to copy files to a S3 bucket\n# Copies files to a S3 bucket using default argument values\nwebsnap(s3_uploader=True)\n\n# Copies files to a S3 bucket and repeat every 1440 minutes (24 hours), \n# file logs are enabled and only 3 backup files are allowed for each config section\nwebsnap(file_logs=True, s3_uploader=True, backup_s3_count=3, repeat_minutes=1440)\n```\n\n#### CLI\n- The following CLI option **must** be used to enable websnap to upload files to a S3 bucket: `--s3_uploader`\n\n- Copies files to a S3 bucket using default argument values:\n     ```bash\n      websnap_cli --s3_uploader \n     ```\n\n- Copies files to a S3 bucket and repeat every 1440 minutes (24 hours), file \n  logs are enabled and only 3 backup files are allowed for each config section:\n     ```bash\n      websnap_cli --file_logs --s3_uploader --backup_s3_count 3 --repeat_minutes 1440\n     ```\n\n### Configuration\n\n- The following environment variables are **required**: `ENDPOINT_URL`, \n  `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`\n- A valid `.ini` or `.json `configuration file is **required**.\n- Websnap expects the config to be `config.ini` in the same directory as websnap \n  package is being executed from.\n  - However, this can be changed using the `config` function argument (or CLI \n   `--config` option).\n- All keys in tables below are **mandatory**.\n\n#### S3 Configuration Example Files\n\n| Format  | Example Configuration File                                                                                                                                                                   |\n|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `.ini`  | <a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/s3_config_template.ini\" target=\"_blank\">src/websnap/config_templates/s3_config_template.ini</a>   |\n| `.json` | <a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/s3_config_template.json\" target=\"_blank\">src/websnap/config_templates/s3_config_template.json</a> |\n\n\n#### Environment Variables\n\nSupports setting environment variables in a `.env` file.\n\nExample `.env` file:\n\n```\nENDPOINT_URL=https://dreamycloud.com\nAWS_ACCESS_KEY_ID=1234567abcdefg\nAWS_SECRET_ACCESS_KEY=hijklmn1234567\n```\n\n| Environment Variable    | Description                              |\n|-------------------------|------------------------------------------|\n| `ENDPOINT_URL`          | URL to use for the constructed S3 client |\n| `AWS_ACCESS_KEY_ID`     | AWS access key ID                        |\n| `AWS_SECRET_ACCESS_KEY` | AWS secret access key                    |\n\n#### Other Sections (one per API URL endpoint)\n\n- _Each file retrieved from an API requires its **own config section!**_\n- The section name be anything, it is suggested to have a name that relates to the \n  copied file.\n\nExample S3 config section configuration with key prefix:\n\n```\n[resource]\nurl=https://www.example.com/api/resource\nbucket=exampledata\nkey=subdirectory_resource/resource.xml\n```\n\nExample S3 config section configuration without key prefix:\n\n```\n[project]\nurl=https://www.example.com/api/project\nbucket=exampledata\nkey=project.json\n```\n\n| Key      | Value Description                                       |\n|----------|---------------------------------------------------------|\n| `url`    | API URL endpoint that file will be retrieved from       |\n| `bucket` | Bucket that file will be written in                     |\n| `key`    | File name with extension, can optionally include prefix |\n\n\n</details>\n\n\n## Usage: Local Machine\n\n<details>\n  <summary>\n  Click to unfold local machine usage\n  </summary>\n\n### **Copy files retrieved from an API to a local machine.** \n\n### Examples\n\n#### Function\n```python\n# Write files retrieved from an API to local machine using default argument values\nwebsnap()\n\n# Write files retrieved from an API locally and repeats every 60 minutes (1 hour), \n# file logs are enabled\nwebsnap(file_logs=True, repeat_minutes=60)\n```\n\n#### CLI \n\n- Write copied files to local machine using default argument values:\n     ```bash\n      websnap_cli \n     ```\n\n- Write copied files locally and repeats every 60 minutes (1 hour), file logs \n  are enabled:\n     ```bash\n      websnap_cli --file_logs --repeat_minutes 60\n     ```\n\n### Configuration\n\n- A valid `.ini` or `.json` configuration file is **required** for both function and \n  CLI usage.\n- Websnap expects the config to be `config.ini` in the same directory as websnap \n  package is being executed from.\n  - However, this can be changed using the `config` function argument (or CLI \n   `--config` option).\n- Each file that will be retrieved from an API requires its _own section_. \n- If the optional `directory` key/value pair is omitted then the file will be written in the directory that the program is executed from.\n\n\n#### Configuration Example Files\n\n| Format  | Example Configuration File                                                                                                                                                             |\n|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `.ini`  | <a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/config_template.ini\" target=\"_blank\">src/websnap/config_templates/config_template.ini</a>   |\n| `.json` | <a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/src/websnap/config_templates/config_template.json\" target=\"_blank\">src/websnap/config_templates/config_template.json</a> |\n\n\n#### Sections (one per API URL endpoint)\n\nExample local machine configuration section:\n\n```\n[project]\nurl=https://www.example.com/api/project\nfile_name=project.json\ndirectory=projectdata\n```\n\n| Key                      | Value Description                                 |\n|--------------------------|---------------------------------------------------|\n| `url`                    | API URL endpoint that file will be retrieved from |\n| `file_name`              | File name with extension                          |\n| `directory` (_optional_) | Local directory name that file will be written in |\n\n</details>\n\n\n## Logs\n\n<details>\n  <summary>\n  Click to unfold logs\n  </summary>\n\nWebsnap supports optional rotating file logs.\n\n- The following CLI option **must** be used to enable websnap to support rotating file logs: `--file_logs`\n  - In function usage the following argument must be passed to support rotating file \n    logs: `file_logs=True`\n- If log keys are not specified in the configuration `[DEFAULT]` section then default values in the table below will be used. \n- `log_when` expects a value used by logging module TimedRotatingFileHandler.\n- <a href=\"https://docs.python.org/3/library/logging.handlers.html#timedrotatingfilehandler\" target=\"_blank\">Click here for more information about how to use TimedRotatingFileHandler.</a>\n- The default values result in the file logs being rotated once every day and no removal of backup log files. \n\n### Configuration\n\nExample log configuration:\n\n```\n[DEFAULT]\nlog_when=midnight\nlog_interval=1\nlog_backup_count=7\n```\n\n#### `[DEFAULT]` Section\n| Key                | Default | Value Description                                                                                                              |\n|--------------------|---------|--------------------------------------------------------------------------------------------------------------------------------|\n| `log_when`         | `D`     | Specifies type of interval                                                                                                     |\n| `log_interval`     | `1`     | Duration of interval (must be positive integer)                                                                                |\n| `log_backup_count` | `0`     | If nonzero then at most <`log_backup_count`> files will be kept,</br>oldest log file is deleted (must be non-negative integer) |\n\n\n</details>\n\n\n## Minimum Download Size\n\n<details>\n  <summary>\n  Click to unfold minimum download size\n  </summary>\n\nWebsnap supports optionally specifying the minimum download size (in kilobytes) a \nfile must be to copy it from the configured API URL endpoint.\n\n- **By default the minimum default minimum size is 0 kb.**\n  - Unless specified in the configuration this means that a file of any size can be downloaded by websnap.\n- Configured minimum download size must be a non-negative integer.\n- If the content from the API URL endpoint is less than the configured size:\n  - An error will be logged and the program continues to the next config section.\n  - If the CLI option `--early_exit` (or function argument `early_exit=True`) is \n    enabled \n    then the program will terminate early.\n\n### Configuration\n\nExample minimum download size configuration:\n\n```\n[DEFAULT]\nmin_size_kb=1\n```\n\n#### `[DEFAULT]` Section\n| Key           | Default | Value Description                                                 |\n|---------------|---------|-------------------------------------------------------------------|\n| `min_size_kb` | `0`     | Minimum download size in kilobytes (must be non-negative integer) |\n\n\n</details>\n\n\n## Author\n\nRebecca Kurup Buchholz\n\n\n## Purpose\n\nThis project was developed to facilitate EnviDat resiliency and support continuous \noperation during server maintenance.\n\n<a href=\"https://www.envidat.ch\" target=\"_blank\">EnviDat</a> is the environmental data \nportal of the Swiss Federal Institute for Forest, Snow and Landscape Research WSL. \n\n\n## License \n\n<a href=\"https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/LICENSE\" target=\"_blank\">MIT License</a>\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Copies files retrieved from an API to a S3 bucket or a local machine.",
    "version": "2.0.0",
    "project_urls": {
        "Changelog": "https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/CHANGELOG.md",
        "Documentation": "https://gitlabext.wsl.ch/EnviDat/websnap/-/blob/main/README.md",
        "Repository": "https://gitlabext.wsl.ch/EnviDat/websnap"
    },
    "split_keywords": [
        "s3",
        " boto3",
        " boto3",
        " api",
        " backup",
        " aws",
        " aws sdk",
        " aws sdk for python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4474e9ce29a3aae959f4299c83d5bc84f30975968990163da55db284da5ca731",
                "md5": "77378289e8b10d2193df0660afd408b3",
                "sha256": "5e0a0a2a2f129d52d6cd70d398358d875ea51daca145625c3ca38ad072752e5c"
            },
            "downloads": -1,
            "filename": "websnap-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "77378289e8b10d2193df0660afd408b3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 19575,
            "upload_time": "2024-09-12T09:44:34",
            "upload_time_iso_8601": "2024-09-12T09:44:34.813629Z",
            "url": "https://files.pythonhosted.org/packages/44/74/e9ce29a3aae959f4299c83d5bc84f30975968990163da55db284da5ca731/websnap-2.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "22f42db1ab38aaf5915d1de94a51907ebb7150bc61ae0c64fefc91a5d7baece6",
                "md5": "fb3ddaa603098e6da8f0969e108233d2",
                "sha256": "c779403c9cc3b4e924a265f298c70c88b0d08d68a9323cac27470c5258c19d21"
            },
            "downloads": -1,
            "filename": "websnap-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "fb3ddaa603098e6da8f0969e108233d2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 21889,
            "upload_time": "2024-09-12T09:44:36",
            "upload_time_iso_8601": "2024-09-12T09:44:36.287470Z",
            "url": "https://files.pythonhosted.org/packages/22/f4/2db1ab38aaf5915d1de94a51907ebb7150bc61ae0c64fefc91a5d7baece6/websnap-2.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-12 09:44:36",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "websnap"
}
        
Elapsed time: 0.32939s