artvee-scraper-cli


Nameartvee-scraper-cli JSON
Version 1.0.1 PyPI version JSON
download
home_pagehttps://github.com/zduclos/artvee-scraper-cli
SummaryFetch public domain artwork from Artvee
upload_time2024-10-29 04:04:37
maintainerNone
docs_urlNone
authorZach Duclos
requires_python>=3.10
licenseMIT
keywords artvee artwork webscraper
VCS
bugtrack_url
requirements artvee-scraper pytest mockito coverage black mypy setuptools wheel
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # artvee-scraper-cli

 **artvee-scraper-cli** is an easy to use command line utility for fetching public domain artwork from Artvee (https://www.artvee.com).

- [Artvee Web-scraper CLI](#artvee-scraper-cli)
  - [Installation](#installation)
  - [Synopsis](#synopsis)
  - [Examples](#examples)
  - [Available Commands](#available-commands)

## Installation

Using PyPI
```console
$ python -m pip install artvee-scraper-cli
```
Python 3.10+ is officially supported.

## Synopsis
```console
artvee-scraper-cli <command> [optional arguments] [positional arguments]
```

## Examples
View help
```console
$ artvee-scraper-cli -h
usage: artvee-scraper-cli [-h] {log-json,file-json,file-multi} ...

Scrape artwork from https://www.artvee.com

positional arguments:
  {log-json,file-json,file-multi}
    log-json            Artwork is output to the log as a JSON object
    file-json           Artwork is represented as a JSON object and written to a file
    file-multi          Artwork image and metadata are written as separate files

optional arguments:
  -h, --help            show this help message and exit
```

View help for the *file-json* command
```console
$ artvee-scraper-cli file-json -h
usage: artvee-scraper-cli file-json [-h] [-t [1-10]] [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
                    [-c {abstract,figurative,landscape,religion,mythology,posters,animals,illustration,still-life,botanical,drawings,asian-art}]
                    [--log-dir LOG_DIR] [--log-max-size [1-10240]] [--log-max-backups [0-100]]
                    [--space-level [2-6]] [--sort-keys] [--overwrite-existing]
                    dir_path

positional arguments:
  dir_path              JSON file output directory

optional arguments:
  -h, --help            show this help message and exit
  -t [1-10], --worker-threads [1-10]
                        Number of worker threads (1-10)
  -l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Set the application log level
  -c {abstract,figurative,landscape,religion,mythology,posters,animals,illustration,still-life,botanical,drawings,asian-art}, --category {abstract,figurative,landscape,religion,mythology,posters,animals,illustration,still-life,botanical,drawings,asian-art}
                        Category of artwork to scrape
  --space-level [2-6]   Enable pretty-printing; number of spaces to indent (2-6)
  --sort-keys           Sort JSON keys in alphabetical order
  --overwrite-existing  Overwrite existing files

optional log file arguments:
  --log-dir LOG_DIR     Log file output directory
  --log-max-size [1-10240]
                        Maximum log file size in MB (1-10,240)
  --log-max-backups [0-100]
                        Maximum number of log files to keep (0-100)
```

Download artwork from *artvee.com* and save each as individal files (JSON format) in the directory *~/artvee/downloads*
```console
$ artvee-scraper-cli file-json ~/artvee/downloads
```

## Available Commands
- [log-json](#log-json)
- [file-json](#file-json)
- [file-multi](#file-multi)

## log-json
Download artwork and output each to the log as a JSON objects.
Note: This command is intended for development test usage; typically it is not desirable to dump the data to the log.
```console
$ artvee-scraper-cli log-json [optional arguments]
```

###### Optional arguments
> `-h` | `--help` (boolean)
>> Display help message.

> `-t` | `--worker-threads` (integer)
>> The number of worker threads used for processing. Range of values is [1-10]. The default value is *3*.

> `-l` | `--log-level` (string)
>> Application log level. One of: **DEBUG, INFO, WARNING, ERROR, CRITICAL**. The default value is *INFO*.

>`-c` | `--category` (string)
>> Category of artwork to fetch. One of: **abstract, figurative, landscape, religion, mythology, posters, animals, illustration, still-life,  botanical, drawings, asian-art**. May be repeatedly used to specify multiple categories (*-c animals, -c drawings*). The default value is *ALL*   categories.

###### Optional log file arguments
> `--log-dir` (string)
>> Path to existing directory used to store *artvee_scraper.log* log files. Disabled by default.

> `--log-max-size` (integer)
>> Maximum size in MB the log file should reach before triggering a rollover. Only applies if *--log-dir* has been specified. Range of values is [1-10240]. The default value is *1024*MB (1GB).

> `--log-max-backups` (integer)
>> Maximum number of log file archives to keep. Only applies if *--log-dir* has been specified. The actively written file is *artvee_scraper.log*. Backup files will have an incrementing numerical suffix; *artvee_scraper.log.1 ... artvee_scraper.log.N*. If this value is zero, rollovers will be disabled. Range of values is [0-100]. The default value is *10*.

###### Optional writer arguments
> `--space-level` (integer)
>> Pretty print JSON; number of spaces to indent. Range of values is [2-6]. Disabled by default.

> `--sort-keys` (boolean)
>> Sort JSON keys in alphabetical order. Disabled by default.

> `--include-image` (boolean)
>> Image will be included in output. Excessive output warning! Disabled by default.


###### Basic Example
```console
$ artvee-scraper-cli log-json
```
###### Output:
```console
  ...
2038-01-19 18:34:38.941 INFO [ThreadPoolExecutor-0_0] runner.<lambda>(79) | Processing 'Komposition' by Otto Freundlich
2038-01-19 18:34:38.943 INFO [ThreadPoolExecutor-0_0] log_writer.write(45) | {"url": "https://artvee.com/dl/komposition-2/", "resource": "komposition-2", "title": "Komposition", "category": "Abstract", "artist": "Otto Freundlich", "date": "1938", "origin": "German, 1878-1943", "image": {"source_url": "https://mdl.artvee.com/sdl/102399absdl.jpg", "width": 1423, "height": 1800, "file_size": 1.1, "file_size_unit": "MB", "format_name": "jpg"}}
  ...
```

###### Advanced Example
```console
$ artvee-scraper-cli log-json --worker-threads 2 --log-level DEBUG --category abstract --log-dir /var/log/artvee --log-max-size 2048 --log-max-backups 10 --space-level 2 --sort-keys --include-image
```
###### Output:
```console
$ cat /var/log/artvee/artvee_scraper_cli.log
  ...
2038-01-19 18:40:11.772 DEBUG [ThreadPoolExecutor-0_0] artvee_client.get_image(132) | Retrieving image; url=https://mdl.artvee.com/sdl/105042absdl.jpg
2038-01-19 18:40:11.772 DEBUG [ThreadPoolExecutor-0_0] connectionpool._new_conn(1051) | Starting new HTTPS connection (1): mdl.artvee.com:443
2038-01-19 18:40:11.853 DEBUG [ThreadPoolExecutor-0_0] connectionpool._make_request(546) | https://mdl.artvee.com:443 "GET /sdl/105042absdl.jpg HTTP/11" 200 2011451
2038-01-19 18:40:11.941 INFO [ThreadPoolExecutor-0_0] runner.<lambda>(79) | Processing 'Gare' by Joaquín Torres-García
2038-01-19 18:40:11.967 INFO [ThreadPoolExecutor-0_0] log_writer.write(45) | {
  "artist": "Joaquín Torres-García",
  "category": "Abstract",
  "date": "1928",
  "image": {
    "file_size": 1.92,
    "file_size_unit": "MB",
    "format_name": "jpg",
    "height": 1259,
    "raw": "/9j/4AAQSkZJRgABA ... o4xSSSVkumh//9k=",
    "source_url": "https://mdl.artvee.com/sdl/105042absdl.jpg",
    "width": 1800
  },
  "origin": "Uruguayan, 1874-1949",
  "resource": "gare",
  "title": "Gare",
  "url": "https://artvee.com/dl/gare/"
}
  ...
```

## file-json
Download artwork and write each to the filesystem. Each artwork is stored as a JSON object.
```console
$ artvee-scraper-cli file-json [optional arguments] <dir_path>
```

###### Positional arguments
> `dir_path` (string) Position *0*.
>> Path to existing directory used to store output files.

###### Optional arguments
> `-h` | `--help` (boolean)
>> Display help message.

> `-t` | `--worker-threads` (integer)
>> The number of worker threads used for processing. Range of values is [1-10]. The default value is *3*.

> `-l` | `--log-level` (string)
>> Application log level. One of: **DEBUG, INFO, WARNING, ERROR, CRITICAL**. The default value is *INFO*.

>`-c` | `--category` (string)
>> Category of artwork to fetch. One of: **abstract, figurative, landscape, religion, mythology, posters, animals, illustration, still-life,  botanical, drawings, asian-art**. May be repeatedly used to specify multiple categories (*-c animals, -c drawings*). The default value is *ALL*   categories.

###### Optional log file arguments
> `--log-dir` (string)
>> Path to existing directory used to store *artvee_scraper.log* log files. Disabled by default.

> `--log-max-size` (integer)
>> Maximum size in MB the log file should reach before triggering a rollover. Only enabled if *--log-dir* has been specified. Range of values is [1-10240]. The default value is *1024*MB (1GB).

> `--log-max-backups` (integer)
>> Maximum number of log file archives to keep. Only enabled if *--log-dir* has been specified. The actively written file is *artvee_scraper.log*. Backup files will have an incrementing numerical suffix; *artvee_scraper.log.1 ... artvee_scraper.log.N*. If this value is zero, rollovers will be disabled. Range of values is [0-100]. The default value is *10*.

###### Optional writer arguments
> `--space-level` (integer)
>> Pretty print JSON; number of spaces to indent. Range of values is [2-6]. Disabled by default.

> `--sort-keys` (boolean)
>> Sort JSON keys in alphabetical order. Disabled by default.

> `--overwrite-existing` (boolean)
>> Allow existing duplicate files to be overwritten. Disabled by default.

###### Basic Example
```console
$ artvee-scraper-cli file-json ~/artvee/downloads
```
###### Output:
```console
$ cat ~/artvee/downloads/woman-by-the-window.json
{"url": "https://artvee.com/dl/woman-by-the-window/", "resource": "woman-by-the-window", "title": "Woman by the window", "category": "Abstract", "artist": "Mikuláš Galanda", "date": "1928", "origin": "Slovak, 1895 – 1938", "image": {"source_url": "https://mdl.artvee.com/sdl/101518absdl.jpg", "width": 1317, "height": 1800, "file_size": 2.48, "file_size_unit": "MB", "raw": "/9j/4AAQSkZJRgAB ... aK1lZLTp7i/Vn//Z", "format_name": "jpg"}}
```

###### Advanced Example
```console
$ artvee-scraper-cli file-json ~/artvee/downloads --worker-threads 1 --log-level INFO --category mythology --log-dir /var/log/artvee --log-max-size 512 --log-max-backups 10 --space-level 4 --sort-keys --overwrite-existing
```
###### Output:
```console
$ cat ~/artvee/downloads/the-judgment-of-paris-3.json
{
    "artist": "Joachim Wtewael",
    "category": "Mythology",
    "date": "1602",
    "image": {
        "file_size": 7.42,
        "file_size_unit": "MB",
        "format_name": "jpg",
        "height": 2138,
        "raw": "/9j/4R8FRXhpZgAASUkq ... /pNfu/+89V/wB46//Z",
        "source_url": "https://mdl.artvee.com/sdl/400408mtsdl.jpg",
        "width": 2833
    },
    "origin": "Dutch, 1566 - 1638",
    "resource": "the-judgment-of-paris-3",
    "title": "The Judgment of Paris",
    "url": "https://artvee.com/dl/the-judgment-of-paris-3/"
}
```

## file-multi
Download artwork and write each to the filesystem. Each artwork is stored as two files: metadata (JSON) & image (JPG).
```console
$ artvee-scraper-cli file-multi [optional arguments] <metadata_dir_path> <image_dir_path>
```

###### Positional arguments
> `metadata_dir_path` (string) Position *0*.
>> Path to existing directory used to store output metadata files.

> `image_dir_path` (string) Position *1*.
>> Path to existing directory used to store output image files.

###### Optional arguments
> `-h` | `--help` (boolean)
>> Display help message.

> `-t` | `--worker-threads` (integer)
>> The number of worker threads used for processing. Range of values is [1-10]. The default value is *3*.

> `-l` | `--log-level` (string)
>> Application log level. One of: **DEBUG, INFO, WARNING, ERROR, CRITICAL**. The default value is *INFO*.

> `-c` | `--category` (string)
>> Category of artwork to fetch. One of: **abstract, figurative, landscape, religion, mythology, posters, animals, illustration, still-life,  botanical, drawings, asian-art**. May be repeatedly used to specify multiple categories (*-c animals -c drawings*). The default value is *ALL*   categories.

###### Optional log file arguments
> `--log-dir` (string)
>> Path to existing directory used to store *artvee_scraper.log* log files. Disabled by default.

> `--log-max-size` (integer)
>> Maximum size in MB the log file should reach before triggering a rollover. Only enabled if *--log-dir* has been specified. Range of values is [1-10240]. The default value is *1024*MB (1GB).

> `--log-max-backups` (integer)
>> Maximum number of log file archives to keep. Only enabled if *--log-dir* has been specified. The actively written file is *artvee_scraper.log*. Backup files will have an incrementing numerical suffix; *artvee_scraper.log.1 ... artvee_scraper.log.N*. If this value is zero, rollovers will be disabled. Range of values is [0-100]. The default value is *10*.

###### Optional writer arguments
> `--space-level` (integer)
>> Pretty print JSON; number of spaces to indent. Range of values is [2-6]. Disabled by default.

> `--sort-keys` (boolean)
>> Sort JSON keys in alphabetical order. Disabled by default.

> `--overwrite-existing` (boolean)
>> Allow existing duplicate files to be overwritten. Disabled by default.

###### Basic Example
```console
$ artvee-scraper-cli file-multi ~/artvee/downloads/metadata ~/artvee/downloads/images
```
###### Output:
```console
$ cat ~/artvee/downloads/metadata/the-pet-pig.json
{"url": "https://artvee.com/dl/the-pet-pig/", "resource": "the-pet-pig", "title": "The pet pig", "category": "Abstract", "artist": "Edvard Munch", "date": "1908-1910", "origin": "Norwegian, 1863 - 1944", "image": {"source_url": "https://mdl.artvee.com/sdl/103755absdl.jpg", "width": 1800, "height": 1320, "file_size": 1.67, "file_size_unit": "MB", "format_name": "jpg"}}
$ hexdump -C ~/artvee/downloads/images/the-pet-pig.jpg
00000000  ff d8 ff e0 00 10 4a 46  49 46 00 01 01 01 01 2c  |......JFIF.....,|
  ...
001aa430  40 2b 9c 02 8a 2b 48 b6  d6 bd ff 00 c8 0f ff d9  |@+...+H.........|
001aa440
```

###### Advanced Example
```console
$ artvee-scraper-cli file-multi --worker-threads 1 --log-level INFO --category asian-art --log-dir /var/log/artvee --log-max-size 512 --log-max-backups 10 --space-level 2 --sort-keys --overwrite-existing ~/artvee/downloads/metadata ~/artvee/downloads/images
```
###### Output:
```console
$ cat ~/artvee/downloads/metadata/two-ronin-looking-into-yoshiwara.json
{
  "artist": "Andō Hiroshige",
  "category": "Asian-art",
  "date": "19th century",
  "image": {
    "file_size": 2.29,
    "file_size_unit": "MB",
    "format_name": "jpg",
    "height": 1179,
    "source_url": "https://mdl.artvee.com/sdl/52015jpsdl.jpg",
    "width": 1800
  },
  "origin": "Japanese, 1797 – 1858",
  "resource": "two-ronin-looking-into-yoshiwara",
  "title": "Two Ronin Looking into Yoshiwara",
  "url": "https://artvee.com/dl/two-ronin-looking-into-yoshiwara/"
}
$ hexdump -C ~/artvee/downloads/images/two-ronin-looking-into-yoshiwara.jpg
00000000  ff d8 ff e0 00 10 4a 46  49 46 00 01 01 01 01 2c  |......JFIF.....,|
  ...
002499c0  a2 b4 fe bf ad cc 4f ff  d9                       |......O..|
002499c9
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/zduclos/artvee-scraper-cli",
    "name": "artvee-scraper-cli",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "artvee, artwork, webscraper",
    "author": "Zach Duclos",
    "author_email": "zduclos.github@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/b2/58/7bcfcc141ba8eecb22129e26bdf86bd02796d83c8c0a55821d279c9e9203/artvee-scraper-cli-1.0.1.tar.gz",
    "platform": null,
    "description": "# artvee-scraper-cli\n\n **artvee-scraper-cli** is an easy to use command line utility for fetching public domain artwork from Artvee (https://www.artvee.com).\n\n- [Artvee Web-scraper CLI](#artvee-scraper-cli)\n  - [Installation](#installation)\n  - [Synopsis](#synopsis)\n  - [Examples](#examples)\n  - [Available Commands](#available-commands)\n\n## Installation\n\nUsing PyPI\n```console\n$ python -m pip install artvee-scraper-cli\n```\nPython 3.10+ is officially supported.\n\n## Synopsis\n```console\nartvee-scraper-cli <command> [optional arguments] [positional arguments]\n```\n\n## Examples\nView help\n```console\n$ artvee-scraper-cli -h\nusage: artvee-scraper-cli [-h] {log-json,file-json,file-multi} ...\n\nScrape artwork from https://www.artvee.com\n\npositional arguments:\n  {log-json,file-json,file-multi}\n    log-json            Artwork is output to the log as a JSON object\n    file-json           Artwork is represented as a JSON object and written to a file\n    file-multi          Artwork image and metadata are written as separate files\n\noptional arguments:\n  -h, --help            show this help message and exit\n```\n\nView help for the *file-json* command\n```console\n$ artvee-scraper-cli file-json -h\nusage: artvee-scraper-cli file-json [-h] [-t [1-10]] [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]\n                    [-c {abstract,figurative,landscape,religion,mythology,posters,animals,illustration,still-life,botanical,drawings,asian-art}]\n                    [--log-dir LOG_DIR] [--log-max-size [1-10240]] [--log-max-backups [0-100]]\n                    [--space-level [2-6]] [--sort-keys] [--overwrite-existing]\n                    dir_path\n\npositional arguments:\n  dir_path              JSON file output directory\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -t [1-10], --worker-threads [1-10]\n                        Number of worker threads (1-10)\n  -l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}\n                        Set the application log level\n  -c {abstract,figurative,landscape,religion,mythology,posters,animals,illustration,still-life,botanical,drawings,asian-art}, --category {abstract,figurative,landscape,religion,mythology,posters,animals,illustration,still-life,botanical,drawings,asian-art}\n                        Category of artwork to scrape\n  --space-level [2-6]   Enable pretty-printing; number of spaces to indent (2-6)\n  --sort-keys           Sort JSON keys in alphabetical order\n  --overwrite-existing  Overwrite existing files\n\noptional log file arguments:\n  --log-dir LOG_DIR     Log file output directory\n  --log-max-size [1-10240]\n                        Maximum log file size in MB (1-10,240)\n  --log-max-backups [0-100]\n                        Maximum number of log files to keep (0-100)\n```\n\nDownload artwork from *artvee.com* and save each as individal files (JSON format) in the directory *~/artvee/downloads*\n```console\n$ artvee-scraper-cli file-json ~/artvee/downloads\n```\n\n## Available Commands\n- [log-json](#log-json)\n- [file-json](#file-json)\n- [file-multi](#file-multi)\n\n## log-json\nDownload artwork and output each to the log as a JSON objects.\nNote: This command is intended for development test usage; typically it is not desirable to dump the data to the log.\n```console\n$ artvee-scraper-cli log-json [optional arguments]\n```\n\n###### Optional arguments\n> `-h` | `--help` (boolean)\n>> Display help message.\n\n> `-t` | `--worker-threads` (integer)\n>> The number of worker threads used for processing. Range of values is [1-10]. The default value is *3*.\n\n> `-l` | `--log-level` (string)\n>> Application log level. One of: **DEBUG, INFO, WARNING, ERROR, CRITICAL**. The default value is *INFO*.\n\n>`-c` | `--category` (string)\n>> Category of artwork to fetch. One of: **abstract, figurative, landscape, religion, mythology, posters, animals, illustration, still-life,  botanical, drawings, asian-art**. May be repeatedly used to specify multiple categories (*-c animals, -c drawings*). The default value is *ALL*   categories.\n\n###### Optional log file arguments\n> `--log-dir` (string)\n>> Path to existing directory used to store *artvee_scraper.log* log files. Disabled by default.\n\n> `--log-max-size` (integer)\n>> Maximum size in MB the log file should reach before triggering a rollover. Only applies if *--log-dir* has been specified. Range of values is [1-10240]. The default value is *1024*MB (1GB).\n\n> `--log-max-backups` (integer)\n>> Maximum number of log file archives to keep. Only applies if *--log-dir* has been specified. The actively written file is *artvee_scraper.log*. Backup files will have an incrementing numerical suffix; *artvee_scraper.log.1 ... artvee_scraper.log.N*. If this value is zero, rollovers will be disabled. Range of values is [0-100]. The default value is *10*.\n\n###### Optional writer arguments\n> `--space-level` (integer)\n>> Pretty print JSON; number of spaces to indent. Range of values is [2-6]. Disabled by default.\n\n> `--sort-keys` (boolean)\n>> Sort JSON keys in alphabetical order. Disabled by default.\n\n> `--include-image` (boolean)\n>> Image will be included in output. Excessive output warning! Disabled by default.\n\n\n###### Basic Example\n```console\n$ artvee-scraper-cli log-json\n```\n###### Output:\n```console\n  ...\n2038-01-19 18:34:38.941 INFO [ThreadPoolExecutor-0_0] runner.<lambda>(79) | Processing 'Komposition' by Otto Freundlich\n2038-01-19 18:34:38.943 INFO [ThreadPoolExecutor-0_0] log_writer.write(45) | {\"url\": \"https://artvee.com/dl/komposition-2/\", \"resource\": \"komposition-2\", \"title\": \"Komposition\", \"category\": \"Abstract\", \"artist\": \"Otto Freundlich\", \"date\": \"1938\", \"origin\": \"German, 1878-1943\", \"image\": {\"source_url\": \"https://mdl.artvee.com/sdl/102399absdl.jpg\", \"width\": 1423, \"height\": 1800, \"file_size\": 1.1, \"file_size_unit\": \"MB\", \"format_name\": \"jpg\"}}\n  ...\n```\n\n###### Advanced Example\n```console\n$ artvee-scraper-cli log-json --worker-threads 2 --log-level DEBUG --category abstract --log-dir /var/log/artvee --log-max-size 2048 --log-max-backups 10 --space-level 2 --sort-keys --include-image\n```\n###### Output:\n```console\n$ cat /var/log/artvee/artvee_scraper_cli.log\n  ...\n2038-01-19 18:40:11.772 DEBUG [ThreadPoolExecutor-0_0] artvee_client.get_image(132) | Retrieving image; url=https://mdl.artvee.com/sdl/105042absdl.jpg\n2038-01-19 18:40:11.772 DEBUG [ThreadPoolExecutor-0_0] connectionpool._new_conn(1051) | Starting new HTTPS connection (1): mdl.artvee.com:443\n2038-01-19 18:40:11.853 DEBUG [ThreadPoolExecutor-0_0] connectionpool._make_request(546) | https://mdl.artvee.com:443 \"GET /sdl/105042absdl.jpg HTTP/11\" 200 2011451\n2038-01-19 18:40:11.941 INFO [ThreadPoolExecutor-0_0] runner.<lambda>(79) | Processing 'Gare' by Joaqu\u00edn Torres-Garc\u00eda\n2038-01-19 18:40:11.967 INFO [ThreadPoolExecutor-0_0] log_writer.write(45) | {\n  \"artist\": \"Joaqu\u00edn Torres-Garc\u00eda\",\n  \"category\": \"Abstract\",\n  \"date\": \"1928\",\n  \"image\": {\n    \"file_size\": 1.92,\n    \"file_size_unit\": \"MB\",\n    \"format_name\": \"jpg\",\n    \"height\": 1259,\n    \"raw\": \"/9j/4AAQSkZJRgABA ... o4xSSSVkumh//9k=\",\n    \"source_url\": \"https://mdl.artvee.com/sdl/105042absdl.jpg\",\n    \"width\": 1800\n  },\n  \"origin\": \"Uruguayan, 1874-1949\",\n  \"resource\": \"gare\",\n  \"title\": \"Gare\",\n  \"url\": \"https://artvee.com/dl/gare/\"\n}\n  ...\n```\n\n## file-json\nDownload artwork and write each to the filesystem. Each artwork is stored as a JSON object.\n```console\n$ artvee-scraper-cli file-json [optional arguments] <dir_path>\n```\n\n###### Positional arguments\n> `dir_path` (string) Position *0*.\n>> Path to existing directory used to store output files.\n\n###### Optional arguments\n> `-h` | `--help` (boolean)\n>> Display help message.\n\n> `-t` | `--worker-threads` (integer)\n>> The number of worker threads used for processing. Range of values is [1-10]. The default value is *3*.\n\n> `-l` | `--log-level` (string)\n>> Application log level. One of: **DEBUG, INFO, WARNING, ERROR, CRITICAL**. The default value is *INFO*.\n\n>`-c` | `--category` (string)\n>> Category of artwork to fetch. One of: **abstract, figurative, landscape, religion, mythology, posters, animals, illustration, still-life,  botanical, drawings, asian-art**. May be repeatedly used to specify multiple categories (*-c animals, -c drawings*). The default value is *ALL*   categories.\n\n###### Optional log file arguments\n> `--log-dir` (string)\n>> Path to existing directory used to store *artvee_scraper.log* log files. Disabled by default.\n\n> `--log-max-size` (integer)\n>> Maximum size in MB the log file should reach before triggering a rollover. Only enabled if *--log-dir* has been specified. Range of values is [1-10240]. The default value is *1024*MB (1GB).\n\n> `--log-max-backups` (integer)\n>> Maximum number of log file archives to keep. Only enabled if *--log-dir* has been specified. The actively written file is *artvee_scraper.log*. Backup files will have an incrementing numerical suffix; *artvee_scraper.log.1 ... artvee_scraper.log.N*. If this value is zero, rollovers will be disabled. Range of values is [0-100]. The default value is *10*.\n\n###### Optional writer arguments\n> `--space-level` (integer)\n>> Pretty print JSON; number of spaces to indent. Range of values is [2-6]. Disabled by default.\n\n> `--sort-keys` (boolean)\n>> Sort JSON keys in alphabetical order. Disabled by default.\n\n> `--overwrite-existing` (boolean)\n>> Allow existing duplicate files to be overwritten. Disabled by default.\n\n###### Basic Example\n```console\n$ artvee-scraper-cli file-json ~/artvee/downloads\n```\n###### Output:\n```console\n$ cat ~/artvee/downloads/woman-by-the-window.json\n{\"url\": \"https://artvee.com/dl/woman-by-the-window/\", \"resource\": \"woman-by-the-window\", \"title\": \"Woman by the window\", \"category\": \"Abstract\", \"artist\": \"Mikul\u00e1\u0161 Galanda\", \"date\": \"1928\", \"origin\": \"Slovak, 1895 \u2013 1938\", \"image\": {\"source_url\": \"https://mdl.artvee.com/sdl/101518absdl.jpg\", \"width\": 1317, \"height\": 1800, \"file_size\": 2.48, \"file_size_unit\": \"MB\", \"raw\": \"/9j/4AAQSkZJRgAB ... aK1lZLTp7i/Vn//Z\", \"format_name\": \"jpg\"}}\n```\n\n###### Advanced Example\n```console\n$ artvee-scraper-cli file-json ~/artvee/downloads --worker-threads 1 --log-level INFO --category mythology --log-dir /var/log/artvee --log-max-size 512 --log-max-backups 10 --space-level 4 --sort-keys --overwrite-existing\n```\n###### Output:\n```console\n$ cat ~/artvee/downloads/the-judgment-of-paris-3.json\n{\n    \"artist\": \"Joachim Wtewael\",\n    \"category\": \"Mythology\",\n    \"date\": \"1602\",\n    \"image\": {\n        \"file_size\": 7.42,\n        \"file_size_unit\": \"MB\",\n        \"format_name\": \"jpg\",\n        \"height\": 2138,\n        \"raw\": \"/9j/4R8FRXhpZgAASUkq ... /pNfu/+89V/wB46//Z\",\n        \"source_url\": \"https://mdl.artvee.com/sdl/400408mtsdl.jpg\",\n        \"width\": 2833\n    },\n    \"origin\": \"Dutch, 1566 - 1638\",\n    \"resource\": \"the-judgment-of-paris-3\",\n    \"title\": \"The Judgment of Paris\",\n    \"url\": \"https://artvee.com/dl/the-judgment-of-paris-3/\"\n}\n```\n\n## file-multi\nDownload artwork and write each to the filesystem. Each artwork is stored as two files: metadata (JSON) & image (JPG).\n```console\n$ artvee-scraper-cli file-multi [optional arguments] <metadata_dir_path> <image_dir_path>\n```\n\n###### Positional arguments\n> `metadata_dir_path` (string) Position *0*.\n>> Path to existing directory used to store output metadata files.\n\n> `image_dir_path` (string) Position *1*.\n>> Path to existing directory used to store output image files.\n\n###### Optional arguments\n> `-h` | `--help` (boolean)\n>> Display help message.\n\n> `-t` | `--worker-threads` (integer)\n>> The number of worker threads used for processing. Range of values is [1-10]. The default value is *3*.\n\n> `-l` | `--log-level` (string)\n>> Application log level. One of: **DEBUG, INFO, WARNING, ERROR, CRITICAL**. The default value is *INFO*.\n\n> `-c` | `--category` (string)\n>> Category of artwork to fetch. One of: **abstract, figurative, landscape, religion, mythology, posters, animals, illustration, still-life,  botanical, drawings, asian-art**. May be repeatedly used to specify multiple categories (*-c animals -c drawings*). The default value is *ALL*   categories.\n\n###### Optional log file arguments\n> `--log-dir` (string)\n>> Path to existing directory used to store *artvee_scraper.log* log files. Disabled by default.\n\n> `--log-max-size` (integer)\n>> Maximum size in MB the log file should reach before triggering a rollover. Only enabled if *--log-dir* has been specified. Range of values is [1-10240]. The default value is *1024*MB (1GB).\n\n> `--log-max-backups` (integer)\n>> Maximum number of log file archives to keep. Only enabled if *--log-dir* has been specified. The actively written file is *artvee_scraper.log*. Backup files will have an incrementing numerical suffix; *artvee_scraper.log.1 ... artvee_scraper.log.N*. If this value is zero, rollovers will be disabled. Range of values is [0-100]. The default value is *10*.\n\n###### Optional writer arguments\n> `--space-level` (integer)\n>> Pretty print JSON; number of spaces to indent. Range of values is [2-6]. Disabled by default.\n\n> `--sort-keys` (boolean)\n>> Sort JSON keys in alphabetical order. Disabled by default.\n\n> `--overwrite-existing` (boolean)\n>> Allow existing duplicate files to be overwritten. Disabled by default.\n\n###### Basic Example\n```console\n$ artvee-scraper-cli file-multi ~/artvee/downloads/metadata ~/artvee/downloads/images\n```\n###### Output:\n```console\n$ cat ~/artvee/downloads/metadata/the-pet-pig.json\n{\"url\": \"https://artvee.com/dl/the-pet-pig/\", \"resource\": \"the-pet-pig\", \"title\": \"The pet pig\", \"category\": \"Abstract\", \"artist\": \"Edvard Munch\", \"date\": \"1908-1910\", \"origin\": \"Norwegian, 1863 - 1944\", \"image\": {\"source_url\": \"https://mdl.artvee.com/sdl/103755absdl.jpg\", \"width\": 1800, \"height\": 1320, \"file_size\": 1.67, \"file_size_unit\": \"MB\", \"format_name\": \"jpg\"}}\n$ hexdump -C ~/artvee/downloads/images/the-pet-pig.jpg\n00000000  ff d8 ff e0 00 10 4a 46  49 46 00 01 01 01 01 2c  |......JFIF.....,|\n  ...\n001aa430  40 2b 9c 02 8a 2b 48 b6  d6 bd ff 00 c8 0f ff d9  |@+...+H.........|\n001aa440\n```\n\n###### Advanced Example\n```console\n$ artvee-scraper-cli file-multi --worker-threads 1 --log-level INFO --category asian-art --log-dir /var/log/artvee --log-max-size 512 --log-max-backups 10 --space-level 2 --sort-keys --overwrite-existing ~/artvee/downloads/metadata ~/artvee/downloads/images\n```\n###### Output:\n```console\n$ cat ~/artvee/downloads/metadata/two-ronin-looking-into-yoshiwara.json\n{\n  \"artist\": \"And\u014d Hiroshige\",\n  \"category\": \"Asian-art\",\n  \"date\": \"19th century\",\n  \"image\": {\n    \"file_size\": 2.29,\n    \"file_size_unit\": \"MB\",\n    \"format_name\": \"jpg\",\n    \"height\": 1179,\n    \"source_url\": \"https://mdl.artvee.com/sdl/52015jpsdl.jpg\",\n    \"width\": 1800\n  },\n  \"origin\": \"Japanese, 1797 \u2013 1858\",\n  \"resource\": \"two-ronin-looking-into-yoshiwara\",\n  \"title\": \"Two Ronin Looking into Yoshiwara\",\n  \"url\": \"https://artvee.com/dl/two-ronin-looking-into-yoshiwara/\"\n}\n$ hexdump -C ~/artvee/downloads/images/two-ronin-looking-into-yoshiwara.jpg\n00000000  ff d8 ff e0 00 10 4a 46  49 46 00 01 01 01 01 2c  |......JFIF.....,|\n  ...\n002499c0  a2 b4 fe bf ad cc 4f ff  d9                       |......O..|\n002499c9\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Fetch public domain artwork from Artvee",
    "version": "1.0.1",
    "project_urls": {
        "Bug Reports": "https://github.com/zduclos/artvee-scraper-cli/issues",
        "Homepage": "https://github.com/zduclos/artvee-scraper-cli",
        "Source": "https://github.com/zduclos/artvee-scraper-cli"
    },
    "split_keywords": [
        "artvee",
        " artwork",
        " webscraper"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2cec3e6af44d51467a1fe5971aea72412ed56dca3b1495d10ab8b3b0211aa205",
                "md5": "22c649fc9c3003e96d9bd8d7d9d1137f",
                "sha256": "111d3a04cc70097e5b21c2fd85f8564cc5f97ab96b05fadf2b6126a0b9dbb7ce"
            },
            "downloads": -1,
            "filename": "artvee_scraper_cli-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "22c649fc9c3003e96d9bd8d7d9d1137f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 17667,
            "upload_time": "2024-10-29T04:04:35",
            "upload_time_iso_8601": "2024-10-29T04:04:35.836930Z",
            "url": "https://files.pythonhosted.org/packages/2c/ec/3e6af44d51467a1fe5971aea72412ed56dca3b1495d10ab8b3b0211aa205/artvee_scraper_cli-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b2587bcfcc141ba8eecb22129e26bdf86bd02796d83c8c0a55821d279c9e9203",
                "md5": "fe3c91e5eee5651c6e847a85bf238917",
                "sha256": "f8cee9ebdfc91843b24334365701dc713dcfbca0ba93f296cbed0658938d8e65"
            },
            "downloads": -1,
            "filename": "artvee-scraper-cli-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "fe3c91e5eee5651c6e847a85bf238917",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 16718,
            "upload_time": "2024-10-29T04:04:37",
            "upload_time_iso_8601": "2024-10-29T04:04:37.166583Z",
            "url": "https://files.pythonhosted.org/packages/b2/58/7bcfcc141ba8eecb22129e26bdf86bd02796d83c8c0a55821d279c9e9203/artvee-scraper-cli-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-29 04:04:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "zduclos",
    "github_project": "artvee-scraper-cli",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "artvee-scraper",
            "specs": [
                [
                    "~=",
                    "4.0.4"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    "~=",
                    "8.3.3"
                ]
            ]
        },
        {
            "name": "mockito",
            "specs": [
                [
                    "~=",
                    "1.5.1"
                ]
            ]
        },
        {
            "name": "coverage",
            "specs": [
                [
                    "~=",
                    "7.6.1"
                ]
            ]
        },
        {
            "name": "black",
            "specs": [
                [
                    "~=",
                    "24.8.0"
                ]
            ]
        },
        {
            "name": "mypy",
            "specs": [
                [
                    "~=",
                    "1.11.2"
                ]
            ]
        },
        {
            "name": "setuptools",
            "specs": [
                [
                    "~=",
                    "65.5.1"
                ]
            ]
        },
        {
            "name": "wheel",
            "specs": [
                [
                    "~=",
                    "0.38.2"
                ]
            ]
        }
    ],
    "lcname": "artvee-scraper-cli"
}
        
Elapsed time: 2.56321s