[](https://travis-ci.com/anelendata/tap-rest-api)
💥 New in 0.2.0: Set record_list_level and record_level, index_key, datetime_key, and timestamp_key with jsonpath.
# tap-rest-api
A configurable REST API singer.io tap.
## What is it?
tap-rest-api is a [Singer](https://singer.io) tap that produces JSON-formatted
data following the [Singer spec](https://github.com/singer-io/getting-started).
This tap:
- Pulls JSON records from Rest API
- Automatically infers the schema and generate JSON-schema and Singer catalog
file.
- Incrementally pulls data based on the input state. (singer.io bookmark specification)
The stdout from this program is intended by consumed by singer.io target program as:
```
tap-rest-api | target-csv
```
## How to use it
Install:
```
pip install tap-rest-api
```
The following example is created using [USGS Earthquake Events data](https://earthquake.usgs.gov/fdsnws/event/1/).
`curl https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2014-01-01&endtime=2014-01-02&minmagnitude=1`
```
{
"type": "FeatureCollection",
"features": [
{
"geometry": {
"type": "Point",
"coordinates": [
-116.7776667,
33.6633333,
11.008
]
},
"type": "Feature",
"properties": {
"rms": 0.09,
"code": "11408890",
"cdi": null,
"sources": ",ci,",
"nst": 39,
"tz": -480,
"title": "M 1.3 - 10km SSW of Idyllwild, CA",
...
"mag": 1.29,
...
"place": "10km SSW of Idyllwild, CA",
"time": 1388620296020,
"mmi": null
},
"id": "ci11408890"
},
...
]
}
```
[examples/usgs/sample_records.json](https://raw.githubusercontent.com/anelendata/tap-rest-api/master/examples/usgs/sample_records.json)
In the following steps, we will atempt to extract `properties` section of
the record type `Feature` as Singer record.
### Step 1: Default spec
Anything defined here can be added to tap configuration file or to the
command-line argument:
- [default_spec.json](https://github.com/anelendata/tap-rest-api/blob/master/tap_rest_api/default_spec.json)
### Step 2: [Optional] Create a custom spec for config file:
If you would like to define more configuration variables, create a spec file.
Here is an
[example] (https://github.com/anelendata/tap-rest-api/blob/master/examples/usgs/custom_spec.json):
```
{
"args": {
"min_magnitude":
{
"type": "integer",
"default": "0",
"help": "Filter based on the minimum magnitude."
}
}
}
```
Anything you define here overwrites
[default_spec.json](https://github.com/anelendata/tap-rest-api/blob/master/tap_rest_api/default_spec.json).
### Step 3. Create Config file:
**Please note jsonpath specification is supported version 0.2.0 and later only.**
Now create a cofnig file. Note the difference between spec file and config file.
The role of spec file is to create or alter the config specs, and the role of
the config file is to provide the values to the config variables. When a value
is not specified in the config file, the default value defined in the spec
file is used.
[Example](https://github.com/anelendata/tap-rest-api/tree/master/examples/usgs/config/tap_config.json):
```
{
"url":"https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime={start_datetime}&endtime={end_datetime}&minmagnitude={min_magnitude}&limit={items_per_page}&offset={current_offset}&eventtype=earthquake&orderby=time-asc",
"record_list_level": "features[*]",
"timestamp_key": "properties.time",
"schema": "earthquakes",
"items_per_page": 100,
"offset_start": 1,
"auth_method": "no_auth",
"min_magnitude": 1
}
```
Below are some key concepts in the configuration file.
#### Parametric URL
You can use `{<config_varable_name>}` notion to insert the value specified at the config to URL.
In addition to the config variables listed in
[default_spec.json](https://github.com/anelendata/tap-rest-api/blob/master/tap_rest_api/default_spec.json)
and the custom spec file, the URL also can contain parameters from the following run-time variables:
- current_offset: Offset by the number of records to skip
- current_page: The current page if the endpoint supports paging
- last_update: The last retrieved value of the column specified by index_key, timestamp_key, or datetime_key
(See next section)
#### timestamp_key, datetime_key, index_key
If you want to use timestamp, datetime, index in the parameterized URL or
want to use a field in those types as a bookmark, one of either timestamp_key,
datetime_key, or index_key must be set to indicate which field in the record
corresponds to the data type.
- timestamp_key: POSIX timestamp
- datetime_key: ISO 8601 formatted datetime (it can be truncated to date and etc)
It works when the character between the date and time components is " " instead of "T".
- index_key: A sequential index (integer or string)
In USGS example, the individual record contains the top level objects `properties`
and `geometry`. The timestamp key is `time` defined under `properties`, so the config
value `timestamp_key` is set as `properties.time`, following
[jsonpath](https://goessner.net/articles/JsonPath/) specification.
When you specify timestamp_key, datetime_key, or index_key in the config,
you also need to set start_timestamp, start_datetime, or start_index in
config or as a command-line argument.
Optionally, you can set end_timestamp, end_datetime, or end_index to indicate
so the process stops once such threashold is encounterd, assuming the data
is sorted by the field.
For human convenience, start/end_datetime (more human readable) is also looked
up when timestamp_key is set but start/end_timestamp is not set.
#### Record list level and record level
- record_list_level:
Some API wraps a set of records under a property. Others responds a newline separated JSONs.
For the former, we need to specify a key so the tap can find the record level.
The USGS earthquake response is a single JSON object example. The records are listed under
features object. So the config value `record_list_level` is set as a jsonpath `features[*]`.
- record_level:
Under the individual record, there may be another layer of properties that separates
the data and meta data and we may only be interested in the former. If this is the case,
we can specify record_level. In USGS example, we can ignore `geometry` object and output
only the content of `properties` object. Set a jsonpath to `record_level` config value
to achieve this:
```
{
"url":"https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime={start_datetime}&endtime={end_datetime}&minmagnitude={min_magnitude}&limit={items_per_page}&offset={current_offset}&eventtype=earthquake&orderby=time-asc",
"record_list_level": "features[*]",
"record_level": "properties",
"timestamp_key": "time",
"schema": "earthquakes",
"items_per_page": 100,
"offset_start": 1,
"auth_method": "no_auth",
"min_magnitude": 1
}
```
### Step 4. Create schema and catalog files
```
$ tap-rest-api custom_spec.json --config config/tap_config.json --schema_dir ./schema --catalog_dir ./catalog --start_datetime="2020-08-06" --infer_schema
```
The schema and catalog files are created under schema and catalog directories, respectively.
Note:
- If no customization needed, you can omit the spec file (custom_spec.json)
- `start_dateime` and `end_datetime` are copied to `start_timestamp` and `end_timestamp`.
- `end_timestamp` and `end_datetime` are automatically set as UTC now if not present in the config file or command-line argument.
### Step 5. Run the tap
```
$ tap-rest-api ./custom_spec.json --config config/tap_config.json --start_datetime="2020-08-06" --catalog ./catalog/earthquakes.json
```
## Authentication
The example above does not require login. tap-rest-api currently supports
basic auth. If this is needed add something like:
```
{
"auth_method": "basic",
"username": "my_username",
"password": "my_password",
...
}
```
Or add those at the commands line:
```
tap-rest-api config/custom_spec.json --config config/tap_config.json --schema_dir ./config/schema --catalog ./config/catalog/some_catalog.json --start_datetime="2020-08-06" --username my_username --password my_password --auth_method basic
```
## Custom http-headers
In addition to the authentication method, you can specify the http header
in config file:
Example:
```
...
"http_headers":
{
"User-Agent": "Mozilla/5.0 (Macintosh; scitylana.singer.io) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36",
"Content-type": "application/json",
"Authorization": "Bearer <some-key>"
},
...
```
Here is the default value:
```
{
"User-Agent": "Mozilla/5.0 (Macintosh; scitylana.singer.io) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36",
"Content-type": "application/json"
}
```
When you define http_headers config value, the default value is nullified.
So you should redefine "User-Agent" and "Content-type" when you need them.
## Multiple streams
tap-rest-api suports settings for multiple streams.
- `url` is set as string for default value.
- `urls` is a dictionary to overwrite the default `url` for the specified stream ID given as the dictionary key
- `{stream}` can be used as parameter in URL.
- `timestamp_key`, `datetime_key`, `index_key` can be set either as string or dictionary. If a stream ID exists in the dictionary key in one of the items, it will be used. If not, the key defaults a string defined one with priotiry (timestamp_key > datetime_key > index_key.
- Active streams must be defined as a comma separated stream IDs either in the config file or in the command `--stream <streams>`
- Streams must be registered in catalog file with `selected: true` ([example](https://github.com/anelendata/tap-rest-api/blob/master/examples/usgs/catalog/earthquakes.json))
Here is an example for [Chargify API](https://developers.chargify.com/docs/api-docs)
```
{
"url": "https://{{ subdomain }}.chargify.com/{stream}.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&date_field={datetime_key}&start_datetime={start_datetime}",
"urls": {
"events": "https://{{ subdomain }}.chargify.com/events.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&date_field=created_at&since_id={start_index}",
"price_points": "https://{{ subdomain }}.chargify.com/products_price_points.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&filter[date_field]=updated_at&filter[start_datetime]={start_datetime}&filter[end_datetime]={end_datetime}",
"segments": "https://{{ subdomain }}.chargify.com/components/{{ component_id }}/price_points/{{ price_point_id }}/segments.json?per_page={items_per_page}&page={current_page_one_base}",
"statements": "https://{{ subdomain }}.chargify.com/statements.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&sort=created_at",
"transactions": "https://{{ subdomain }}.chargify.com/transactions.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&since_id={start_index}&order_by=id",
"customers_meta": "https://{{ subdomain }}.chargify.com/customers/metadata.json?direction=asc&date_field=updated_at&per_page={items_per_page}&page={current_page_one_base}&with_deleted=true&start_datetime={start_datetime}&end_datetime={end_datetime}",
"subscriptions_meta": "https://{{ subdomain }}.chargify.com/subscriptions/metadata.json?direction=asc&date_field=updated_at&per_page={items_per_page}&page={current_page_one_base}&with_deleted=true&start_datetime={start_datetime}&end_datetime={end_datetime}"
},
"streams": "components,coupons,customers,events,invoices,price_points,products,product_families,subscriptions,subscriptions_components,transactions",
"auth_method": "basic",
"username": "{{ api_key }}",
"password": "x",
"record_list_level": {
"customers_meta": "$.metadata[*]",
"invoices": "$.invoices[*]",
"price_points": "$.price_points[*]",
"segments": "$.segments[*]",
"subscriptions_components": "$.subscriptions_components[*]",
"subscriptions_meta": "$.metadata[*]"
},
"record_level": {
"components": "$.component",
"coupons": "$.coupon",
"customers": "$.customer",
"events": "$.event",
"product_families": "$.product_family",
"products": "$.product",
"statements": "$.statement",
"subscriptions": "$.subscription",
"transactions": "$.transaction"
},
"datetime_key": {
"components": "updated_at",
"coupons": "updated_at",
"customers": "updated_at",
"invoices": "updated_at",
"price_points": "updated_at",
"product_families": "updated_at",
"products": "updated_at",
"subscriptions": "updated_at",
"subscriptions_components": "updated_at"
},
"index_key": {
"events": "id",
"transactions": "id",
"segments": "id",
"statements": "id",
"customers_meta": "id",
"subscriptions_meta": "id"
},
"items_per_page": 200
}
```
## State
This tap emits [state](https://github.com/singer-io/getting-started/blob/master/docs/CONFIG_AND_STATE.md#state-file).
The command also takes a state file input with `--state <file-name>` option.
The tap itself does not output a state file. It anticipate the target program or a downstream process to fianlize the state safetly and produce a state file.
## Raw output mode
If you want to use this tap outside Singer framework, set `--raw` in the
commandline argument. Then the process write out the records as
newline-separated JSON.
A use case for this mode is when you expect the schema to change or inconsistent
and you rather want to extract and clean up post-loading.
([Example](https://articles.anelen.co/elt-google-cloud-storage-bigquery/))
## Schema validation and cleanups
- on_invalid_property: Behavior when schema validation fails.
- "raise": Raise exception
- "null": Impute with null
- "force" (default): Keep the record value as is (string). This may fail in the singer target.
- drop_unknown_properties: If true, record will exclude unknown (sub-)properties before it's being written to stdout. Default is false.
Config example to add them:
```
{
...
"on_invalid_property": "force",
"drop_unknown_properties": true,
...
}
```
# About this project
This project is developed by
ANELEN and friends. Please check out the ANELEN's
[open innovation philosophy and other projects](https://anelen.co/open-source.html)

---
Copyright © 2020~ Anelen Co., LLC
Raw data
{
"_id": null,
"home_page": "https://github.com/anelendata/tap-rest-api",
"name": "tap-rest-api",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Daigo Tanaka, Anelen Co., LLC",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/35/b3/3f22b08c19e0bfec7637e2110194270f7484b7f4722ee6b9181ec4665abd/tap-rest-api-0.2.9.tar.gz",
"platform": null,
"description": "[](https://travis-ci.com/anelendata/tap-rest-api)\n\n\ud83d\udca5 New in 0.2.0: Set record_list_level and record_level, index_key, datetime_key, and timestamp_key with jsonpath.\n\n# tap-rest-api\n\nA configurable REST API singer.io tap.\n\n## What is it?\n\ntap-rest-api is a [Singer](https://singer.io) tap that produces JSON-formatted\ndata following the [Singer spec](https://github.com/singer-io/getting-started).\n\nThis tap:\n\n- Pulls JSON records from Rest API\n- Automatically infers the schema and generate JSON-schema and Singer catalog\n file.\n- Incrementally pulls data based on the input state. (singer.io bookmark specification)\n\nThe stdout from this program is intended by consumed by singer.io target program as:\n\n```\ntap-rest-api | target-csv\n```\n\n## How to use it\n\nInstall:\n\n```\npip install tap-rest-api\n```\n\nThe following example is created using [USGS Earthquake Events data](https://earthquake.usgs.gov/fdsnws/event/1/).\n\n`curl https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2014-01-01&endtime=2014-01-02&minmagnitude=1`\n\n```\n{\n \"type\": \"FeatureCollection\",\n \"features\": [\n {\n \"geometry\": {\n \"type\": \"Point\",\n \"coordinates\": [\n -116.7776667,\n 33.6633333,\n 11.008\n ]\n },\n \"type\": \"Feature\",\n \"properties\": {\n \"rms\": 0.09,\n \"code\": \"11408890\",\n \"cdi\": null,\n \"sources\": \",ci,\",\n \"nst\": 39,\n \"tz\": -480,\n \"title\": \"M 1.3 - 10km SSW of Idyllwild, CA\",\n ...\n \"mag\": 1.29,\n ...\n \"place\": \"10km SSW of Idyllwild, CA\",\n \"time\": 1388620296020,\n \"mmi\": null\n },\n \"id\": \"ci11408890\"\n },\n ...\n ]\n}\n```\n[examples/usgs/sample_records.json](https://raw.githubusercontent.com/anelendata/tap-rest-api/master/examples/usgs/sample_records.json)\n\nIn the following steps, we will atempt to extract `properties` section of\nthe record type `Feature` as Singer record.\n\n### Step 1: Default spec\n\nAnything defined here can be added to tap configuration file or to the\ncommand-line argument:\n\n- [default_spec.json](https://github.com/anelendata/tap-rest-api/blob/master/tap_rest_api/default_spec.json)\n\n### Step 2: [Optional] Create a custom spec for config file:\n\nIf you would like to define more configuration variables, create a spec file.\nHere is an\n[example] (https://github.com/anelendata/tap-rest-api/blob/master/examples/usgs/custom_spec.json):\n```\n{\n \"args\": {\n \"min_magnitude\":\n {\n \"type\": \"integer\",\n \"default\": \"0\",\n \"help\": \"Filter based on the minimum magnitude.\"\n }\n }\n}\n```\n\nAnything you define here overwrites\n[default_spec.json](https://github.com/anelendata/tap-rest-api/blob/master/tap_rest_api/default_spec.json).\n\n### Step 3. Create Config file:\n\n**Please note jsonpath specification is supported version 0.2.0 and later only.**\n\nNow create a cofnig file. Note the difference between spec file and config file.\nThe role of spec file is to create or alter the config specs, and the role of\nthe config file is to provide the values to the config variables. When a value\nis not specified in the config file, the default value defined in the spec\nfile is used.\n\n[Example](https://github.com/anelendata/tap-rest-api/tree/master/examples/usgs/config/tap_config.json):\n\n```\n{\n \"url\":\"https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime={start_datetime}&endtime={end_datetime}&minmagnitude={min_magnitude}&limit={items_per_page}&offset={current_offset}&eventtype=earthquake&orderby=time-asc\",\n \"record_list_level\": \"features[*]\",\n \"timestamp_key\": \"properties.time\",\n \"schema\": \"earthquakes\",\n \"items_per_page\": 100,\n \"offset_start\": 1,\n \"auth_method\": \"no_auth\",\n \"min_magnitude\": 1\n}\n```\n\nBelow are some key concepts in the configuration file.\n\n#### Parametric URL\n\nYou can use `{<config_varable_name>}` notion to insert the value specified at the config to URL.\n\nIn addition to the config variables listed in\n[default_spec.json](https://github.com/anelendata/tap-rest-api/blob/master/tap_rest_api/default_spec.json)\nand the custom spec file, the URL also can contain parameters from the following run-time variables:\n\n- current_offset: Offset by the number of records to skip\n- current_page: The current page if the endpoint supports paging\n- last_update: The last retrieved value of the column specified by index_key, timestamp_key, or datetime_key\n (See next section)\n\n#### timestamp_key, datetime_key, index_key\n\nIf you want to use timestamp, datetime, index in the parameterized URL or\nwant to use a field in those types as a bookmark, one of either timestamp_key,\ndatetime_key, or index_key must be set to indicate which field in the record\ncorresponds to the data type.\n\n- timestamp_key: POSIX timestamp\n- datetime_key: ISO 8601 formatted datetime (it can be truncated to date and etc)\n It works when the character between the date and time components is \" \" instead of \"T\".\n- index_key: A sequential index (integer or string)\n\nIn USGS example, the individual record contains the top level objects `properties`\nand `geometry`. The timestamp key is `time` defined under `properties`, so the config\nvalue `timestamp_key` is set as `properties.time`, following\n[jsonpath](https://goessner.net/articles/JsonPath/) specification.\n\nWhen you specify timestamp_key, datetime_key, or index_key in the config,\nyou also need to set start_timestamp, start_datetime, or start_index in\nconfig or as a command-line argument.\n\nOptionally, you can set end_timestamp, end_datetime, or end_index to indicate\nso the process stops once such threashold is encounterd, assuming the data\nis sorted by the field.\n\nFor human convenience, start/end_datetime (more human readable) is also looked\nup when timestamp_key is set but start/end_timestamp is not set.\n\n#### Record list level and record level\n\n- record_list_level:\n Some API wraps a set of records under a property. Others responds a newline separated JSONs.\n For the former, we need to specify a key so the tap can find the record level.\n The USGS earthquake response is a single JSON object example. The records are listed under\n features object. So the config value `record_list_level` is set as a jsonpath `features[*]`.\n\n- record_level:\n Under the individual record, there may be another layer of properties that separates\n the data and meta data and we may only be interested in the former. If this is the case,\n we can specify record_level. In USGS example, we can ignore `geometry` object and output\n only the content of `properties` object. Set a jsonpath to `record_level` config value\n to achieve this:\n\n```\n{\n \"url\":\"https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime={start_datetime}&endtime={end_datetime}&minmagnitude={min_magnitude}&limit={items_per_page}&offset={current_offset}&eventtype=earthquake&orderby=time-asc\",\n \"record_list_level\": \"features[*]\",\n \"record_level\": \"properties\",\n \"timestamp_key\": \"time\",\n \"schema\": \"earthquakes\",\n \"items_per_page\": 100,\n \"offset_start\": 1,\n \"auth_method\": \"no_auth\",\n \"min_magnitude\": 1\n}\n```\n\n### Step 4. Create schema and catalog files\n\n```\n$ tap-rest-api custom_spec.json --config config/tap_config.json --schema_dir ./schema --catalog_dir ./catalog --start_datetime=\"2020-08-06\" --infer_schema\n```\n\nThe schema and catalog files are created under schema and catalog directories, respectively.\n\nNote:\n\n- If no customization needed, you can omit the spec file (custom_spec.json)\n- `start_dateime` and `end_datetime` are copied to `start_timestamp` and `end_timestamp`.\n- `end_timestamp` and `end_datetime` are automatically set as UTC now if not present in the config file or command-line argument.\n\n### Step 5. Run the tap\n\n```\n$ tap-rest-api ./custom_spec.json --config config/tap_config.json --start_datetime=\"2020-08-06\" --catalog ./catalog/earthquakes.json\n```\n\n## Authentication\n\nThe example above does not require login. tap-rest-api currently supports\nbasic auth. If this is needed add something like:\n\n```\n{\n \"auth_method\": \"basic\",\n \"username\": \"my_username\",\n \"password\": \"my_password\",\n ...\n}\n```\n\nOr add those at the commands line:\n\n```\ntap-rest-api config/custom_spec.json --config config/tap_config.json --schema_dir ./config/schema --catalog ./config/catalog/some_catalog.json --start_datetime=\"2020-08-06\" --username my_username --password my_password --auth_method basic\n```\n\n## Custom http-headers\n\nIn addition to the authentication method, you can specify the http header\nin config file:\n\nExample:\n\n```\n...\n\"http_headers\":\n {\n \"User-Agent\": \"Mozilla/5.0 (Macintosh; scitylana.singer.io) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36\",\n \"Content-type\": \"application/json\",\n \"Authorization\": \"Bearer <some-key>\"\n },\n...\n```\n\nHere is the default value:\n```\n{\n \"User-Agent\": \"Mozilla/5.0 (Macintosh; scitylana.singer.io) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36\",\n \"Content-type\": \"application/json\"\n}\n```\n\nWhen you define http_headers config value, the default value is nullified.\nSo you should redefine \"User-Agent\" and \"Content-type\" when you need them.\n\n## Multiple streams\n\ntap-rest-api suports settings for multiple streams.\n\n- `url` is set as string for default value.\n- `urls` is a dictionary to overwrite the default `url` for the specified stream ID given as the dictionary key\n- `{stream}` can be used as parameter in URL.\n- `timestamp_key`, `datetime_key`, `index_key` can be set either as string or dictionary. If a stream ID exists in the dictionary key in one of the items, it will be used. If not, the key defaults a string defined one with priotiry (timestamp_key > datetime_key > index_key.\n- Active streams must be defined as a comma separated stream IDs either in the config file or in the command `--stream <streams>`\n- Streams must be registered in catalog file with `selected: true` ([example](https://github.com/anelendata/tap-rest-api/blob/master/examples/usgs/catalog/earthquakes.json))\n\nHere is an example for [Chargify API](https://developers.chargify.com/docs/api-docs)\n\n```\n{\n \"url\": \"https://{{ subdomain }}.chargify.com/{stream}.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&date_field={datetime_key}&start_datetime={start_datetime}\",\n \"urls\": {\n \"events\": \"https://{{ subdomain }}.chargify.com/events.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&date_field=created_at&since_id={start_index}\",\n \"price_points\": \"https://{{ subdomain }}.chargify.com/products_price_points.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&filter[date_field]=updated_at&filter[start_datetime]={start_datetime}&filter[end_datetime]={end_datetime}\",\n \"segments\": \"https://{{ subdomain }}.chargify.com/components/{{ component_id }}/price_points/{{ price_point_id }}/segments.json?per_page={items_per_page}&page={current_page_one_base}\",\n \"statements\": \"https://{{ subdomain }}.chargify.com/statements.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&sort=created_at\",\n \"transactions\": \"https://{{ subdomain }}.chargify.com/transactions.json?direction=asc&per_page={items_per_page}&page={current_page_one_base}&since_id={start_index}&order_by=id\",\n \"customers_meta\": \"https://{{ subdomain }}.chargify.com/customers/metadata.json?direction=asc&date_field=updated_at&per_page={items_per_page}&page={current_page_one_base}&with_deleted=true&start_datetime={start_datetime}&end_datetime={end_datetime}\",\n \"subscriptions_meta\": \"https://{{ subdomain }}.chargify.com/subscriptions/metadata.json?direction=asc&date_field=updated_at&per_page={items_per_page}&page={current_page_one_base}&with_deleted=true&start_datetime={start_datetime}&end_datetime={end_datetime}\"\n },\n \"streams\": \"components,coupons,customers,events,invoices,price_points,products,product_families,subscriptions,subscriptions_components,transactions\",\n \"auth_method\": \"basic\",\n \"username\": \"{{ api_key }}\",\n \"password\": \"x\",\n \"record_list_level\": {\n \"customers_meta\": \"$.metadata[*]\",\n \"invoices\": \"$.invoices[*]\",\n \"price_points\": \"$.price_points[*]\",\n \"segments\": \"$.segments[*]\",\n \"subscriptions_components\": \"$.subscriptions_components[*]\",\n \"subscriptions_meta\": \"$.metadata[*]\"\n },\n \"record_level\": {\n \"components\": \"$.component\",\n \"coupons\": \"$.coupon\",\n \"customers\": \"$.customer\",\n \"events\": \"$.event\",\n \"product_families\": \"$.product_family\",\n \"products\": \"$.product\",\n \"statements\": \"$.statement\",\n \"subscriptions\": \"$.subscription\",\n \"transactions\": \"$.transaction\"\n },\n \"datetime_key\": {\n \"components\": \"updated_at\",\n \"coupons\": \"updated_at\",\n \"customers\": \"updated_at\",\n \"invoices\": \"updated_at\",\n \"price_points\": \"updated_at\",\n \"product_families\": \"updated_at\",\n \"products\": \"updated_at\",\n \"subscriptions\": \"updated_at\",\n \"subscriptions_components\": \"updated_at\"\n },\n \"index_key\": {\n \"events\": \"id\",\n \"transactions\": \"id\",\n \"segments\": \"id\",\n \"statements\": \"id\",\n \"customers_meta\": \"id\",\n \"subscriptions_meta\": \"id\"\n },\n \"items_per_page\": 200\n}\n```\n\n## State\n\nThis tap emits [state](https://github.com/singer-io/getting-started/blob/master/docs/CONFIG_AND_STATE.md#state-file).\nThe command also takes a state file input with `--state <file-name>` option.\nThe tap itself does not output a state file. It anticipate the target program or a downstream process to fianlize the state safetly and produce a state file.\n\n## Raw output mode\n\nIf you want to use this tap outside Singer framework, set `--raw` in the\ncommandline argument. Then the process write out the records as\nnewline-separated JSON.\n\nA use case for this mode is when you expect the schema to change or inconsistent\nand you rather want to extract and clean up post-loading.\n([Example](https://articles.anelen.co/elt-google-cloud-storage-bigquery/))\n\n## Schema validation and cleanups\n\n- on_invalid_property: Behavior when schema validation fails.\n - \"raise\": Raise exception\n - \"null\": Impute with null\n - \"force\" (default): Keep the record value as is (string). This may fail in the singer target.\n- drop_unknown_properties: If true, record will exclude unknown (sub-)properties before it's being written to stdout. Default is false.\n\nConfig example to add them:\n```\n{\n...\n \"on_invalid_property\": \"force\",\n \"drop_unknown_properties\": true,\n...\n}\n```\n\n# About this project\n\nThis project is developed by \nANELEN and friends. Please check out the ANELEN's\n[open innovation philosophy and other projects](https://anelen.co/open-source.html)\n\n\n---\n\nCopyright © 2020~ Anelen Co., LLC\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Singer.io tap for extracting data from any REST API",
"version": "0.2.9",
"project_urls": {
"Homepage": "https://github.com/anelendata/tap-rest-api"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5f7a51d4f6b4e88e80f1397433f6435d337c872cf71b84b5806eb0a88c5d80fc",
"md5": "7c00a6a84437e2bb13a9214b4d4ff8a9",
"sha256": "025b7e397dc077ec3f6c9303d4b6a3bc6c4147035d5fddbce0686933ce41ec4f"
},
"downloads": -1,
"filename": "tap_rest_api-0.2.9-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7c00a6a84437e2bb13a9214b4d4ff8a9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 24255,
"upload_time": "2024-07-17T03:43:52",
"upload_time_iso_8601": "2024-07-17T03:43:52.889082Z",
"url": "https://files.pythonhosted.org/packages/5f/7a/51d4f6b4e88e80f1397433f6435d337c872cf71b84b5806eb0a88c5d80fc/tap_rest_api-0.2.9-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "35b33f22b08c19e0bfec7637e2110194270f7484b7f4722ee6b9181ec4665abd",
"md5": "e87075fe940d193b0ec60af639c4a47d",
"sha256": "19287eb74b1569bb5e2be7569b9d86b30ebed3b1b8f3efcc4657d82bf3dde025"
},
"downloads": -1,
"filename": "tap-rest-api-0.2.9.tar.gz",
"has_sig": false,
"md5_digest": "e87075fe940d193b0ec60af639c4a47d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 26922,
"upload_time": "2024-07-17T03:43:54",
"upload_time_iso_8601": "2024-07-17T03:43:54.483936Z",
"url": "https://files.pythonhosted.org/packages/35/b3/3f22b08c19e0bfec7637e2110194270f7484b7f4722ee6b9181ec4665abd/tap-rest-api-0.2.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-07-17 03:43:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "anelendata",
"github_project": "tap-rest-api",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"lcname": "tap-rest-api"
}