Name | dz-mongodb JSON |
Version |
1.4.9
JSON |
| download |
home_page | https://github.com/datazip/dz-mongodb |
Summary | Singer.io tap for extracting data from MongoDB - Datazip compatible |
upload_time | 2024-12-30 07:59:17 |
maintainer | None |
docs_url | None |
author | Wise |
requires_python | None |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# dz-mongodb
This is a [Singer](https://singer.io) tap that produces JSON-formatted data following the [Singer spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md) from a MongoDB source.
## Set up local dev environment:
```shell script
make setup
```
## Activate virtual environment
```shell script
. venv/bin/activate
```
## Set up Config file
Create json file called `config.json`, with the following contents:
```json
{
"password": "<password>",
"user": "<username>",
"host": "<host ip address>",
"auth_database": "<database name to authenticate on>",
"database": "<database name to sync from>"
}
```
The following parameters are optional for your config file:
| Name | Type | Default value| Description |
| -----|------|--------|------------ |
| `srv` | Boolean | false | uses a `mongodb+srv` protocol to connect. Disables the usage of `port` argument if set to `True` |
| `port` | Integer | false | Connection port. Required if a non-srv connection is being used. |
| `replica_set` | string | null | name of replica set |
| `ssl` | Boolean | false | can be set to true to connect using ssl |
| `verify_mode` | Boolean | true | Default SSL verify mode |
| `include_schemas_in_destination_stream_name` | Boolean |false | forces the stream names to take the form `<database_name>-<collection_name>` instead of `<collection_name>`|
| `update_buffer_size` | int | 1 | [LOG_BASED] The size of the buffer that holds detected update operations in memory, the buffer is flushed once the size is reached |
| `await_time_ms` | int | 1000 | [LOG_BASED] The maximum amount of time in milliseconds the loge_base method waits for new data changes before exiting. |
| `full_load_on_empty_state` | bool | false | [LOG_BASED] A flag which forces full load when no previous token is found in state.|
All of the above attributes are required by the tap to connect to your mongo instance.
here is a [sample configuration file](./sample_config.json).
## Run in discovery mode
Run the following command and redirect the output into the catalog file
```shell script
dz-mongodb --config ~/config.json --discover > ~/catalog.json
```
Your catalog file should now look like this:
```json
{
"streams": [
{
"table_name": "<table name>",
"tap_stream_id": "<tap_stream_id>",
"metadata": [
{
"breadcrumb": [],
"metadata": {
"row-count":<int>,
"is-view": <bool>,
"database-name": "<database name>",
"table-key-properties": [
"_id"
],
"valid-replication-keys": [
"_id"
]
}
}
],
"stream": "<stream name>",
"schema": {
"type": "object"
}
}
]
}
```
## Edit Catalog file
### Using valid json, edit the config.json file
To select a stream, enter the following to the stream's metadata:
```json
"selected": true,
"replication-method": "<replication method>",
```
`<replication-method>` must be either `FULL_TABLE`, `INCREMENTAL` or `LOG_BASED`, if it's `INCREMENTAL`, make sure to add a `"replication-key"`.
For example, if you were to edit the example stream to select the stream as well as add a projection, config.json should look this:
```json
{
"streams": [
{
"table_name": "<table name>",
"tap_stream_id": "<tap_stream_id>",
"metadata": [
{
"breadcrumb": [],
"metadata": {
"row-count": <int>,
"is-view": <bool>,
"database-name": "<database name>",
"table-key-properties": [
"_id"
],
"valid-replication-keys": [
"_id"
],
"selected": true,
"replication-method": "<replication method>"
}
}
],
"stream": "<stream name>",
"schema": {
"type": "object"
}
}
]
}
```
## Run in sync mode:
```shell script
dz-mongodb --config ~/config.json --catalog ~/catalog.json
```
The tap will write bookmarks to stdout which can be captured and passed as an optional `--state state.json` parameter to the tap for the next sync.
## Logging configuration
The tap uses a predefined logging config if none is provided, however, you can set your own config by setting the environment variable `LOGGING_CONFIG_FILE` as the path to the logging config.
A sample config is available [here](./sample_logging.conf).
---
Copyright © 2020 TransferWise
Raw data
{
"_id": null,
"home_page": "https://github.com/datazip/dz-mongodb",
"name": "dz-mongodb",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Wise",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/fa/ce/203de109503407c699780f4f7398b7d2a5ce53e55b3ec180ee826c749256/dz-mongodb-1.4.9.tar.gz",
"platform": null,
"description": "# dz-mongodb\n\nThis is a [Singer](https://singer.io) tap that produces JSON-formatted data following the [Singer spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md) from a MongoDB source.\n\n## Set up local dev environment:\n\n```shell script\nmake setup\n```\n\n## Activate virtual environment\n\n```shell script\n. venv/bin/activate\n```\n\n## Set up Config file\n\nCreate json file called `config.json`, with the following contents:\n```json\n{\n \"password\": \"<password>\",\n \"user\": \"<username>\",\n \"host\": \"<host ip address>\",\n \"auth_database\": \"<database name to authenticate on>\",\n \"database\": \"<database name to sync from>\"\n}\n```\nThe following parameters are optional for your config file:\n\n| Name | Type | Default value| Description |\n| -----|------|--------|------------ |\n| `srv` | Boolean | false | uses a `mongodb+srv` protocol to connect. Disables the usage of `port` argument if set to `True` |\n| `port` | Integer | false | Connection port. Required if a non-srv connection is being used. |\n| `replica_set` | string | null | name of replica set |\n| `ssl` | Boolean | false | can be set to true to connect using ssl |\n| `verify_mode` | Boolean | true | Default SSL verify mode |\n| `include_schemas_in_destination_stream_name` | Boolean |false | forces the stream names to take the form `<database_name>-<collection_name>` instead of `<collection_name>`|\n| `update_buffer_size` | int | 1 | [LOG_BASED] The size of the buffer that holds detected update operations in memory, the buffer is flushed once the size is reached |\n| `await_time_ms` | int | 1000 | [LOG_BASED] The maximum amount of time in milliseconds the loge_base method waits for new data changes before exiting. |\n| `full_load_on_empty_state` | bool | false | [LOG_BASED] A flag which forces full load when no previous token is found in state.|\n\nAll of the above attributes are required by the tap to connect to your mongo instance.\nhere is a [sample configuration file](./sample_config.json).\n\n## Run in discovery mode\nRun the following command and redirect the output into the catalog file\n```shell script\ndz-mongodb --config ~/config.json --discover > ~/catalog.json\n```\n\nYour catalog file should now look like this:\n```json\n{\n \"streams\": [\n {\n \"table_name\": \"<table name>\",\n \"tap_stream_id\": \"<tap_stream_id>\",\n \"metadata\": [\n {\n \"breadcrumb\": [],\n \"metadata\": {\n \"row-count\":<int>,\n \"is-view\": <bool>,\n \"database-name\": \"<database name>\",\n \"table-key-properties\": [\n \"_id\"\n ],\n \"valid-replication-keys\": [\n \"_id\"\n ]\n }\n }\n ],\n \"stream\": \"<stream name>\",\n \"schema\": {\n \"type\": \"object\"\n }\n }\n ]\n}\n```\n\n## Edit Catalog file\n### Using valid json, edit the config.json file\nTo select a stream, enter the following to the stream's metadata:\n```json\n\"selected\": true,\n\"replication-method\": \"<replication method>\",\n```\n\n`<replication-method>` must be either `FULL_TABLE`, `INCREMENTAL` or `LOG_BASED`, if it's `INCREMENTAL`, make sure to add a `\"replication-key\"`.\n\n\nFor example, if you were to edit the example stream to select the stream as well as add a projection, config.json should look this:\n```json\n{\n \"streams\": [\n {\n \"table_name\": \"<table name>\",\n \"tap_stream_id\": \"<tap_stream_id>\",\n \"metadata\": [\n {\n \"breadcrumb\": [],\n \"metadata\": {\n \"row-count\": <int>,\n \"is-view\": <bool>,\n \"database-name\": \"<database name>\",\n \"table-key-properties\": [\n \"_id\"\n ],\n \"valid-replication-keys\": [\n \"_id\"\n ],\n \"selected\": true,\n \"replication-method\": \"<replication method>\"\n }\n }\n ],\n \"stream\": \"<stream name>\",\n \"schema\": {\n \"type\": \"object\"\n }\n }\n ]\n}\n\n```\n## Run in sync mode:\n```shell script\ndz-mongodb --config ~/config.json --catalog ~/catalog.json\n```\n\nThe tap will write bookmarks to stdout which can be captured and passed as an optional `--state state.json` parameter to the tap for the next sync.\n\n## Logging configuration\nThe tap uses a predefined logging config if none is provided, however, you can set your own config by setting the environment variable `LOGGING_CONFIG_FILE` as the path to the logging config.\nA sample config is available [here](./sample_logging.conf).\n\n---\n\nCopyright © 2020 TransferWise\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Singer.io tap for extracting data from MongoDB - Datazip compatible",
"version": "1.4.9",
"project_urls": {
"Homepage": "https://github.com/datazip/dz-mongodb"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "face203de109503407c699780f4f7398b7d2a5ce53e55b3ec180ee826c749256",
"md5": "7625a1c104a68abd6cff1f9a87510bf2",
"sha256": "a2e97ed80813e1cecb92262d3a6bfb2d2d1f90cebf991d99aab8ce08dce733f4"
},
"downloads": -1,
"filename": "dz-mongodb-1.4.9.tar.gz",
"has_sig": false,
"md5_digest": "7625a1c104a68abd6cff1f9a87510bf2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 31004,
"upload_time": "2024-12-30T07:59:17",
"upload_time_iso_8601": "2024-12-30T07:59:17.068264Z",
"url": "https://files.pythonhosted.org/packages/fa/ce/203de109503407c699780f4f7398b7d2a5ce53e55b3ec180ee826c749256/dz-mongodb-1.4.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-30 07:59:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "datazip",
"github_project": "dz-mongodb",
"github_not_found": true,
"lcname": "dz-mongodb"
}