pipelinewise-tap-mongodb


Namepipelinewise-tap-mongodb JSON
Version 1.5.0 PyPI version JSON
download
home_pagehttps://github.com/transferwise/pipelinewise-tap-mongodb
SummarySinger.io tap for extracting data from MongoDB - Pipelinewise compatible
upload_time2024-06-19 10:52:54
maintainerNone
docs_urlNone
authorWise
requires_pythonNone
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pipelinewise-tap-mongodb

This is a [Singer](https://singer.io) tap that produces JSON-formatted data following the [Singer spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md) from a MongoDB source.

## Set up local dev environment:

```shell script
make setup
```

## Activate virtual environment

```shell script
. venv/bin/activate
```

## Set up Config file

Create json file called `config.json`, with the following contents: 
```json
{
  "password": "<password>",
  "user": "<username>",
  "host": "<host ip address>",
  "auth_database": "<database name to authenticate on>",
  "database": "<database name to sync from>"
}
```
The following parameters are optional for your config file:

| Name | Type | Default value| Description |
| -----|------|--------|------------ |
| `srv` | Boolean | false | uses a `mongodb+srv` protocol to connect. Disables the usage of `port` argument if set to `True` |
| `port` | Integer | false | Connection port. Required if a non-srv connection is being used.  |
| `replica_set` | string | null | name of replica set |
| `ssl` | Boolean | false | can be set to true to connect using ssl |
| `verify_mode` | Boolean | true | Default SSL verify mode |
| `include_schemas_in_destination_stream_name` | Boolean |false  | forces the stream names to take the form `<database_name>-<collection_name>` instead of `<collection_name>`|
| `update_buffer_size` | int | 1 | [LOG_BASED] The size of the buffer that holds detected update operations in memory, the buffer is flushed once the size is reached |
| `await_time_ms` | int | 1000 | [LOG_BASED] The maximum amount of time in milliseconds the loge_base method waits for new data changes before exiting. |

All of the above attributes are required by the tap to connect to your mongo instance. 
here is a [sample configuration file](./sample_config.json).

## Run in discovery mode
Run the following command and redirect the output into the catalog file
```shell script
tap-mongodb --config ~/config.json --discover > ~/catalog.json
```

Your catalog file should now look like this:
```json
{
  "streams": [
    {
      "table_name": "<table name>",
      "tap_stream_id": "<tap_stream_id>",
      "metadata": [
        {
          "breadcrumb": [],
          "metadata": {
            "row-count":<int>,
            "is-view": <bool>,
            "database-name": "<database name>",
            "table-key-properties": [
              "_id"
            ],
            "valid-replication-keys": [
              "_id"
            ]
          }
        }
      ],
      "stream": "<stream name>",
      "schema": {
        "type": "object"
      }
    }
  ]
}
```

## Edit Catalog file
### Using valid json, edit the config.json file
To select a stream, enter the following to the stream's metadata:
```json
"selected": true,
"replication-method": "<replication method>",
```

`<replication-method>` must be either `FULL_TABLE`, `INCREMENTAL` or `LOG_BASED`, if it's `INCREMENTAL`, make sure to add a `"replication-key"`.


For example, if you were to edit the example stream to select the stream as well as add a projection, config.json should look this:
```json
{
  "streams": [
    {
      "table_name": "<table name>",
      "tap_stream_id": "<tap_stream_id>",
      "metadata": [
        {
          "breadcrumb": [],
          "metadata": {
            "row-count": <int>,
            "is-view": <bool>,
            "database-name": "<database name>",
            "table-key-properties": [
              "_id"
            ],
            "valid-replication-keys": [
              "_id"
            ],
            "selected": true,
            "replication-method": "<replication method>"
          }
        }
      ],
      "stream": "<stream name>",
      "schema": {
        "type": "object"
      }
    }
  ]
}

```
## Run in sync mode:
```shell script
tap-mongodb --config ~/config.json --catalog ~/catalog.json
```

The tap will write bookmarks to stdout which can be captured and passed as an optional `--state state.json` parameter to the tap for the next sync.

## Logging configuration
The tap uses a predefined logging config if none is provided, however, you can set your own config by setting the environment variable `LOGGING_CONFIG_FILE` as the path to the logging config.
A sample config is available [here](./sample_logging.conf).

---

Copyright &copy; 2020 TransferWise

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/transferwise/pipelinewise-tap-mongodb",
    "name": "pipelinewise-tap-mongodb",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Wise",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/c5/9d/9de68970c9f090bff130b755d20f627fbd69ad70f2a073a265475865f3a8/pipelinewise_tap_mongodb-1.5.0.tar.gz",
    "platform": null,
    "description": "# pipelinewise-tap-mongodb\n\nThis is a [Singer](https://singer.io) tap that produces JSON-formatted data following the [Singer spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md) from a MongoDB source.\n\n## Set up local dev environment:\n\n```shell script\nmake setup\n```\n\n## Activate virtual environment\n\n```shell script\n. venv/bin/activate\n```\n\n## Set up Config file\n\nCreate json file called `config.json`, with the following contents: \n```json\n{\n  \"password\": \"<password>\",\n  \"user\": \"<username>\",\n  \"host\": \"<host ip address>\",\n  \"auth_database\": \"<database name to authenticate on>\",\n  \"database\": \"<database name to sync from>\"\n}\n```\nThe following parameters are optional for your config file:\n\n| Name | Type | Default value| Description |\n| -----|------|--------|------------ |\n| `srv` | Boolean | false | uses a `mongodb+srv` protocol to connect. Disables the usage of `port` argument if set to `True` |\n| `port` | Integer | false | Connection port. Required if a non-srv connection is being used.  |\n| `replica_set` | string | null | name of replica set |\n| `ssl` | Boolean | false | can be set to true to connect using ssl |\n| `verify_mode` | Boolean | true | Default SSL verify mode |\n| `include_schemas_in_destination_stream_name` | Boolean |false  | forces the stream names to take the form `<database_name>-<collection_name>` instead of `<collection_name>`|\n| `update_buffer_size` | int | 1 | [LOG_BASED] The size of the buffer that holds detected update operations in memory, the buffer is flushed once the size is reached |\n| `await_time_ms` | int | 1000 | [LOG_BASED] The maximum amount of time in milliseconds the loge_base method waits for new data changes before exiting. |\n\nAll of the above attributes are required by the tap to connect to your mongo instance. \nhere is a [sample configuration file](./sample_config.json).\n\n## Run in discovery mode\nRun the following command and redirect the output into the catalog file\n```shell script\ntap-mongodb --config ~/config.json --discover > ~/catalog.json\n```\n\nYour catalog file should now look like this:\n```json\n{\n  \"streams\": [\n    {\n      \"table_name\": \"<table name>\",\n      \"tap_stream_id\": \"<tap_stream_id>\",\n      \"metadata\": [\n        {\n          \"breadcrumb\": [],\n          \"metadata\": {\n            \"row-count\":<int>,\n            \"is-view\": <bool>,\n            \"database-name\": \"<database name>\",\n            \"table-key-properties\": [\n              \"_id\"\n            ],\n            \"valid-replication-keys\": [\n              \"_id\"\n            ]\n          }\n        }\n      ],\n      \"stream\": \"<stream name>\",\n      \"schema\": {\n        \"type\": \"object\"\n      }\n    }\n  ]\n}\n```\n\n## Edit Catalog file\n### Using valid json, edit the config.json file\nTo select a stream, enter the following to the stream's metadata:\n```json\n\"selected\": true,\n\"replication-method\": \"<replication method>\",\n```\n\n`<replication-method>` must be either `FULL_TABLE`, `INCREMENTAL` or `LOG_BASED`, if it's `INCREMENTAL`, make sure to add a `\"replication-key\"`.\n\n\nFor example, if you were to edit the example stream to select the stream as well as add a projection, config.json should look this:\n```json\n{\n  \"streams\": [\n    {\n      \"table_name\": \"<table name>\",\n      \"tap_stream_id\": \"<tap_stream_id>\",\n      \"metadata\": [\n        {\n          \"breadcrumb\": [],\n          \"metadata\": {\n            \"row-count\": <int>,\n            \"is-view\": <bool>,\n            \"database-name\": \"<database name>\",\n            \"table-key-properties\": [\n              \"_id\"\n            ],\n            \"valid-replication-keys\": [\n              \"_id\"\n            ],\n            \"selected\": true,\n            \"replication-method\": \"<replication method>\"\n          }\n        }\n      ],\n      \"stream\": \"<stream name>\",\n      \"schema\": {\n        \"type\": \"object\"\n      }\n    }\n  ]\n}\n\n```\n## Run in sync mode:\n```shell script\ntap-mongodb --config ~/config.json --catalog ~/catalog.json\n```\n\nThe tap will write bookmarks to stdout which can be captured and passed as an optional `--state state.json` parameter to the tap for the next sync.\n\n## Logging configuration\nThe tap uses a predefined logging config if none is provided, however, you can set your own config by setting the environment variable `LOGGING_CONFIG_FILE` as the path to the logging config.\nA sample config is available [here](./sample_logging.conf).\n\n---\n\nCopyright &copy; 2020 TransferWise\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Singer.io tap for extracting data from MongoDB - Pipelinewise compatible",
    "version": "1.5.0",
    "project_urls": {
        "Homepage": "https://github.com/transferwise/pipelinewise-tap-mongodb"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "35599b49817b9ffa83f306c7cba379b2077f19f6198f698fad60398c7ee897fc",
                "md5": "d1644b488b894c4b89b54dc94a247572",
                "sha256": "3fdbc7e8f211aaf978d3a9c0a0071c5eb05f2bd4bf5bc27a934c3dc0d02149a7"
            },
            "downloads": -1,
            "filename": "pipelinewise_tap_mongodb-1.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d1644b488b894c4b89b54dc94a247572",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 32264,
            "upload_time": "2024-06-19T10:52:51",
            "upload_time_iso_8601": "2024-06-19T10:52:51.352786Z",
            "url": "https://files.pythonhosted.org/packages/35/59/9b49817b9ffa83f306c7cba379b2077f19f6198f698fad60398c7ee897fc/pipelinewise_tap_mongodb-1.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c59d9de68970c9f090bff130b755d20f627fbd69ad70f2a073a265475865f3a8",
                "md5": "92ef4d99934cdd0f1b3db97c72031fdb",
                "sha256": "70d2827aae58be5e5ee4834be194725df3c310c9c0beae7cfd02da1f1b2a1c83"
            },
            "downloads": -1,
            "filename": "pipelinewise_tap_mongodb-1.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "92ef4d99934cdd0f1b3db97c72031fdb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 27937,
            "upload_time": "2024-06-19T10:52:54",
            "upload_time_iso_8601": "2024-06-19T10:52:54.718823Z",
            "url": "https://files.pythonhosted.org/packages/c5/9d/9de68970c9f090bff130b755d20f627fbd69ad70f2a073a265475865f3a8/pipelinewise_tap_mongodb-1.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-19 10:52:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "transferwise",
    "github_project": "pipelinewise-tap-mongodb",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pipelinewise-tap-mongodb"
}
        
Elapsed time: 0.26045s