as-mongodb


Nameas-mongodb JSON
Version 1.0.0 PyPI version JSON
download
home_pagehttps://github.com/datazip/as-mongodb
SummarySinger.io tap for extracting data from MongoDB - Datazip compatible
upload_time2023-06-23 10:14:08
maintainer
docs_urlNone
authorWise
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # as-mongodb

This is a [Singer](https://singer.io) tap that produces JSON-formatted data following the [Singer spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md) from a MongoDB source.

## Set up local dev environment:

```shell script
make setup
```

## Activate virtual environment

```shell script
. venv/bin/activate
```

## Set up Config file

Create json file called `config.json`, with the following contents:
```json
{
  "password": "<password>",
  "user": "<username>",
  "host": "<host ip address>",
  "auth_database": "<database name to authenticate on>",
  "database": "<database name to sync from>"
}
```
The following parameters are optional for your config file:

| Name | Type | Default value| Description |
| -----|------|--------|------------ |
| `srv` | Boolean | false | uses a `mongodb+srv` protocol to connect. Disables the usage of `port` argument if set to `True` |
| `port` | Integer | false | Connection port. Required if a non-srv connection is being used.  |
| `replica_set` | string | null | name of replica set |
| `ssl` | Boolean | false | can be set to true to connect using ssl |
| `verify_mode` | Boolean | true | Default SSL verify mode |
| `include_schemas_in_destination_stream_name` | Boolean |false  | forces the stream names to take the form `<database_name>-<collection_name>` instead of `<collection_name>`|
| `update_buffer_size` | int | 1 | [LOG_BASED] The size of the buffer that holds detected update operations in memory, the buffer is flushed once the size is reached |
| `await_time_ms` | int | 1000 | [LOG_BASED] The maximum amount of time in milliseconds the loge_base method waits for new data changes before exiting. |
| `full_load_on_empty_state` | bool | false | [LOG_BASED] A flag which forces full load when no previous token is found in state.|

All of the above attributes are required by the tap to connect to your mongo instance.
here is a [sample configuration file](./sample_config.json).

## Run in discovery mode
Run the following command and redirect the output into the catalog file
```shell script
as-mongodb --config ~/config.json --discover > ~/catalog.json
```

Your catalog file should now look like this:
```json
{
  "streams": [
    {
      "table_name": "<table name>",
      "tap_stream_id": "<tap_stream_id>",
      "metadata": [
        {
          "breadcrumb": [],
          "metadata": {
            "row-count":<int>,
            "is-view": <bool>,
            "database-name": "<database name>",
            "table-key-properties": [
              "_id"
            ],
            "valid-replication-keys": [
              "_id"
            ]
          }
        }
      ],
      "stream": "<stream name>",
      "schema": {
        "type": "object"
      }
    }
  ]
}
```

## Edit Catalog file
### Using valid json, edit the config.json file
To select a stream, enter the following to the stream's metadata:
```json
"selected": true,
"replication-method": "<replication method>",
```

`<replication-method>` must be either `FULL_TABLE`, `INCREMENTAL` or `LOG_BASED`, if it's `INCREMENTAL`, make sure to add a `"replication-key"`.


For example, if you were to edit the example stream to select the stream as well as add a projection, config.json should look this:
```json
{
  "streams": [
    {
      "table_name": "<table name>",
      "tap_stream_id": "<tap_stream_id>",
      "metadata": [
        {
          "breadcrumb": [],
          "metadata": {
            "row-count": <int>,
            "is-view": <bool>,
            "database-name": "<database name>",
            "table-key-properties": [
              "_id"
            ],
            "valid-replication-keys": [
              "_id"
            ],
            "selected": true,
            "replication-method": "<replication method>"
          }
        }
      ],
      "stream": "<stream name>",
      "schema": {
        "type": "object"
      }
    }
  ]
}

```
## Run in sync mode:
```shell script
as-mongodb --config ~/config.json --catalog ~/catalog.json
```

The tap will write bookmarks to stdout which can be captured and passed as an optional `--state state.json` parameter to the tap for the next sync.

## Logging configuration
The tap uses a predefined logging config if none is provided, however, you can set your own config by setting the environment variable `LOGGING_CONFIG_FILE` as the path to the logging config.
A sample config is available [here](./sample_logging.conf).

---

Copyright &copy; 2020 TransferWise



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/datazip/as-mongodb",
    "name": "as-mongodb",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Wise",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/3e/5b/4626f5abe5cebe76bd7e0519fe44b7498c7fce61aafb4c633172972391fc/as-mongodb-1.0.0.tar.gz",
    "platform": null,
    "description": "# as-mongodb\n\nThis is a [Singer](https://singer.io) tap that produces JSON-formatted data following the [Singer spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md) from a MongoDB source.\n\n## Set up local dev environment:\n\n```shell script\nmake setup\n```\n\n## Activate virtual environment\n\n```shell script\n. venv/bin/activate\n```\n\n## Set up Config file\n\nCreate json file called `config.json`, with the following contents:\n```json\n{\n  \"password\": \"<password>\",\n  \"user\": \"<username>\",\n  \"host\": \"<host ip address>\",\n  \"auth_database\": \"<database name to authenticate on>\",\n  \"database\": \"<database name to sync from>\"\n}\n```\nThe following parameters are optional for your config file:\n\n| Name | Type | Default value| Description |\n| -----|------|--------|------------ |\n| `srv` | Boolean | false | uses a `mongodb+srv` protocol to connect. Disables the usage of `port` argument if set to `True` |\n| `port` | Integer | false | Connection port. Required if a non-srv connection is being used.  |\n| `replica_set` | string | null | name of replica set |\n| `ssl` | Boolean | false | can be set to true to connect using ssl |\n| `verify_mode` | Boolean | true | Default SSL verify mode |\n| `include_schemas_in_destination_stream_name` | Boolean |false  | forces the stream names to take the form `<database_name>-<collection_name>` instead of `<collection_name>`|\n| `update_buffer_size` | int | 1 | [LOG_BASED] The size of the buffer that holds detected update operations in memory, the buffer is flushed once the size is reached |\n| `await_time_ms` | int | 1000 | [LOG_BASED] The maximum amount of time in milliseconds the loge_base method waits for new data changes before exiting. |\n| `full_load_on_empty_state` | bool | false | [LOG_BASED] A flag which forces full load when no previous token is found in state.|\n\nAll of the above attributes are required by the tap to connect to your mongo instance.\nhere is a [sample configuration file](./sample_config.json).\n\n## Run in discovery mode\nRun the following command and redirect the output into the catalog file\n```shell script\nas-mongodb --config ~/config.json --discover > ~/catalog.json\n```\n\nYour catalog file should now look like this:\n```json\n{\n  \"streams\": [\n    {\n      \"table_name\": \"<table name>\",\n      \"tap_stream_id\": \"<tap_stream_id>\",\n      \"metadata\": [\n        {\n          \"breadcrumb\": [],\n          \"metadata\": {\n            \"row-count\":<int>,\n            \"is-view\": <bool>,\n            \"database-name\": \"<database name>\",\n            \"table-key-properties\": [\n              \"_id\"\n            ],\n            \"valid-replication-keys\": [\n              \"_id\"\n            ]\n          }\n        }\n      ],\n      \"stream\": \"<stream name>\",\n      \"schema\": {\n        \"type\": \"object\"\n      }\n    }\n  ]\n}\n```\n\n## Edit Catalog file\n### Using valid json, edit the config.json file\nTo select a stream, enter the following to the stream's metadata:\n```json\n\"selected\": true,\n\"replication-method\": \"<replication method>\",\n```\n\n`<replication-method>` must be either `FULL_TABLE`, `INCREMENTAL` or `LOG_BASED`, if it's `INCREMENTAL`, make sure to add a `\"replication-key\"`.\n\n\nFor example, if you were to edit the example stream to select the stream as well as add a projection, config.json should look this:\n```json\n{\n  \"streams\": [\n    {\n      \"table_name\": \"<table name>\",\n      \"tap_stream_id\": \"<tap_stream_id>\",\n      \"metadata\": [\n        {\n          \"breadcrumb\": [],\n          \"metadata\": {\n            \"row-count\": <int>,\n            \"is-view\": <bool>,\n            \"database-name\": \"<database name>\",\n            \"table-key-properties\": [\n              \"_id\"\n            ],\n            \"valid-replication-keys\": [\n              \"_id\"\n            ],\n            \"selected\": true,\n            \"replication-method\": \"<replication method>\"\n          }\n        }\n      ],\n      \"stream\": \"<stream name>\",\n      \"schema\": {\n        \"type\": \"object\"\n      }\n    }\n  ]\n}\n\n```\n## Run in sync mode:\n```shell script\nas-mongodb --config ~/config.json --catalog ~/catalog.json\n```\n\nThe tap will write bookmarks to stdout which can be captured and passed as an optional `--state state.json` parameter to the tap for the next sync.\n\n## Logging configuration\nThe tap uses a predefined logging config if none is provided, however, you can set your own config by setting the environment variable `LOGGING_CONFIG_FILE` as the path to the logging config.\nA sample config is available [here](./sample_logging.conf).\n\n---\n\nCopyright &copy; 2020 TransferWise\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Singer.io tap for extracting data from MongoDB - Datazip compatible",
    "version": "1.0.0",
    "project_urls": {
        "Homepage": "https://github.com/datazip/as-mongodb"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "32166b83f1da0f9c4848d748141cdafbc7d2be2d2ea1251aa090b9da90c27770",
                "md5": "f1fe9dc7e5ae4e786977a6938d46dcab",
                "sha256": "b3d70a05152ef0c340100687efc0992865cf3ebf92b2bbb549a2edc76e2ac1c3"
            },
            "downloads": -1,
            "filename": "as_mongodb-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f1fe9dc7e5ae4e786977a6938d46dcab",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 33320,
            "upload_time": "2023-06-23T10:14:05",
            "upload_time_iso_8601": "2023-06-23T10:14:05.398598Z",
            "url": "https://files.pythonhosted.org/packages/32/16/6b83f1da0f9c4848d748141cdafbc7d2be2d2ea1251aa090b9da90c27770/as_mongodb-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3e5b4626f5abe5cebe76bd7e0519fe44b7498c7fce61aafb4c633172972391fc",
                "md5": "c6df301e1601312d7339e5995ea4af6c",
                "sha256": "59da48b8eab13148176ec9f7d4961c9d91258d29450b1770bd45293f227e22f8"
            },
            "downloads": -1,
            "filename": "as-mongodb-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c6df301e1601312d7339e5995ea4af6c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 30663,
            "upload_time": "2023-06-23T10:14:08",
            "upload_time_iso_8601": "2023-06-23T10:14:08.130891Z",
            "url": "https://files.pythonhosted.org/packages/3e/5b/4626f5abe5cebe76bd7e0519fe44b7498c7fce61aafb4c633172972391fc/as-mongodb-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-23 10:14:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "datazip",
    "github_project": "as-mongodb",
    "github_not_found": true,
    "lcname": "as-mongodb"
}
        
Elapsed time: 0.19325s