airbyte-cdk


Nameairbyte-cdk JSON
Version 0.83.1 PyPI version JSON
download
home_pagehttps://github.com/airbytehq/airbyte
SummaryA framework for writing Airbyte Connectors.
upload_time2024-04-19 20:38:30
maintainerNone
docs_urlNone
authorAirbyte
requires_python<4.0,>=3.9
licenseMIT
keywords airbyte connector-development-kit cdk
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Airbyte Python CDK and Low-Code CDK

Airbyte Python CDK is a framework for building Airbyte API Source Connectors. It provides a set of
classes and helpers that make it easy to build a connector against an HTTP API (REST, GraphQL, etc),
or a generic Python source connector.

## Usage

If you're looking to build a connector, we highly recommend that you
[start with the Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview).
It should be enough for 90% connectors out there. For more flexible and complex connectors, use the
[low-code CDK and `SourceDeclarativeManifest`](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview).

If that doesn't work, then consider building on top of the
[lower-level Python CDK itself](https://docs.airbyte.com/connector-development/cdk-python/).

### Quick Start

To get started on a Python CDK based connector or a low-code connector, you can generate a connector
project from a template:

```bash
# from the repo root
cd airbyte-integrations/connector-templates/generator
./generate.sh
```

### Example Connectors

**HTTP Connectors**:

- [Stripe](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-stripe/)
- [Salesforce](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-salesforce/)

**Python connectors using the bare-bones `Source` abstraction**:

- [Google Sheets](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-google-sheets/google_sheets_source/google_sheets_source.py)

This will generate a project with a type and a name of your choice and put it in
`airbyte-integrations/connectors`. Open the directory with your connector in an editor and follow
the `TODO` items.

## Python CDK Overview

Airbyte CDK code is within `airbyte_cdk` directory. Here's a high level overview of what's inside:

- `connector_builder`. Internal wrapper that helps the Connector Builder platform run a declarative
  manifest (low-code connector). You should not use this code directly. If you need to run a
  `SourceDeclarativeManifest`, take a look at
  [`source-declarative-manifest`](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-declarative-manifest)
  connector implementation instead.
- `destinations`. Basic Destination connector support! If you're building a Destination connector in
  Python, try that. Some of our vector DB destinations like `destination-pinecone` are using that
  code.
- `models` expose `airbyte_protocol.models` as a part of `airbyte_cdk` package.
- `sources/concurrent_source` is the Concurrent CDK implementation. It supports reading data from
  streams concurrently per slice / partition, useful for connectors with high throughput and high
  number of records.
- `sources/declarative` is the low-code CDK. It works on top of Airbyte Python CDK, but provides a
  declarative manifest language to define streams, operations, etc. This makes it easier to build
  connectors without writing Python code.
- `sources/file_based` is the CDK for file-based sources. Examples include S3, Azure, GCS, etc.

## Contributing

Thank you for being interested in contributing to Airbyte Python CDK! Here are some guidelines to
get you started:

- We adhere to the [code of conduct](/CODE_OF_CONDUCT.md).
- You can contribute by reporting bugs, posting github discussions, opening issues, improving
  [documentation](/docs/), and submitting pull requests with bugfixes and new features alike.
- If you're changing the code, please add unit tests for your change.
- When submitting issues or PRs, please add a small reproduction project. Using the changes in your
  connector and providing that connector code as an example (or a satellite PR) helps!

### First time setup

Install the project dependencies and development tools:

```bash
poetry install --all-extras
```

Installing all extras is required to run the full suite of unit tests.

#### Running tests locally

- Iterate on the CDK code locally
- Run tests via `poetry run poe unit-test-with-cov`, or `python -m pytest -s unit_tests` if you want
  to pass pytest options.
- Run `poetry run poe check-local` to lint all code, type-check modified code, and run unit tests
  with coverage in one command.

To see all available scripts, run `poetry run poe`.

##### Autogenerated files

Low-code CDK models are generated from `sources/declarative/declarative_component_schema.yaml`. If
the iteration you are working on includes changes to the models or the connector generator, you
might want to regenerate them. In order to do that, you can run:

```bash
poetry run poe build
```

This will generate the code generator docker image and the component manifest files based on the
schemas and templates.

#### Testing

All tests are located in the `unit_tests` directory. Run `poetry run poe unit-test-with-cov` to run
them. This also presents a test coverage report. For faster iteration with no coverage report and
more options, `python -m pytest -s unit_tests` is a good place to start.

#### Building and testing a connector with your local CDK

When developing a new feature in the CDK, you may find it helpful to run a connector that uses that
new feature. You can test this in one of two ways:

- Running a connector locally
- Building and running a source via Docker

##### Installing your local CDK into a local Python connector

Open the connector's `pyproject.toml` file and replace the line with `airbyte_cdk` with the
following:

```toml
airbyte_cdk = { path = "../../../airbyte-cdk/python/airbyte_cdk", develop = true }
```

Then, running `poetry update` should reinstall `airbyte_cdk` from your local working directory.

##### Building a Python connector in Docker with your local CDK installed

_Pre-requisite: Install the
[`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_

You can build your connector image with the local CDK using

```bash
# from the airbytehq/airbyte base directory
airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
```

Note that the local CDK is injected at build time, so if you make changes, you will have to run the
build command again to see them reflected.

##### Running Connector Acceptance Tests for a single connector in Docker with your local CDK installed

_Pre-requisite: Install the
[`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_

To run acceptance tests for a single connectors using the local CDK, from the connector directory,
run

```bash
airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test
```

#### When you don't have access to the API

There may be a time when you do not have access to the API (either because you don't have the
credentials, network access, etc...) You will probably still want to do end-to-end testing at least
once. In order to do so, you can emulate the server you would be reaching using a server stubbing
tool.

For example, using [mockserver](https://www.mock-server.com/), you can set up an expectation file
like this:

```json
{
  "httpRequest": {
    "method": "GET",
    "path": "/data"
  },
  "httpResponse": {
    "body": "{\"data\": [{\"record_key\": 1}, {\"record_key\": 2}]}"
  }
}
```

Assuming this file has been created at `secrets/mock_server_config/expectations.json`, running the
following command will allow to match any requests on path `/data` to return the response defined in
the expectation file:

```bash
docker run -d --rm -v $(pwd)/secrets/mock_server_config:/config -p 8113:8113 --env MOCKSERVER_LOG_LEVEL=TRACE --env MOCKSERVER_SERVER_PORT=8113 --env MOCKSERVER_WATCH_INITIALIZATION_JSON=true --env MOCKSERVER_PERSISTED_EXPECTATIONS_PATH=/config/expectations.json --env MOCKSERVER_INITIALIZATION_JSON_PATH=/config/expectations.json mockserver/mockserver:5.15.0
```

HTTP requests to `localhost:8113/data` should now return the body defined in the expectations file.
To test this, the implementer either has to change the code which defines the base URL for Python
source or update the `url_base` from low-code. With the Connector Builder running in docker, you
will have to use domain `host.docker.internal` instead of `localhost` as the requests are executed
within docker.

#### Publishing a new version to PyPi

Python CDK has a
[GitHub workflow](https://github.com/airbytehq/airbyte/actions/workflows/publish-cdk-command-manually.yml)
that manages the CDK changelog, making a new release for `airbyte_cdk`, publishing it to PyPI, and
then making a commit to update (and subsequently auto-release)
[`source-declarative-manifest`](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-declarative-manifest)
and Connector Builder (in the platform repository).

> [!Note]: The workflow will handle the `CHANGELOG.md` entry for you. You should not add changelog
> lines in your PRs to the CDK itself.

> [!Warning]: The workflow bumps version on it's own, please don't change the CDK version in
> `pyproject.toml` manually.

1. You only trigger the release workflow once all the PRs that you want to be included are already
   merged into the `master` branch.
2. The
   [`Publish CDK Manually`](https://github.com/airbytehq/airbyte/actions/workflows/publish-cdk-command-manually.yml)
   workflow from master using `release-type=major|manor|patch` and setting the changelog message.
3. When the workflow runs, it will commit a new version directly to master branch.
4. The workflow will bump the version of `source-declarative-manifest` according to the
   `release-type` of the CDK, then commit these changes back to master. The commit to master will
   kick off a publish of the new version of `source-declarative-manifest`.
5. The workflow will also add a pull request to `airbyte-platform-internal` repo to bump the
   dependency in Connector Builder.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/airbytehq/airbyte",
    "name": "airbyte-cdk",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "airbyte, connector-development-kit, cdk",
    "author": "Airbyte",
    "author_email": "contact@airbyte.io",
    "download_url": "https://files.pythonhosted.org/packages/63/54/610677ca3825243b0b5f0f4c1bacc42903f24709f8fd77cbab9af0b250ac/airbyte_cdk-0.83.1.tar.gz",
    "platform": null,
    "description": "# Airbyte Python CDK and Low-Code CDK\n\nAirbyte Python CDK is a framework for building Airbyte API Source Connectors. It provides a set of\nclasses and helpers that make it easy to build a connector against an HTTP API (REST, GraphQL, etc),\nor a generic Python source connector.\n\n## Usage\n\nIf you're looking to build a connector, we highly recommend that you\n[start with the Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview).\nIt should be enough for 90% connectors out there. For more flexible and complex connectors, use the\n[low-code CDK and `SourceDeclarativeManifest`](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview).\n\nIf that doesn't work, then consider building on top of the\n[lower-level Python CDK itself](https://docs.airbyte.com/connector-development/cdk-python/).\n\n### Quick Start\n\nTo get started on a Python CDK based connector or a low-code connector, you can generate a connector\nproject from a template:\n\n```bash\n# from the repo root\ncd airbyte-integrations/connector-templates/generator\n./generate.sh\n```\n\n### Example Connectors\n\n**HTTP Connectors**:\n\n- [Stripe](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-stripe/)\n- [Salesforce](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-salesforce/)\n\n**Python connectors using the bare-bones `Source` abstraction**:\n\n- [Google Sheets](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-google-sheets/google_sheets_source/google_sheets_source.py)\n\nThis will generate a project with a type and a name of your choice and put it in\n`airbyte-integrations/connectors`. Open the directory with your connector in an editor and follow\nthe `TODO` items.\n\n## Python CDK Overview\n\nAirbyte CDK code is within `airbyte_cdk` directory. Here's a high level overview of what's inside:\n\n- `connector_builder`. Internal wrapper that helps the Connector Builder platform run a declarative\n  manifest (low-code connector). You should not use this code directly. If you need to run a\n  `SourceDeclarativeManifest`, take a look at\n  [`source-declarative-manifest`](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-declarative-manifest)\n  connector implementation instead.\n- `destinations`. Basic Destination connector support! If you're building a Destination connector in\n  Python, try that. Some of our vector DB destinations like `destination-pinecone` are using that\n  code.\n- `models` expose `airbyte_protocol.models` as a part of `airbyte_cdk` package.\n- `sources/concurrent_source` is the Concurrent CDK implementation. It supports reading data from\n  streams concurrently per slice / partition, useful for connectors with high throughput and high\n  number of records.\n- `sources/declarative` is the low-code CDK. It works on top of Airbyte Python CDK, but provides a\n  declarative manifest language to define streams, operations, etc. This makes it easier to build\n  connectors without writing Python code.\n- `sources/file_based` is the CDK for file-based sources. Examples include S3, Azure, GCS, etc.\n\n## Contributing\n\nThank you for being interested in contributing to Airbyte Python CDK! Here are some guidelines to\nget you started:\n\n- We adhere to the [code of conduct](/CODE_OF_CONDUCT.md).\n- You can contribute by reporting bugs, posting github discussions, opening issues, improving\n  [documentation](/docs/), and submitting pull requests with bugfixes and new features alike.\n- If you're changing the code, please add unit tests for your change.\n- When submitting issues or PRs, please add a small reproduction project. Using the changes in your\n  connector and providing that connector code as an example (or a satellite PR) helps!\n\n### First time setup\n\nInstall the project dependencies and development tools:\n\n```bash\npoetry install --all-extras\n```\n\nInstalling all extras is required to run the full suite of unit tests.\n\n#### Running tests locally\n\n- Iterate on the CDK code locally\n- Run tests via `poetry run poe unit-test-with-cov`, or `python -m pytest -s unit_tests` if you want\n  to pass pytest options.\n- Run `poetry run poe check-local` to lint all code, type-check modified code, and run unit tests\n  with coverage in one command.\n\nTo see all available scripts, run `poetry run poe`.\n\n##### Autogenerated files\n\nLow-code CDK models are generated from `sources/declarative/declarative_component_schema.yaml`. If\nthe iteration you are working on includes changes to the models or the connector generator, you\nmight want to regenerate them. In order to do that, you can run:\n\n```bash\npoetry run poe build\n```\n\nThis will generate the code generator docker image and the component manifest files based on the\nschemas and templates.\n\n#### Testing\n\nAll tests are located in the `unit_tests` directory. Run `poetry run poe unit-test-with-cov` to run\nthem. This also presents a test coverage report. For faster iteration with no coverage report and\nmore options, `python -m pytest -s unit_tests` is a good place to start.\n\n#### Building and testing a connector with your local CDK\n\nWhen developing a new feature in the CDK, you may find it helpful to run a connector that uses that\nnew feature. You can test this in one of two ways:\n\n- Running a connector locally\n- Building and running a source via Docker\n\n##### Installing your local CDK into a local Python connector\n\nOpen the connector's `pyproject.toml` file and replace the line with `airbyte_cdk` with the\nfollowing:\n\n```toml\nairbyte_cdk = { path = \"../../../airbyte-cdk/python/airbyte_cdk\", develop = true }\n```\n\nThen, running `poetry update` should reinstall `airbyte_cdk` from your local working directory.\n\n##### Building a Python connector in Docker with your local CDK installed\n\n_Pre-requisite: Install the\n[`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_\n\nYou can build your connector image with the local CDK using\n\n```bash\n# from the airbytehq/airbyte base directory\nairbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build\n```\n\nNote that the local CDK is injected at build time, so if you make changes, you will have to run the\nbuild command again to see them reflected.\n\n##### Running Connector Acceptance Tests for a single connector in Docker with your local CDK installed\n\n_Pre-requisite: Install the\n[`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_\n\nTo run acceptance tests for a single connectors using the local CDK, from the connector directory,\nrun\n\n```bash\nairbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test\n```\n\n#### When you don't have access to the API\n\nThere may be a time when you do not have access to the API (either because you don't have the\ncredentials, network access, etc...) You will probably still want to do end-to-end testing at least\nonce. In order to do so, you can emulate the server you would be reaching using a server stubbing\ntool.\n\nFor example, using [mockserver](https://www.mock-server.com/), you can set up an expectation file\nlike this:\n\n```json\n{\n  \"httpRequest\": {\n    \"method\": \"GET\",\n    \"path\": \"/data\"\n  },\n  \"httpResponse\": {\n    \"body\": \"{\\\"data\\\": [{\\\"record_key\\\": 1}, {\\\"record_key\\\": 2}]}\"\n  }\n}\n```\n\nAssuming this file has been created at `secrets/mock_server_config/expectations.json`, running the\nfollowing command will allow to match any requests on path `/data` to return the response defined in\nthe expectation file:\n\n```bash\ndocker run -d --rm -v $(pwd)/secrets/mock_server_config:/config -p 8113:8113 --env MOCKSERVER_LOG_LEVEL=TRACE --env MOCKSERVER_SERVER_PORT=8113 --env MOCKSERVER_WATCH_INITIALIZATION_JSON=true --env MOCKSERVER_PERSISTED_EXPECTATIONS_PATH=/config/expectations.json --env MOCKSERVER_INITIALIZATION_JSON_PATH=/config/expectations.json mockserver/mockserver:5.15.0\n```\n\nHTTP requests to `localhost:8113/data` should now return the body defined in the expectations file.\nTo test this, the implementer either has to change the code which defines the base URL for Python\nsource or update the `url_base` from low-code. With the Connector Builder running in docker, you\nwill have to use domain `host.docker.internal` instead of `localhost` as the requests are executed\nwithin docker.\n\n#### Publishing a new version to PyPi\n\nPython CDK has a\n[GitHub workflow](https://github.com/airbytehq/airbyte/actions/workflows/publish-cdk-command-manually.yml)\nthat manages the CDK changelog, making a new release for `airbyte_cdk`, publishing it to PyPI, and\nthen making a commit to update (and subsequently auto-release)\n[`source-declarative-m\fanifest`](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-declarative-manifest)\nand Connector Builder (in the platform repository).\n\n> [!Note]: The workflow will handle the `CHANGELOG.md` entry for you. You should not add changelog\n> lines in your PRs to the CDK itself.\n\n> [!Warning]: The workflow bumps version on it's own, please don't change the CDK version in\n> `pyproject.toml` manually.\n\n1. You only trigger the release workflow once all the PRs that you want to be included are already\n   merged into the `master` branch.\n2. The\n   [`Publish CDK Manually`](https://github.com/airbytehq/airbyte/actions/workflows/publish-cdk-command-manually.yml)\n   workflow from master using `release-type=major|manor|patch` and setting the changelog message.\n3. When the workflow runs, it will commit a new version directly to master branch.\n4. The workflow will bump the version of `source-declarative-manifest` according to the\n   `release-type` of the CDK, then commit these changes back to master. The commit to master will\n   kick off a publish of the new version of `source-declarative-manifest`.\n5. The workflow will also add a pull request to `airbyte-platform-internal` repo to bump the\n   dependency in Connector Builder.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A framework for writing Airbyte Connectors.",
    "version": "0.83.1",
    "project_urls": {
        "Documentation": "https://docs.airbyte.io/",
        "Homepage": "https://github.com/airbytehq/airbyte",
        "Repository": "https://github.com/airbytehq/airbyte"
    },
    "split_keywords": [
        "airbyte",
        " connector-development-kit",
        " cdk"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4bdf5095fb1ddbafc067ed4724d3054c036e9665d908ad9eab062573a7be51e7",
                "md5": "d00fb5c4f1fe815e01406e5ca9cc4b03",
                "sha256": "c1e1b5b24ce145575b5605179ff8e4c9fc8ae34e30f35a466846ffbba54b858a"
            },
            "downloads": -1,
            "filename": "airbyte_cdk-0.83.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d00fb5c4f1fe815e01406e5ca9cc4b03",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 409406,
            "upload_time": "2024-04-19T20:38:25",
            "upload_time_iso_8601": "2024-04-19T20:38:25.105216Z",
            "url": "https://files.pythonhosted.org/packages/4b/df/5095fb1ddbafc067ed4724d3054c036e9665d908ad9eab062573a7be51e7/airbyte_cdk-0.83.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6354610677ca3825243b0b5f0f4c1bacc42903f24709f8fd77cbab9af0b250ac",
                "md5": "6bf5c183590614ef75b6eddea8c67198",
                "sha256": "73342874ebb99791afa5da1e6b5ff9decd226644a2fd6cbffa5934819c2de0c5"
            },
            "downloads": -1,
            "filename": "airbyte_cdk-0.83.1.tar.gz",
            "has_sig": false,
            "md5_digest": "6bf5c183590614ef75b6eddea8c67198",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 278160,
            "upload_time": "2024-04-19T20:38:30",
            "upload_time_iso_8601": "2024-04-19T20:38:30.255641Z",
            "url": "https://files.pythonhosted.org/packages/63/54/610677ca3825243b0b5f0f4c1bacc42903f24709f8fd77cbab9af0b250ac/airbyte_cdk-0.83.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-19 20:38:30",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "airbytehq",
    "github_project": "airbyte",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "airbyte-cdk"
}
        
Elapsed time: 0.25865s