toucan-connectors


Nametoucan-connectors JSON
Version 7.6.0 PyPI version JSON
download
home_pageNone
SummaryToucan Toco Connectors
upload_time2025-01-16 15:12:04
maintainerNone
docs_urlNone
authorToucan Toco
requires_python<3.14,>=3.11
licenseBSD
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Pypi-v](https://img.shields.io/pypi/v/toucan-connectors.svg)](https://pypi.python.org/pypi/toucan-connectors)
[![Pypi-pyversions](https://img.shields.io/pypi/pyversions/toucan-connectors.svg)](https://pypi.python.org/pypi/toucan-connectors)
[![Pypi-l](https://img.shields.io/pypi/l/toucan-connectors.svg)](https://pypi.python.org/pypi/toucan-connectors)
[![Pypi-wheel](https://img.shields.io/pypi/wheel/toucan-connectors.svg)](https://pypi.python.org/pypi/toucan-connectors)
[![GitHub Actions](https://github.com/ToucanToco/toucan-connectors/workflows/CI/badge.svg)](https://github.com/ToucanToco/toucan-connectors/actions?query=workflow%3ACI)
[![Coverage](https://sonarcloud.io/api/project_badges/measure?project=ToucanToco_toucan-connectors&metric=coverage)](https://sonarcloud.io/dashboard?id=ToucanToco_toucan-connectors)

# Toucan Connectors
[Toucan Toco](https://toucantoco.com/fr/) data connectors are plugins to the Toucan Toco platform. Their role is to return [Pandas DataFrames](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) from many different sources.

![Components Diagram](doc/ComponentsDiagram.jpeg)

Each connector is dedicated to a single type of source (PostrgeSQL, Mongo, Salesforce, etc...) and is made of two classes:

- `Connector` which contains all the necessary information to *use a data provider* (e.g. hostname,  auth method and details, etc...).
- `DataSource` which contains all the information to *get a dataframe* (query, path, etc...) using the `Connector` class above.

The Toucan Toco platform instantiates these classes using values provided by Toucan admin and app designers, it then uses the following methods to get data and metadata:

- `Connector._retrieve_data` returning an instance of `pandas.DataFrame`, method used to return data to a Toucan Toco end user
- `Connector.get_slice` returning an instance of `DataSlice`, method used to return data to a Toucan Toco application designer when building a query.
- `Connector.get_status` returning an instance of `ConnectorStatus`, method used to inform an admin or Toucan Toco application designer of the status of its connection to a third party data service. Is it reachable from our servers? Are the authentication details and method working? etc...

## Installing for development

We use `poetry` for packaging and development. Use the following command to install the project for development:

```
poetry install -E all
```

## Dependencies

This project uses `make` and `Python 3.8`. Install the main dependencies :

```bash
pip install -e .
```

We are using the `setuptools` construct `extra_requires` to define each connector's dependencies separately. For example to install the MySQL connector dependencies:

```bash
pip install -e ".[mysql]"
```

There is a shortcut called `all` to install all the dependencies for all the connectors. I do not recommend that you use this as a contributor to this package, but if you do, use the section below to install the necessary system packages.

```bash
pip install -e ".[all]"
```

You may face issues when instally the repo locally due to dependencies.
That's why a dev container is available to be used with visual studio.
Refer to [this doc](https://code.visualstudio.com/docs/remote/containers) to use it.


### System packages

Some connectors dependencies require specific system packages. As each connector can define its dependencies separatly you do not need this until you want to use these specific connectors.

#### ODBC

On `linux`, you're going to need bindings for `unixodbc` to install `pyodbc` from the requirements, and to install that (using apt), just follow:

```bash
sudo apt-get update
sudo apt-get install unixodbc-dev
```

#### MSSSQL

To test and use `mssql` (and `azure_mssql`) you need to install the Microsoft ODBC driver for SQL Server for
[Linux](https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver15)
or [MacOS](https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/install-microsoft-odbc-driver-sql-server-macos?view=sql-server-ver15)

#### PostgreSQL

On macOS, to test the `postgres` connector, you need to install `postgresql` by running for instance `brew install postgres`.
You can then install the library with `env LDFLAGS='-L/usr/local/lib -L/usr/local/opt/openssl/lib -L/usr/local/opt/readline/lib' pip install psycopg2`

#### Other

You can find all connectors specific documentation [here](doc/connectors)

## Testing

We are using `pytest` and various packages of its ecosystem.
To install the testing dependencies, run:

```bash
pip install -r requirements-testing.txt
```

As each connector is an independant plugin, its tests are written independently from the rest of the codebase.
Run the tests for a specifc connector (`http_api` in this example) like this:

```bash
pytest tests/http_api
```

Note: running the tests above implies that you have installed the specific dependencies of the `http_api` connector (using the `pip install -e .[http_api]` command)

Our CI does run all the tests for all the connectors, like this:

```
pip install -e ".[all]"
make test
```

Some connectors are tested using mocks (cf. `trello`), others are tested by making calls to data providers (cf. `elasticsearch`) running on the system in docker containers. The required images are in the `tests/docker-compose.yml` file, they need to be pulled (cf. `pytest --pull`) to run the relevant tests.

## Contributing

This is an open source repository under the [BSD 3-Clause Licence](https://github.com/ToucanToco/toucan-connectors/blob/master/LICENSE). The Toucan Toco tech team are the maintainers of this repository, we welcome contributions.

At the moment the main use of this code is its integration into Toucan Toco commercially licenced software, as a result our dev and maintenance efforts applied here are mostly driven by Toucan Toco internal priorities.

The starting point of a contribution should be an [Issue](https://github.com/ToucanToco/toucan-connectors/issues), either one you create or an existing one. This allows us (maintainers) to discuss the contribution before it is produced and avoids back and forth in reviews or stalled pull requests.

### Step 1: Generate base classes and tests files

To generate the connector and test modules from boilerplate, run:

```
make new_connector type=mytype
```

`mytype` should be the name of a system we would like to build a connector for,
such as `MySQL` or `Magento`.

Open the folder in `tests` for the new connector. You can start writing your tests before implementing it.

Some connectors are tested with calls to the actual data systems that they target,
for example `elasticsearch`, `mongo`, `mssql`.

Others are tested with mocks of the
classes or functions returning data that you are wrapping (see : `HttpAPI`, or
`microstrategy`).

If you have a container for your target system, add a docker image in
the `docker-compose.yml`, then use the `pytest` fixture `service_container` to automatically
start the docker and shut it down for you when you are running tests.

The fixture will not pull the image for you for each test runs, you need to pull the image on your machine (at least once) using the `pytest --pull` option.

### Step 2: New connector

Open the folder `mytype` in `toucan_connectors` for your new connector and create your classes.

```python
import pandas as pd

# Careful here you need to import ToucanConnector from the deep path, not the __init__ path.
from toucan_connectors.toucan_connector import ToucanConnector, ToucanDataSource


class MyTypeDataSource(ToucanDataSource):
    """Model of my datasource"""
    query: str


class MyTypeConnector(ToucanConnector, data_source_model=MyTypeDataSource):
    """Model of my connector"""
    host: str
    port: int
    database: str

    def _retrieve_data(self, data_source: MyTypeDataSource) -> pd.DataFrame:
        ...

    def get_slice(self, ...) -> DataSlice:
        ...

    def get_status(self) -> ConnectorStatus:
        ...
```

### Step 3: Register your connector, add documentation

Add your connector in `toucan_connectors/__init__.py`.
The key is what we call the `type` of the connector, which
is an id used to retrieve it when used in Toucan Toco platform.

```python
CONNECTORS_CATALOGUE = {
  ...,
  'MyType': 'mytype.mytype_connector.MyTypeConnector',
  ...
}
```

Add you connector requirements to the `setup.py` in the `extras_require` dictionary:

```ini
extras_require = {
    ...
    'mytype': ['my_dependency_pkg1==x.x.x', 'my_dependency_pkg2>=x.x.x']
}
```

If you need to add testing dependencies, add them to the `requirements-testing.txt` file.

You can now generate and edit the documentation page for your connector:

```shell
# Example: PYTHONPATH=. python doc/generate.py github > doc/connectors/github.md
PYTHONPATH=. python doc/generate.py myconnectormodule > doc/connectors/mytypeconnector.md
```

### Step 4 : Create a pull request

Make sure your new code is properly formatted by running `make lint`. If it's not, please use `make format`. You can now create a pull request.

## Publishing a release

1. Create a pull request updating **only** the [changelog](CHANGELOG.md) and the `version`
   attribute of the `[tool.poetry]` section in the `pyproject.toml` file.

2. Once the pull request is approved, merge it using the **squash and merge** strategy.

3. Create an annotated tag for the release commit. it should be in the `vX.Y.Z` format, where `X.Y.Z` is the semver
   version defined in `pyproject.toml`. Example:
   ```
   git tag -a v1.23.45 -m v1.23.45 ea3768a
   git push origin v1.23.45
   ```

4. In the project's *Releases* page, click on the *Draft a new release* button. Pick the tag you just pushed,
   and click on *Generate release notes*. Adapt the releases notes if needed, and click on *Publish release*.

5. A GitHub action in charge of publishing the required artifacts to PyPI should now be running. Make sure
   the action is successful.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "toucan-connectors",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.14,>=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": "Toucan Toco",
    "author_email": "dev@toucantoco.com",
    "download_url": "https://files.pythonhosted.org/packages/a7/ae/f5585d1b11cde1cb8545820f00692952bcc3862df0da6edf04e75974e633/toucan_connectors-7.6.0.tar.gz",
    "platform": null,
    "description": "[![Pypi-v](https://img.shields.io/pypi/v/toucan-connectors.svg)](https://pypi.python.org/pypi/toucan-connectors)\n[![Pypi-pyversions](https://img.shields.io/pypi/pyversions/toucan-connectors.svg)](https://pypi.python.org/pypi/toucan-connectors)\n[![Pypi-l](https://img.shields.io/pypi/l/toucan-connectors.svg)](https://pypi.python.org/pypi/toucan-connectors)\n[![Pypi-wheel](https://img.shields.io/pypi/wheel/toucan-connectors.svg)](https://pypi.python.org/pypi/toucan-connectors)\n[![GitHub Actions](https://github.com/ToucanToco/toucan-connectors/workflows/CI/badge.svg)](https://github.com/ToucanToco/toucan-connectors/actions?query=workflow%3ACI)\n[![Coverage](https://sonarcloud.io/api/project_badges/measure?project=ToucanToco_toucan-connectors&metric=coverage)](https://sonarcloud.io/dashboard?id=ToucanToco_toucan-connectors)\n\n# Toucan Connectors\n[Toucan Toco](https://toucantoco.com/fr/) data connectors are plugins to the Toucan Toco platform. Their role is to return [Pandas DataFrames](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) from many different sources.\n\n![Components Diagram](doc/ComponentsDiagram.jpeg)\n\nEach connector is dedicated to a single type of source (PostrgeSQL, Mongo, Salesforce, etc...) and is made of two classes:\n\n- `Connector` which contains all the necessary information to *use a data provider* (e.g. hostname,  auth method and details, etc...).\n- `DataSource` which contains all the information to *get a dataframe* (query, path, etc...) using the `Connector` class above.\n\nThe Toucan Toco platform instantiates these classes using values provided by Toucan admin and app designers, it then uses the following methods to get data and metadata:\n\n- `Connector._retrieve_data` returning an instance of `pandas.DataFrame`, method used to return data to a Toucan Toco end user\n- `Connector.get_slice` returning an instance of `DataSlice`, method used to return data to a Toucan Toco application designer when building a query.\n- `Connector.get_status` returning an instance of `ConnectorStatus`, method used to inform an admin or Toucan Toco application designer of the status of its connection to a third party data service. Is it reachable from our servers? Are the authentication details and method working? etc...\n\n## Installing for development\n\nWe use `poetry` for packaging and development. Use the following command to install the project for development:\n\n```\npoetry install -E all\n```\n\n## Dependencies\n\nThis project uses `make` and `Python 3.8`. Install the main dependencies :\n\n```bash\npip install -e .\n```\n\nWe are using the `setuptools` construct `extra_requires` to define each connector's dependencies separately. For example to install the MySQL connector dependencies:\n\n```bash\npip install -e \".[mysql]\"\n```\n\nThere is a shortcut called `all` to install all the dependencies for all the connectors. I do not recommend that you use this as a contributor to this package, but if you do, use the section below to install the necessary system packages.\n\n```bash\npip install -e \".[all]\"\n```\n\nYou may face issues when instally the repo locally due to dependencies.\nThat's why a dev container is available to be used with visual studio.\nRefer to [this doc](https://code.visualstudio.com/docs/remote/containers) to use it.\n\n\n### System packages\n\nSome connectors dependencies require specific system packages. As each connector can define its dependencies separatly you do not need this until you want to use these specific connectors.\n\n#### ODBC\n\nOn `linux`, you're going to need bindings for `unixodbc` to install `pyodbc` from the requirements, and to install that (using apt), just follow:\n\n```bash\nsudo apt-get update\nsudo apt-get install unixodbc-dev\n```\n\n#### MSSSQL\n\nTo test and use `mssql` (and `azure_mssql`) you need to install the Microsoft ODBC driver for SQL Server for\n[Linux](https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver15)\nor [MacOS](https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/install-microsoft-odbc-driver-sql-server-macos?view=sql-server-ver15)\n\n#### PostgreSQL\n\nOn macOS, to test the `postgres` connector, you need to install `postgresql` by running for instance `brew install postgres`.\nYou can then install the library with `env LDFLAGS='-L/usr/local/lib -L/usr/local/opt/openssl/lib -L/usr/local/opt/readline/lib' pip install psycopg2`\n\n#### Other\n\nYou can find all connectors specific documentation [here](doc/connectors)\n\n## Testing\n\nWe are using `pytest` and various packages of its ecosystem.\nTo install the testing dependencies, run:\n\n```bash\npip install -r requirements-testing.txt\n```\n\nAs each connector is an independant plugin, its tests are written independently from the rest of the codebase.\nRun the tests for a specifc connector (`http_api` in this example) like this:\n\n```bash\npytest tests/http_api\n```\n\nNote: running the tests above implies that you have installed the specific dependencies of the `http_api` connector (using the `pip install -e .[http_api]` command)\n\nOur CI does run all the tests for all the connectors, like this:\n\n```\npip install -e \".[all]\"\nmake test\n```\n\nSome connectors are tested using mocks (cf. `trello`), others are tested by making calls to data providers (cf. `elasticsearch`) running on the system in docker containers. The required images are in the `tests/docker-compose.yml` file, they need to be pulled (cf. `pytest --pull`) to run the relevant tests.\n\n## Contributing\n\nThis is an open source repository under the [BSD 3-Clause Licence](https://github.com/ToucanToco/toucan-connectors/blob/master/LICENSE). The Toucan Toco tech team are the maintainers of this repository, we welcome contributions.\n\nAt the moment the main use of this code is its integration into Toucan Toco commercially licenced software, as a result our dev and maintenance efforts applied here are mostly driven by Toucan Toco internal priorities.\n\nThe starting point of a contribution should be an [Issue](https://github.com/ToucanToco/toucan-connectors/issues), either one you create or an existing one. This allows us (maintainers) to discuss the contribution before it is produced and avoids back and forth in reviews or stalled pull requests.\n\n### Step 1: Generate base classes and tests files\n\nTo generate the connector and test modules from boilerplate, run:\n\n```\nmake new_connector type=mytype\n```\n\n`mytype` should be the name of a system we would like to build a connector for,\nsuch as `MySQL` or `Magento`.\n\nOpen the folder in `tests` for the new connector. You can start writing your tests before implementing it.\n\nSome connectors are tested with calls to the actual data systems that they target,\nfor example `elasticsearch`, `mongo`, `mssql`.\n\nOthers are tested with mocks of the\nclasses or functions returning data that you are wrapping (see : `HttpAPI`, or\n`microstrategy`).\n\nIf you have a container for your target system, add a docker image in\nthe `docker-compose.yml`, then use the `pytest` fixture `service_container` to automatically\nstart the docker and shut it down for you when you are running tests.\n\nThe fixture will not pull the image for you for each test runs, you need to pull the image on your machine (at least once) using the `pytest --pull` option.\n\n### Step 2: New connector\n\nOpen the folder `mytype` in `toucan_connectors` for your new connector and create your classes.\n\n```python\nimport pandas as pd\n\n# Careful here you need to import ToucanConnector from the deep path, not the __init__ path.\nfrom toucan_connectors.toucan_connector import ToucanConnector, ToucanDataSource\n\n\nclass MyTypeDataSource(ToucanDataSource):\n    \"\"\"Model of my datasource\"\"\"\n    query: str\n\n\nclass MyTypeConnector(ToucanConnector, data_source_model=MyTypeDataSource):\n    \"\"\"Model of my connector\"\"\"\n    host: str\n    port: int\n    database: str\n\n    def _retrieve_data(self, data_source: MyTypeDataSource) -> pd.DataFrame:\n        ...\n\n    def get_slice(self, ...) -> DataSlice:\n        ...\n\n    def get_status(self) -> ConnectorStatus:\n        ...\n```\n\n### Step 3: Register your connector, add documentation\n\nAdd your connector in `toucan_connectors/__init__.py`.\nThe key is what we call the `type` of the connector, which\nis an id used to retrieve it when used in Toucan Toco platform.\n\n```python\nCONNECTORS_CATALOGUE = {\n  ...,\n  'MyType': 'mytype.mytype_connector.MyTypeConnector',\n  ...\n}\n```\n\nAdd you connector requirements to the `setup.py` in the `extras_require` dictionary:\n\n```ini\nextras_require = {\n    ...\n    'mytype': ['my_dependency_pkg1==x.x.x', 'my_dependency_pkg2>=x.x.x']\n}\n```\n\nIf you need to add testing dependencies, add them to the `requirements-testing.txt` file.\n\nYou can now generate and edit the documentation page for your connector:\n\n```shell\n# Example: PYTHONPATH=. python doc/generate.py github > doc/connectors/github.md\nPYTHONPATH=. python doc/generate.py myconnectormodule > doc/connectors/mytypeconnector.md\n```\n\n### Step 4 : Create a pull request\n\nMake sure your new code is properly formatted by running `make lint`. If it's not, please use `make format`. You can now create a pull request.\n\n## Publishing a release\n\n1. Create a pull request updating **only** the [changelog](CHANGELOG.md) and the `version`\n   attribute of the `[tool.poetry]` section in the `pyproject.toml` file.\n\n2. Once the pull request is approved, merge it using the **squash and merge** strategy.\n\n3. Create an annotated tag for the release commit. it should be in the `vX.Y.Z` format, where `X.Y.Z` is the semver\n   version defined in `pyproject.toml`. Example:\n   ```\n   git tag -a v1.23.45 -m v1.23.45 ea3768a\n   git push origin v1.23.45\n   ```\n\n4. In the project's *Releases* page, click on the *Draft a new release* button. Pick the tag you just pushed,\n   and click on *Generate release notes*. Adapt the releases notes if needed, and click on *Publish release*.\n\n5. A GitHub action in charge of publishing the required artifacts to PyPI should now be running. Make sure\n   the action is successful.\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Toucan Toco Connectors",
    "version": "7.6.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cab8488c0a3a94a73f2678a18a7a4fc7fed7e40b36fe287e74c31b7382fdbc37",
                "md5": "d2388f870d89e14d0a0039461ae2b2dd",
                "sha256": "72224b4b48e339c89740c2c2f52c517870df9e2f53d31cd6277ce4e4b47eb9ab"
            },
            "downloads": -1,
            "filename": "toucan_connectors-7.6.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d2388f870d89e14d0a0039461ae2b2dd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.14,>=3.11",
            "size": 1060013,
            "upload_time": "2025-01-16T15:11:59",
            "upload_time_iso_8601": "2025-01-16T15:11:59.833656Z",
            "url": "https://files.pythonhosted.org/packages/ca/b8/488c0a3a94a73f2678a18a7a4fc7fed7e40b36fe287e74c31b7382fdbc37/toucan_connectors-7.6.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a7aef5585d1b11cde1cb8545820f00692952bcc3862df0da6edf04e75974e633",
                "md5": "1d3c2690f2b6ad3ed27eba40da8c5506",
                "sha256": "4e45db533d77708309588de6d2094b9e37191b204b9ffd67a480e87e2cec6c36"
            },
            "downloads": -1,
            "filename": "toucan_connectors-7.6.0.tar.gz",
            "has_sig": false,
            "md5_digest": "1d3c2690f2b6ad3ed27eba40da8c5506",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.14,>=3.11",
            "size": 1052645,
            "upload_time": "2025-01-16T15:12:04",
            "upload_time_iso_8601": "2025-01-16T15:12:04.598905Z",
            "url": "https://files.pythonhosted.org/packages/a7/ae/f5585d1b11cde1cb8545820f00692952bcc3862df0da6edf04e75974e633/toucan_connectors-7.6.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-16 15:12:04",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "toucan-connectors"
}
        
Elapsed time: 1.02615s