meltanolabs-tap-github


Namemeltanolabs-tap-github JSON
Version 1.20.0 PyPI version JSON
download
home_pageNone
SummarySinger tap for GitHub, built with the Singer SDK.
upload_time2025-07-08 17:38:36
maintainerMeltano and Meltano Community
docs_urlNone
authorMeltano and Meltano Community
requires_python>=3.9
licenseApache-2.0
keywords meltano singer meltano sdk singer sdk elt github
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # tap-github

`tap-github` is a Singer tap for GitHub.

Built with the [Singer SDK](https://gitlab.com/meltano/singer-sdk).

## Installation

```bash
# use uv (https://docs.astral.sh/uv/)
uv tool install meltanolabs-tap-github

# or pipx (https://pipx.pypa.io/stable/)
pipx install meltanolabs-tap-github

# or Meltano
meltano add extractor tap-github
```

A list of release versions is available at https://github.com/MeltanoLabs/tap-github/releases

## Configuration

### Accepted Config Options

This tap accepts the following configuration options:

- Required: One and only one of the following modes:
  1. `repositories`: An array of strings specifying the GitHub repositories to be included. Each element of the array should be of the form `<org>/<repository>`, e.g. `MeltanoLabs/tap-github`.
  2. `organizations`: An array of strings containing the github organizations to be included
  3. `searches`: An array of search descriptor objects with the following properties:
     - `name`: A human readable name for the search query
     - `query`: A github search string (generally the same as would come after `?q=` in the URL)
  4. `user_usernames`: A list of github usernames
  5. `user_ids`: A list of github user ids [int]
- Highly recommended:
  - Personal access tokens (PATs) for authentication can be provided in 3 ways:
    - `auth_token` - Takes a single token.
    - `additional_auth_tokens` - Takes a list of tokens. Can be used together with `auth_token` or as the sole source of PATs.
    - Any environment variables beginning with `GITHUB_TOKEN` will be assumed to be PATs. These tokens will be used in addition to `auth_token` (if provided), but will not be used if `additional_auth_tokens` is provided.
  - GitHub App keys are another option for authentication, and can be used in combination with PATs if desired. App IDs and keys should be assembled into the format `:app_id:;;-----BEGIN RSA PRIVATE KEY-----\n_YOUR_P_KEY_\n-----END RSA PRIVATE KEY-----` where the key can be generated from the `Private keys` section on https://github.com/organizations/:organization_name/settings/apps/:app_name.  Read more about GitHub App quotas [here](https://docs.github.com/en/enterprise-server@3.3/developers/apps/building-github-apps/rate-limits-for-github-apps#server-to-server-requests). Formatted app keys can be provided in 2 ways:
    - `auth_app_keys` - List of GitHub App keys in the prescribed format.
    - If `auth_app_keys` is not provided but there is an environment variable with the name `GITHUB_APP_PRIVATE_KEY`, it will be assumed to be an App key in the prescribed format.
- Optional:
  - `user_agent`
  - `start_date`
  - `metrics_log_level`
  - `stream_maps`
  - `stream_maps_config`
  - `stream_options`: Options which can change the behaviour of a specific stream are nested within.
    - `milestones`: Valid options for the `milestones` stream are nested within.
      - `state`: Determines which milestones will be extracted. One of `open` (default), `closed`, `all`.
  - `rate_limit_buffer`: A buffer to avoid consuming all query points for the auth_token at hand. Defaults to 1000.
  - `expiry_time_buffer`: A buffer used when determining when to refresh GitHub app tokens. Only relevant when authenticating as a GitHub app. Defaults to 10 minutes. Tokens generated by GitHub apps expire 1 hour after creation, and will be refreshed once fewer than `expiry_time_buffer` minutes remain until the anticipated expiry time.

Note that modes 1-3 are `repository` modes and 4-5 are `user` modes and will not run the same set of streams.

A full list of supported settings and capabilities for this tap is available by running:

```bash
tap-github --about
```

### Source Authentication and Authorization

A small number of records may be pulled without an auth token. However, a Github auth token should generally be considered "required" since it gives more realistic rate limits. (See GitHub API docs for more info.)

## Usage

### API Limitation - Pagination

The GitHub API is limited for some resources such as `/events`. For some resources, users might encounter the following error:

```
In order to keep the API fast for everyone, pagination is limited for this resource. Check the rel=last link relation in the Link response header to see how far back you can traverse.
```

To avoid this, the GitHub streams will exit early. I.e. when there are no more `next page` available. If you are fecthing `/events` at the repository level, beware of letting the tap disabled for longer than a few days or you will have gaps in your data.

You can easily run `tap-github` by itself or in a pipeline using [Meltano](www.meltano.com).

### Notes regarding permissions

* For the `traffic_*` streams, [you will need write access to the repository](https://docs.github.com/en/rest/metrics/traffic?apiVersion=2022-11-28). You can enable extraction for these streams by [selecting them in the catalog](https://hub.meltano.com/singer/spec/#metadata).

### Executing the Tap Directly

```bash
tap-github --version
tap-github --help
tap-github --config CONFIG --discover > ./catalog.json
```

## Contributing
This project uses parent-child streams. Learn more about them [here.](https://gitlab.com/meltano/sdk/-/blob/main/docs/parent_streams.md)

### Initialize your Development Environment

```bash
pipx install poetry
poetry install
```

### Create and Run Tests

Create tests within the `tap_github/tests` subfolder and
then run:

```bash
poetry run pytest
```

You can also test the `tap-github` CLI interface directly using `poetry run`:

```bash
poetry run tap-github --help
```

### Testing with [Meltano](meltano.com)

_**Note:** This tap will work in any Singer environment and does not require Meltano.
Examples here are for convenience and to streamline end-to-end orchestration scenarios._

Your project comes with a custom `meltano.yml` project file already created. Open the `meltano.yml` and follow any _"TODO"_ items listed in
the file.

Next, install Meltano (if you haven't already) and any needed plugins:

```bash
# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-github
meltano install
```

Now you can test and orchestrate using Meltano:

```bash
# Test invocation:
meltano invoke tap-github --version
# OR run a test `elt` pipeline:
meltano elt tap-github target-jsonl
```

One-liner to recreate output directory, run elt, and write out state file:

```bash
# Update this when you want a fresh state file:
TESTJOB=testjob1

# Run everything in one line
mkdir -p .output && meltano elt tap-github target-jsonl --job_id $TESTJOB && meltano elt tap-github target-jsonl --job_id $TESTJOB --dump=state > .output/state.json
```

### Singer SDK Dev Guide

See the [dev guide](../../docs/dev_guide.md) for more instructions on how to use the Singer SDK to
develop your own taps and targets.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "meltanolabs-tap-github",
    "maintainer": "Meltano and Meltano Community",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "hello@meltano.com",
    "keywords": "Meltano, Singer, Meltano SDK, Singer SDK, ELT, GitHub",
    "author": "Meltano and Meltano Community",
    "author_email": "hello@meltano.com",
    "download_url": "https://files.pythonhosted.org/packages/fb/d6/ce6b8ea18d0a362c28a7877348010126d7a6084dd124159ba2e8583e832c/meltanolabs_tap_github-1.20.0.tar.gz",
    "platform": null,
    "description": "# tap-github\n\n`tap-github` is a Singer tap for GitHub.\n\nBuilt with the [Singer SDK](https://gitlab.com/meltano/singer-sdk).\n\n## Installation\n\n```bash\n# use uv (https://docs.astral.sh/uv/)\nuv tool install meltanolabs-tap-github\n\n# or pipx (https://pipx.pypa.io/stable/)\npipx install meltanolabs-tap-github\n\n# or Meltano\nmeltano add extractor tap-github\n```\n\nA list of release versions is available at https://github.com/MeltanoLabs/tap-github/releases\n\n## Configuration\n\n### Accepted Config Options\n\nThis tap accepts the following configuration options:\n\n- Required: One and only one of the following modes:\n  1. `repositories`: An array of strings specifying the GitHub repositories to be included. Each element of the array should be of the form `<org>/<repository>`, e.g. `MeltanoLabs/tap-github`.\n  2. `organizations`: An array of strings containing the github organizations to be included\n  3. `searches`: An array of search descriptor objects with the following properties:\n     - `name`: A human readable name for the search query\n     - `query`: A github search string (generally the same as would come after `?q=` in the URL)\n  4. `user_usernames`: A list of github usernames\n  5. `user_ids`: A list of github user ids [int]\n- Highly recommended:\n  - Personal access tokens (PATs) for authentication can be provided in 3 ways:\n    - `auth_token` - Takes a single token.\n    - `additional_auth_tokens` - Takes a list of tokens. Can be used together with `auth_token` or as the sole source of PATs.\n    - Any environment variables beginning with `GITHUB_TOKEN` will be assumed to be PATs. These tokens will be used in addition to `auth_token` (if provided), but will not be used if `additional_auth_tokens` is provided.\n  - GitHub App keys are another option for authentication, and can be used in combination with PATs if desired. App IDs and keys should be assembled into the format `:app_id:;;-----BEGIN RSA PRIVATE KEY-----\\n_YOUR_P_KEY_\\n-----END RSA PRIVATE KEY-----` where the key can be generated from the `Private keys` section on https://github.com/organizations/:organization_name/settings/apps/:app_name.  Read more about GitHub App quotas [here](https://docs.github.com/en/enterprise-server@3.3/developers/apps/building-github-apps/rate-limits-for-github-apps#server-to-server-requests). Formatted app keys can be provided in 2 ways:\n    - `auth_app_keys` - List of GitHub App keys in the prescribed format.\n    - If `auth_app_keys` is not provided but there is an environment variable with the name `GITHUB_APP_PRIVATE_KEY`, it will be assumed to be an App key in the prescribed format.\n- Optional:\n  - `user_agent`\n  - `start_date`\n  - `metrics_log_level`\n  - `stream_maps`\n  - `stream_maps_config`\n  - `stream_options`: Options which can change the behaviour of a specific stream are nested within.\n    - `milestones`: Valid options for the `milestones` stream are nested within.\n      - `state`: Determines which milestones will be extracted. One of `open` (default), `closed`, `all`.\n  - `rate_limit_buffer`: A buffer to avoid consuming all query points for the auth_token at hand. Defaults to 1000.\n  - `expiry_time_buffer`: A buffer used when determining when to refresh GitHub app tokens. Only relevant when authenticating as a GitHub app. Defaults to 10 minutes. Tokens generated by GitHub apps expire 1 hour after creation, and will be refreshed once fewer than `expiry_time_buffer` minutes remain until the anticipated expiry time.\n\nNote that modes 1-3 are `repository` modes and 4-5 are `user` modes and will not run the same set of streams.\n\nA full list of supported settings and capabilities for this tap is available by running:\n\n```bash\ntap-github --about\n```\n\n### Source Authentication and Authorization\n\nA small number of records may be pulled without an auth token. However, a Github auth token should generally be considered \"required\" since it gives more realistic rate limits. (See GitHub API docs for more info.)\n\n## Usage\n\n### API Limitation - Pagination\n\nThe GitHub API is limited for some resources such as `/events`. For some resources, users might encounter the following error:\n\n```\nIn order to keep the API fast for everyone, pagination is limited for this resource. Check the rel=last link relation in the Link response header to see how far back you can traverse.\n```\n\nTo avoid this, the GitHub streams will exit early. I.e. when there are no more `next page` available. If you are fecthing `/events` at the repository level, beware of letting the tap disabled for longer than a few days or you will have gaps in your data.\n\nYou can easily run `tap-github` by itself or in a pipeline using [Meltano](www.meltano.com).\n\n### Notes regarding permissions\n\n* For the `traffic_*` streams, [you will need write access to the repository](https://docs.github.com/en/rest/metrics/traffic?apiVersion=2022-11-28). You can enable extraction for these streams by [selecting them in the catalog](https://hub.meltano.com/singer/spec/#metadata).\n\n### Executing the Tap Directly\n\n```bash\ntap-github --version\ntap-github --help\ntap-github --config CONFIG --discover > ./catalog.json\n```\n\n## Contributing\nThis project uses parent-child streams. Learn more about them [here.](https://gitlab.com/meltano/sdk/-/blob/main/docs/parent_streams.md)\n\n### Initialize your Development Environment\n\n```bash\npipx install poetry\npoetry install\n```\n\n### Create and Run Tests\n\nCreate tests within the `tap_github/tests` subfolder and\nthen run:\n\n```bash\npoetry run pytest\n```\n\nYou can also test the `tap-github` CLI interface directly using `poetry run`:\n\n```bash\npoetry run tap-github --help\n```\n\n### Testing with [Meltano](meltano.com)\n\n_**Note:** This tap will work in any Singer environment and does not require Meltano.\nExamples here are for convenience and to streamline end-to-end orchestration scenarios._\n\nYour project comes with a custom `meltano.yml` project file already created. Open the `meltano.yml` and follow any _\"TODO\"_ items listed in\nthe file.\n\nNext, install Meltano (if you haven't already) and any needed plugins:\n\n```bash\n# Install meltano\npipx install meltano\n# Initialize meltano within this directory\ncd tap-github\nmeltano install\n```\n\nNow you can test and orchestrate using Meltano:\n\n```bash\n# Test invocation:\nmeltano invoke tap-github --version\n# OR run a test `elt` pipeline:\nmeltano elt tap-github target-jsonl\n```\n\nOne-liner to recreate output directory, run elt, and write out state file:\n\n```bash\n# Update this when you want a fresh state file:\nTESTJOB=testjob1\n\n# Run everything in one line\nmkdir -p .output && meltano elt tap-github target-jsonl --job_id $TESTJOB && meltano elt tap-github target-jsonl --job_id $TESTJOB --dump=state > .output/state.json\n```\n\n### Singer SDK Dev Guide\n\nSee the [dev guide](../../docs/dev_guide.md) for more instructions on how to use the Singer SDK to\ndevelop your own taps and targets.\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Singer tap for GitHub, built with the Singer SDK.",
    "version": "1.20.0",
    "project_urls": {
        "Homepage": "https://github.com/MeltanoLabs/tap-github",
        "Issue Tracker": "https://github.com/MeltanoLabs/tap-github/issues",
        "Repository": "https://github.com/MeltanoLabs/tap-github"
    },
    "split_keywords": [
        "meltano",
        " singer",
        " meltano sdk",
        " singer sdk",
        " elt",
        " github"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c6508f096b5ac3b1d6b35b105c07be7fa16dc5c0d04ef2005b7d604a3f4392fb",
                "md5": "4a3c9d3989152204b43ae0b20b9332b9",
                "sha256": "0505d88da282ec8edf5993adbef45c6b6cf78d6b65ae58d895fc3a2f3807f683"
            },
            "downloads": -1,
            "filename": "meltanolabs_tap_github-1.20.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4a3c9d3989152204b43ae0b20b9332b9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 60093,
            "upload_time": "2025-07-08T17:38:34",
            "upload_time_iso_8601": "2025-07-08T17:38:34.643125Z",
            "url": "https://files.pythonhosted.org/packages/c6/50/8f096b5ac3b1d6b35b105c07be7fa16dc5c0d04ef2005b7d604a3f4392fb/meltanolabs_tap_github-1.20.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fbd6ce6b8ea18d0a362c28a7877348010126d7a6084dd124159ba2e8583e832c",
                "md5": "748ea38479c2826eb777da3d55162b70",
                "sha256": "97840a0db316d3d43293511a14a194ad597c84af458f2ad9a42ced9e17eaa120"
            },
            "downloads": -1,
            "filename": "meltanolabs_tap_github-1.20.0.tar.gz",
            "has_sig": false,
            "md5_digest": "748ea38479c2826eb777da3d55162b70",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 55165,
            "upload_time": "2025-07-08T17:38:36",
            "upload_time_iso_8601": "2025-07-08T17:38:36.033284Z",
            "url": "https://files.pythonhosted.org/packages/fb/d6/ce6b8ea18d0a362c28a7877348010126d7a6084dd124159ba2e8583e832c/meltanolabs_tap_github-1.20.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-08 17:38:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MeltanoLabs",
    "github_project": "tap-github",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "meltanolabs-tap-github"
}
        
Elapsed time: 0.41991s