# tap-csv-folder
`tap-csv-folder` is a Singer tap for CSV files stored in a folder on a local or remote filesystem.
Built with the [Meltano Tap SDK](https://sdk.meltano.com) for Singer Taps.
## Installation
<!-- TODO
Install from PyPi:
```bash
pipx install tap-csv-folder
```
-->
Install from GitHub:
```bash
pipx install git+https://github.com/MeltanoLabs/tap-csv-folder.git@main
```
Install in a Meltano project:
```bash
meltano add tap-csv-folder
```
## Configuration
### Accepted Config Options
| Setting | Required | Default | Description |
| :------------------------- | :------- | :------ | :-------------------------------------------------------------------------------------------------------------------- |
| delimiter | False | , | Field delimiter character. |
| quotechar | False | " | Quote character. |
| escapechar | False | None | Escape character. |
| doublequote | False | true | Whether quotechar inside a field should be doubled. |
| lineterminator | False | | |
| Line terminator character. | | | |
| filesystem | False | local | The filesystem to use. |
| path | False | None | Path to the directory where the files are stored. |
| read_mode | False | None | Use `one_stream_per_file` to read each file as a separate stream, or `merge` to merge all files into a single stream. |
| stream_name | False | files | Name of the stream to use when `read_mode` is `merge`. |
#### Filesystem settings
The following settings are provided by the Singer SDK for filesystems and supported by the tap.
| Setting | Required | Default | Description |
| :----------------------- | :------- | :------ | :---------------------------------------- |
| ftp | False | None | FTP connection settings |
| ftp.host | True | None | FTP server host |
| ftp.port | False | 21 | FTP server port |
| ftp.username | False | None | FTP username |
| ftp.password | False | None | FTP password |
| ftp.timeout | False | 60 | Timeout of the FTP connection in seconds |
| ftp.encoding | False | utf-8 | FTP server encoding |
| sftp | False | None | SFTP connection settings |
| sftp.host | True | None | SFTP server host |
| sftp.ssh_kwargs | False | None | SSH connection settings |
| sftp.ssh_kwargs.port | False | 22 | SFTP server port |
| sftp.ssh_kwargs.username | True | None | SFTP username |
| sftp.ssh_kwargs.password | False | None | SFTP password |
| sftp.ssh_kwargs.pkey | False | None | Private key |
| sftp.ssh_kwargs.timeout | False | 60 | Timeout of the SFTP connection in seconds |
#### Built-in Singer SDK capabilities
The following settings are provided by the Singer SDK and automatically supported by the tap.
| Setting | Required | Default | Description |
| :-------------------------------- | :------- | :------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| stream_maps | False | None | Config object for stream maps capability. For more information check out [Stream Maps](https://sdk.meltano.com/en/latest/stream_maps.html). |
| stream_map_config | False | None | User-defined config values to be used within map expressions. |
| faker_config | False | None | Config for the [`Faker`](https://faker.readthedocs.io/en/master/) instance variable `fake` used within map expressions. Only applicable if the plugin specifies `faker` as an addtional dependency (through the `singer-sdk` `faker` extra or directly). |
| faker_config.seed | False | None | Value to seed the Faker generator for deterministic output: https://faker.readthedocs.io/en/master/#seeding-the-generator |
| faker_config.locale | False | None | One or more LCID locale strings to produce localized output for: https://faker.readthedocs.io/en/master/#localization |
| flattening_enabled | False | None | 'True' to enable schema flattening and automatically expand nested properties. |
| flattening_max_depth | False | None | The max depth to flatten schemas. |
| batch_config | False | None | |
| batch_config.encoding | False | None | Specifies the format and compression of the batch files. |
| batch_config.encoding.format | False | None | Format to use for batch files. |
| batch_config.encoding.compression | False | None | Compression format to use for batch files. |
| batch_config.storage | False | None | Defines the storage layer to use when writing batch files |
| batch_config.storage.root | False | None | Root path to use when writing batch files. |
| batch_config.storage.prefix | False | None | Prefix to use when writing batch files. |
A full list of supported settings and capabilities for this
tap is available by running:
```bash
tap-csv-folder --about
```
### Configure using environment variables
This Singer tap will automatically import any environment variables within the working directory's
`.env` if the `--config=ENV` is provided, such that config values will be considered if a matching
environment variable is set either in the terminal context or in the `.env` file.
### Source Authentication and Authorization
## Usage
You can easily run `tap-csv-folder` by itself or in a pipeline using [Meltano](https://meltano.com/).
### Executing the Tap Directly
```bash
tap-csv-folder --version
tap-csv-folder --help
tap-csv-folder --config CONFIG --discover > ./catalog.json
```
## Developer Resources
Follow these instructions to contribute to this project.
### Initialize your Development Environment
```bash
pipx install poetry
poetry install
```
### Create and Run Tests
Create tests within the `tests` subfolder and
then run:
```bash
poetry run pytest
```
You can also test the `tap-csv-folder` CLI interface directly using `poetry run`:
```bash
poetry run tap-csv-folder --help
```
### Testing with [Meltano](https://www.meltano.com)
_**Note:** This tap will work in any Singer environment and does not require Meltano.
Examples here are for convenience and to streamline end-to-end orchestration scenarios._
Next, install Meltano (if you haven't already) and any needed plugins:
```bash
# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-csv-folder
meltano install
```
Now you can test and orchestrate using Meltano:
```bash
# Test invocation:
meltano invoke tap-csv-folder --version
# OR run a test `elt` pipeline:
meltano run tap-csv-folder target-jsonl
```
### SDK Dev Guide
See the [dev guide](https://sdk.meltano.com/en/latest/dev_guide.html) for more instructions on how to use the SDK to
develop your own taps and targets.
Raw data
{
"_id": null,
"home_page": null,
"name": "tap-csv-folder",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "ELT, CSVFolder",
"author": "Edgar Ram\u00edrez-Mondrag\u00f3n",
"author_email": "edgar@arch.dev",
"download_url": "https://files.pythonhosted.org/packages/2c/b0/bee0c73eb8470b8e00feaa4aab2e5b97806467d1b7f35c4606301bf1e56f/tap_csv_folder-0.0.1a1.tar.gz",
"platform": null,
"description": "# tap-csv-folder\n\n`tap-csv-folder` is a Singer tap for CSV files stored in a folder on a local or remote filesystem.\n\nBuilt with the [Meltano Tap SDK](https://sdk.meltano.com) for Singer Taps.\n\n## Installation\n\n<!-- TODO\n\nInstall from PyPi:\n\n```bash\npipx install tap-csv-folder\n```\n\n-->\n\nInstall from GitHub:\n\n```bash\npipx install git+https://github.com/MeltanoLabs/tap-csv-folder.git@main\n```\n\nInstall in a Meltano project:\n\n```bash\nmeltano add tap-csv-folder\n```\n\n## Configuration\n\n### Accepted Config Options\n\n| Setting | Required | Default | Description |\n| :------------------------- | :------- | :------ | :-------------------------------------------------------------------------------------------------------------------- |\n| delimiter | False | , | Field delimiter character. |\n| quotechar | False | \" | Quote character. |\n| escapechar | False | None | Escape character. |\n| doublequote | False | true | Whether quotechar inside a field should be doubled. |\n| lineterminator | False | | |\n| Line terminator character. | | | |\n| filesystem | False | local | The filesystem to use. |\n| path | False | None | Path to the directory where the files are stored. |\n| read_mode | False | None | Use `one_stream_per_file` to read each file as a separate stream, or `merge` to merge all files into a single stream. |\n| stream_name | False | files | Name of the stream to use when `read_mode` is `merge`. |\n\n#### Filesystem settings\n\nThe following settings are provided by the Singer SDK for filesystems and supported by the tap.\n\n| Setting | Required | Default | Description |\n| :----------------------- | :------- | :------ | :---------------------------------------- |\n| ftp | False | None | FTP connection settings |\n| ftp.host | True | None | FTP server host |\n| ftp.port | False | 21 | FTP server port |\n| ftp.username | False | None | FTP username |\n| ftp.password | False | None | FTP password |\n| ftp.timeout | False | 60 | Timeout of the FTP connection in seconds |\n| ftp.encoding | False | utf-8 | FTP server encoding |\n| sftp | False | None | SFTP connection settings |\n| sftp.host | True | None | SFTP server host |\n| sftp.ssh_kwargs | False | None | SSH connection settings |\n| sftp.ssh_kwargs.port | False | 22 | SFTP server port |\n| sftp.ssh_kwargs.username | True | None | SFTP username |\n| sftp.ssh_kwargs.password | False | None | SFTP password |\n| sftp.ssh_kwargs.pkey | False | None | Private key |\n| sftp.ssh_kwargs.timeout | False | 60 | Timeout of the SFTP connection in seconds |\n\n#### Built-in Singer SDK capabilities\n\nThe following settings are provided by the Singer SDK and automatically supported by the tap.\n\n| Setting | Required | Default | Description |\n| :-------------------------------- | :------- | :------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| stream_maps | False | None | Config object for stream maps capability. For more information check out [Stream Maps](https://sdk.meltano.com/en/latest/stream_maps.html). |\n| stream_map_config | False | None | User-defined config values to be used within map expressions. |\n| faker_config | False | None | Config for the [`Faker`](https://faker.readthedocs.io/en/master/) instance variable `fake` used within map expressions. Only applicable if the plugin specifies `faker` as an addtional dependency (through the `singer-sdk` `faker` extra or directly). |\n| faker_config.seed | False | None | Value to seed the Faker generator for deterministic output: https://faker.readthedocs.io/en/master/#seeding-the-generator |\n| faker_config.locale | False | None | One or more LCID locale strings to produce localized output for: https://faker.readthedocs.io/en/master/#localization |\n| flattening_enabled | False | None | 'True' to enable schema flattening and automatically expand nested properties. |\n| flattening_max_depth | False | None | The max depth to flatten schemas. |\n| batch_config | False | None | |\n| batch_config.encoding | False | None | Specifies the format and compression of the batch files. |\n| batch_config.encoding.format | False | None | Format to use for batch files. |\n| batch_config.encoding.compression | False | None | Compression format to use for batch files. |\n| batch_config.storage | False | None | Defines the storage layer to use when writing batch files |\n| batch_config.storage.root | False | None | Root path to use when writing batch files. |\n| batch_config.storage.prefix | False | None | Prefix to use when writing batch files. |\n\nA full list of supported settings and capabilities for this\ntap is available by running:\n\n```bash\ntap-csv-folder --about\n```\n\n### Configure using environment variables\n\nThis Singer tap will automatically import any environment variables within the working directory's\n`.env` if the `--config=ENV` is provided, such that config values will be considered if a matching\nenvironment variable is set either in the terminal context or in the `.env` file.\n\n### Source Authentication and Authorization\n\n## Usage\n\nYou can easily run `tap-csv-folder` by itself or in a pipeline using [Meltano](https://meltano.com/).\n\n### Executing the Tap Directly\n\n```bash\ntap-csv-folder --version\ntap-csv-folder --help\ntap-csv-folder --config CONFIG --discover > ./catalog.json\n```\n\n## Developer Resources\n\nFollow these instructions to contribute to this project.\n\n### Initialize your Development Environment\n\n```bash\npipx install poetry\npoetry install\n```\n\n### Create and Run Tests\n\nCreate tests within the `tests` subfolder and\n then run:\n\n```bash\npoetry run pytest\n```\n\nYou can also test the `tap-csv-folder` CLI interface directly using `poetry run`:\n\n```bash\npoetry run tap-csv-folder --help\n```\n\n### Testing with [Meltano](https://www.meltano.com)\n\n_**Note:** This tap will work in any Singer environment and does not require Meltano.\nExamples here are for convenience and to streamline end-to-end orchestration scenarios._\n\nNext, install Meltano (if you haven't already) and any needed plugins:\n\n```bash\n# Install meltano\npipx install meltano\n# Initialize meltano within this directory\ncd tap-csv-folder\nmeltano install\n```\n\nNow you can test and orchestrate using Meltano:\n\n```bash\n# Test invocation:\nmeltano invoke tap-csv-folder --version\n# OR run a test `elt` pipeline:\nmeltano run tap-csv-folder target-jsonl\n```\n\n### SDK Dev Guide\n\nSee the [dev guide](https://sdk.meltano.com/en/latest/dev_guide.html) for more instructions on how to use the SDK to\ndevelop your own taps and targets.\n\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Singer tap for CSVFolder, built with the Meltano Singer SDK.",
"version": "0.0.1a1",
"project_urls": null,
"split_keywords": [
"elt",
" csvfolder"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "69c1221f122654590484f7278e37a13e177f480c7737bd3efc6a708aa16a55ad",
"md5": "af458c55160adb88f4d425c2b978d84d",
"sha256": "b4ba507f487bbd1d5cdb70f95a153f719ffffe8c6a12345528e80aea2a97b6d8"
},
"downloads": -1,
"filename": "tap_csv_folder-0.0.1a1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "af458c55160adb88f4d425c2b978d84d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 9882,
"upload_time": "2024-10-09T17:05:37",
"upload_time_iso_8601": "2024-10-09T17:05:37.565166Z",
"url": "https://files.pythonhosted.org/packages/69/c1/221f122654590484f7278e37a13e177f480c7737bd3efc6a708aa16a55ad/tap_csv_folder-0.0.1a1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2cb0bee0c73eb8470b8e00feaa4aab2e5b97806467d1b7f35c4606301bf1e56f",
"md5": "675a686e052e2df9b72de30d5975cb38",
"sha256": "e31970c150916ffc9fe712c2829269bc3d45e5072f8f459245a38d77f5da1c19"
},
"downloads": -1,
"filename": "tap_csv_folder-0.0.1a1.tar.gz",
"has_sig": false,
"md5_digest": "675a686e052e2df9b72de30d5975cb38",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 8682,
"upload_time": "2024-10-09T17:05:38",
"upload_time_iso_8601": "2024-10-09T17:05:38.458285Z",
"url": "https://files.pythonhosted.org/packages/2c/b0/bee0c73eb8470b8e00feaa4aab2e5b97806467d1b7f35c4606301bf1e56f/tap_csv_folder-0.0.1a1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-09 17:05:38",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "tap-csv-folder"
}