# EsXport
[![codecov](https://codecov.io/gh/nikhilbadyal/esxport/graph/badge.svg?token=zaoNlW2YXq)](https://codecov.io/gh/nikhilbadyal/esxport)
A Python-based CLI utility and module designed for querying Elasticsearch and exporting results as a CSV file.
Requirements
------------
1. This tool should be used with Elasticsearch 8.x version.
2. You also need >= `Python 3.8.x`.
Installation
------------
From source:
```bash
pip install esxport
```
For development purpose
```bash
pip install "esxport[dev]"
```
Usage
-----
### CLI Usage
Run `esxport --help` for detailed information on available options:
OPTIONS
---------
```text
Usage: esxport [OPTIONS]
Options:
-q, --query JSON Query string in Query DSL syntax. [required]
-o, --output-file PATH CSV file location. [required]
-i, --index-prefixes TEXT Index name prefix(es). [required]
-u, --url URL Elasticsearch host URL. [default: https://localhost:9200]
-U, --user TEXT Elasticsearch basic authentication user. [default: elastic]
-p, --password TEXT Elasticsearch basic authentication password. [required]
-f, --fields TEXT List of _source fields to present in the output. [default: _all]
-S, --sort ELASTIC SORT List of fields to sort in the format `<field>:<direction>`.
-d, --delimiter TEXT Delimiter to use in the CSV file. [default: ,]
-m, --max-results INTEGER Maximum number of results to return. [default: 10]
-s, --scroll-size INTEGER Scroll size for each batch of results. [default: 100]
-e, --meta-fields [_id|_index|_score]
Add meta-fields to the output.
--verify-certs Verify SSL certificates.
--ca-certs PATH Location of CA bundle.
--client-cert PATH Location of Client Auth cert.
--client-key PATH Location of Client Cert Key.
-v, --version Show version and exit.
--debug Enable debug mode.
--help Show this message and exit.
```
Module Usage
---------
In addition to the CLI, EsXport can now be used as a Python module. Below is an example of how to integrate it into
your Python application:
```python
from esxport import CliOptions, EsXport
kwargs = {
"query": {
"query": {"match_all": {}},
"size": 1000
},
"output_file": "output.csv",
"index_prefixes": ["my-index-prefix"],
"url": "https://localhost:9200",
"user": "elastic",
"password": "password",
"verify_certs": False,
"debug": True,
"max_results": 1000,
"scroll_size": 100,
"sort": ["field_name:asc"],
"ca_certs": "path/to/ca.crt"
}
# Create CLI options and initialize EsXport
cli_options = CliOptions(kwargs)
es = EsXport(cli_options)
# Export data
es.export()
```
Class Descriptions
------------------
### `CliOptions`
A configuration class to manage CLI arguments programmatically when using the module.
#### Attributes
| **Attribute** | **Type** | **Description** | **Default** |
|------------------|-------------|---------------------------------------------------------|-------------------------------|
| `query` | `dict` | Elasticsearch Query DSL syntax for filtering data. | N/A |
| `output_file` | `str` | Path to save the exported CSV file. | N/A |
| `url` | `str` | Elasticsearch host URL. | `"https://localhost:9200"` |
| `user` | `str` | Basic authentication username for Elasticsearch. | `"elastic"` |
| `password` | `str` | Basic authentication password for Elasticsearch. | N/A |
| `index_prefixes` | `list[str]` | List of index prefixes to query. | N/A |
| `fields` | `list[str]` | List of `_source` fields to include in the output. | `["_all"]` |
| `sort` | `list[str]` | Fields to sort the output in the format `field_name:asc | desc`. | N/A |
| `delimiter` | `str` | Delimiter for the CSV output. | `","` |
| `max_results` | `int` | Maximum number of results to fetch. | `10` |
| `scroll_size` | `int` | Batch size for scroll queries. | `100` |
| `meta_fields` | `list[str]` | Metadata fields to include in the output. | `["_id", "_index", "_score"]` |
| `verify_certs` | `bool` | Whether to verify SSL certificates. | `False` |
| `ca_certs` | `str` | Path to the CA certificate bundle. | N/A |
| `client_cert` | `str` | Path to the client certificate for authentication. | N/A |
| `client_key` | `str` | Path to the client key for authentication. | N/A |
| `debug` | `bool` | Enable debugging. | `False` |
---
#### Example Initialization
```python
from esxport import CliOptions
cli_options = CliOptions({
"query": {"query": {"match_all": {}}},
"output_file": "data.csv",
"url": "https://localhost:9200",
"user": "elastic",
"password": "password",
"index_prefixes": ["my-index-prefix"],
"fields": ["field1", "field2"],
"sort": ["field1:asc"],
"max_results": 1000,
"scroll_size": 100
})
```
### `EsXport`
The main class for executing the export operation.
#### Methods
| **Method** | **Description** |
|-----------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|
| `__init__(opts: CliOptions, es_client: ElasticsearchClient \| None = None)` | Initializes the `EsXport` object with options (`CliOptions`) and an optional Elasticsearch client. |
| `export()` | Executes the query and exports the results to the specified CSV file. |
---
#### Example Initialization and Usage
```python
from esxport import CliOptions, EsXport
# Define CLI options
cli_options = CliOptions({
"query": {"query": {"match_all": {}}},
"output_file": "output.csv",
"url": "https://localhost:9200",
"user": "elastic",
"password": "password",
"index_prefixes": ["my-index-prefix"]
})
# Initialize EsXport
esxport = EsXport(cli_options)
# Export data
esxport.export()
Raw data
{
"_id": null,
"home_page": null,
"name": "esxport",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "bulk, csv, elasticsearch, es, export",
"author": null,
"author_email": "Nikhil Badyal <nikhill773384@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/45/95/626948bfa2fbf377ad0af2636694b45ff57064a8bafac5114325ea5dafdb/esxport-8.16.0.tar.gz",
"platform": null,
"description": "# EsXport\n[![codecov](https://codecov.io/gh/nikhilbadyal/esxport/graph/badge.svg?token=zaoNlW2YXq)](https://codecov.io/gh/nikhilbadyal/esxport)\n\nA Python-based CLI utility and module designed for querying Elasticsearch and exporting results as a CSV file.\n\nRequirements\n------------\n1. This tool should be used with Elasticsearch 8.x version.\n2. You also need >= `Python 3.8.x`.\n\nInstallation\n------------\n\nFrom source:\n\n```bash\npip install esxport\n```\nFor development purpose\n```bash\npip install \"esxport[dev]\"\n```\nUsage\n-----\n\n### CLI Usage\n\nRun `esxport --help` for detailed information on available options:\n\n\nOPTIONS\n---------\n```text\nUsage: esxport [OPTIONS]\n\nOptions:\n -q, --query JSON Query string in Query DSL syntax. [required]\n -o, --output-file PATH CSV file location. [required]\n -i, --index-prefixes TEXT Index name prefix(es). [required]\n -u, --url URL Elasticsearch host URL. [default: https://localhost:9200]\n -U, --user TEXT Elasticsearch basic authentication user. [default: elastic]\n -p, --password TEXT Elasticsearch basic authentication password. [required]\n -f, --fields TEXT List of _source fields to present in the output. [default: _all]\n -S, --sort ELASTIC SORT List of fields to sort in the format `<field>:<direction>`.\n -d, --delimiter TEXT Delimiter to use in the CSV file. [default: ,]\n -m, --max-results INTEGER Maximum number of results to return. [default: 10]\n -s, --scroll-size INTEGER Scroll size for each batch of results. [default: 100]\n -e, --meta-fields [_id|_index|_score]\n Add meta-fields to the output.\n --verify-certs Verify SSL certificates.\n --ca-certs PATH Location of CA bundle.\n --client-cert PATH Location of Client Auth cert.\n --client-key PATH Location of Client Cert Key.\n -v, --version Show version and exit.\n --debug Enable debug mode.\n --help Show this message and exit.\n```\n\n\nModule Usage\n---------\nIn addition to the CLI, EsXport can now be used as a Python module. Below is an example of how to integrate it into\nyour Python application:\n\n```python\nfrom esxport import CliOptions, EsXport\n\nkwargs = {\n \"query\": {\n \"query\": {\"match_all\": {}},\n \"size\": 1000\n },\n \"output_file\": \"output.csv\",\n \"index_prefixes\": [\"my-index-prefix\"],\n \"url\": \"https://localhost:9200\",\n \"user\": \"elastic\",\n \"password\": \"password\",\n \"verify_certs\": False,\n \"debug\": True,\n \"max_results\": 1000,\n \"scroll_size\": 100,\n \"sort\": [\"field_name:asc\"],\n \"ca_certs\": \"path/to/ca.crt\"\n}\n\n# Create CLI options and initialize EsXport\ncli_options = CliOptions(kwargs)\nes = EsXport(cli_options)\n\n# Export data\nes.export()\n```\n\nClass Descriptions\n------------------\n\n### `CliOptions`\n\nA configuration class to manage CLI arguments programmatically when using the module.\n\n#### Attributes\n\n| **Attribute** | **Type** | **Description** | **Default** |\n|------------------|-------------|---------------------------------------------------------|-------------------------------|\n| `query` | `dict` | Elasticsearch Query DSL syntax for filtering data. | N/A |\n| `output_file` | `str` | Path to save the exported CSV file. | N/A |\n| `url` | `str` | Elasticsearch host URL. | `\"https://localhost:9200\"` |\n| `user` | `str` | Basic authentication username for Elasticsearch. | `\"elastic\"` |\n| `password` | `str` | Basic authentication password for Elasticsearch. | N/A |\n| `index_prefixes` | `list[str]` | List of index prefixes to query. | N/A |\n| `fields` | `list[str]` | List of `_source` fields to include in the output. | `[\"_all\"]` |\n| `sort` | `list[str]` | Fields to sort the output in the format `field_name:asc | desc`. | N/A |\n| `delimiter` | `str` | Delimiter for the CSV output. | `\",\"` |\n| `max_results` | `int` | Maximum number of results to fetch. | `10` |\n| `scroll_size` | `int` | Batch size for scroll queries. | `100` |\n| `meta_fields` | `list[str]` | Metadata fields to include in the output. | `[\"_id\", \"_index\", \"_score\"]` |\n| `verify_certs` | `bool` | Whether to verify SSL certificates. | `False` |\n| `ca_certs` | `str` | Path to the CA certificate bundle. | N/A |\n| `client_cert` | `str` | Path to the client certificate for authentication. | N/A |\n| `client_key` | `str` | Path to the client key for authentication. | N/A |\n| `debug` | `bool` | Enable debugging. | `False` |\n\n---\n\n#### Example Initialization\n\n```python\nfrom esxport import CliOptions\n\ncli_options = CliOptions({\n \"query\": {\"query\": {\"match_all\": {}}},\n \"output_file\": \"data.csv\",\n \"url\": \"https://localhost:9200\",\n \"user\": \"elastic\",\n \"password\": \"password\",\n \"index_prefixes\": [\"my-index-prefix\"],\n \"fields\": [\"field1\", \"field2\"],\n \"sort\": [\"field1:asc\"],\n \"max_results\": 1000,\n \"scroll_size\": 100\n})\n```\n\n\n### `EsXport`\n\nThe main class for executing the export operation.\n\n#### Methods\n\n| **Method** | **Description** |\n|-----------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|\n| `__init__(opts: CliOptions, es_client: ElasticsearchClient \\| None = None)` | Initializes the `EsXport` object with options (`CliOptions`) and an optional Elasticsearch client. |\n| `export()` | Executes the query and exports the results to the specified CSV file. |\n\n---\n\n#### Example Initialization and Usage\n\n```python\nfrom esxport import CliOptions, EsXport\n\n# Define CLI options\ncli_options = CliOptions({\n \"query\": {\"query\": {\"match_all\": {}}},\n \"output_file\": \"output.csv\",\n \"url\": \"https://localhost:9200\",\n \"user\": \"elastic\",\n \"password\": \"password\",\n \"index_prefixes\": [\"my-index-prefix\"]\n})\n\n# Initialize EsXport\nesxport = EsXport(cli_options)\n\n# Export data\nesxport.export()\n",
"bugtrack_url": null,
"license": null,
"summary": "An adept Python CLI utility designed for querying Elasticsearch and exporting result as a CSV file.",
"version": "8.16.0",
"project_urls": {
"Bug Tracker": "https://github.com/nikhilbadyal/esxport/issues",
"Homepage": "https://github.com/nikhilbadyal/esxport",
"Repository": "https://github.com/nikhilbadyal/esxport.git"
},
"split_keywords": [
"bulk",
" csv",
" elasticsearch",
" es",
" export"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "ef9c5de11c6b2caa8eaef92f5a5a293566cef06722cef7f2037ef7d07a7f23b9",
"md5": "205592b4f352224dafdade6950d3dea8",
"sha256": "707a1c32f121284b1001aac21fdf10c437a765581ec9db683dc827fa310aa067"
},
"downloads": -1,
"filename": "esxport-8.16.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "205592b4f352224dafdade6950d3dea8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 17365,
"upload_time": "2024-12-07T13:59:51",
"upload_time_iso_8601": "2024-12-07T13:59:51.811257Z",
"url": "https://files.pythonhosted.org/packages/ef/9c/5de11c6b2caa8eaef92f5a5a293566cef06722cef7f2037ef7d07a7f23b9/esxport-8.16.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4595626948bfa2fbf377ad0af2636694b45ff57064a8bafac5114325ea5dafdb",
"md5": "6fba953b96c668928bf63dc899edba46",
"sha256": "cb52fc72756b8010d06ba3840848da71be30e191946e40e7809577b845cd517e"
},
"downloads": -1,
"filename": "esxport-8.16.0.tar.gz",
"has_sig": false,
"md5_digest": "6fba953b96c668928bf63dc899edba46",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 14876,
"upload_time": "2024-12-07T13:59:54",
"upload_time_iso_8601": "2024-12-07T13:59:54.298807Z",
"url": "https://files.pythonhosted.org/packages/45/95/626948bfa2fbf377ad0af2636694b45ff57064a8bafac5114325ea5dafdb/esxport-8.16.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-07 13:59:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nikhilbadyal",
"github_project": "esxport",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "click",
"specs": [
[
"==",
"8.1.7"
]
]
},
{
"name": "click-params",
"specs": [
[
"==",
"0.5.0"
]
]
},
{
"name": "elasticsearch",
"specs": [
[
"==",
"8.16.0"
]
]
},
{
"name": "loguru",
"specs": [
[
"==",
"0.7.3"
]
]
},
{
"name": "tenacity",
"specs": [
[
"==",
"9.0.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
"==",
"4.67.1"
]
]
},
{
"name": "typing-extensions",
"specs": [
[
"==",
"4.12.2"
]
]
}
],
"lcname": "esxport"
}