esxport


Nameesxport JSON
Version 8.16.0 PyPI version JSON
download
home_pageNone
SummaryAn adept Python CLI utility designed for querying Elasticsearch and exporting result as a CSV file.
upload_time2024-12-07 13:59:54
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords bulk csv elasticsearch es export
VCS
bugtrack_url
requirements click click-params elasticsearch loguru tenacity tqdm typing-extensions
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # EsXport
[![codecov](https://codecov.io/gh/nikhilbadyal/esxport/graph/badge.svg?token=zaoNlW2YXq)](https://codecov.io/gh/nikhilbadyal/esxport)

A Python-based CLI utility and module designed for querying Elasticsearch and exporting results as a CSV file.

Requirements
------------
1. This tool should be used with Elasticsearch 8.x version.
2. You also need >= `Python 3.8.x`.

Installation
------------

From source:

```bash
pip install esxport
```
For development purpose
```bash
pip install "esxport[dev]"
```
Usage
-----

### CLI Usage

Run `esxport --help` for detailed information on available options:


OPTIONS
---------
```text
Usage: esxport [OPTIONS]

Options:
  -q, --query JSON           Query string in Query DSL syntax. [required]
  -o, --output-file PATH     CSV file location. [required]
  -i, --index-prefixes TEXT  Index name prefix(es). [required]
  -u, --url URL              Elasticsearch host URL. [default: https://localhost:9200]
  -U, --user TEXT            Elasticsearch basic authentication user. [default: elastic]
  -p, --password TEXT        Elasticsearch basic authentication password. [required]
  -f, --fields TEXT          List of _source fields to present in the output. [default: _all]
  -S, --sort ELASTIC SORT    List of fields to sort in the format `<field>:<direction>`.
  -d, --delimiter TEXT       Delimiter to use in the CSV file. [default: ,]
  -m, --max-results INTEGER  Maximum number of results to return. [default: 10]
  -s, --scroll-size INTEGER  Scroll size for each batch of results. [default: 100]
  -e, --meta-fields [_id|_index|_score]
                             Add meta-fields to the output.
  --verify-certs             Verify SSL certificates.
  --ca-certs PATH            Location of CA bundle.
  --client-cert PATH         Location of Client Auth cert.
  --client-key PATH          Location of Client Cert Key.
  -v, --version              Show version and exit.
  --debug                    Enable debug mode.
  --help                     Show this message and exit.
```


Module Usage
---------
In addition to the CLI, EsXport can now be used as a Python module. Below is an example of how to integrate it into
your Python application:

```python
from esxport import CliOptions, EsXport

kwargs = {
    "query": {
        "query": {"match_all": {}},
        "size": 1000
    },
    "output_file": "output.csv",
    "index_prefixes": ["my-index-prefix"],
    "url": "https://localhost:9200",
    "user": "elastic",
    "password": "password",
    "verify_certs": False,
    "debug": True,
    "max_results": 1000,
    "scroll_size": 100,
    "sort": ["field_name:asc"],
    "ca_certs": "path/to/ca.crt"
}

# Create CLI options and initialize EsXport
cli_options = CliOptions(kwargs)
es = EsXport(cli_options)

# Export data
es.export()
```

Class Descriptions
------------------

### `CliOptions`

A configuration class to manage CLI arguments programmatically when using the module.

#### Attributes

| **Attribute**    | **Type**    | **Description**                                         | **Default**                   |
|------------------|-------------|---------------------------------------------------------|-------------------------------|
| `query`          | `dict`      | Elasticsearch Query DSL syntax for filtering data.      | N/A                           |
| `output_file`    | `str`       | Path to save the exported CSV file.                     | N/A                           |
| `url`            | `str`       | Elasticsearch host URL.                                 | `"https://localhost:9200"`    |
| `user`           | `str`       | Basic authentication username for Elasticsearch.        | `"elastic"`                   |
| `password`       | `str`       | Basic authentication password for Elasticsearch.        | N/A                           |
| `index_prefixes` | `list[str]` | List of index prefixes to query.                        | N/A                           |
| `fields`         | `list[str]` | List of `_source` fields to include in the output.      | `["_all"]`                    |
| `sort`           | `list[str]` | Fields to sort the output in the format `field_name:asc | desc`.                        | N/A               |
| `delimiter`      | `str`       | Delimiter for the CSV output.                           | `","`                         |
| `max_results`    | `int`       | Maximum number of results to fetch.                     | `10`                          |
| `scroll_size`    | `int`       | Batch size for scroll queries.                          | `100`                         |
| `meta_fields`    | `list[str]` | Metadata fields to include in the output.               | `["_id", "_index", "_score"]` |
| `verify_certs`   | `bool`      | Whether to verify SSL certificates.                     | `False`                       |
| `ca_certs`       | `str`       | Path to the CA certificate bundle.                      | N/A                           |
| `client_cert`    | `str`       | Path to the client certificate for authentication.      | N/A                           |
| `client_key`     | `str`       | Path to the client key for authentication.              | N/A                           |
| `debug`          | `bool`      | Enable debugging.                                       | `False`                       |

---

#### Example Initialization

```python
from esxport import CliOptions

cli_options = CliOptions({
    "query": {"query": {"match_all": {}}},
    "output_file": "data.csv",
    "url": "https://localhost:9200",
    "user": "elastic",
    "password": "password",
    "index_prefixes": ["my-index-prefix"],
    "fields": ["field1", "field2"],
    "sort": ["field1:asc"],
    "max_results": 1000,
    "scroll_size": 100
})
```


### `EsXport`

The main class for executing the export operation.

#### Methods

| **Method**                                                                  | **Description**                                                                                    |
|-----------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|
| `__init__(opts: CliOptions, es_client: ElasticsearchClient \| None = None)` | Initializes the `EsXport` object with options (`CliOptions`) and an optional Elasticsearch client. |
| `export()`                                                                  | Executes the query and exports the results to the specified CSV file.                              |

---

#### Example Initialization and Usage

```python
from esxport import CliOptions, EsXport

# Define CLI options
cli_options = CliOptions({
    "query": {"query": {"match_all": {}}},
    "output_file": "output.csv",
    "url": "https://localhost:9200",
    "user": "elastic",
    "password": "password",
    "index_prefixes": ["my-index-prefix"]
})

# Initialize EsXport
esxport = EsXport(cli_options)

# Export data
esxport.export()

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "esxport",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "bulk, csv, elasticsearch, es, export",
    "author": null,
    "author_email": "Nikhil Badyal <nikhill773384@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/45/95/626948bfa2fbf377ad0af2636694b45ff57064a8bafac5114325ea5dafdb/esxport-8.16.0.tar.gz",
    "platform": null,
    "description": "# EsXport\n[![codecov](https://codecov.io/gh/nikhilbadyal/esxport/graph/badge.svg?token=zaoNlW2YXq)](https://codecov.io/gh/nikhilbadyal/esxport)\n\nA Python-based CLI utility and module designed for querying Elasticsearch and exporting results as a CSV file.\n\nRequirements\n------------\n1. This tool should be used with Elasticsearch 8.x version.\n2. You also need >= `Python 3.8.x`.\n\nInstallation\n------------\n\nFrom source:\n\n```bash\npip install esxport\n```\nFor development purpose\n```bash\npip install \"esxport[dev]\"\n```\nUsage\n-----\n\n### CLI Usage\n\nRun `esxport --help` for detailed information on available options:\n\n\nOPTIONS\n---------\n```text\nUsage: esxport [OPTIONS]\n\nOptions:\n  -q, --query JSON           Query string in Query DSL syntax. [required]\n  -o, --output-file PATH     CSV file location. [required]\n  -i, --index-prefixes TEXT  Index name prefix(es). [required]\n  -u, --url URL              Elasticsearch host URL. [default: https://localhost:9200]\n  -U, --user TEXT            Elasticsearch basic authentication user. [default: elastic]\n  -p, --password TEXT        Elasticsearch basic authentication password. [required]\n  -f, --fields TEXT          List of _source fields to present in the output. [default: _all]\n  -S, --sort ELASTIC SORT    List of fields to sort in the format `<field>:<direction>`.\n  -d, --delimiter TEXT       Delimiter to use in the CSV file. [default: ,]\n  -m, --max-results INTEGER  Maximum number of results to return. [default: 10]\n  -s, --scroll-size INTEGER  Scroll size for each batch of results. [default: 100]\n  -e, --meta-fields [_id|_index|_score]\n                             Add meta-fields to the output.\n  --verify-certs             Verify SSL certificates.\n  --ca-certs PATH            Location of CA bundle.\n  --client-cert PATH         Location of Client Auth cert.\n  --client-key PATH          Location of Client Cert Key.\n  -v, --version              Show version and exit.\n  --debug                    Enable debug mode.\n  --help                     Show this message and exit.\n```\n\n\nModule Usage\n---------\nIn addition to the CLI, EsXport can now be used as a Python module. Below is an example of how to integrate it into\nyour Python application:\n\n```python\nfrom esxport import CliOptions, EsXport\n\nkwargs = {\n    \"query\": {\n        \"query\": {\"match_all\": {}},\n        \"size\": 1000\n    },\n    \"output_file\": \"output.csv\",\n    \"index_prefixes\": [\"my-index-prefix\"],\n    \"url\": \"https://localhost:9200\",\n    \"user\": \"elastic\",\n    \"password\": \"password\",\n    \"verify_certs\": False,\n    \"debug\": True,\n    \"max_results\": 1000,\n    \"scroll_size\": 100,\n    \"sort\": [\"field_name:asc\"],\n    \"ca_certs\": \"path/to/ca.crt\"\n}\n\n# Create CLI options and initialize EsXport\ncli_options = CliOptions(kwargs)\nes = EsXport(cli_options)\n\n# Export data\nes.export()\n```\n\nClass Descriptions\n------------------\n\n### `CliOptions`\n\nA configuration class to manage CLI arguments programmatically when using the module.\n\n#### Attributes\n\n| **Attribute**    | **Type**    | **Description**                                         | **Default**                   |\n|------------------|-------------|---------------------------------------------------------|-------------------------------|\n| `query`          | `dict`      | Elasticsearch Query DSL syntax for filtering data.      | N/A                           |\n| `output_file`    | `str`       | Path to save the exported CSV file.                     | N/A                           |\n| `url`            | `str`       | Elasticsearch host URL.                                 | `\"https://localhost:9200\"`    |\n| `user`           | `str`       | Basic authentication username for Elasticsearch.        | `\"elastic\"`                   |\n| `password`       | `str`       | Basic authentication password for Elasticsearch.        | N/A                           |\n| `index_prefixes` | `list[str]` | List of index prefixes to query.                        | N/A                           |\n| `fields`         | `list[str]` | List of `_source` fields to include in the output.      | `[\"_all\"]`                    |\n| `sort`           | `list[str]` | Fields to sort the output in the format `field_name:asc | desc`.                        | N/A               |\n| `delimiter`      | `str`       | Delimiter for the CSV output.                           | `\",\"`                         |\n| `max_results`    | `int`       | Maximum number of results to fetch.                     | `10`                          |\n| `scroll_size`    | `int`       | Batch size for scroll queries.                          | `100`                         |\n| `meta_fields`    | `list[str]` | Metadata fields to include in the output.               | `[\"_id\", \"_index\", \"_score\"]` |\n| `verify_certs`   | `bool`      | Whether to verify SSL certificates.                     | `False`                       |\n| `ca_certs`       | `str`       | Path to the CA certificate bundle.                      | N/A                           |\n| `client_cert`    | `str`       | Path to the client certificate for authentication.      | N/A                           |\n| `client_key`     | `str`       | Path to the client key for authentication.              | N/A                           |\n| `debug`          | `bool`      | Enable debugging.                                       | `False`                       |\n\n---\n\n#### Example Initialization\n\n```python\nfrom esxport import CliOptions\n\ncli_options = CliOptions({\n    \"query\": {\"query\": {\"match_all\": {}}},\n    \"output_file\": \"data.csv\",\n    \"url\": \"https://localhost:9200\",\n    \"user\": \"elastic\",\n    \"password\": \"password\",\n    \"index_prefixes\": [\"my-index-prefix\"],\n    \"fields\": [\"field1\", \"field2\"],\n    \"sort\": [\"field1:asc\"],\n    \"max_results\": 1000,\n    \"scroll_size\": 100\n})\n```\n\n\n### `EsXport`\n\nThe main class for executing the export operation.\n\n#### Methods\n\n| **Method**                                                                  | **Description**                                                                                    |\n|-----------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|\n| `__init__(opts: CliOptions, es_client: ElasticsearchClient \\| None = None)` | Initializes the `EsXport` object with options (`CliOptions`) and an optional Elasticsearch client. |\n| `export()`                                                                  | Executes the query and exports the results to the specified CSV file.                              |\n\n---\n\n#### Example Initialization and Usage\n\n```python\nfrom esxport import CliOptions, EsXport\n\n# Define CLI options\ncli_options = CliOptions({\n    \"query\": {\"query\": {\"match_all\": {}}},\n    \"output_file\": \"output.csv\",\n    \"url\": \"https://localhost:9200\",\n    \"user\": \"elastic\",\n    \"password\": \"password\",\n    \"index_prefixes\": [\"my-index-prefix\"]\n})\n\n# Initialize EsXport\nesxport = EsXport(cli_options)\n\n# Export data\nesxport.export()\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "An adept Python CLI utility designed for querying Elasticsearch and exporting result as a CSV file.",
    "version": "8.16.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/nikhilbadyal/esxport/issues",
        "Homepage": "https://github.com/nikhilbadyal/esxport",
        "Repository": "https://github.com/nikhilbadyal/esxport.git"
    },
    "split_keywords": [
        "bulk",
        " csv",
        " elasticsearch",
        " es",
        " export"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ef9c5de11c6b2caa8eaef92f5a5a293566cef06722cef7f2037ef7d07a7f23b9",
                "md5": "205592b4f352224dafdade6950d3dea8",
                "sha256": "707a1c32f121284b1001aac21fdf10c437a765581ec9db683dc827fa310aa067"
            },
            "downloads": -1,
            "filename": "esxport-8.16.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "205592b4f352224dafdade6950d3dea8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 17365,
            "upload_time": "2024-12-07T13:59:51",
            "upload_time_iso_8601": "2024-12-07T13:59:51.811257Z",
            "url": "https://files.pythonhosted.org/packages/ef/9c/5de11c6b2caa8eaef92f5a5a293566cef06722cef7f2037ef7d07a7f23b9/esxport-8.16.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4595626948bfa2fbf377ad0af2636694b45ff57064a8bafac5114325ea5dafdb",
                "md5": "6fba953b96c668928bf63dc899edba46",
                "sha256": "cb52fc72756b8010d06ba3840848da71be30e191946e40e7809577b845cd517e"
            },
            "downloads": -1,
            "filename": "esxport-8.16.0.tar.gz",
            "has_sig": false,
            "md5_digest": "6fba953b96c668928bf63dc899edba46",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 14876,
            "upload_time": "2024-12-07T13:59:54",
            "upload_time_iso_8601": "2024-12-07T13:59:54.298807Z",
            "url": "https://files.pythonhosted.org/packages/45/95/626948bfa2fbf377ad0af2636694b45ff57064a8bafac5114325ea5dafdb/esxport-8.16.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-07 13:59:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nikhilbadyal",
    "github_project": "esxport",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "click",
            "specs": [
                [
                    "==",
                    "8.1.7"
                ]
            ]
        },
        {
            "name": "click-params",
            "specs": [
                [
                    "==",
                    "0.5.0"
                ]
            ]
        },
        {
            "name": "elasticsearch",
            "specs": [
                [
                    "==",
                    "8.16.0"
                ]
            ]
        },
        {
            "name": "loguru",
            "specs": [
                [
                    "==",
                    "0.7.3"
                ]
            ]
        },
        {
            "name": "tenacity",
            "specs": [
                [
                    "==",
                    "9.0.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    "==",
                    "4.67.1"
                ]
            ]
        },
        {
            "name": "typing-extensions",
            "specs": [
                [
                    "==",
                    "4.12.2"
                ]
            ]
        }
    ],
    "lcname": "esxport"
}
        
Elapsed time: 0.65148s