.. contents:: **elasticsearch-faker**
:backlinks: top
:depth: 2
Summary
============================================
`elasticsearch-faker` is a CLI tool to generate fake data for Elasticsearch.
.. image:: https://badge.fury.io/py/elasticsearch-faker.svg
:target: https://badge.fury.io/py/elasticsearch-faker
:alt: PyPI package version
.. image:: https://img.shields.io/pypi/pyversions/elasticsearch-faker.svg
:target: https://pypi.org/project/elasticsearch-faker
:alt: Supported Python versions
.. image:: https://github.com/thombashi/elasticsearch-faker/workflows/Tests/badge.svg
:target: https://github.com/thombashi/elasticsearch-faker/actions?query=workflow%3ATests
:alt: Tests CI status
.. image:: https://github.com/thombashi/elasticsearch-faker/actions/workflows/build_and_release.yml/badge.svg
:target: https://github.com/thombashi/elasticsearch-faker/actions/workflows/build_and_release.yml
:alt: Build and release CI status
Installation
============================================
Installation: pip
------------------------------
::
pip install elasticsearch-faker
Installation: dpkg (Ubuntu)
--------------------------------------------
1. Navigate to `Releases page <https://github.com/thombashi/elasticsearch-faker/releases>`__
2. Download the latest ``deb`` package
3. Install with ``dpkg -i`` command
Installation: Docker container
--------------------------------------------
`Packages page <https://github.com/thombashi/elasticsearch-faker/pkgs/container/elasticsearch-faker>`__
Usage
============================================
Command help
----------------------------------------------
::
Usage: elasticsearch-faker [OPTIONS] COMMAND [ARGS]...
Faker for Elasticsearch.
Options:
--version Show the version and exit.
--debug For debug print.
-q, --quiet Suppress execution log messages.
-v, --verbose
--locale [ar_EG|ar_PS|ar_SA|bs_BA|bg_BG|cs_CZ|de_DE|dk_DK|el_GR|en_AU|en_CA|en_GB|en_NZ|en_US|es_ES|es_MX|et_EE|fa_IR|fi_FI|fr_FR|hi_IN|hr_HR|hu_HU|it_IT|ja_JP|ko_KR|lt_LT|lv_LV|ne_NP|nl_NL|no_NO|pl_PL|pt_BR|pt_PT|ro_RO|ru_RU|sl_SI|sv_SE|tr_TR|uk_UA|zh_CN|zh_TW|ka_GE]
Specify localization for fake data. Defaults
to en_US.
--seed INTEGER Random seed for faker.
--basic-auth-user TEXT User name for Elasticsearch basic
authentication. Or you can set the value via
ES_BASIC_AUTH_USER environment variable.
--basic-auth-password TEXT Password for Elasticsearch basic
authentication. Or you can set the value via
ES_BASIC_AUTH_PASSWORD environment variable.
--verify-certs Verify Elasticsearch server certificate. Or
you can set the value via
ES_SSL_ASSERT_FINGERPRINT environment
variable.
--ssl-assert-fingerprint TEXT SSL certificate fingerprint to verify.
--ignore-es-warn Ignore ElasticsearchWarning.
-h, --help Show this message and exit.
Commands:
generate Generate fake data and put it to an Elasticsearch index.
provider Show or search providers for doc templates.
show-stats Fetch and show statistics of an index.
validate Check that a faker doc template file is well formed.
version Show version information.
Issue tracker: https://github.com/thombashi/elasticsearch-faker/issues
::
Usage: elasticsearch-faker generate [OPTIONS] ENDPOINT
Generate fake data and put it to an Elasticsearch index.
Options:
--index NAME Name of an index to create. Defaults to
'test_index'.
--mapping PATH Path to a mapping file. See also https://www
.elastic.co/guide/en/elasticsearch/reference
/current/explicit-mapping.html
--doc-template, --template PATH
Path to a faker doc template file.
-n, --num-doc INTEGER Number of generating documents. The command
uses bulk API if the value equals or is
greater than two. Defaults to 1000.
--bulk-size INTEGER Number of creating documents for a single
bulk API call. Defaults to 200.
--delete-index Delete the index if already exists before
generating documents.
-j, --jobs INTEGER Number of workers that create docs. Defaults
to 1.
--stdin Read a faker doc template from stdin.
--dry-run Do no harm.
-h, --help Show this message and exit.
Issue tracker: https://github.com/thombashi/elasticsearch-faker/issues
Execution example
----------------------------------------------
Create 1000 docs to an Elasticsearch index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
:Execution:
::
$ elasticsearch-faker generate --doc-template doc_template.jinja2 https://localhost:9200 -n 1000
document generator #0: 100%|█████████████████████| 1000/1000 [00:01<00:00, 590.53docs/s]
[INFO] generate 1000 docs to test_index
[Results]
target index: test_index
completed in 10.4 secs
current store.size: 3.0 MB
current docs.count: 1,000
generated store.size: 3.0 MB
average size[byte]/doc: 3,164
generated docs.count: 1,000
generated docs/secs: 96.3
bulk size: 200
$ curl -sS localhost:9200/test_index/_search | jq .hits.hits[:2]
[
{
"_index": "test_index",
"_id": "4bdd73c0-7744-4c6f-9736-50e3e8515f1c-0",
"_score": 1,
"_source": {
"name": "jennifer17",
"userId": 56561230,
"createdAt": "2009-07-17T06:31:04.000+0000",
"body": "Present blue happen thus miss toward. Itself race so successful build real beyond score. Look different she receive.Compare miss federal lawyer. Herself prevent approach east.",
"ext": "course",
"blobId": "c35769a9-3468-43fc-93c7-3c2f27ec9f64"
}
},
{
"_index": "test_index",
"_id": "88238d96-5ecc-4639-bb8f-c3f816027560-0",
"_score": 1,
"_source": {
"name": "dnicholson",
"userId": 457,
"createdAt": "2008-08-29T22:14:43.000+0000",
"body": "I sit another health president bring. Very expect international television job parent into.Authority read few stock. International hope yard left measure.Player them get move.",
"ext": "trial",
"blobId": "e43faf58-9b66-4a43-b1b7-7540b3996cde"
}
}
]
:doc template file (doc_template.jinja2):
.. code-block:: jinja
{
"name": "{{ user_name }}",
"userId": {{ random_number }},
"createdAt": "{{ date_time }}",
"body": "{{ text }}",
"ext": "{{ word }}",
"blobId": "{{ uuid4 }}"
}
``{{ XXX }}`` in the template file indicates the used providers of Faker to generate data.
The available providers can be listed by ``elasticsearch-faker provider list`` / ``elasticsearch-faker provider example`` subcommands.
Use Elasticsearch authentication
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
:Execution:
::
$ export ES_BASIC_AUTH_USER=elastic
$ export ES_BASIC_AUTH_PASSWORD=<PASSWORD>
$ export ES_SSL_ASSERT_FINGERPRINT=<HTTP CA certificate SHA-256 fingerprint>
$ elasticsearch-faker --verify-certs generate --doc-template doc_template.jinja2 https://localhost:9200 -n 1000
[INFO] generate 1000 docs to test_index
[Results]
target index: test_index
completed in 0.7 secs
current store.size: 3.9 MB
current docs.count: 6,000
generated store.size: 0.0 MB
average size[byte]/doc: 690
generated docs.count: 1,000
generated docs/secs: 1,338.6
bulk size: 200
$ curl --insecure -sS https://${ES_BASIC_AUTH_USER}:${ES_BASIC_AUTH_PASSWORD}@localhost:9200/test_index/_search | jq .hits.hits[:2]
[
{
"_index": "test_index",
"_id": "8PMd9ocBtCWmUGxHBM9L",
"_score": 1,
"_source": {
"name": "lclarke",
"userId": 331837,
"createdAt": "1980-07-18T23:42:30.000+0000",
"body": "Large address animal husband present. In act call animal.Yes plant pressure year me.",
"ext": "series",
"blobId": "ede46099-ac97-4447-b86b-0a87ef0180f1"
}
},
{
"_index": "test_index",
"_id": "71b76118-91fa-4ed3-a1e0-305694b3d34d-0",
"_score": 1,
"_source": {
"name": "shawnyoder",
"userId": 80039293,
"createdAt": "1972-09-28T19:04:31.000+0000",
"body": "Book television political surface fill position security itself. Not man support attorney attorney which amount finish. Ground mother board natural wait about lot.",
"ext": "before",
"blobId": "8913b0a4-dd44-442a-8961-a6be87eb68a6"
}
}
]
Or without ``--verify-certs`` option:
:Execution:
::
$ export ES_BASIC_AUTH_USER=elastic
$ export ES_BASIC_AUTH_PASSWORD=<PASSWORD>
$ elasticsearch-faker generate --doc-template doc_template.jinja2 https://localhost:9200 -n 1000
Dependencies
============================================
- Elasticsearch 8 or newer
- Python 3.7+
- `Python package dependencies (automatically installed) <https://github.com/thombashi/elasticsearch-faker/network/dependencies>`__
Raw data
{
"_id": null,
"home_page": "https://github.com/thombashi/elasticsearch-faker",
"name": "elasticsearch-faker",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "elasticsearch,faker",
"author": "Tsuyoshi Hombashi",
"author_email": "tsuyoshi.hombashi@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/35/3d/b8865ec3972d2aa0f38b2a5b24d8216a2a5bfa05145af02a0539c3e504d0/elasticsearch-faker-0.3.0.tar.gz",
"platform": null,
"description": ".. contents:: **elasticsearch-faker**\n :backlinks: top\n :depth: 2\n\n\nSummary\n============================================\n`elasticsearch-faker` is a CLI tool to generate fake data for Elasticsearch.\n\n.. image:: https://badge.fury.io/py/elasticsearch-faker.svg\n :target: https://badge.fury.io/py/elasticsearch-faker\n :alt: PyPI package version\n\n.. image:: https://img.shields.io/pypi/pyversions/elasticsearch-faker.svg\n :target: https://pypi.org/project/elasticsearch-faker\n :alt: Supported Python versions\n\n.. image:: https://github.com/thombashi/elasticsearch-faker/workflows/Tests/badge.svg\n :target: https://github.com/thombashi/elasticsearch-faker/actions?query=workflow%3ATests\n :alt: Tests CI status\n\n.. image:: https://github.com/thombashi/elasticsearch-faker/actions/workflows/build_and_release.yml/badge.svg\n :target: https://github.com/thombashi/elasticsearch-faker/actions/workflows/build_and_release.yml\n :alt: Build and release CI status\n\n\nInstallation\n============================================\n\nInstallation: pip\n------------------------------\n::\n\n pip install elasticsearch-faker\n\n\nInstallation: dpkg (Ubuntu)\n--------------------------------------------\n\n1. Navigate to `Releases page <https://github.com/thombashi/elasticsearch-faker/releases>`__\n2. Download the latest ``deb`` package\n3. Install with ``dpkg -i`` command\n\nInstallation: Docker container\n--------------------------------------------\n`Packages page <https://github.com/thombashi/elasticsearch-faker/pkgs/container/elasticsearch-faker>`__\n\n\nUsage\n============================================\n\nCommand help\n----------------------------------------------\n::\n\n Usage: elasticsearch-faker [OPTIONS] COMMAND [ARGS]...\n\n Faker for Elasticsearch.\n\n Options:\n --version Show the version and exit.\n --debug For debug print.\n -q, --quiet Suppress execution log messages.\n -v, --verbose\n --locale [ar_EG|ar_PS|ar_SA|bs_BA|bg_BG|cs_CZ|de_DE|dk_DK|el_GR|en_AU|en_CA|en_GB|en_NZ|en_US|es_ES|es_MX|et_EE|fa_IR|fi_FI|fr_FR|hi_IN|hr_HR|hu_HU|it_IT|ja_JP|ko_KR|lt_LT|lv_LV|ne_NP|nl_NL|no_NO|pl_PL|pt_BR|pt_PT|ro_RO|ru_RU|sl_SI|sv_SE|tr_TR|uk_UA|zh_CN|zh_TW|ka_GE]\n Specify localization for fake data. Defaults\n to en_US.\n --seed INTEGER Random seed for faker.\n --basic-auth-user TEXT User name for Elasticsearch basic\n authentication. Or you can set the value via\n ES_BASIC_AUTH_USER environment variable.\n --basic-auth-password TEXT Password for Elasticsearch basic\n authentication. Or you can set the value via\n ES_BASIC_AUTH_PASSWORD environment variable.\n --verify-certs Verify Elasticsearch server certificate. Or\n you can set the value via\n ES_SSL_ASSERT_FINGERPRINT environment\n variable.\n --ssl-assert-fingerprint TEXT SSL certificate fingerprint to verify.\n --ignore-es-warn Ignore ElasticsearchWarning.\n -h, --help Show this message and exit.\n\n Commands:\n generate Generate fake data and put it to an Elasticsearch index.\n provider Show or search providers for doc templates.\n show-stats Fetch and show statistics of an index.\n validate Check that a faker doc template file is well formed.\n version Show version information.\n\n Issue tracker: https://github.com/thombashi/elasticsearch-faker/issues\n\n::\n\n Usage: elasticsearch-faker generate [OPTIONS] ENDPOINT\n\n Generate fake data and put it to an Elasticsearch index.\n\n Options:\n --index NAME Name of an index to create. Defaults to\n 'test_index'.\n --mapping PATH Path to a mapping file. See also https://www\n .elastic.co/guide/en/elasticsearch/reference\n /current/explicit-mapping.html\n --doc-template, --template PATH\n Path to a faker doc template file.\n -n, --num-doc INTEGER Number of generating documents. The command\n uses bulk API if the value equals or is\n greater than two. Defaults to 1000.\n --bulk-size INTEGER Number of creating documents for a single\n bulk API call. Defaults to 200.\n --delete-index Delete the index if already exists before\n generating documents.\n -j, --jobs INTEGER Number of workers that create docs. Defaults\n to 1.\n --stdin Read a faker doc template from stdin.\n --dry-run Do no harm.\n -h, --help Show this message and exit.\n\n Issue tracker: https://github.com/thombashi/elasticsearch-faker/issues\n\nExecution example\n----------------------------------------------\n\nCreate 1000 docs to an Elasticsearch index\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n:Execution:\n ::\n\n $ elasticsearch-faker generate --doc-template doc_template.jinja2 https://localhost:9200 -n 1000\n document generator #0: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1000/1000 [00:01<00:00, 590.53docs/s]\n [INFO] generate 1000 docs to test_index\n\n [Results]\n target index: test_index\n completed in 10.4 secs\n current store.size: 3.0 MB\n current docs.count: 1,000\n generated store.size: 3.0 MB\n average size[byte]/doc: 3,164\n generated docs.count: 1,000\n generated docs/secs: 96.3\n bulk size: 200\n $ curl -sS localhost:9200/test_index/_search | jq .hits.hits[:2]\n [\n {\n \"_index\": \"test_index\",\n \"_id\": \"4bdd73c0-7744-4c6f-9736-50e3e8515f1c-0\",\n \"_score\": 1,\n \"_source\": {\n \"name\": \"jennifer17\",\n \"userId\": 56561230,\n \"createdAt\": \"2009-07-17T06:31:04.000+0000\",\n \"body\": \"Present blue happen thus miss toward. Itself race so successful build real beyond score. Look different she receive.Compare miss federal lawyer. Herself prevent approach east.\",\n \"ext\": \"course\",\n \"blobId\": \"c35769a9-3468-43fc-93c7-3c2f27ec9f64\"\n }\n },\n {\n \"_index\": \"test_index\",\n \"_id\": \"88238d96-5ecc-4639-bb8f-c3f816027560-0\",\n \"_score\": 1,\n \"_source\": {\n \"name\": \"dnicholson\",\n \"userId\": 457,\n \"createdAt\": \"2008-08-29T22:14:43.000+0000\",\n \"body\": \"I sit another health president bring. Very expect international television job parent into.Authority read few stock. International hope yard left measure.Player them get move.\",\n \"ext\": \"trial\",\n \"blobId\": \"e43faf58-9b66-4a43-b1b7-7540b3996cde\"\n }\n }\n ]\n:doc template file (doc_template.jinja2):\n .. code-block:: jinja\n\n {\n \"name\": \"{{ user_name }}\",\n \"userId\": {{ random_number }},\n \"createdAt\": \"{{ date_time }}\",\n \"body\": \"{{ text }}\",\n \"ext\": \"{{ word }}\",\n \"blobId\": \"{{ uuid4 }}\"\n }\n\n``{{ XXX }}`` in the template file indicates the used providers of Faker to generate data.\nThe available providers can be listed by ``elasticsearch-faker provider list`` / ``elasticsearch-faker provider example`` subcommands.\n\nUse Elasticsearch authentication\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n:Execution:\n ::\n\n $ export ES_BASIC_AUTH_USER=elastic\n $ export ES_BASIC_AUTH_PASSWORD=<PASSWORD>\n $ export ES_SSL_ASSERT_FINGERPRINT=<HTTP CA certificate SHA-256 fingerprint>\n\n $ elasticsearch-faker --verify-certs generate --doc-template doc_template.jinja2 https://localhost:9200 -n 1000\n [INFO] generate 1000 docs to test_index\n\n [Results]\n target index: test_index\n completed in 0.7 secs\n current store.size: 3.9 MB\n current docs.count: 6,000\n generated store.size: 0.0 MB\n average size[byte]/doc: 690\n generated docs.count: 1,000\n generated docs/secs: 1,338.6\n bulk size: 200\n\n $ curl --insecure -sS https://${ES_BASIC_AUTH_USER}:${ES_BASIC_AUTH_PASSWORD}@localhost:9200/test_index/_search | jq .hits.hits[:2]\n [\n {\n \"_index\": \"test_index\",\n \"_id\": \"8PMd9ocBtCWmUGxHBM9L\",\n \"_score\": 1,\n \"_source\": {\n \"name\": \"lclarke\",\n \"userId\": 331837,\n \"createdAt\": \"1980-07-18T23:42:30.000+0000\",\n \"body\": \"Large address animal husband present. In act call animal.Yes plant pressure year me.\",\n \"ext\": \"series\",\n \"blobId\": \"ede46099-ac97-4447-b86b-0a87ef0180f1\"\n }\n },\n {\n \"_index\": \"test_index\",\n \"_id\": \"71b76118-91fa-4ed3-a1e0-305694b3d34d-0\",\n \"_score\": 1,\n \"_source\": {\n \"name\": \"shawnyoder\",\n \"userId\": 80039293,\n \"createdAt\": \"1972-09-28T19:04:31.000+0000\",\n \"body\": \"Book television political surface fill position security itself. Not man support attorney attorney which amount finish. Ground mother board natural wait about lot.\",\n \"ext\": \"before\",\n \"blobId\": \"8913b0a4-dd44-442a-8961-a6be87eb68a6\"\n }\n }\n ]\n\nOr without ``--verify-certs`` option:\n\n:Execution:\n ::\n\n $ export ES_BASIC_AUTH_USER=elastic\n $ export ES_BASIC_AUTH_PASSWORD=<PASSWORD>\n\n $ elasticsearch-faker generate --doc-template doc_template.jinja2 https://localhost:9200 -n 1000\n\n\nDependencies\n============================================\n- Elasticsearch 8 or newer\n- Python 3.7+\n- `Python package dependencies (automatically installed) <https://github.com/thombashi/elasticsearch-faker/network/dependencies>`__\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "elasticsearch-faker is a CLI tool to generate fake data for Elasticsearch.",
"version": "0.3.0",
"project_urls": {
"Homepage": "https://github.com/thombashi/elasticsearch-faker",
"Source": "https://github.com/thombashi/elasticsearch-faker",
"Tracker": "https://github.com/thombashi/elasticsearch-faker/issues"
},
"split_keywords": [
"elasticsearch",
"faker"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "a08caffa154dc3187cb31ef19e40ced19f1530f785e9c1f4d7fa823b3c8875e0",
"md5": "145758325fa7e79e1933487770d7657c",
"sha256": "e80ade8a6c39dfd8bb9b9d41c608384b640f641860e2d9d603de4815557944a0"
},
"downloads": -1,
"filename": "elasticsearch_faker-0.3.0-py3-none-any.whl",
"has_sig": true,
"md5_digest": "145758325fa7e79e1933487770d7657c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 17977,
"upload_time": "2023-05-07T14:29:27",
"upload_time_iso_8601": "2023-05-07T14:29:27.087242Z",
"url": "https://files.pythonhosted.org/packages/a0/8c/affa154dc3187cb31ef19e40ced19f1530f785e9c1f4d7fa823b3c8875e0/elasticsearch_faker-0.3.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "353db8865ec3972d2aa0f38b2a5b24d8216a2a5bfa05145af02a0539c3e504d0",
"md5": "760dbce083c0029df4815874f87f3f12",
"sha256": "d725ba6116d2bbbf6ac753011476935eb08d759781995e0fffb95edac3a6bef9"
},
"downloads": -1,
"filename": "elasticsearch-faker-0.3.0.tar.gz",
"has_sig": true,
"md5_digest": "760dbce083c0029df4815874f87f3f12",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 21190,
"upload_time": "2023-05-07T14:29:30",
"upload_time_iso_8601": "2023-05-07T14:29:30.163034Z",
"url": "https://files.pythonhosted.org/packages/35/3d/b8865ec3972d2aa0f38b2a5b24d8216a2a5bfa05145af02a0539c3e504d0/elasticsearch-faker-0.3.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-05-07 14:29:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "thombashi",
"github_project": "elasticsearch-faker",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "elasticsearch-faker"
}