pg-es-fdw

Name	pg-es-fdw JSON
Version	0.12.0 JSON
	download
home_page	https://github.com/matthewfranglen/postgres-elasticsearch-fdw
Summary	Connect PostgreSQL and Elastic Search with this Foreign Data Wrapper
upload_time	2024-05-10 23:05:14
maintainer	None
docs_url	None
author	Matthew Franglen
requires_python	None
license	MIT
keywords	postgresql postgres elasticsearch es fdw
VCS
bugtrack_url
requirements	elasticsearch
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            PostgreSQL Elastic Search foreign data wrapper
==============================================

This allows you to index data in Elastic Search and then search it from
PostgreSQL. You can write as well as read.

SYNOPSIS
--------

### Supported Versions

| Elastic Search | Dependency Installation Command |
|----------------|---------------------------------|
| 5 | `sudo pip install "elasticsearch>=5,<6"` |
| 6 | `sudo pip install "elasticsearch>=6,<7"` |
| 7 | `sudo pip install "elasticsearch>=7,<8"` |
| 8 | `sudo pip install "elasticsearch>=8,<9"` |

| PostgreSQL | Dependency Installation Command                                     |
|------------|---------------------------------------------------------------------|
| 9.4        | `sudo apt-get install postgresql-9.4-python-multicorn`              |
| 9.5        | `sudo apt-get install postgresql-9.5-python-multicorn`              |
| 9.6        | `sudo apt-get install postgresql-9.6-python-multicorn`              |
| 10         | `sudo apt-get install postgresql-10-python-multicorn`               |
| 11         | `sudo apt-get install postgresql-11-python-multicorn`               |
| 12         | `sudo apt-get install postgresql-12-python3-multicorn`              |
| 13         | `sudo apt-get install postgresql-13-python3-multicorn`              |

Please note that the Debian package for Multicorn on PostgreSQL 12+ requires Python 3.

#### PostgreSQL 13+

For PostgreSQL versions 13+ you can use [Multicorn 2](https://github.com/pgsql-io/multicorn2)
which is actively maintained. You can consult the [Multicorn 2 README](https://github.com/pgsql-io/multicorn2#using-in-oscgio)
for installation instructions, or review the Dockerfiles associated with this
project to see how I have installed it. I would recommend using the OSCG.IO
approach as it seems to be the easiest.

| PostgreSQL | Dockerfile                                                                                                            |
|------------|-----------------------------------------------------------------------------------------------------------------------|
| 13         | [Dockerfile](https://github.com/matthewfranglen/postgres-elasticsearch-fdw/blob/master/tests/docker/pg-13/Dockerfile) |
| 14         | [Dockerfile](https://github.com/matthewfranglen/postgres-elasticsearch-fdw/blob/master/tests/docker/pg-14/Dockerfile) |
| 15         | [Dockerfile](https://github.com/matthewfranglen/postgres-elasticsearch-fdw/blob/master/tests/docker/pg-15/Dockerfile) |


### Installation

This requires installation on the PostgreSQL server, and has system level dependencies.
You can install the dependencies with:

```
sudo apt-get install python python-pip
```

You should install the version of multicorn that is specific to your postgres
version. See the table in _Supported Versions_ for installation commands. The
multicorn package is also only available from Ubuntu Xenial (16.04) onwards. If
you cannot install multicorn in this way then you can use
[pgxn](http://pgxnclient.projects.pgfoundry.org/) to install it.

This uses the Elastic Search client which has release versions that correspond
to the major version of the Elastic Search server. You should install the
`elasticsearch` dependency separately. See the table in _Supported Versions_
for installation commands.

Once the dependencies are installed you can install the foreign data wrapper
using pip:

```
sudo pip install pg_es_fdw
```

### Usage

A running configuration for this can be found in the `docker-compose.yml`
within this folder.

The basic steps are:

 * Load the extension
 * Create the server
 * Create the foreign table
 * Populate the foreign table
 * Query the foreign table...

#### Load extension and Create server

```sql
CREATE EXTENSION multicorn;

CREATE SERVER multicorn_es FOREIGN DATA WRAPPER multicorn
OPTIONS (
  wrapper 'pg_es_fdw.ElasticsearchFDW'
);
```

#### Create the foreign table

```sql
CREATE FOREIGN TABLE articles_es
    (
        id BIGINT,
        title TEXT,
        body TEXT,
        metadata JSON,
        query TEXT,
        score NUMERIC,
        sort TEXT
    )
SERVER multicorn_es
OPTIONS
    (
        host 'elasticsearch',
        port '9200',
        index 'article-index',
        type 'article',
        rowid_column 'id',
        query_column 'query',
        query_dsl 'false',
        score_column 'score',
        default_sort 'last_updated:desc',
        sort_column 'sort',
        refresh 'false',
        complete_returning 'false',
        timeout '20',
        username 'elastic',
        password 'changeme'
    )
;
```

Elastic Search 7 and greater does not require the `type` option, which
corresponds to the `doc_type` used in prior versions of Elastic Search.

This corresponds to an Elastic Search index which contains a `title` and `body`
fields. The other fields have special meaning:

 * The `rowid_column` (`id` above) is mapped to the Elastic Search document id
 * The `query_column` (`query` above) accepts Elastic Search queries to filter the rows
 * The `query_dsl` (`false` above) indicates if the query is in the [URI Search](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html) syntax or the json [Query DSL](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html). The query will use the URI Search syntax by default. To use the Query DSL set this option to `"true"`.
 * The `score_column` (`score` above) returns the score for the document against the query
 * The `sort_column` (`sort` above) accepts an Elastic Search column to sort by
 * The `refresh` option controls if inserts and updates should wait for an index refresh ([Elastic Search documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-refresh.html)). The acceptable values are `"false"` (default), `"wait_for"` and `"true"`
 * The `complete_returning` options controls if Elastic Search is queries for the document after an insert to support `RETURNING` fields other than the document id. The acceptable values are `"false"` (default) and `"true"`
 * The `scheme` field specifies the scheme of the Elastic Search index
 * The `timeout` field specifies the connection timeout in seconds
 * The `username` field specifies the basic auth username used
 * The `password` field specifies the basic auth password used
 * Any other options are passed through to the elasticsearch client, use this to specify things like ssl

All of these are optional.
Enabling `refresh` or `complete_returning` comes with a performance penalty.

To use basic auth you must provide both a username and a password,
even if the password is blank.

##### JSON and JSONB

When elasticsearch returns nested data it is serialized to TEXT as json before being returned.
This means you can create columns with JSON or JSONB types and the data will be correctly converted on read.
If you write to a JSON or JSONB column then the data is passed to elasticsearch as json.

As the data is converted on the fly per query the benefits of using JSONB over JSON are limited.

##### Elastic Search Authentication

Currently basic auth is supported for authentication.
You can provide the username and password by setting the `username` and `password` options when creating the table.
You must provide both, even if the password is blank.
If you do not provide them then basic auth is disabled for the table.

If you need to use other forms of authentication then please open an issue.

#### Populate the foreign table

```sql
INSERT INTO articles_es
    (
        id,
        title,
        body,
        metadata
    )
VALUES
    (
        1,
        'foo',
        'spike',
        '{"score": 3}'::json
    );
```

It is possible to write documents to Elastic Search using the foreign data
wrapper. This feature was introduced in PostgreSQL 9.3.

#### Query the foreign table

To select all documents:

```sql
SELECT
    id,
    title,
    body,
    metadata
FROM
    articles_es
;
```

##### URI Search Query

To filter the documents using a URI Search query:

```sql
SELECT
    id,
    title,
    body,
    metadata,
    score
FROM
    articles_es
WHERE
    query = 'body:chess'
;
```

This uses the [URI Search](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html) from Elastic Search.
This is the default search syntax, and can be explicitly selected by setting the `query_dsl` option to `"false"`.

##### Query DSL Query

To filter the documents using a Query DSL query you must ensure that you have set the `query_dsl` option to `"true"` when creating the table.

```sql
SELECT
    id,
    title,
    body,
    metadata,
    score
FROM
    articles_es
WHERE
    query = '{"query":{"bool":{"filter":[{"term":{"body":"chess"}}]}}}'
;
```

This uses the [Query DSL](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html) from Elastic Search.
This is not the default search syntax and must be specifically enabled.
You cannot enable this on a per-query basis.

#### Sorting the Results

By default Elastic Search returns the documents in score order, descending.
If you wish to sort by a different field you can use the `sort_column` option.

The `sort_column` accepts a column to sort by and a direction, for example `last_updated:desc`.
This is passed to Elastic Search so it should be the name of the Elastic Search column.
If you always want sorted results then you can use the `default_sort` option to specify the sort when creating the table.

To break ties you can specify further columns to sort on.
You just need to separate the columns with a comma, for example `unit:asc,last_updated:desc`.

```
SELECT
    id,
    title,
    body,
    metadata,
    score
FROM
    articles_es
WHERE
    sort = 'id:asc'
;
```

#### Refresh and RETURNING

When inserting or updating documents in Elastic Search the document ID is returned.
This can be accessed through the `RETURNING` statement without any additional performance loss.

To get further fields requires reading the document from Elastic Search again.
This comes at a cost because an immediate read after an insert may not return the updated document.
Elastic Search periodically refreshes the indexes and at that point the document will be available.

To wait for a refresh before returning you can use the `refresh` parameter.
This accepts three values: `"false"` (the default), `"true"` and `"wait_for"`.
You can read about them [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-refresh.html).
If you choose to use the refresh setting then it is recommended to use `"wait_for"`.

Once you have chosen to wait for the refresh you can enable full returning support by setting `complete_returning` to `"true"`.
Both the `refresh` and `complete_returning` options are set during table creation.
If you do not wish to incur the associated costs for every query then you can create two tables with different settings.

Caveats
-------

Elastic Search does not support transactions, so the elasticsearch index
is not guaranteed to be synchronized with the canonical version in PostgreSQL.
Unfortunately this is the case even for serializable isolation level transactions.
It would however be possible to check against Elastic Search version field and locking.

Rollback is currently not supported.

Tests
-----

There are end to end tests that use docker to create a PostgreSQL and Elastic
Search database. These are then populated with data and tests are run against
them.

These require docker and docker-compose (see the [installation
instructions](https://docs.docker.com/engine/install/ubuntu/)).

These currently target Python 3.11.9 which you can install with
[pyenv](https://github.com/pyenv/pyenv/#installation).

These also require [poetry](https://python-poetry.org/docs/#installation) to
manage the python dependencies.

Once you have installed python 3.11.9, docker, docker-compose and poetry you
can install the requirements with:

```bash
make requirements
```

The makefile will test all versions if you run `make test`:

```bash
➜ make test
poetry run tests/run.py --pg 13 14 15 --es 5 6 7 8
Testing PostgreSQL 13 with Elasticsearch 5
PostgreSQL 13 with Elasticsearch 5: Test read - PASS
PostgreSQL 13 with Elasticsearch 5: Test nested-read - PASS
PostgreSQL 13 with Elasticsearch 5: Test sorted-read - PASS
PostgreSQL 13 with Elasticsearch 5: Test query - PASS
PostgreSQL 13 with Elasticsearch 5: Test json-query - PASS
PostgreSQL 13 with Elasticsearch 5: Test insert returning id - PASS
PostgreSQL 13 with Elasticsearch 5: Test insert returning row - PASS
PostgreSQL 13 with Elasticsearch 5: Test insert waiting for refresh - PASS
PostgreSQL 13 with Elasticsearch 5: Test delete returning row - PASS
Testing PostgreSQL 13 with Elasticsearch 6
...
Testing PostgreSQL 15 with Elasticsearch 8
PostgreSQL 15 with Elasticsearch 8: Test read - PASS
PostgreSQL 15 with Elasticsearch 8: Test nested-read - PASS
PostgreSQL 15 with Elasticsearch 8: Test sorted-read - PASS
PostgreSQL 15 with Elasticsearch 8: Test query - PASS
PostgreSQL 15 with Elasticsearch 8: Test json-query - PASS
PostgreSQL 15 with Elasticsearch 8: Test insert returning id - PASS
PostgreSQL 15 with Elasticsearch 8: Test insert returning row - PASS
PostgreSQL 15 with Elasticsearch 8: Test insert waiting for refresh - PASS
PostgreSQL 15 with Elasticsearch 8: Test delete returning row - PASS
PASS
```

If you want to run the tests for specific versions then you can then run the
tests using `tests/run.py`.  This takes the PostgreSQL version(s) to test using
the `--pg` argument and the Elastic Search versions to test with the `--es`
argument.  The currently supported versions of PostgreSQL are 13, 14 and 15.
PostgreSQL back to 9.4 is supported on a best effort basis, but the docker images are unavailable and so the tests cannot run and fixes cannot be made if there are errors.
The currently supported versions of Elastic Search are 5 to 8. You can pass
multiple versions to test against all of them.

### Test Failure Messages

```
Error starting userland proxy: listen tcp 0.0.0.0:5432: bind: address already in use
```
You are already running something that listens to 5432.
Try stopping your running postgres server:
```
sudo /etc/init.d/postgresql stop
```

```
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
```
Your system does not have the appropriate limits in place to run a production ready instance of elasticsearch.
Try increasing it:
```
sudo sysctl -w vm.max_map_count=262144
```
This setting will revert after a reboot.

### Migrating from <=0.6.0

In version 0.7.0 the TEXT representation of json objects changed from HSTORE to JSON.
If you have been mapping json objects to HSTORE columns then you should change the column type to JSON.
The arrow operator exists for json so queries should not need rewriting.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/matthewfranglen/postgres-elasticsearch-fdw",
    "name": "pg-es-fdw",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "postgresql, postgres, elasticsearch, es, fdw",
    "author": "Matthew Franglen",
    "author_email": "matthew@franglen.org",
    "download_url": "https://files.pythonhosted.org/packages/89/74/53b75128f9e8c7bcf817a67173530e7fc45eb9d6af3953005923a8edc4ce/pg_es_fdw-0.12.0.tar.gz",
    "platform": null,
    "description": "PostgreSQL Elastic Search foreign data wrapper\n==============================================\n\nThis allows you to index data in Elastic Search and then search it from\nPostgreSQL. You can write as well as read.\n\nSYNOPSIS\n--------\n\n### Supported Versions\n\n| Elastic Search | Dependency Installation Command |\n|----------------|---------------------------------|\n| 5 | `sudo pip install \"elasticsearch>=5,<6\"` |\n| 6 | `sudo pip install \"elasticsearch>=6,<7\"` |\n| 7 | `sudo pip install \"elasticsearch>=7,<8\"` |\n| 8 | `sudo pip install \"elasticsearch>=8,<9\"` |\n\n| PostgreSQL | Dependency Installation Command                                     |\n|------------|---------------------------------------------------------------------|\n| 9.4        | `sudo apt-get install postgresql-9.4-python-multicorn`              |\n| 9.5        | `sudo apt-get install postgresql-9.5-python-multicorn`              |\n| 9.6        | `sudo apt-get install postgresql-9.6-python-multicorn`              |\n| 10         | `sudo apt-get install postgresql-10-python-multicorn`               |\n| 11         | `sudo apt-get install postgresql-11-python-multicorn`               |\n| 12         | `sudo apt-get install postgresql-12-python3-multicorn`              |\n| 13         | `sudo apt-get install postgresql-13-python3-multicorn`              |\n\nPlease note that the Debian package for Multicorn on PostgreSQL 12+ requires Python 3.\n\n#### PostgreSQL 13+\n\nFor PostgreSQL versions 13+ you can use [Multicorn 2](https://github.com/pgsql-io/multicorn2)\nwhich is actively maintained. You can consult the [Multicorn 2 README](https://github.com/pgsql-io/multicorn2#using-in-oscgio)\nfor installation instructions, or review the Dockerfiles associated with this\nproject to see how I have installed it. I would recommend using the OSCG.IO\napproach as it seems to be the easiest.\n\n| PostgreSQL | Dockerfile                                                                                                            |\n|------------|-----------------------------------------------------------------------------------------------------------------------|\n| 13         | [Dockerfile](https://github.com/matthewfranglen/postgres-elasticsearch-fdw/blob/master/tests/docker/pg-13/Dockerfile) |\n| 14         | [Dockerfile](https://github.com/matthewfranglen/postgres-elasticsearch-fdw/blob/master/tests/docker/pg-14/Dockerfile) |\n| 15         | [Dockerfile](https://github.com/matthewfranglen/postgres-elasticsearch-fdw/blob/master/tests/docker/pg-15/Dockerfile) |\n\n\n### Installation\n\nThis requires installation on the PostgreSQL server, and has system level dependencies.\nYou can install the dependencies with:\n\n```\nsudo apt-get install python python-pip\n```\n\nYou should install the version of multicorn that is specific to your postgres\nversion. See the table in _Supported Versions_ for installation commands. The\nmulticorn package is also only available from Ubuntu Xenial (16.04) onwards. If\nyou cannot install multicorn in this way then you can use\n[pgxn](http://pgxnclient.projects.pgfoundry.org/) to install it.\n\nThis uses the Elastic Search client which has release versions that correspond\nto the major version of the Elastic Search server. You should install the\n`elasticsearch` dependency separately. See the table in _Supported Versions_\nfor installation commands.\n\nOnce the dependencies are installed you can install the foreign data wrapper\nusing pip:\n\n```\nsudo pip install pg_es_fdw\n```\n\n### Usage\n\nA running configuration for this can be found in the `docker-compose.yml`\nwithin this folder.\n\nThe basic steps are:\n\n * Load the extension\n * Create the server\n * Create the foreign table\n * Populate the foreign table\n * Query the foreign table...\n\n#### Load extension and Create server\n\n```sql\nCREATE EXTENSION multicorn;\n\nCREATE SERVER multicorn_es FOREIGN DATA WRAPPER multicorn\nOPTIONS (\n  wrapper 'pg_es_fdw.ElasticsearchFDW'\n);\n```\n\n#### Create the foreign table\n\n```sql\nCREATE FOREIGN TABLE articles_es\n    (\n        id BIGINT,\n        title TEXT,\n        body TEXT,\n        metadata JSON,\n        query TEXT,\n        score NUMERIC,\n        sort TEXT\n    )\nSERVER multicorn_es\nOPTIONS\n    (\n        host 'elasticsearch',\n        port '9200',\n        index 'article-index',\n        type 'article',\n        rowid_column 'id',\n        query_column 'query',\n        query_dsl 'false',\n        score_column 'score',\n        default_sort 'last_updated:desc',\n        sort_column 'sort',\n        refresh 'false',\n        complete_returning 'false',\n        timeout '20',\n        username 'elastic',\n        password 'changeme'\n    )\n;\n```\n\nElastic Search 7 and greater does not require the `type` option, which\ncorresponds to the `doc_type` used in prior versions of Elastic Search.\n\nThis corresponds to an Elastic Search index which contains a `title` and `body`\nfields. The other fields have special meaning:\n\n * The `rowid_column` (`id` above) is mapped to the Elastic Search document id\n * The `query_column` (`query` above) accepts Elastic Search queries to filter the rows\n * The `query_dsl` (`false` above) indicates if the query is in the [URI Search](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html) syntax or the json [Query DSL](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html). The query will use the URI Search syntax by default. To use the Query DSL set this option to `\"true\"`.\n * The `score_column` (`score` above) returns the score for the document against the query\n * The `sort_column` (`sort` above) accepts an Elastic Search column to sort by\n * The `refresh` option controls if inserts and updates should wait for an index refresh ([Elastic Search documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-refresh.html)). The acceptable values are `\"false\"` (default), `\"wait_for\"` and `\"true\"`\n * The `complete_returning` options controls if Elastic Search is queries for the document after an insert to support `RETURNING` fields other than the document id. The acceptable values are `\"false\"` (default) and `\"true\"`\n * The `scheme` field specifies the scheme of the Elastic Search index\n * The `timeout` field specifies the connection timeout in seconds\n * The `username` field specifies the basic auth username used\n * The `password` field specifies the basic auth password used\n * Any other options are passed through to the elasticsearch client, use this to specify things like ssl\n\nAll of these are optional.\nEnabling `refresh` or `complete_returning` comes with a performance penalty.\n\nTo use basic auth you must provide both a username and a password,\neven if the password is blank.\n\n##### JSON and JSONB\n\nWhen elasticsearch returns nested data it is serialized to TEXT as json before being returned.\nThis means you can create columns with JSON or JSONB types and the data will be correctly converted on read.\nIf you write to a JSON or JSONB column then the data is passed to elasticsearch as json.\n\nAs the data is converted on the fly per query the benefits of using JSONB over JSON are limited.\n\n##### Elastic Search Authentication\n\nCurrently basic auth is supported for authentication.\nYou can provide the username and password by setting the `username` and `password` options when creating the table.\nYou must provide both, even if the password is blank.\nIf you do not provide them then basic auth is disabled for the table.\n\nIf you need to use other forms of authentication then please open an issue.\n\n#### Populate the foreign table\n\n```sql\nINSERT INTO articles_es\n    (\n        id,\n        title,\n        body,\n        metadata\n    )\nVALUES\n    (\n        1,\n        'foo',\n        'spike',\n        '{\"score\": 3}'::json\n    );\n```\n\nIt is possible to write documents to Elastic Search using the foreign data\nwrapper. This feature was introduced in PostgreSQL 9.3.\n\n#### Query the foreign table\n\nTo select all documents:\n\n```sql\nSELECT\n    id,\n    title,\n    body,\n    metadata\nFROM\n    articles_es\n;\n```\n\n##### URI Search Query\n\nTo filter the documents using a URI Search query:\n\n```sql\nSELECT\n    id,\n    title,\n    body,\n    metadata,\n    score\nFROM\n    articles_es\nWHERE\n    query = 'body:chess'\n;\n```\n\nThis uses the [URI Search](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html) from Elastic Search.\nThis is the default search syntax, and can be explicitly selected by setting the `query_dsl` option to `\"false\"`.\n\n##### Query DSL Query\n\nTo filter the documents using a Query DSL query you must ensure that you have set the `query_dsl` option to `\"true\"` when creating the table.\n\n```sql\nSELECT\n    id,\n    title,\n    body,\n    metadata,\n    score\nFROM\n    articles_es\nWHERE\n    query = '{\"query\":{\"bool\":{\"filter\":[{\"term\":{\"body\":\"chess\"}}]}}}'\n;\n```\n\nThis uses the [Query DSL](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html) from Elastic Search.\nThis is not the default search syntax and must be specifically enabled.\nYou cannot enable this on a per-query basis.\n\n#### Sorting the Results\n\nBy default Elastic Search returns the documents in score order, descending.\nIf you wish to sort by a different field you can use the `sort_column` option.\n\nThe `sort_column` accepts a column to sort by and a direction, for example `last_updated:desc`.\nThis is passed to Elastic Search so it should be the name of the Elastic Search column.\nIf you always want sorted results then you can use the `default_sort` option to specify the sort when creating the table.\n\nTo break ties you can specify further columns to sort on.\nYou just need to separate the columns with a comma, for example `unit:asc,last_updated:desc`.\n\n```\nSELECT\n    id,\n    title,\n    body,\n    metadata,\n    score\nFROM\n    articles_es\nWHERE\n    sort = 'id:asc'\n;\n```\n\n#### Refresh and RETURNING\n\nWhen inserting or updating documents in Elastic Search the document ID is returned.\nThis can be accessed through the `RETURNING` statement without any additional performance loss.\n\nTo get further fields requires reading the document from Elastic Search again.\nThis comes at a cost because an immediate read after an insert may not return the updated document.\nElastic Search periodically refreshes the indexes and at that point the document will be available.\n\nTo wait for a refresh before returning you can use the `refresh` parameter.\nThis accepts three values: `\"false\"` (the default), `\"true\"` and `\"wait_for\"`.\nYou can read about them [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-refresh.html).\nIf you choose to use the refresh setting then it is recommended to use `\"wait_for\"`.\n\nOnce you have chosen to wait for the refresh you can enable full returning support by setting `complete_returning` to `\"true\"`.\nBoth the `refresh` and `complete_returning` options are set during table creation.\nIf you do not wish to incur the associated costs for every query then you can create two tables with different settings.\n\nCaveats\n-------\n\nElastic Search does not support transactions, so the elasticsearch index\nis not guaranteed to be synchronized with the canonical version in PostgreSQL.\nUnfortunately this is the case even for serializable isolation level transactions.\nIt would however be possible to check against Elastic Search version field and locking.\n\nRollback is currently not supported.\n\nTests\n-----\n\nThere are end to end tests that use docker to create a PostgreSQL and Elastic\nSearch database. These are then populated with data and tests are run against\nthem.\n\nThese require docker and docker-compose (see the [installation\ninstructions](https://docs.docker.com/engine/install/ubuntu/)).\n\nThese currently target Python 3.11.9 which you can install with\n[pyenv](https://github.com/pyenv/pyenv/#installation).\n\nThese also require [poetry](https://python-poetry.org/docs/#installation) to\nmanage the python dependencies.\n\nOnce you have installed python 3.11.9, docker, docker-compose and poetry you\ncan install the requirements with:\n\n```bash\nmake requirements\n```\n\nThe makefile will test all versions if you run `make test`:\n\n```bash\n\u279c make test\npoetry run tests/run.py --pg 13 14 15 --es 5 6 7 8\nTesting PostgreSQL 13 with Elasticsearch 5\nPostgreSQL 13 with Elasticsearch 5: Test read - PASS\nPostgreSQL 13 with Elasticsearch 5: Test nested-read - PASS\nPostgreSQL 13 with Elasticsearch 5: Test sorted-read - PASS\nPostgreSQL 13 with Elasticsearch 5: Test query - PASS\nPostgreSQL 13 with Elasticsearch 5: Test json-query - PASS\nPostgreSQL 13 with Elasticsearch 5: Test insert returning id - PASS\nPostgreSQL 13 with Elasticsearch 5: Test insert returning row - PASS\nPostgreSQL 13 with Elasticsearch 5: Test insert waiting for refresh - PASS\nPostgreSQL 13 with Elasticsearch 5: Test delete returning row - PASS\nTesting PostgreSQL 13 with Elasticsearch 6\n...\nTesting PostgreSQL 15 with Elasticsearch 8\nPostgreSQL 15 with Elasticsearch 8: Test read - PASS\nPostgreSQL 15 with Elasticsearch 8: Test nested-read - PASS\nPostgreSQL 15 with Elasticsearch 8: Test sorted-read - PASS\nPostgreSQL 15 with Elasticsearch 8: Test query - PASS\nPostgreSQL 15 with Elasticsearch 8: Test json-query - PASS\nPostgreSQL 15 with Elasticsearch 8: Test insert returning id - PASS\nPostgreSQL 15 with Elasticsearch 8: Test insert returning row - PASS\nPostgreSQL 15 with Elasticsearch 8: Test insert waiting for refresh - PASS\nPostgreSQL 15 with Elasticsearch 8: Test delete returning row - PASS\nPASS\n```\n\nIf you want to run the tests for specific versions then you can then run the\ntests using `tests/run.py`.  This takes the PostgreSQL version(s) to test using\nthe `--pg` argument and the Elastic Search versions to test with the `--es`\nargument.  The currently supported versions of PostgreSQL are 13, 14 and 15.\nPostgreSQL back to 9.4 is supported on a best effort basis, but the docker images are unavailable and so the tests cannot run and fixes cannot be made if there are errors.\nThe currently supported versions of Elastic Search are 5 to 8. You can pass\nmultiple versions to test against all of them.\n\n### Test Failure Messages\n\n```\nError starting userland proxy: listen tcp 0.0.0.0:5432: bind: address already in use\n```\nYou are already running something that listens to 5432.\nTry stopping your running postgres server:\n```\nsudo /etc/init.d/postgresql stop\n```\n\n```\nmax virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]\n```\nYour system does not have the appropriate limits in place to run a production ready instance of elasticsearch.\nTry increasing it:\n```\nsudo sysctl -w vm.max_map_count=262144\n```\nThis setting will revert after a reboot.\n\n### Migrating from <=0.6.0\n\nIn version 0.7.0 the TEXT representation of json objects changed from HSTORE to JSON.\nIf you have been mapping json objects to HSTORE columns then you should change the column type to JSON.\nThe arrow operator exists for json so queries should not need rewriting.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Connect PostgreSQL and Elastic Search with this Foreign Data Wrapper",
    "version": "0.12.0",
    "project_urls": {
        "Homepage": "https://github.com/matthewfranglen/postgres-elasticsearch-fdw",
        "Repository": "https://github.com/matthewfranglen/postgres-elasticsearch-fdw",
        "download": "https://github.com/matthewfranglen/postgres-elasticsearch-fdw/archive/0.12.0.zip"
    },
    "split_keywords": [
        "postgresql",
        " postgres",
        " elasticsearch",
        " es",
        " fdw"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6e9331fca521070fa50adc417ced49afa49a3f1abf8658d1e2782b61fe89dc77",
                "md5": "6d8e708a535e763e878a5a2ce56c6131",
                "sha256": "48f90ed81b653aaa7bd16e097365a24016309bfee3296032c0bda2ca1d7e02aa"
            },
            "downloads": -1,
            "filename": "pg_es_fdw-0.12.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6d8e708a535e763e878a5a2ce56c6131",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 12419,
            "upload_time": "2024-05-10T23:05:11",
            "upload_time_iso_8601": "2024-05-10T23:05:11.673733Z",
            "url": "https://files.pythonhosted.org/packages/6e/93/31fca521070fa50adc417ced49afa49a3f1abf8658d1e2782b61fe89dc77/pg_es_fdw-0.12.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "897453b75128f9e8c7bcf817a67173530e7fc45eb9d6af3953005923a8edc4ce",
                "md5": "b1785ef10b42c13374fffec90c85bece",
                "sha256": "5abe88373904d185b3a742caa1398d9786fe259e5120841bfe77013510e0a9cc"
            },
            "downloads": -1,
            "filename": "pg_es_fdw-0.12.0.tar.gz",
            "has_sig": false,
            "md5_digest": "b1785ef10b42c13374fffec90c85bece",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 15438,
            "upload_time": "2024-05-10T23:05:14",
            "upload_time_iso_8601": "2024-05-10T23:05:14.420848Z",
            "url": "https://files.pythonhosted.org/packages/89/74/53b75128f9e8c7bcf817a67173530e7fc45eb9d6af3953005923a8edc4ce/pg_es_fdw-0.12.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-10 23:05:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "matthewfranglen",
    "github_project": "postgres-elasticsearch-fdw",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "elasticsearch",
            "specs": []
        }
    ],
    "lcname": "pg-es-fdw"
}

Matthew Franglen