wildcard.hps


Namewildcard.hps JSON
Version 1.4.3 PyPI version JSON
download
home_pagehttps://github.com/castlecms/wildcard.hps
Summaryopensearch integration with CastleCMS and Plone
upload_time2023-10-11 14:02:37
maintainer
docs_urlNone
authorWildcard Corp.
requires_python>2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*
licenseGPL version 2
keywords castlecms plone opensearch search indexing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            wildcard.hps
============

CastleCMS and Plone integration with [OpenSearch](https://opensearch.org)

This product was forked from [collective.elasticsearch](https://github.com/collective/collective.elasticsearch)
in order to provide integration with OpenSearch instead of ElasticSearch. OpenSearch itself is
a fork of ElasticSearch and compatible with, at least, the ES 7.10.x series of releases (at least
at opensearch-py 1.1.0). Compatibility may diverge in the future, and while the collective.elasticsearch
package will likely try to maintain compatibility with ElasticSearch, wildcard.hps is intended
to maintain compatibility with OpenSearch.

## Quickstart

First, start up an instance (for official guides, see the [opensearch project documentation](https://opensearch.org/docs/latest/opensearch/install/index/))

```
$ docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:latest
$ curl -XGET https://localhost:9200 -u 'admin:admin' -k
```

Second, setup Plone/CastleCMS:

1. add `wildcard.hps` to the `eggs` section of your buildout
3. run buildout
4. restart your instance, using relevant Environment Variables to connect to your opensearch instance
5. install the 'Wildcard HPS' product
6. under the 'Wildcard HPS' control panel, click 'Convert Catalog' then 'Rebuild Catalog'

Configuration Settings are passed as environment variables. See the "Configuration" section
below for more details.


## Overview

This package aims to index all fields the `portal_catalog` indexes
and allows you to delete the `Title`, `Description` and `SearchableText`
indexes which can provide significant improvement to performance and RAM usage.

OpenSearch queries are ONLY used when Title, Description and SearchableText
text are in the query. Otherwise, Plone's default catalog will be used.
This is because Plone's default catalog is faster on normal queries than using
OpenSearch.


## Configuration

Configuration for OpenSearch connections, and custom index naming, is done through
Environment Variables. This allows per-instance customization without the need to
modify site data, and allows for many deployments to use the same cluster(s) without
_needing_ to do per-site customized index names.

Available Environrment Variable Options:

  * `HPS_ZOPE_CONF_PATH`
    * path to a zope.conf to get a Zope app instance
    * NOTE: this is only needed for the `reindex_hps` script that gets installed.
      See `wildcard/hps/scripts/reindex.py`.
  * `HPS_OVERRIDE_LOGGING`
    * if present, will tell the `reindex_hps` script to override the root logging
      configuration, and print logging to console at INFO level.
  * `HPS_FORCE_ENABLE`
    * default: no
    * accepted values (all other values are equivalent to False): Yes, True, 1, On
    * will force the "enabled" lookup to be True
  * `HPS_INSTANCE_INDEX_PREFIX`
    * default: None
    * a string value prepended to index names used by the Plone instances this addon is installed into
  * `HPS_INCLUDE_TRASHED_BY_DEFAULT`
    * default: no
    * accepted values (all other values are equivalent to False): Yes, True, 1, On
    * will default searchResults to include trashed entries (which are not included by default)
  * `HPS_FOCE_EXTERNAL_INDEXES`
    * default: None
    * a list of object properties that will be included in the externally index object (IE
      the indexed object in opensearch)
  * `OPENSEARCH_HOSTS`
    * default: https://admin:admin@localhost:9200
    * a list of RFC-1738 formated urls. multiple urls can be specified by putting a space between urls.
    * NOTE: for now, the opensearch-py (1.1.0) does not respect the HTTP auth info that is formatted
      as part of the URL, instead use `OPENSEARCH_HTTP_USERNAME` and `OPENSEARCH_HTTP_PASSWORD` to pass
      the same HTTP auth to each request to any node listed as a host.
  * `OPENSEARCH_HTTP_USERNAME`
    * default: None
    * a username to use in all connections to any node in the `OPENSEARCH_HOSTS` list
  * `OPENSEARCH_HTTP_PASSWORD`
    * default: None
    * a password to use in all connections to any node in the `OPENSEARCH_HOSTS` list
  * `OPENSEARCH_TIMEOUT`
    * default connection timeout
  * `OPENSEARCH_RETRY_ON_TIMEOUT`
    * default: Off
    * accepted values (all other values are equivalent to False): Yes, True, 1, On
    * retry connection to different node when connection fails
  * `OPENSEARCH_SNIFF_ON_START`
    * default: False
    * accepted values (all other values are equivalent to False): Yes, True, 1, On
    * refresh nodes before doing anything
  * `OPENSEARCH_SNIFF_ON_CONNECTION_FAIL`
    * default: False
    * accepted values (all other values are equivalent to False): Yes, True, 1, On
    * refresh nodes after a node fails to respond
  * `OPENSEARCH_SNIFFER_TIMEOUT`
    * default: None
    * refresh node list on this time (in seconds) interval
  * `OPENSEARCH_SNIFF_TIMEOUT`
    * default: 0.1
    * timeout of sniff request
  * `OPENSEARCH_USE_SSL`
    * default: False
    * accepted values (all other values are equivalent to False): Yes, True, 1, On
    * connections to OpenSearch will use SSL
  * `OPENSEARCH_VERIFY_CERTS`
    * default: True
    * accepted values (all other values are equivalent to False): Yes, True, 1, On
    * verify SSL certificates when using SSL connections to OpenSearch
  * `OPENSEARCH_SSL_SHOW_WARN`
    * default: True
    * accepted values (all other values are equivalent to False): Yes, True, 1, On
    * when verifying SSL certificates is disabled, then a warning will be shown by default
  * `OPENSEARCH_CERTS_PATH`
    * default: None
    * a path to a directory containing CA Certificates used in SSL verification
  * `OPENSEARCH_CLIENT_CERT_PATH`
    * default: None
    * a path to a PEM formated SSL client certificate for SSL client auth
  * `OPENSEARCH_CLIENT_CERT_KEY` -- 
    * default: None
    * a path to a PEM formated SSL client key for SSL client auth


## Compatibility

Only tested with Plone 5 with Dexterity types.

Only compatible with versions of OpenSearch (and ElasticSearch) compatible
with the `opensearch-py` library.

For ElasticSearch integration, see [collective.elasticsearch](https://github.com/collective/collective.elasticsearch).


## State

Support for all index column types is done EXCEPT for the DateRecurringIndex
index column type. If you are doing a full text search along with a query that
contains a DateRecurringIndex column, it will not work.


## Celery support

This package comes with Celery support where all indexing operations will be pushed
into celery to be run asynchronously.

Please see instructions for collective.celery to see how this works.


## Running tests

First, start an instance of OpenSearch.

Second,

```
$ virtualenv ./env
$ ./env/bin/pip install -r requirements.txt
$ ./env/bin/buildout -c buildout.cfg
$ ./bin/test
```


Changelog
=========

1.4.3 (2023-10-11)
------------------

- handle unicode for index data derived from IAdditionalIndexDataProvider adapters


1.4.2 (2023-05-15)
------------------

- abstract unicode handling code for hook when getting index data, and handle
  tuples, lists, and dict values


1.4.1 (2023-05-11)
------------------

- handle unicode error and fix bug in hook when getting index data


1.4.0 (2022-11-04)
------------------

- allow a custom prefix to be defined for fetching connection settings from the
  environment (default to the previous hard-coded 'OPENSEARCH_' value)


1.3.0 (2022-08-17)
------------------

- add HPS_FORCE_EXTERNAL_INDEXES
- update default set returned when external indexes setting is not configured yet


1.2.1 (2022-06-23)
------------------

- fix some view name's in the control panel templates


1.2.0 (2022-05-25)
------------------

- add HPS_INCLUDE_TRASHED_BY_DEFAULT env for disabling a filter on searchResults
  from WildcardHPSCatalog (see readme entry for HPS_INCLUDE_TRASHED_BY_DEFAULT)


1.1.1 (2022-05-12)
------------------

- add property on wildcard.hps.opensearch.WildcardHPSCatalog for the instance prefix


1.1.0 (2022-05-12)
------------------

- initial fork from: https://github.com/collective/collective.elasticsearch/commit/d21bf7b9311a9fc923283eeff11c42f4145180b4
  this fork aims to primarily maintain compatibility with the OpenSearch project, which
  itself has forked from ElasticSearch 7.10.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/castlecms/wildcard.hps",
    "name": "wildcard.hps",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*",
    "maintainer_email": "",
    "keywords": "castlecms plone opensearch search indexing",
    "author": "Wildcard Corp.",
    "author_email": "corporate@wildcardcorp.com",
    "download_url": "https://files.pythonhosted.org/packages/b6/0b/236655b0a1224842bbeae3dc8729d06e13384fe88d54cdf91d470d2d5fac/wildcard.hps-1.4.3.tar.gz",
    "platform": null,
    "description": "wildcard.hps\n============\n\nCastleCMS and Plone integration with [OpenSearch](https://opensearch.org)\n\nThis product was forked from [collective.elasticsearch](https://github.com/collective/collective.elasticsearch)\nin order to provide integration with OpenSearch instead of ElasticSearch. OpenSearch itself is\na fork of ElasticSearch and compatible with, at least, the ES 7.10.x series of releases (at least\nat opensearch-py 1.1.0). Compatibility may diverge in the future, and while the collective.elasticsearch\npackage will likely try to maintain compatibility with ElasticSearch, wildcard.hps is intended\nto maintain compatibility with OpenSearch.\n\n## Quickstart\n\nFirst, start up an instance (for official guides, see the [opensearch project documentation](https://opensearch.org/docs/latest/opensearch/install/index/))\n\n```\n$ docker run -p 9200:9200 -p 9600:9600 -e \"discovery.type=single-node\" opensearchproject/opensearch:latest\n$ curl -XGET https://localhost:9200 -u 'admin:admin' -k\n```\n\nSecond, setup Plone/CastleCMS:\n\n1. add `wildcard.hps` to the `eggs` section of your buildout\n3. run buildout\n4. restart your instance, using relevant Environment Variables to connect to your opensearch instance\n5. install the 'Wildcard HPS' product\n6. under the 'Wildcard HPS' control panel, click 'Convert Catalog' then 'Rebuild Catalog'\n\nConfiguration Settings are passed as environment variables. See the \"Configuration\" section\nbelow for more details.\n\n\n## Overview\n\nThis package aims to index all fields the `portal_catalog` indexes\nand allows you to delete the `Title`, `Description` and `SearchableText`\nindexes which can provide significant improvement to performance and RAM usage.\n\nOpenSearch queries are ONLY used when Title, Description and SearchableText\ntext are in the query. Otherwise, Plone's default catalog will be used.\nThis is because Plone's default catalog is faster on normal queries than using\nOpenSearch.\n\n\n## Configuration\n\nConfiguration for OpenSearch connections, and custom index naming, is done through\nEnvironment Variables. This allows per-instance customization without the need to\nmodify site data, and allows for many deployments to use the same cluster(s) without\n_needing_ to do per-site customized index names.\n\nAvailable Environrment Variable Options:\n\n  * `HPS_ZOPE_CONF_PATH`\n    * path to a zope.conf to get a Zope app instance\n    * NOTE: this is only needed for the `reindex_hps` script that gets installed.\n      See `wildcard/hps/scripts/reindex.py`.\n  * `HPS_OVERRIDE_LOGGING`\n    * if present, will tell the `reindex_hps` script to override the root logging\n      configuration, and print logging to console at INFO level.\n  * `HPS_FORCE_ENABLE`\n    * default: no\n    * accepted values (all other values are equivalent to False): Yes, True, 1, On\n    * will force the \"enabled\" lookup to be True\n  * `HPS_INSTANCE_INDEX_PREFIX`\n    * default: None\n    * a string value prepended to index names used by the Plone instances this addon is installed into\n  * `HPS_INCLUDE_TRASHED_BY_DEFAULT`\n    * default: no\n    * accepted values (all other values are equivalent to False): Yes, True, 1, On\n    * will default searchResults to include trashed entries (which are not included by default)\n  * `HPS_FOCE_EXTERNAL_INDEXES`\n    * default: None\n    * a list of object properties that will be included in the externally index object (IE\n      the indexed object in opensearch)\n  * `OPENSEARCH_HOSTS`\n    * default: https://admin:admin@localhost:9200\n    * a list of RFC-1738 formated urls. multiple urls can be specified by putting a space between urls.\n    * NOTE: for now, the opensearch-py (1.1.0) does not respect the HTTP auth info that is formatted\n      as part of the URL, instead use `OPENSEARCH_HTTP_USERNAME` and `OPENSEARCH_HTTP_PASSWORD` to pass\n      the same HTTP auth to each request to any node listed as a host.\n  * `OPENSEARCH_HTTP_USERNAME`\n    * default: None\n    * a username to use in all connections to any node in the `OPENSEARCH_HOSTS` list\n  * `OPENSEARCH_HTTP_PASSWORD`\n    * default: None\n    * a password to use in all connections to any node in the `OPENSEARCH_HOSTS` list\n  * `OPENSEARCH_TIMEOUT`\n    * default connection timeout\n  * `OPENSEARCH_RETRY_ON_TIMEOUT`\n    * default: Off\n    * accepted values (all other values are equivalent to False): Yes, True, 1, On\n    * retry connection to different node when connection fails\n  * `OPENSEARCH_SNIFF_ON_START`\n    * default: False\n    * accepted values (all other values are equivalent to False): Yes, True, 1, On\n    * refresh nodes before doing anything\n  * `OPENSEARCH_SNIFF_ON_CONNECTION_FAIL`\n    * default: False\n    * accepted values (all other values are equivalent to False): Yes, True, 1, On\n    * refresh nodes after a node fails to respond\n  * `OPENSEARCH_SNIFFER_TIMEOUT`\n    * default: None\n    * refresh node list on this time (in seconds) interval\n  * `OPENSEARCH_SNIFF_TIMEOUT`\n    * default: 0.1\n    * timeout of sniff request\n  * `OPENSEARCH_USE_SSL`\n    * default: False\n    * accepted values (all other values are equivalent to False): Yes, True, 1, On\n    * connections to OpenSearch will use SSL\n  * `OPENSEARCH_VERIFY_CERTS`\n    * default: True\n    * accepted values (all other values are equivalent to False): Yes, True, 1, On\n    * verify SSL certificates when using SSL connections to OpenSearch\n  * `OPENSEARCH_SSL_SHOW_WARN`\n    * default: True\n    * accepted values (all other values are equivalent to False): Yes, True, 1, On\n    * when verifying SSL certificates is disabled, then a warning will be shown by default\n  * `OPENSEARCH_CERTS_PATH`\n    * default: None\n    * a path to a directory containing CA Certificates used in SSL verification\n  * `OPENSEARCH_CLIENT_CERT_PATH`\n    * default: None\n    * a path to a PEM formated SSL client certificate for SSL client auth\n  * `OPENSEARCH_CLIENT_CERT_KEY` -- \n    * default: None\n    * a path to a PEM formated SSL client key for SSL client auth\n\n\n## Compatibility\n\nOnly tested with Plone 5 with Dexterity types.\n\nOnly compatible with versions of OpenSearch (and ElasticSearch) compatible\nwith the `opensearch-py` library.\n\nFor ElasticSearch integration, see [collective.elasticsearch](https://github.com/collective/collective.elasticsearch).\n\n\n## State\n\nSupport for all index column types is done EXCEPT for the DateRecurringIndex\nindex column type. If you are doing a full text search along with a query that\ncontains a DateRecurringIndex column, it will not work.\n\n\n## Celery support\n\nThis package comes with Celery support where all indexing operations will be pushed\ninto celery to be run asynchronously.\n\nPlease see instructions for collective.celery to see how this works.\n\n\n## Running tests\n\nFirst, start an instance of OpenSearch.\n\nSecond,\n\n```\n$ virtualenv ./env\n$ ./env/bin/pip install -r requirements.txt\n$ ./env/bin/buildout -c buildout.cfg\n$ ./bin/test\n```\n\n\nChangelog\n=========\n\n1.4.3 (2023-10-11)\n------------------\n\n- handle unicode for index data derived from IAdditionalIndexDataProvider adapters\n\n\n1.4.2 (2023-05-15)\n------------------\n\n- abstract unicode handling code for hook when getting index data, and handle\n  tuples, lists, and dict values\n\n\n1.4.1 (2023-05-11)\n------------------\n\n- handle unicode error and fix bug in hook when getting index data\n\n\n1.4.0 (2022-11-04)\n------------------\n\n- allow a custom prefix to be defined for fetching connection settings from the\n  environment (default to the previous hard-coded 'OPENSEARCH_' value)\n\n\n1.3.0 (2022-08-17)\n------------------\n\n- add HPS_FORCE_EXTERNAL_INDEXES\n- update default set returned when external indexes setting is not configured yet\n\n\n1.2.1 (2022-06-23)\n------------------\n\n- fix some view name's in the control panel templates\n\n\n1.2.0 (2022-05-25)\n------------------\n\n- add HPS_INCLUDE_TRASHED_BY_DEFAULT env for disabling a filter on searchResults\n  from WildcardHPSCatalog (see readme entry for HPS_INCLUDE_TRASHED_BY_DEFAULT)\n\n\n1.1.1 (2022-05-12)\n------------------\n\n- add property on wildcard.hps.opensearch.WildcardHPSCatalog for the instance prefix\n\n\n1.1.0 (2022-05-12)\n------------------\n\n- initial fork from: https://github.com/collective/collective.elasticsearch/commit/d21bf7b9311a9fc923283eeff11c42f4145180b4\n  this fork aims to primarily maintain compatibility with the OpenSearch project, which\n  itself has forked from ElasticSearch 7.10.\n",
    "bugtrack_url": null,
    "license": "GPL version 2",
    "summary": "opensearch integration with CastleCMS and Plone",
    "version": "1.4.3",
    "project_urls": {
        "Homepage": "https://github.com/castlecms/wildcard.hps",
        "PyPI": "https://pypi.python.org/pypi/wildcard.hps",
        "Source": "https://github.com/castlecms/wildcard.hps",
        "Tracker": "https://github.com/castlecms/wildcard.hps/issues"
    },
    "split_keywords": [
        "castlecms",
        "plone",
        "opensearch",
        "search",
        "indexing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b60b236655b0a1224842bbeae3dc8729d06e13384fe88d54cdf91d470d2d5fac",
                "md5": "3b676bef740ec00569154dd198869c07",
                "sha256": "6fe9a651904cf75726d97e19ba91a61a9e49f961dc8a61d5b0df72a6155a5457"
            },
            "downloads": -1,
            "filename": "wildcard.hps-1.4.3.tar.gz",
            "has_sig": false,
            "md5_digest": "3b676bef740ec00569154dd198869c07",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*",
            "size": 43503,
            "upload_time": "2023-10-11T14:02:37",
            "upload_time_iso_8601": "2023-10-11T14:02:37.307077Z",
            "url": "https://files.pythonhosted.org/packages/b6/0b/236655b0a1224842bbeae3dc8729d06e13384fe88d54cdf91d470d2d5fac/wildcard.hps-1.4.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-11 14:02:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "castlecms",
    "github_project": "wildcard.hps",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "tox": true,
    "lcname": "wildcard.hps"
}
        
Elapsed time: 1.74720s