Name | ckanext-search-tweaks JSON |
Version |
1.0.0
JSON |
| download |
home_page | None |
Summary | None |
upload_time | 2025-07-23 10:41:56 |
maintainer | None |
docs_url | None |
author | None |
requires_python | None |
license | AGPL |
keywords |
ckan
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
|
[](https://github.com/DataShades/ckanext-search-tweaks/actions)
# ckanext-search-tweaks
Set of tools providing control over search results, sorting, etc.
## Requirements
Compatibility with core CKAN versions:
| CKAN version | Compatible? |
|-----------------|-------------|
| 2.9 and earlier | no |
| 2.10+ | yes |
## Installation
To install ckanext-search-tweaks:
1. Activate your CKAN virtual environment, for example:
. /usr/lib/ckan/default/bin/activate
2. Install it on the virtualenv
pip install ckanext-search-tweaks
3. Add `search_tweaks` to the `ckan.plugins` setting in your CKAN
config file (by default the config file is located at
`/etc/ckan/default/ckan.ini`).
4. Restart CKAN.
## Usage
This extensions consists of multiple plugins. `search_tweaks` is the main
(major) one, that must be enabled all the time. And depending on the set of
secondary (minor) plugins, extra features and config options may be
available. Bellow are listed all the plugins with their side effects.
| Plugin | Functionality |
|-----------------------------------------------------------------|---------------------------------------------------------------------------------|
| [search_tweaks](#search_tweaks) | Allow all the other plugins to be enabled |
| [search_tweaks_query_relevance](#search_tweaks_query_relevance) | Promote datasets that were visited most frequently for the current search query |
| [search_tweaks_field_relevance](#search_tweaks_field_relevance) | Promote dataset depending on value of it's field |
| [search_tweaks_spellcheck](#search_tweaks_spellcheck) | Provides "Did you mean?" feature |
### <a id="search_tweaks"></a> search_tweaks
Provides base functionality and essential pieces of logic used by all the other
plugins. Must be enabled as long as at least one other plugin from this
extension is enabled.
- Switches search to `edismax` query parser if none was specified
- Enables `ckanext.search_tweaks.iterfaces.ISearchTweaks` interface with the
following methods:
```python
def get_search_boost_fn(self, search_params: dict[str, Any]) -> Optional[str]:
"""Returns optional boost function that will be applied to the search query.
"""
return None
def get_extra_qf(self, search_params: dict[str, Any]) -> Optional[str]:
"""Return an additional fragment of the Solr's qf.
This fragment will be appended to the current qf
"""
return None
```
#### CLI
ckan search-tweaks -
Root of all the extension specific commands.
Every command from minor plugins is registered under this section.
#### Config settings
```ini
# Rewrite the default value of the qf parameter sent to Solr
# (optional, default: value of ckan.lib.search.query.QUERY_FIELDS).
ckanext.search_tweaks.common.qf = title^5 text
# Search by misspelled queries.
# (optional, default: false).
ckanext.search_tweaks.common.fuzzy_search.enabled = on
# Maximum number of misspelled letters. Possible values are 1 and 2.
# (optional, default: 1).
ckanext.search_tweaks.common.fuzzy_search.distance = 2
# Use `boost` instead of `bf` when `edismax` query parser is active
# (optional, default: true).
ckanext.search_tweaks.common.prefer_boost = no
# MinimumShouldMatch used in queries
# (optional, default: 1).
ckanext.search_tweaks.common.mm = 2<-1 5<80%
# Keep original query when using fuzzy search, e.g. "(hello~2) OR (hello)" if true
# (optional, default: true).
ckanext.search_tweaks.common.fuzzy_search.keep_original
```
---
### <a id="search_tweaks_query_relevance"></a> search_tweaks_query_relevance
Increase relevance of datasets for particular query depending on number of
direct visits of the dataset after running this search. I.e, if user searches
for `something` and then visits dataset **B** which is initially displayed in a
third row of search results, eventually this dataset will be displayed on the
second or even on the first row.
This is implemented in two stages:
- In the first stage, statistics are collected and stored in Redis.
- During search, we apply Solr's boost function to scale the dataset score based on the number of visits.
#### CLI
```
relevance query export - export statistics as CSV.
relevance query import - import statistics from CSV. Note, records that are already in storage but
are not listed in CSV won't be removed. It must be done manually
relevance query reset - reset all the query relevance scores
```
#### Config settings
```ini
# Minimum boost to apply to a search query
# (optional, default: 1).
ckanext.search_tweaks.query_relevance.min_boost = 1
# Maximum boost to apply to a search query. Set more to promote datasets higher
# (optional, default: 1.5).
ckanext.search_tweaks.query_relevance.max_boost = 2
# Maximum number of boosts to apply to a search query
# Set more to promote more datasets at once. Note, that a higher
# number of boosts may increase the query time.
# (optional, default: 60).
ckanext.search_tweaks.query_relevance.max_boost_count = 60
```
---
### <a id="search_tweaks_field_relevance"></a> search_tweaks_field_relevance
Increases the relevance of a dataset depending on value of its *numeric*
field. For now it's impossible to promote dataset using field with textual type.
No magic here either, this plugin allows you to specify Solr's boost function
that will be used during all the searches. One can achieve exactly the same
result using `ISearchTweaks.get_search_boost_fn`. But I expect this option to
be used often, so there is a possibility to update relevance without any extra
line of code.
#### Config settings
```ini
# Solr boost function for static numeric field
# (optional, default: None).
ckanext.search_tweaks.field_relevance.boost_function = pow(promoted_level,2)
# Field with dataset promotion level
# (optional, default: promotion_level).
ckanext.search_tweaks.field_relevance.blueprint.promotion.field_name = promotion
# Register pacakge promotion route
# (optional, default: False).
ckanext.search_tweaks.field_relevance.blueprint.promotion.enabled = true
```
#### Auth functions
search_tweaks_field_relevance_promote: access package promotion route. Calls `package_update` by default.
---
### <a id="search_tweaks_spellcheck"></a> search_tweaks_spellcheck
Exposes search suggestions from the Solr's spellcheck component to CKAN
templates. This plugin doesn't do much and mainly relies on the Solr's built-in
functionality. Thus you have to make a lot of changes inside Solr in order to
use it:
- `solrconfig.xml`. Configure spellcheck component. Search for `<searchComponent
name="spellcheck" class="solr.SpellCheckComponent">` section and add the
following item under it:
```xml
<lst name="spellchecker">
<str name="name">did_you_mean</str>
<str name="field">did_you_mean</str>
<str name="buildOnCommit">false</str>
</lst>
```
- Add cron job that will update suggestions dictionary periodically:
```sh
ckan search-tweaks spellcheck rebuild
```
- `solrconfig.xml`. Add spellcheck component to the search handler (`<requestHandler
name="/select" class="solr.SearchHandler">`):
```xml
<arr name="last-components">
<str>spellcheck</str>
</arr>
```
- Define spellcheck field in the schema. If you want to use an existing
field(`text` for example), change `<str name="field">did_you_mean</str>`
value inside `solrconfig.xml` to the name of the selected field instead.
```xml
<field name="did_you_mean" type="textgen" indexed="true" multiValued="true" />
```
- **Note:** skip if you've decided to use an existing field in the previous step.
<br/>
Copy meaningfull values into this field:
```xml
<copyField source="title" dest="did_you_mean"/>
<copyField source="notes" dest="did_you_mean"/>
<copyField source="res_name" dest="did_you_mean"/>
<copyField source="res_description" dest="did_you_mean"/>
<copyField source="extras_*" dest="did_you_mean"/>
```
After that you have to restart Solr service and rebuild search index:
```sh
ckan search-index rebuild
```
Now you can use `spellcheck_did_you_mean` template helper that returns better
search query when available instead of the current one. Consider including
`search_tweaks/did_you_mean.html` fragment under search form.
#### Config settings
```ini
# Do not show suggestions that have fewer results than current query
# (optional, default: true).
ckanext.search_tweaks.spellcheck.more_results_only = off
# How many different suggestions you expect to see for query
# (optional, default: 1).
ckanext.search_tweaks.spellcheck.max_suggestions = 3
```
#### CLI
spellcheck rebuild - rebuild/reload spellcheck dictionary.
---
## Developer installation
To install ckanext-search-tweaks for development, activate your CKAN virtualenv and
do:
```sh
git clone https://github.com/DataShades/ckanext-search-tweaks.git
cd ckanext-search-tweaks
python setup.py develop
pip install -r dev-requirements.txt
```
## Tests
Apart from the default configuration for CKAN testing, you have to create
`ckan_search_tweaks` Solr's core, replace its schema with
`ckanext/search_tweaks/tests/schema.xml` and make changes to `solrconfig.xml`
that are required by `search_tweaks_spellcheck`.
To run the tests, do:
pytest --ckan-ini=test.ini ckanext/search_tweaks/tests
## License
[AGPL](https://www.gnu.org/licenses/agpl-3.0.en.html)
Raw data
{
"_id": null,
"home_page": null,
"name": "ckanext-search-tweaks",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": "DataShades <datashades@linkdigital.com.au>",
"keywords": "CKAN",
"author": null,
"author_email": "DataShades <datashades@linkdigital.com.au>, Sergey Motornyuk <sergey.motornyuk@linkdigital.com.au>",
"download_url": "https://files.pythonhosted.org/packages/d4/56/8520f9adec0b47b7ca642917dbc07fe692e77a5fce9edd15d43925720623/ckanext_search_tweaks-1.0.0.tar.gz",
"platform": null,
"description": "[](https://github.com/DataShades/ckanext-search-tweaks/actions)\n\n# ckanext-search-tweaks\n\nSet of tools providing control over search results, sorting, etc.\n\n## Requirements\n\nCompatibility with core CKAN versions:\n\n| CKAN version | Compatible? |\n|-----------------|-------------|\n| 2.9 and earlier | no |\n| 2.10+ | yes |\n\n\n## Installation\n\nTo install ckanext-search-tweaks:\n\n1. Activate your CKAN virtual environment, for example:\n\n\t\t. /usr/lib/ckan/default/bin/activate\n\n2. Install it on the virtualenv\n\n\t\tpip install ckanext-search-tweaks\n\n3. Add `search_tweaks` to the `ckan.plugins` setting in your CKAN\n config file (by default the config file is located at\n `/etc/ckan/default/ckan.ini`).\n\n4. Restart CKAN.\n\n## Usage\n\nThis extensions consists of multiple plugins. `search_tweaks` is the main\n(major) one, that must be enabled all the time. And depending on the set of\nsecondary (minor) plugins, extra features and config options may be\navailable. Bellow are listed all the plugins with their side effects.\n\n| Plugin | Functionality |\n|-----------------------------------------------------------------|---------------------------------------------------------------------------------|\n| [search_tweaks](#search_tweaks) | Allow all the other plugins to be enabled |\n| [search_tweaks_query_relevance](#search_tweaks_query_relevance) | Promote datasets that were visited most frequently for the current search query |\n| [search_tweaks_field_relevance](#search_tweaks_field_relevance) | Promote dataset depending on value of it's field |\n| [search_tweaks_spellcheck](#search_tweaks_spellcheck) | Provides \"Did you mean?\" feature |\n\n### <a id=\"search_tweaks\"></a> search_tweaks\n\nProvides base functionality and essential pieces of logic used by all the other\nplugins. Must be enabled as long as at least one other plugin from this\nextension is enabled.\n\n- Switches search to `edismax` query parser if none was specified\n- Enables `ckanext.search_tweaks.iterfaces.ISearchTweaks` interface with the\nfollowing methods:\n\n```python\ndef get_search_boost_fn(self, search_params: dict[str, Any]) -> Optional[str]:\n \"\"\"Returns optional boost function that will be applied to the search query.\n \"\"\"\n return None\n\ndef get_extra_qf(self, search_params: dict[str, Any]) -> Optional[str]:\n \"\"\"Return an additional fragment of the Solr's qf.\n This fragment will be appended to the current qf\n \"\"\"\n return None\n```\n\n#### CLI\n\n\tckan search-tweaks -\n\t\tRoot of all the extension specific commands.\n\t\tEvery command from minor plugins is registered under this section.\n\n\n#### Config settings\n\n```ini\n# Rewrite the default value of the qf parameter sent to Solr\n# (optional, default: value of ckan.lib.search.query.QUERY_FIELDS).\nckanext.search_tweaks.common.qf = title^5 text\n\n# Search by misspelled queries.\n# (optional, default: false).\nckanext.search_tweaks.common.fuzzy_search.enabled = on\n\n# Maximum number of misspelled letters. Possible values are 1 and 2.\n# (optional, default: 1).\nckanext.search_tweaks.common.fuzzy_search.distance = 2\n\n# Use `boost` instead of `bf` when `edismax` query parser is active\n# (optional, default: true).\nckanext.search_tweaks.common.prefer_boost = no\n\n# MinimumShouldMatch used in queries\n# (optional, default: 1).\nckanext.search_tweaks.common.mm = 2<-1 5<80%\n\n# Keep original query when using fuzzy search, e.g. \"(hello~2) OR (hello)\" if true\n# (optional, default: true).\nckanext.search_tweaks.common.fuzzy_search.keep_original\n```\n\n---\n\n### <a id=\"search_tweaks_query_relevance\"></a> search_tweaks_query_relevance\n\nIncrease relevance of datasets for particular query depending on number of\ndirect visits of the dataset after running this search. I.e, if user searches\nfor `something` and then visits dataset **B** which is initially displayed in a\nthird row of search results, eventually this dataset will be displayed on the\nsecond or even on the first row.\n\nThis is implemented in two stages:\n- In the first stage, statistics are collected and stored in Redis.\n- During search, we apply Solr's boost function to scale the dataset score based on the number of visits.\n\n#### CLI\n\n```\nrelevance query export - export statistics as CSV.\n\nrelevance query import - import statistics from CSV. Note, records that are already in storage but\n are not listed in CSV won't be removed. It must be done manually\n\nrelevance query reset - reset all the query relevance scores\n```\n\n\n#### Config settings\n\n```ini\n# Minimum boost to apply to a search query\n# (optional, default: 1).\nckanext.search_tweaks.query_relevance.min_boost = 1\n\n# Maximum boost to apply to a search query. Set more to promote datasets higher\n# (optional, default: 1.5).\nckanext.search_tweaks.query_relevance.max_boost = 2\n\n# Maximum number of boosts to apply to a search query\n# Set more to promote more datasets at once. Note, that a higher\n# number of boosts may increase the query time.\n# (optional, default: 60).\nckanext.search_tweaks.query_relevance.max_boost_count = 60\n```\n\n---\n### <a id=\"search_tweaks_field_relevance\"></a> search_tweaks_field_relevance\n\nIncreases the relevance of a dataset depending on value of its *numeric*\nfield. For now it's impossible to promote dataset using field with textual type.\n\nNo magic here either, this plugin allows you to specify Solr's boost function\nthat will be used during all the searches. One can achieve exactly the same\nresult using `ISearchTweaks.get_search_boost_fn`. But I expect this option to\nbe used often, so there is a possibility to update relevance without any extra\nline of code.\n\n#### Config settings\n\n```ini\n# Solr boost function for static numeric field\n# (optional, default: None).\nckanext.search_tweaks.field_relevance.boost_function = pow(promoted_level,2)\n\n# Field with dataset promotion level\n# (optional, default: promotion_level).\nckanext.search_tweaks.field_relevance.blueprint.promotion.field_name = promotion\n\n# Register pacakge promotion route\n# (optional, default: False).\nckanext.search_tweaks.field_relevance.blueprint.promotion.enabled = true\n```\n\n\n#### Auth functions\n\n\tsearch_tweaks_field_relevance_promote: access package promotion route. Calls `package_update` by default.\n\n---\n\n### <a id=\"search_tweaks_spellcheck\"></a> search_tweaks_spellcheck\n\nExposes search suggestions from the Solr's spellcheck component to CKAN\ntemplates. This plugin doesn't do much and mainly relies on the Solr's built-in\nfunctionality. Thus you have to make a lot of changes inside Solr in order to\nuse it:\n\n- `solrconfig.xml`. Configure spellcheck component. Search for `<searchComponent\n name=\"spellcheck\" class=\"solr.SpellCheckComponent\">` section and add the\n following item under it:\n\n```xml\n<lst name=\"spellchecker\">\n <str name=\"name\">did_you_mean</str>\n <str name=\"field\">did_you_mean</str>\n <str name=\"buildOnCommit\">false</str>\n</lst>\n```\n\n- Add cron job that will update suggestions dictionary periodically:\n\n```sh\nckan search-tweaks spellcheck rebuild\n```\n\n- `solrconfig.xml`. Add spellcheck component to the search handler (`<requestHandler\n name=\"/select\" class=\"solr.SearchHandler\">`):\n\n```xml\n<arr name=\"last-components\">\n <str>spellcheck</str>\n</arr>\n```\n\n- Define spellcheck field in the schema. If you want to use an existing\n field(`text` for example), change `<str name=\"field\">did_you_mean</str>`\n value inside `solrconfig.xml` to the name of the selected field instead.\n\n```xml\n<field name=\"did_you_mean\" type=\"textgen\" indexed=\"true\" multiValued=\"true\" />\n```\n\n- **Note:** skip if you've decided to use an existing field in the previous step.\n <br/>\n Copy meaningfull values into this field:\n\n```xml\n<copyField source=\"title\" dest=\"did_you_mean\"/>\n<copyField source=\"notes\" dest=\"did_you_mean\"/>\n<copyField source=\"res_name\" dest=\"did_you_mean\"/>\n<copyField source=\"res_description\" dest=\"did_you_mean\"/>\n<copyField source=\"extras_*\" dest=\"did_you_mean\"/>\n```\n\nAfter that you have to restart Solr service and rebuild search index:\n\n```sh\nckan search-index rebuild\n```\n\nNow you can use `spellcheck_did_you_mean` template helper that returns better\nsearch query when available instead of the current one. Consider including\n`search_tweaks/did_you_mean.html` fragment under search form.\n\n#### Config settings\n\n```ini\n# Do not show suggestions that have fewer results than current query\n# (optional, default: true).\nckanext.search_tweaks.spellcheck.more_results_only = off\n\n# How many different suggestions you expect to see for query\n# (optional, default: 1).\nckanext.search_tweaks.spellcheck.max_suggestions = 3\n```\n\n#### CLI\n\n\tspellcheck rebuild - rebuild/reload spellcheck dictionary.\n\n---\n\n## Developer installation\n\nTo install ckanext-search-tweaks for development, activate your CKAN virtualenv and\ndo:\n\n```sh\ngit clone https://github.com/DataShades/ckanext-search-tweaks.git\ncd ckanext-search-tweaks\npython setup.py develop\npip install -r dev-requirements.txt\n```\n\n\n## Tests\n\nApart from the default configuration for CKAN testing, you have to create\n`ckan_search_tweaks` Solr's core, replace its schema with\n`ckanext/search_tweaks/tests/schema.xml` and make changes to `solrconfig.xml`\nthat are required by `search_tweaks_spellcheck`.\n\nTo run the tests, do:\n\n pytest --ckan-ini=test.ini ckanext/search_tweaks/tests\n\n\n## License\n\n[AGPL](https://www.gnu.org/licenses/agpl-3.0.en.html)\n",
"bugtrack_url": null,
"license": "AGPL",
"summary": null,
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/DataShades/ckanext-search-tweaks"
},
"split_keywords": [
"ckan"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "9901570f3ccc20b13a1321e508cf01a850fee1111586584aac659cbf82abf789",
"md5": "228d4ed8a417ae11418cbbfd50510b32",
"sha256": "d2d6f826c3f1f4fdc39bb6f1d6017773511c3199fa0527edf0b5fdb7728e9d91"
},
"downloads": -1,
"filename": "ckanext_search_tweaks-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "228d4ed8a417ae11418cbbfd50510b32",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 51204,
"upload_time": "2025-07-23T10:41:55",
"upload_time_iso_8601": "2025-07-23T10:41:55.293816Z",
"url": "https://files.pythonhosted.org/packages/99/01/570f3ccc20b13a1321e508cf01a850fee1111586584aac659cbf82abf789/ckanext_search_tweaks-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d4568520f9adec0b47b7ca642917dbc07fe692e77a5fce9edd15d43925720623",
"md5": "d08d72341dfe1cac0e5fe01d541ebb62",
"sha256": "a76f67012717f46c2276145a4ad752f3b1c1578d399c406d43b0131ce3e6f175"
},
"downloads": -1,
"filename": "ckanext_search_tweaks-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "d08d72341dfe1cac0e5fe01d541ebb62",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 42566,
"upload_time": "2025-07-23T10:41:56",
"upload_time_iso_8601": "2025-07-23T10:41:56.927273Z",
"url": "https://files.pythonhosted.org/packages/d4/56/8520f9adec0b47b7ca642917dbc07fe692e77a5fce9edd15d43925720623/ckanext_search_tweaks-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-23 10:41:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DataShades",
"github_project": "ckanext-search-tweaks",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"requirements": [],
"lcname": "ckanext-search-tweaks"
}