Name | horsebox JSON |
Version |
0.7.0
JSON |
| download |
home_page | None |
Summary | You Know, for local Search. |
upload_time | 2025-07-18 17:29:00 |
maintainer | None |
docs_url | None |
author | None |
requires_python | <3.14,>=3.9 |
license | None |
keywords |
cli
search
tantivy
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Horsebox
A versatile and autonomous command line tool for search.
[](https://github.com/michelcaradec/horsebox/actions/workflows/python-tests.yml) 
<details>
<summary>Table of contents</summary>
- [Abstract](#abstract)
- [TL;DR](#tldr)
- [Requirements](#requirements)
- [Tool Installation](#tool-installation)
- [Project Setup](#project-setup)
- [Python Environment](#python-environment)
- [Usage](#usage)
- [Naming Conventions](#naming-conventions)
- [Getting Help](#getting-help)
- [Rendering](#rendering)
- [Searching](#searching)
- [Building An Index](#building-an-index)
- [Refreshing An Index](#refreshing-an-index)
- [Inspecting An Index](#inspecting-an-index)
- [Analyzing Some Text](#analyzing-some-text)
- [Concepts](#concepts)
- [Collectors](#collectors)
- [Raw Collector](#raw-collector)
- [Guess Collector](#guess-collector)
- [Collectors Usage Matrix](#collectors-usage-matrix)
- [Index](#index)
- [Strategies](#strategies)
- [Annexes](#annexes)
- [Project Bootstrap](#project-bootstrap)
- [Unit Tests](#unit-tests)
- [Manual Testing In Docker](#manual-testing-in-docker)
- [Samples](#samples)
- [Advanced Searches](#advanced-searches)
- [Using A Custom Analyzer](#using-a-custom-analyzer)
- [Custom Analyzer Definition](#custom-analyzer-definition)
- [Custom Analyzer Limitations](#custom-analyzer-limitations)
- [Configuration](#configuration)
- [Where Does This Name Come From](#where-does-this-name-come-from)
</details>
## Abstract
Anybody faced at least once a situation where searching for some information was required, whether it was from a project folder, or any other place that contains information of interest.
[Horsebox](#where-does-this-name-come-from) is a tool whose purpose is to offer such search feature (thanks to the full-text search engine library [Tantivy](https://github.com/quickwit-oss/tantivy)), without any external dependencies, from the command line.
While it was built with a developer persona in mind, it can be used by anybody who is not afraid of typing few characters in a terminal ([samples](#samples) are here to guide you).
Disclaimer: this tool was tested on Linux (Ubuntu, Debian) and MacOS only.
## TL;DR
*For the ones who want to go **straight** to the point.*
```bash
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env
# Install Horsebox
uv tool install horsebox
```
You are ready to [search](#searching).
## Requirements
All the commands described in this project rely on the Python package and project manager [uv](https://docs.astral.sh/uv/).
1. Install uv:
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
2. Or update it:
```bash
uv self update
```
## Tool Installation
*For the ones who just want to **use** the tool.*
1. Install the tool:
- From PyPi:
```bash
uv tool install horsebox
```
- From the online Github project:
```bash
uv tool install git+https://github.com/michelcaradec/horsebox
```
2. [Use](#usage) the tool.
## Project Setup
*For the ones who want to **develop** on the project.*
### Python Environment
1. Clone the project:
```bash
git clone https://github.com/michelcaradec/horsebox.git
cd horsebox
```
2. Create a Python virtual environment:
```bash
uv sync
# Install the development requirements
uv sync --extra dev
# Activate the environment
source .venv/bin/activate
```
3. Check the tool execution:
```bash
uv run horsebox
```
Alternate commands:
- `uv run hb`.
- `uv run ./src/horsebox/main.py`.
- `python ./src/horsebox/main.py`.
4. The tool can also be installed from the local project with the command:
```bash
uv tool install --editable .
```
5. [Use](#usage) the tool.
## Usage
### Naming Conventions
The following terms are used:
- **Datasource**: the place where the information will be collected from. It can be a folder, a web page, an RSS feed, etc.
- **Container**: the "box" containing the information. It can be a file, a web page, an RSS article, etc.
- **Content**: the information contained in a container. It is mostly text, but can also be a date of last update for a file.
- **[Collector](#collectors)**: a working unit in charge of gathering information to be converted in searchable one.
### Getting Help
To list the available commands:
```bash
hb --help
```
To get help for a given command (here `search`):
```bash
hb search --help
```
### Rendering
For any command, the option `--format` specifies the output format:
- `txt`: text mode (default).
- `json`: JSON. The shortcut option `--json` can also be used.
### Searching
The query string syntax, specified with the option `--query`, is the one supported by the [Tantivy's query parser](https://docs.rs/tantivy/latest/tantivy/query/struct.QueryParser.html).
Example: search in text files (with extension `.txt`) under the folder `demo`.
```bash
hb search --from ./demo/ --pattern "*.txt" --query "better" --highlight
```
Options used:
- `--from`: folder to (recursively) index.
- `--pattern`: files to index.
**Attention!** The pattern must be enclosed in quotes to prevent wildcard expansion.
- `--query`: search query.
- `--highlight`: shows the places where the result was found in the content of the files.
One result is returned, as there is only one document (i.e. container) in the index.
A different [collector](#collectors) can be used to index line by line:
```bash
hb search --from ./demo/ --pattern "*.txt" --using fileline --query "better" --highlight --limit 5
```
Options used:
- `--using`: collector to use for indexing.
- `--limit`: returns a maximum number of results (default is 10).
The option `--count` can be added to show the total number of results found:
```bash
hb search --from ./demo/ --pattern "*.txt" --using fileline --query "better" --count
```
*See the section [samples](#samples) for advanced usage.*
### Building An Index
Example: build an index `.index-demo` from the text files (with extension `.txt`) under the folder `demo`.
```bash
hb build --from ./demo/ --pattern "*.txt" --index ./.index-demo
```
Options used:
- `--from`: folder to (recursively) index.
- `--pattern`: files to index.
**Attention!** The pattern must be enclosed in quotes to prevent wildcard expansion.
- `--index`: location where to persist the index.
By default, the [collector](#collectors) `filecontent` is used.
An alternate collector can be specified with the option `--using`.
The option `--dry-run` can be used to show the items to be index, without creating the index.
The built index can be searched:
```bash
hb search --index ./.index-demo --query "better" --highlight
```
Searching on a persisted index will trigger a warning if the age of the index (i.e. the time elapsed since it was built) goes over a given threshold (which can be [configured](#configuration)).
The index can be [refreshed](#refreshing-an-index) to contain the most up-to-date data.
### Refreshing An Index
A built index can be refreshed to contain the most up-to-date data.
Example: refresh the index `.index-demo` [previously built](#building-an-index).
```bash
hb refresh --index ./.index-demo
```
There are cases where an index can't be refreshed:
- The index was built with a version prior to `0.4.0`.
- The index data source was provided by pipe (see the section [Collectors Usage Matrix](#collectors-usage-matrix)).
### Inspecting An Index
To get technical information on an existing index:
```bash
hb inspect --index ./.index-demo
```
To get the most frequent keywords (option `--top`):
```bash
hb search --index ./.index-demo --top
```
### Analyzing Some Text
**Attention!** The version `0.7.0` introduced a [new option](#using-a-custom-analyzer) `--analyzer`, which replaces the legacy ones (`--tokenizer`, `--tokenizer-params`, `--filter` and `--filter-params`). Even-though the use of this new option is strongly recommended, the legacies are still available with the command `analyze`.
The command `analyze` is used to play with the [tokenizers](https://docs.rs/tantivy/latest/tantivy/tokenizer/trait.Tokenizer.html) and [filters](https://docs.rs/tantivy/latest/tantivy/tokenizer/trait.TokenFilter.html) supported by Tantivy to index documents.
To tokenize a text:
```bash
hb analyze \
--text "Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust." \
--tokenizer whitespace
```
To filter a text:
```bash
hb analyze \
--text "Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust." \
--filter lowercase
```
*Multiple examples can be found in the script [usage.sh](./demo/usage.sh).*
## Concepts
Horsebox has been thought around few concepts:
- [Collectors](#collectors).
- [Index](#index).
Understanding them will help in choosing the right usage [strategy](#strategies).
### Collectors
A collector is in charge of **gathering information** from a given **datasource**, and returning **documents** to [index](#index).
It acts as a level of abstraction, which returns documents to be ingested.
Horsebox supports different types of collectors:
| Collector | Description |
| ------------- | --------------------------------------------------------------- |
| `filename` | One document per file, containing the name of the file only. |
| `filecontent` | One document per file, with the content of the file (default). |
| `fileline` | One document per line and per file. |
| `rss` | RSS feed, one document per article. |
| `html` | Collect the content of an HTML page. |
| `raw` | Collect ready to index [JSON documents](#raw-collector). |
| `pdf` | Collect the content of a PDF document. |
| `guess` | Used to identify the [best collector](#guess-collector) to use. |
The collector to use is specified with the option `--using`.
The default collector is `filecontent`.
*See the script [usage.sh](./demo/usage.sh) for sample commands.*
#### Raw Collector
The collector `raw` can be used to collect ready to index JSON documents.
Each document must have the following fields [^4]:
- `name` (`text`): name of the [container](#naming-conventions).
- `type` (`text`): type of the container.
- `content` (`text`): content of the container.
- `path` (`text`): full path to the content.
- `size` (`integer`): size of the content.
- `date` (`text`): date-time of the content (formatted as `YYYY-mm-dd H:M:S`, for example `2025-03-14 12:34:56`).
The JSON file can contain either an **array** of JSON objects (default), or one JSON object per **line** ([JSON Lines](https://jsonlines.org/) format).
The JSON Lines format is automatically detected from the file extension (`.jsonl` or `ndjson`).
The option `--jsonl` can be used to **force** the detection (this is for example required when the data source is provided by pipe).
Some examples can be found with the files [raw.json](./demo/raw.json) (array of objects) and [raw.jsonl](./demo/raw.jsonl) (JSON Lines).
[^4]: Run the command `hb schema` for a full description.
#### Guess Collector
*Disclaimer: starting with version `0.5.0`.*
The collector `guess` can be used to identify the best collector to use.
The detection is done in a [best effort](#collectors-usage-matrix) from the options `--from` and `--pattern`.
An error will be returned if no collector could be guessed.
The collector `guess` is used by default, meaning that the option `--using` can be skipped.
Examples:
```bash
hb search --from "https://planetpython.org/rss20.xml" --query "some text" --using rss
# Can be simplified as (guess from the https scheme and the extension .xml)
hb search --from "https://planetpython.org/rss20.xml" --query "some text"
```
```bash
hb search --from ./raw.json --query "some text" --using raw
# Can be simplified as (guess from the file extension .json)
hb search --from ./raw.json --query "some text"
```
```bash
hb search --from ./raw.jsonl --query "some text" --using raw --jsonl
# Can be simplified as (guess from the file extension .jsonl)
hb search --from ./raw.jsonl --query "some text"
```
This feature is mainly for command line usage, to help reduce the number of keystrokes.
When used in a script, it is advised to explicitly set the required collector with the option `--using`.
#### Collectors Usage Matrix
The following table shows the options supported by each collector.
| Collector | Multi-Sources Mode | Single Source Mode | Pipe Support |
| ------------- | -------------------------------- | ------------------ | ------------------------------ |
| `filename` | `--from $folder --pattern *.xxx` | - | - |
| `filecontent` | `--from $folder --pattern *.xxx` | - | `--from - --using filecontent` |
| `fileline` | `--from $folder --pattern *.xxx` | - | `--from - --using fileline` |
| `rss` | - | `--from $feed` | - |
| `html` | - | `--from $page` | - |
| `raw` | - | `--from $json` | `--from - --using raw` |
| `pdf` | `--from $folder --pattern *.pdf` | `--from $file.pdf` | - |
*`-`: not supported.*
These options are also used by the [guess collector](#guess-collector) in its detection.
### Index
The index is the place where the [collected](#collectors) information lies. It is required to allow the search.
An index is built with the help of [Tantivy](https://github.com/quickwit-oss/tantivy) (a full-text search engine library), and can be either stored in **memory** or persisted on **disk** (see the section [strategies](#strategies)).
### Strategies
Horsebox can be used in different ways to achieve to goal of searching (and hopefully finding) some information.
- One-step search:
Index and [search](#searching), with **no** index **retention**.
This fits an **unstable** source of information, with frequent changes.
```bash
hb search --from ./demo/ --pattern "*.txt" --query "better" --highlight
```
- Two-steps search:
[Build](#building-an-index) and persist an index, then [search](#searching) in the existing index.
This fits a **stable** and **voluminous** (i.e. long to index) source of information.
Build the index once:
```bash
hb build --from ./demo/ --pattern "*.txt" --index ./.index-demo
```
Then search it (multiple times):
```bash
hb search --index ./.index-demo --query "better" --highlight
```
- All-in-one search:
Like a two-steps search, but in **one step**.
For the ones who want to do everything in a single command.
```bash
hb search --from ./demo/ --pattern "*.txt" --index ./.index-demo --query "better" --highlight
```
The use of the options `--from` and `--index` with the command `search` will [build and persist](#building-an-index) an index, which will be immediately [searched](#searching), and will also be available for future searches.
## Annexes
### Project Bootstrap
The project was created with the command:
```bash
# Will create a directory `horsebox`
uv init --app --package --python 3.10 horsebox
```
### Unit Tests
The Python module [doctest](https://docs.python.org/3.10/library/doctest.html) has been used to write some unit tests:
```bash
python -m doctest -v ./src/**/*.py
```
### Manual Testing In Docker
Horsebox can be installed in a fresh environment to demonstrate its straight-forward setup:
```bash
# From the project
docker run --interactive --tty --name horsebox --volume=$(pwd):/home/project --rm debian:stable /bin/bash
# Alternative: Docker image with OhMyZsh (for colors)
docker run --interactive --tty --name horsebox --volume=$(pwd):/home/project --rm ohmyzsh/ohmyzsh:main
# Install few dependencies
source /home/project/demo/docker-setup.sh
# Install Horsebox
uv tool install .
```
### Samples
The script [usage.sh](./demo/usage.sh) contains multiple sample commands:
```bash
bash ./demo/usage.sh
```
#### Advanced Searches
The query string syntax conforms to [Tantivy's query parser](https://docs.rs/tantivy/latest/tantivy/query/struct.QueryParser.html).
- Search on multiple datasources:
Multiple datasources can be collected to build/search an index by repeating the option `--from`.
```bash
hb search \
--from "https://www.blog.pythonlibrary.org/feed/" \
--from "https://planetpython.org/rss20.xml" \
--from "https://realpython.com/atom.xml?format=xml" \
--using rss --query "duckdb" --highlight
```
*Source: [Top 60 Python RSS Feeds](https://rss.feedspot.com/python_rss_feeds/).*
- Search on date:
A date must be formatted using the [RFC3339](https://en.wikipedia.org/wiki/ISO_8601) standard.
Example: `2025-01-01T10:00:00.00Z`.
The field `date` must be specified, and the date must be enclosed in single quotes:
```bash
hb search --from ./demo/raw.json --using raw --query "date:'2025-01-01T10:00:00.00Z'"
```
- Search on range of dates:
**Inclusive boundaries** are specified with square brackets (`[` `]`):
```bash
hb search --from ./demo/raw.json --using raw --query "date:[2025-01-01T10:00:00.00Z TO 2025-01-04T10:00:00.00Z]"
```
**Exclusive boundaries** are specified with curly brackets (`{` `}`):
```bash
hb search --from ./demo/raw.json --using raw --query "date:{2025-01-01T10:00:00.00Z TO 2025-01-04T10:00:00.00Z}"
```
Inclusive and exclusive boundaries can be **mixed**:
```bash
hb search --from ./demo/raw.json --using raw --query "date:[2025-01-01T10:00:00.00Z TO 2025-01-04T10:00:00.00Z}"
````
- Fuzzy search:
The fuzzy search is not supported by Tantivy query parser [^6].
Horsebox comes with a simple implementation, which supports the expression of a fuzzy search on a **single word**.
Example: the search `engne~` will find the word "engine", as it differs by 1 change according to the [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) measure.
The distance can be set after the marker `~`, with a maximum of 2: `engne~1`, `engne~2`.
```bash
hb search --from ./demo/raw.json --using raw --query "engne~1"
```
**Attention!** The highlight (option `--highlight`) will not work [^5].
- Proximity search:
The two words to search are enclosed in single quotes, followed by the maximum distance.
```bash
hb search --from ./demo/raw.json --using raw --query "'engine inspired'~1" --highlight
```
*Will find all documents where the words "engine" and "inspired" are separated by a maximum of 1 word.*
[^5]: See <https://github.com/quickwit-oss/tantivy/issues/2576>.
[^6]: Even though Tantivy implements it with [FuzzyTermQuery](https://docs.rs/tantivy/latest/tantivy/query/struct.FuzzyTermQuery.html).
### Using A Custom Analyzer
*Disclaimer: starting with version `0.7.0`.*
By default, the [content of a container](#naming-conventions) is indexed in the [field](#raw-collector) `content` using the [default](https://docs.rs/tantivy/latest/tantivy/tokenizer/#default) [text analyzer](https://docs.rs/tantivy/latest/tantivy/tokenizer/), which splits the text on every white space and punctuation [^8], removes words (a.k.a tokens) that are longer than 40 characters [^9], and lowercases the text [^10].
While this text analyzer fits most of the cases, it may not be suitable for more specific content such as code.
The option `--analyzer` can be used with the commands `build` and `search` to apply a custom tokenizer and filters to the content to be indexed.
The [definition of the custom analyzer](#custom-analyzer-definition) is described in a JSON file.
The analyzed content will be indexed to an extra field `custom`.
To build an index `.index-analyzer` with a custom analyzer `analyzer-python.json`:
```bash
hb build \
--index .index-analyzer \
--from ./demo --pattern "*.py" \
--using fileline \
--analyzer ./demo/analyzer-python.json
```
A full set of examples can be found in the script [usage.sh](./demo/usage.sh).
#### Custom Analyzer Definition
The custom analyzer definition is described in a JSON file.
It is composed of two parts:
- `tokenizer`: the tokenizer to use to split the content. There must be one and only one tokenizer.
- `filters`: the filters to use to transform and select the tokenized content. There can be zero or more filters.
```json
{
"tokenizer": {
"$tokenize_type": {...}
},
"filters": [
{
"$filter_type": {...}
},
{
"$filter_type": {...}
}
]
}
```
Each object `$tokenize_type` and `$filter_type` may contain extra configuration fields.
The file [analyzer-schema.json](./demo/analyzer-schema.json) is a [JSON Schema](https://json-schema.org/) which can be used to **validate** any custom analyzer definition.
The site [JSON Editor Online](https://jsoneditoronline.org/) proposes a [playground](https://jsoneditoronline.org/indepth/validate/json-schema-validator/#Try_it_out) to test it from your browser.
The Python library [jsonschema](https://pypi.org/project/jsonschema/) proposes an implementation of JSON Schema validation.
#### Custom Analyzer Limitations
- When a custom analyzer is defined, the [highlight](#searching) is done of the field `custom`.
- The tokenizer [regex](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.RegexTokenizer.html) uses the pattern syntax supported by the [Regex](https://docs.rs/tantivy-fst/latest/tantivy_fst/struct.Regex.html) implementation.
- The option `--top` is not applied on the field `custom`, due to the [fast](https://docs.rs/tantivy/latest/tantivy/fastfield/) mode required for aggregation, but not compatible with the tokenizer [regex](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.RegexTokenizer.html).
[^8]: Using the tokenizer [simple](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.SimpleTokenizer.html).
[^9]: Using the filter [remove_long](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.RemoveLongFilter.html).
[^10]: Using the filter [lowercase](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.LowerCaser.html).
### Configuration
Horsebox can be configured through **environment variables**:
| Setting | Description | Default Value |
| ------------------------ | ---------------------------------------------------------------------------- | ------------: |
| `HB_INDEX_BATCH_SIZE` | Batch size when indexing. | 1000 |
| `HB_HIGHLIGHT_MAX_CHARS` | Maximum number of characters to show for highlights. | 200 |
| `HB_PARSER_MAX_LINE` | Maximum size of a line in a container (unlimited if null). | |
| `HB_PARSER_MAX_CONTENT` | Maximum size of a container (unlimited if null). | |
| `HB_RENDER_MAX_CONTENT` | Maximum size of a document content to render (unlimited if null). | |
| `HB_INDEX_EXPIRATION` | Index freshness threshold (in seconds). | 3600 |
| `HB_CUSTOM_STOPWORDS` | Custom list of stop-words (separated by a comma). | |
| `HB_STRING_NORMALIZE` | Normalize strings [^7] when reading files (0=disabled, other value=enabled). | 1 |
| `HB_TOP_MIN_CHARS` | Minimum number of characters of a top keyword. | 1 |
To get help on configuration:
```bash
hb config
```
*The default and current values are displayed.*
[^7]: The normalization of a string consists in replacing the accented characters by their non-accented equivalent, and converting Unicode escaped characters. This is a CPU intensive process, which may not be required for some datasources.
### Where Does This Name Come From
I had some requirements to find a name:
- Short and easy to remember.
- Preferably a compound one, so it could be shortcut at the command line with the first letters of each part.
- Connected to Tantivy, whose logo is a rider on a horse.
I then remembered the nickname of a very good friend met during my studies in Cork, Ireland: "Horsebox".
That was it: the name will be "Horsebox", with its easy-to-type shortcut "hb".
Raw data
{
"_id": null,
"home_page": null,
"name": "horsebox",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.14,>=3.9",
"maintainer_email": "Michel Caradec <mcaradec@proton.me>",
"keywords": "CLI, Search, Tantivy",
"author": null,
"author_email": "Michel Caradec <mcaradec@proton.me>",
"download_url": "https://files.pythonhosted.org/packages/c7/e7/8d952b214562daca646d1c7b4ccdbb21e05efa9a9e1b6f580f6c80a241d2/horsebox-0.7.0.tar.gz",
"platform": null,
"description": "# Horsebox\n\nA versatile and autonomous command line tool for search.\n\n[](https://github.com/michelcaradec/horsebox/actions/workflows/python-tests.yml) \n\n<details>\n<summary>Table of contents</summary>\n\n- [Abstract](#abstract)\n- [TL;DR](#tldr)\n- [Requirements](#requirements)\n- [Tool Installation](#tool-installation)\n- [Project Setup](#project-setup)\n - [Python Environment](#python-environment)\n- [Usage](#usage)\n - [Naming Conventions](#naming-conventions)\n - [Getting Help](#getting-help)\n - [Rendering](#rendering)\n - [Searching](#searching)\n - [Building An Index](#building-an-index)\n - [Refreshing An Index](#refreshing-an-index)\n - [Inspecting An Index](#inspecting-an-index)\n - [Analyzing Some Text](#analyzing-some-text)\n- [Concepts](#concepts)\n - [Collectors](#collectors)\n - [Raw Collector](#raw-collector)\n - [Guess Collector](#guess-collector)\n - [Collectors Usage Matrix](#collectors-usage-matrix)\n - [Index](#index)\n - [Strategies](#strategies)\n- [Annexes](#annexes)\n - [Project Bootstrap](#project-bootstrap)\n - [Unit Tests](#unit-tests)\n - [Manual Testing In Docker](#manual-testing-in-docker)\n - [Samples](#samples)\n - [Advanced Searches](#advanced-searches)\n - [Using A Custom Analyzer](#using-a-custom-analyzer)\n - [Custom Analyzer Definition](#custom-analyzer-definition)\n - [Custom Analyzer Limitations](#custom-analyzer-limitations)\n - [Configuration](#configuration)\n - [Where Does This Name Come From](#where-does-this-name-come-from)\n\n</details>\n\n## Abstract\n\nAnybody faced at least once a situation where searching for some information was required, whether it was from a project folder, or any other place that contains information of interest. \n\n[Horsebox](#where-does-this-name-come-from) is a tool whose purpose is to offer such search feature (thanks to the full-text search engine library [Tantivy](https://github.com/quickwit-oss/tantivy)), without any external dependencies, from the command line.\n\nWhile it was built with a developer persona in mind, it can be used by anybody who is not afraid of typing few characters in a terminal ([samples](#samples) are here to guide you).\n\nDisclaimer: this tool was tested on Linux (Ubuntu, Debian) and MacOS only.\n\n## TL;DR\n\n*For the ones who want to go **straight** to the point.*\n\n```bash\n# Install uv\ncurl -LsSf https://astral.sh/uv/install.sh | sh\nsource $HOME/.local/bin/env\n\n# Install Horsebox\nuv tool install horsebox\n```\n\nYou are ready to [search](#searching).\n\n## Requirements\n\nAll the commands described in this project rely on the Python package and project manager [uv](https://docs.astral.sh/uv/).\n\n1. Install uv:\n\n ```bash\n curl -LsSf https://astral.sh/uv/install.sh | sh\n ```\n\n2. Or update it:\n\n ```bash\n uv self update\n ```\n\n## Tool Installation\n\n*For the ones who just want to **use** the tool.*\n\n1. Install the tool:\n\n - From PyPi:\n\n ```bash\n uv tool install horsebox\n ```\n\n - From the online Github project:\n\n ```bash\n uv tool install git+https://github.com/michelcaradec/horsebox\n ```\n\n2. [Use](#usage) the tool.\n\n## Project Setup\n\n*For the ones who want to **develop** on the project.*\n\n### Python Environment\n\n1. Clone the project:\n\n ```bash\n git clone https://github.com/michelcaradec/horsebox.git\n\n cd horsebox\n ```\n\n2. Create a Python virtual environment:\n\n ```bash\n uv sync\n\n # Install the development requirements\n uv sync --extra dev\n\n # Activate the environment\n source .venv/bin/activate\n ```\n\n3. Check the tool execution:\n\n ```bash\n uv run horsebox\n ```\n\n Alternate commands:\n\n - `uv run hb`.\n - `uv run ./src/horsebox/main.py`.\n - `python ./src/horsebox/main.py`.\n\n4. The tool can also be installed from the local project with the command:\n\n ```bash\n uv tool install --editable .\n ```\n\n5. [Use](#usage) the tool.\n\n## Usage\n\n### Naming Conventions\n\nThe following terms are used:\n\n- **Datasource**: the place where the information will be collected from. It can be a folder, a web page, an RSS feed, etc.\n- **Container**: the \"box\" containing the information. It can be a file, a web page, an RSS article, etc.\n- **Content**: the information contained in a container. It is mostly text, but can also be a date of last update for a file.\n- **[Collector](#collectors)**: a working unit in charge of gathering information to be converted in searchable one.\n\n### Getting Help\n\nTo list the available commands:\n\n```bash\nhb --help\n```\n\nTo get help for a given command (here `search`):\n\n```bash\nhb search --help\n```\n\n### Rendering\n\nFor any command, the option `--format` specifies the output format:\n\n- `txt`: text mode (default).\n- `json`: JSON. The shortcut option `--json` can also be used.\n\n### Searching\n\nThe query string syntax, specified with the option `--query`, is the one supported by the [Tantivy's query parser](https://docs.rs/tantivy/latest/tantivy/query/struct.QueryParser.html).\n\nExample: search in text files (with extension `.txt`) under the folder `demo`.\n\n```bash\nhb search --from ./demo/ --pattern \"*.txt\" --query \"better\" --highlight\n```\n\nOptions used:\n\n- `--from`: folder to (recursively) index.\n- `--pattern`: files to index. \n **Attention!** The pattern must be enclosed in quotes to prevent wildcard expansion.\n- `--query`: search query.\n- `--highlight`: shows the places where the result was found in the content of the files.\n\nOne result is returned, as there is only one document (i.e. container) in the index.\n\nA different [collector](#collectors) can be used to index line by line:\n\n```bash\nhb search --from ./demo/ --pattern \"*.txt\" --using fileline --query \"better\" --highlight --limit 5\n```\n\nOptions used:\n\n- `--using`: collector to use for indexing.\n- `--limit`: returns a maximum number of results (default is 10).\n\nThe option `--count` can be added to show the total number of results found:\n\n```bash\nhb search --from ./demo/ --pattern \"*.txt\" --using fileline --query \"better\" --count\n```\n\n*See the section [samples](#samples) for advanced usage.*\n\n### Building An Index\n\nExample: build an index `.index-demo` from the text files (with extension `.txt`) under the folder `demo`.\n\n```bash\nhb build --from ./demo/ --pattern \"*.txt\" --index ./.index-demo\n```\n\nOptions used:\n\n- `--from`: folder to (recursively) index.\n- `--pattern`: files to index. \n **Attention!** The pattern must be enclosed in quotes to prevent wildcard expansion.\n- `--index`: location where to persist the index.\n\nBy default, the [collector](#collectors) `filecontent` is used. \nAn alternate collector can be specified with the option `--using`. \nThe option `--dry-run` can be used to show the items to be index, without creating the index.\n\nThe built index can be searched:\n\n```bash\nhb search --index ./.index-demo --query \"better\" --highlight\n```\n\nSearching on a persisted index will trigger a warning if the age of the index (i.e. the time elapsed since it was built) goes over a given threshold (which can be [configured](#configuration)). \nThe index can be [refreshed](#refreshing-an-index) to contain the most up-to-date data.\n\n### Refreshing An Index\n\nA built index can be refreshed to contain the most up-to-date data.\n\nExample: refresh the index `.index-demo` [previously built](#building-an-index).\n\n```bash\nhb refresh --index ./.index-demo\n```\n\nThere are cases where an index can't be refreshed:\n\n- The index was built with a version prior to `0.4.0`.\n- The index data source was provided by pipe (see the section [Collectors Usage Matrix](#collectors-usage-matrix)).\n\n### Inspecting An Index\n\nTo get technical information on an existing index:\n\n```bash\nhb inspect --index ./.index-demo\n```\n\nTo get the most frequent keywords (option `--top`):\n\n```bash\nhb search --index ./.index-demo --top\n```\n\n### Analyzing Some Text\n\n**Attention!** The version `0.7.0` introduced a [new option](#using-a-custom-analyzer) `--analyzer`, which replaces the legacy ones (`--tokenizer`, `--tokenizer-params`, `--filter` and `--filter-params`). Even-though the use of this new option is strongly recommended, the legacies are still available with the command `analyze`.\n\nThe command `analyze` is used to play with the [tokenizers](https://docs.rs/tantivy/latest/tantivy/tokenizer/trait.Tokenizer.html) and [filters](https://docs.rs/tantivy/latest/tantivy/tokenizer/trait.TokenFilter.html) supported by Tantivy to index documents.\n\nTo tokenize a text:\n\n```bash\nhb analyze \\\n --text \"Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust.\" \\\n --tokenizer whitespace\n```\n\nTo filter a text:\n\n```bash\nhb analyze \\\n --text \"Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust.\" \\\n --filter lowercase\n```\n\n*Multiple examples can be found in the script [usage.sh](./demo/usage.sh).*\n\n## Concepts\n\nHorsebox has been thought around few concepts:\n\n- [Collectors](#collectors).\n- [Index](#index).\n\nUnderstanding them will help in choosing the right usage [strategy](#strategies).\n\n### Collectors\n\nA collector is in charge of **gathering information** from a given **datasource**, and returning **documents** to [index](#index). \nIt acts as a level of abstraction, which returns documents to be ingested.\n\nHorsebox supports different types of collectors:\n\n| Collector | Description |\n| ------------- | --------------------------------------------------------------- |\n| `filename` | One document per file, containing the name of the file only. |\n| `filecontent` | One document per file, with the content of the file (default). |\n| `fileline` | One document per line and per file. |\n| `rss` | RSS feed, one document per article. |\n| `html` | Collect the content of an HTML page. |\n| `raw` | Collect ready to index [JSON documents](#raw-collector). |\n| `pdf` | Collect the content of a PDF document. |\n| `guess` | Used to identify the [best collector](#guess-collector) to use. |\n\nThe collector to use is specified with the option `--using`. \nThe default collector is `filecontent`.\n\n*See the script [usage.sh](./demo/usage.sh) for sample commands.*\n\n#### Raw Collector\n\nThe collector `raw` can be used to collect ready to index JSON documents.\n\nEach document must have the following fields [^4]:\n\n- `name` (`text`): name of the [container](#naming-conventions).\n- `type` (`text`): type of the container.\n- `content` (`text`): content of the container.\n- `path` (`text`): full path to the content.\n- `size` (`integer`): size of the content.\n- `date` (`text`): date-time of the content (formatted as `YYYY-mm-dd H:M:S`, for example `2025-03-14 12:34:56`).\n\nThe JSON file can contain either an **array** of JSON objects (default), or one JSON object per **line** ([JSON Lines](https://jsonlines.org/) format). \nThe JSON Lines format is automatically detected from the file extension (`.jsonl` or `ndjson`). \nThe option `--jsonl` can be used to **force** the detection (this is for example required when the data source is provided by pipe).\n\nSome examples can be found with the files [raw.json](./demo/raw.json) (array of objects) and [raw.jsonl](./demo/raw.jsonl) (JSON Lines).\n\n[^4]: Run the command `hb schema` for a full description.\n\n#### Guess Collector\n\n*Disclaimer: starting with version `0.5.0`.*\n\nThe collector `guess` can be used to identify the best collector to use. \nThe detection is done in a [best effort](#collectors-usage-matrix) from the options `--from` and `--pattern`. \nAn error will be returned if no collector could be guessed.\n\nThe collector `guess` is used by default, meaning that the option `--using` can be skipped.\n\nExamples:\n\n```bash\nhb search --from \"https://planetpython.org/rss20.xml\" --query \"some text\" --using rss\n# Can be simplified as (guess from the https scheme and the extension .xml)\nhb search --from \"https://planetpython.org/rss20.xml\" --query \"some text\"\n```\n\n```bash\nhb search --from ./raw.json --query \"some text\" --using raw\n# Can be simplified as (guess from the file extension .json)\nhb search --from ./raw.json --query \"some text\"\n```\n\n```bash\nhb search --from ./raw.jsonl --query \"some text\" --using raw --jsonl\n# Can be simplified as (guess from the file extension .jsonl)\nhb search --from ./raw.jsonl --query \"some text\"\n```\n\nThis feature is mainly for command line usage, to help reduce the number of keystrokes. \nWhen used in a script, it is advised to explicitly set the required collector with the option `--using`.\n\n#### Collectors Usage Matrix\n\nThe following table shows the options supported by each collector.\n\n| Collector | Multi-Sources Mode | Single Source Mode | Pipe Support |\n| ------------- | -------------------------------- | ------------------ | ------------------------------ |\n| `filename` | `--from $folder --pattern *.xxx` | - | - |\n| `filecontent` | `--from $folder --pattern *.xxx` | - | `--from - --using filecontent` |\n| `fileline` | `--from $folder --pattern *.xxx` | - | `--from - --using fileline` |\n| `rss` | - | `--from $feed` | - |\n| `html` | - | `--from $page` | - |\n| `raw` | - | `--from $json` | `--from - --using raw` |\n| `pdf` | `--from $folder --pattern *.pdf` | `--from $file.pdf` | - |\n\n*`-`: not supported.*\n\nThese options are also used by the [guess collector](#guess-collector) in its detection.\n\n### Index\n\nThe index is the place where the [collected](#collectors) information lies. It is required to allow the search.\n\nAn index is built with the help of [Tantivy](https://github.com/quickwit-oss/tantivy) (a full-text search engine library), and can be either stored in **memory** or persisted on **disk** (see the section [strategies](#strategies)).\n\n### Strategies\n\nHorsebox can be used in different ways to achieve to goal of searching (and hopefully finding) some information.\n\n- One-step search: \n Index and [search](#searching), with **no** index **retention**. \n This fits an **unstable** source of information, with frequent changes.\n\n ```bash\n hb search --from ./demo/ --pattern \"*.txt\" --query \"better\" --highlight\n ```\n\n- Two-steps search: \n [Build](#building-an-index) and persist an index, then [search](#searching) in the existing index. \n This fits a **stable** and **voluminous** (i.e. long to index) source of information.\n\n Build the index once:\n\n ```bash\n hb build --from ./demo/ --pattern \"*.txt\" --index ./.index-demo\n ```\n\n Then search it (multiple times):\n\n ```bash\n hb search --index ./.index-demo --query \"better\" --highlight\n ```\n\n- All-in-one search: \n Like a two-steps search, but in **one step**. \n For the ones who want to do everything in a single command.\n\n ```bash\n hb search --from ./demo/ --pattern \"*.txt\" --index ./.index-demo --query \"better\" --highlight\n ```\n\n The use of the options `--from` and `--index` with the command `search` will [build and persist](#building-an-index) an index, which will be immediately [searched](#searching), and will also be available for future searches.\n\n## Annexes\n\n### Project Bootstrap\n\nThe project was created with the command:\n\n```bash\n# Will create a directory `horsebox`\nuv init --app --package --python 3.10 horsebox\n```\n\n### Unit Tests\n\nThe Python module [doctest](https://docs.python.org/3.10/library/doctest.html) has been used to write some unit tests:\n\n```bash\npython -m doctest -v ./src/**/*.py\n```\n\n### Manual Testing In Docker\n\nHorsebox can be installed in a fresh environment to demonstrate its straight-forward setup:\n\n```bash\n# From the project\ndocker run --interactive --tty --name horsebox --volume=$(pwd):/home/project --rm debian:stable /bin/bash\n# Alternative: Docker image with OhMyZsh (for colors)\ndocker run --interactive --tty --name horsebox --volume=$(pwd):/home/project --rm ohmyzsh/ohmyzsh:main\n\n# Install few dependencies\nsource /home/project/demo/docker-setup.sh\n\n# Install Horsebox\nuv tool install .\n```\n\n### Samples\n\nThe script [usage.sh](./demo/usage.sh) contains multiple sample commands:\n\n```bash\nbash ./demo/usage.sh\n```\n\n#### Advanced Searches\n\nThe query string syntax conforms to [Tantivy's query parser](https://docs.rs/tantivy/latest/tantivy/query/struct.QueryParser.html).\n\n- Search on multiple datasources: \n Multiple datasources can be collected to build/search an index by repeating the option `--from`.\n\n ```bash\n hb search \\\n --from \"https://www.blog.pythonlibrary.org/feed/\" \\\n --from \"https://planetpython.org/rss20.xml\" \\\n --from \"https://realpython.com/atom.xml?format=xml\" \\\n --using rss --query \"duckdb\" --highlight\n ```\n\n *Source: [Top 60 Python RSS Feeds](https://rss.feedspot.com/python_rss_feeds/).*\n\n- Search on date: \n A date must be formatted using the [RFC3339](https://en.wikipedia.org/wiki/ISO_8601) standard. \n Example: `2025-01-01T10:00:00.00Z`.\n\n The field `date` must be specified, and the date must be enclosed in single quotes:\n\n ```bash\n hb search --from ./demo/raw.json --using raw --query \"date:'2025-01-01T10:00:00.00Z'\"\n ```\n\n- Search on range of dates: \n **Inclusive boundaries** are specified with square brackets (`[` `]`):\n\n ```bash\n hb search --from ./demo/raw.json --using raw --query \"date:[2025-01-01T10:00:00.00Z TO 2025-01-04T10:00:00.00Z]\"\n ```\n\n **Exclusive boundaries** are specified with curly brackets (`{` `}`):\n\n ```bash\n hb search --from ./demo/raw.json --using raw --query \"date:{2025-01-01T10:00:00.00Z TO 2025-01-04T10:00:00.00Z}\"\n ```\n\n Inclusive and exclusive boundaries can be **mixed**:\n\n ```bash\n hb search --from ./demo/raw.json --using raw --query \"date:[2025-01-01T10:00:00.00Z TO 2025-01-04T10:00:00.00Z}\"\n ````\n\n- Fuzzy search: \n The fuzzy search is not supported by Tantivy query parser [^6]. \n Horsebox comes with a simple implementation, which supports the expression of a fuzzy search on a **single word**. \n Example: the search `engne~` will find the word \"engine\", as it differs by 1 change according to the [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) measure.\n\n The distance can be set after the marker `~`, with a maximum of 2: `engne~1`, `engne~2`.\n\n ```bash\n hb search --from ./demo/raw.json --using raw --query \"engne~1\"\n ```\n\n **Attention!** The highlight (option `--highlight`) will not work [^5].\n\n- Proximity search: \n The two words to search are enclosed in single quotes, followed by the maximum distance.\n\n ```bash\n hb search --from ./demo/raw.json --using raw --query \"'engine inspired'~1\" --highlight\n ```\n\n *Will find all documents where the words \"engine\" and \"inspired\" are separated by a maximum of 1 word.*\n\n[^5]: See <https://github.com/quickwit-oss/tantivy/issues/2576>. \n[^6]: Even though Tantivy implements it with [FuzzyTermQuery](https://docs.rs/tantivy/latest/tantivy/query/struct.FuzzyTermQuery.html).\n\n### Using A Custom Analyzer\n\n*Disclaimer: starting with version `0.7.0`.*\n\nBy default, the [content of a container](#naming-conventions) is indexed in the [field](#raw-collector) `content` using the [default](https://docs.rs/tantivy/latest/tantivy/tokenizer/#default) [text analyzer](https://docs.rs/tantivy/latest/tantivy/tokenizer/), which splits the text on every white space and punctuation [^8], removes words (a.k.a tokens) that are longer than 40 characters [^9], and lowercases the text [^10].\n\nWhile this text analyzer fits most of the cases, it may not be suitable for more specific content such as code.\n\nThe option `--analyzer` can be used with the commands `build` and `search` to apply a custom tokenizer and filters to the content to be indexed. \nThe [definition of the custom analyzer](#custom-analyzer-definition) is described in a JSON file. \nThe analyzed content will be indexed to an extra field `custom`.\n\nTo build an index `.index-analyzer` with a custom analyzer `analyzer-python.json`:\n\n```bash\nhb build \\\n --index .index-analyzer \\\n --from ./demo --pattern \"*.py\" \\\n --using fileline \\\n --analyzer ./demo/analyzer-python.json\n```\n\nA full set of examples can be found in the script [usage.sh](./demo/usage.sh).\n\n#### Custom Analyzer Definition\n\nThe custom analyzer definition is described in a JSON file.\n\nIt is composed of two parts:\n\n- `tokenizer`: the tokenizer to use to split the content. There must be one and only one tokenizer.\n- `filters`: the filters to use to transform and select the tokenized content. There can be zero or more filters.\n\n```json\n{\n \"tokenizer\": {\n \"$tokenize_type\": {...}\n },\n \"filters\": [\n {\n \"$filter_type\": {...}\n },\n {\n \"$filter_type\": {...}\n }\n ]\n}\n```\n\nEach object `$tokenize_type` and `$filter_type` may contain extra configuration fields.\n\nThe file [analyzer-schema.json](./demo/analyzer-schema.json) is a [JSON Schema](https://json-schema.org/) which can be used to **validate** any custom analyzer definition. \nThe site [JSON Editor Online](https://jsoneditoronline.org/) proposes a [playground](https://jsoneditoronline.org/indepth/validate/json-schema-validator/#Try_it_out) to test it from your browser. \nThe Python library [jsonschema](https://pypi.org/project/jsonschema/) proposes an implementation of JSON Schema validation.\n\n#### Custom Analyzer Limitations\n\n- When a custom analyzer is defined, the [highlight](#searching) is done of the field `custom`.\n- The tokenizer [regex](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.RegexTokenizer.html) uses the pattern syntax supported by the [Regex](https://docs.rs/tantivy-fst/latest/tantivy_fst/struct.Regex.html) implementation.\n- The option `--top` is not applied on the field `custom`, due to the [fast](https://docs.rs/tantivy/latest/tantivy/fastfield/) mode required for aggregation, but not compatible with the tokenizer [regex](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.RegexTokenizer.html).\n\n[^8]: Using the tokenizer [simple](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.SimpleTokenizer.html). \n[^9]: Using the filter [remove_long](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.RemoveLongFilter.html). \n[^10]: Using the filter [lowercase](https://docs.rs/tantivy/latest/tantivy/tokenizer/struct.LowerCaser.html).\n\n### Configuration\n\nHorsebox can be configured through **environment variables**:\n\n| Setting | Description | Default Value |\n| ------------------------ | ---------------------------------------------------------------------------- | ------------: |\n| `HB_INDEX_BATCH_SIZE` | Batch size when indexing. | 1000 |\n| `HB_HIGHLIGHT_MAX_CHARS` | Maximum number of characters to show for highlights. | 200 |\n| `HB_PARSER_MAX_LINE` | Maximum size of a line in a container (unlimited if null). | |\n| `HB_PARSER_MAX_CONTENT` | Maximum size of a container (unlimited if null). | |\n| `HB_RENDER_MAX_CONTENT` | Maximum size of a document content to render (unlimited if null). | |\n| `HB_INDEX_EXPIRATION` | Index freshness threshold (in seconds). | 3600 |\n| `HB_CUSTOM_STOPWORDS` | Custom list of stop-words (separated by a comma). | |\n| `HB_STRING_NORMALIZE` | Normalize strings [^7] when reading files (0=disabled, other value=enabled). | 1 |\n| `HB_TOP_MIN_CHARS` | Minimum number of characters of a top keyword. | 1 |\n\nTo get help on configuration:\n\n```bash\nhb config\n```\n\n*The default and current values are displayed.*\n\n[^7]: The normalization of a string consists in replacing the accented characters by their non-accented equivalent, and converting Unicode escaped characters. This is a CPU intensive process, which may not be required for some datasources.\n\n### Where Does This Name Come From\n\nI had some requirements to find a name:\n\n- Short and easy to remember.\n- Preferably a compound one, so it could be shortcut at the command line with the first letters of each part.\n- Connected to Tantivy, whose logo is a rider on a horse.\n\nI then remembered the nickname of a very good friend met during my studies in Cork, Ireland: \"Horsebox\".\n\nThat was it: the name will be \"Horsebox\", with its easy-to-type shortcut \"hb\".\n",
"bugtrack_url": null,
"license": null,
"summary": "You Know, for local Search.",
"version": "0.7.0",
"project_urls": {
"Changelog": "https://github.com/michelcaradec/horsebox/blob/main/CHANGELOG.md",
"Homepage": "https://github.com/michelcaradec/horsebox",
"Issues": "https://github.com/michelcaradec/horsebox/issues",
"Repository": "https://github.com/michelcaradec/horsebox.git"
},
"split_keywords": [
"cli",
" search",
" tantivy"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4ca148bd8a3080b99b63bfb0be11f829487b1dd19ab51ed85a23a30f81c5ce60",
"md5": "71e7d06f41e0e321b8577b40958a3eb3",
"sha256": "ac3e71121616ef85d66aaf53e97e54828200b3f4fb02f66b5724461a70bb6d4b"
},
"downloads": -1,
"filename": "horsebox-0.7.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "71e7d06f41e0e321b8577b40958a3eb3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.14,>=3.9",
"size": 56033,
"upload_time": "2025-07-18T17:28:59",
"upload_time_iso_8601": "2025-07-18T17:28:59.356051Z",
"url": "https://files.pythonhosted.org/packages/4c/a1/48bd8a3080b99b63bfb0be11f829487b1dd19ab51ed85a23a30f81c5ce60/horsebox-0.7.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "c7e78d952b214562daca646d1c7b4ccdbb21e05efa9a9e1b6f580f6c80a241d2",
"md5": "aa6384891594fc9fef746eb0467a7886",
"sha256": "b253fdcf478320ba49d2ca763adda2ca53139b34fdb84e5dd27264811965a079"
},
"downloads": -1,
"filename": "horsebox-0.7.0.tar.gz",
"has_sig": false,
"md5_digest": "aa6384891594fc9fef746eb0467a7886",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.14,>=3.9",
"size": 34811,
"upload_time": "2025-07-18T17:29:00",
"upload_time_iso_8601": "2025-07-18T17:29:00.553237Z",
"url": "https://files.pythonhosted.org/packages/c7/e7/8d952b214562daca646d1c7b4ccdbb21e05efa9a9e1b6f580f6c80a241d2/horsebox-0.7.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-18 17:29:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "michelcaradec",
"github_project": "horsebox",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "horsebox"
}