busca-py


Namebusca-py JSON
Version 2.3.1 PyPI version JSON
download
home_pageNone
SummaryLibrary to search for files with content that most closely match the lines of a reference string
upload_time2023-10-31 17:01:50
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseNone
keywords file match closest
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # busca

[![CICD](https://github.com/noahbaculi/busca/actions/workflows/cicd.yml/badge.svg)](https://github.com/noahbaculi/busca/actions/workflows/cicd.yml)
[![PyPI version](https://badge.fury.io/py/busca-py.svg)](https://badge.fury.io/py/busca-py)

<img src="https://github.com/noahbaculi/busca/assets/49008873/443ead58-ff6f-4e16-982d-ba57096a6068" alt="busca logo" width="200">

CLI and library to search for files with content that most closely match the lines of a reference string.

![Busca Demo](https://github.com/noahbaculi/busca/assets/49008873/dbb40dc1-427e-4d55-839b-31e8c287bc43)

## Table of Contents

- [busca](#busca)
  - [Table of Contents](#table-of-contents)
  - [Python Library](#python-library)
  - [Command Line Interface](#command-line-interface)
    - [CLI Usage](#cli-usage)
      - [Examples](#examples)
        - [Find files that most closely match the source `file_5.py` file in a search directory](#find-files-that-most-closely-match-the-source-file_5py-file-in-a-search-directory)
        - [Find files that most closely match the source `path_to_reference.json` file in a search directory](#find-files-that-most-closely-match-the-source-path_to_referencejson-file-in-a-search-directory)
        - [Change search to scan the current working directory](#change-search-to-scan-the-current-working-directory)
        - [Narrow search to only consider `.json` files whose paths include the substring "foo" and that contain fewer than 1,000 lines](#narrow-search-to-only-consider-json-files-whose-paths-include-the-substring-foo-and-that-contain-fewer-than-1000-lines)
        - [Piped input mode to search the output of a command](#piped-input-mode-to-search-the-output-of-a-command)
    - [CLI Installation](#cli-installation)
      - [Mac OS](#mac-os)
        - [Homebrew](#homebrew)
      - [All platforms (Windows, MacOS, Linux)](#all-platforms-windows-macos-linux)
        - [Compile from source](#compile-from-source)

## Python Library

> 🐍 The Python library is renamed to `busca_py` due to a name conflict with an [existing (possibly abandoned) project](https://pypi.org/project/Busca/).

```shell
pip install busca_py
```

```python
import busca_py as busca


reference_file_path = "./sample_dir_hello_world/file_1.py"
with open(reference_file_path, "r") as file:
    reference_string = file.read()

# Perform search with required parameters
all_file_matches = busca.search_for_lines(
    reference_string=reference_string,
    search_path="./sample_dir_hello_world",
)

# File matches are returned in descending order of percent match
closest_file_match = all_file_matches[0]
assert closest_file_match.path == reference_file_path
assert closest_file_match.percent_match == 1.0
assert closest_file_match.lines == reference_string

# Perform search for top 5 matches with additional filters
# to speed up runtime by skipping files that will not match
relevant_file_matches = busca.search_for_lines(
    reference_string=reference_string,
    search_path="./sample_dir_hello_world",
    max_lines=10_000,
    include_globs=["*.py"],
    count=5,
)

assert len(relevant_file_matches) < len(all_file_matches)

# Create new file match object
new_file_match = busca.FileMatch("file/path", 1.0, "file\ncontent")
```

## Command Line Interface

### CLI Usage

🧑‍💻️ To see usage documentation, run

```shell
busca -h
```

Output for v2.1.3

```text
Simple utility to search for files with content that most closely match the lines of a reference string

Usage: busca --ref-file-path <REF_FILE_PATH> [OPTIONS]
       <SomeCommand> | busca [OPTIONS]

Options:
  -r, --ref-file-path <REF_FILE_PATH>  Local or absolute path to the reference comparison file. Overrides any piped input
  -s, --search-path <SEARCH_PATH>      Directory or file in which to search. Defaults to CWD
  -m, --max-lines <MAX_LINES>          The number of lines to consider when comparing files. Files with more lines will be skipped [default: 10000]
  -i, --include-glob <INCLUDE_GLOB>    Globs that qualify a file for comparison
  -x, --exclude-glob <EXCLUDE_GLOB>    Globs that disqualify a file from comparison
  -c, --count <COUNT>                  Number of results to display [default: 10]
  -h, --help                           Print help
  -V, --version                        Print version
```

#### Examples

##### Find files that most closely match the source `file_5.py` file in a search directory

```shell
❯ busca --ref-file-path sample_dir_mix/file_5.py --search-path sample_dir_mix

? Select a file to compare:  
  sample_dir_mix/file_5.py                  ++++++++++  100.0%
> sample_dir_mix/file_5v2.py                ++++++++++   97.5%
  sample_dir_mix/nested_dir/file_7.py       ++++         42.3%
  sample_dir_mix/aldras/aldras_settings.py  ++           24.1%
  sample_dir_mix/aldras/aldras_core.py      ++           21.0%
  sample_dir_mix/file_3.py                  +            13.2%
  sample_dir_mix/file_1.py                  +            11.0%
  sample_dir_mix/file_2.py                  +             9.4%
  sample_dir_mix/aldras/aldras_execute.py   +             7.5%
  sample_dir_mix/file_4.py                  +             6.9%
[↑↓ to move, enter to select, type to filter]
```

##### Find files that most closely match the source `path_to_reference.json` file in a search directory

```shell
busca --ref-file-path path_to_reference.json --search-path path_to_search_dir
```

##### Change search to scan the current working directory

```shell
busca --ref-file-path path_to_reference.json
```

##### Narrow search to only consider `.json` files whose paths include the substring "foo" and that contain fewer than 1,000 lines

```shell
busca --ref-file-path path_to_reference.json --include-glob '*.json' --include-glob '**foo**' --max-lines 1000
```

- [Glob reference](https://en.wikipedia.org/wiki/Glob_(programming))

##### Piped input mode to search the output of a command

```shell
# <SomeCommand> | busca [OPTIONS]
echo 'String to find in files.' | busca
```

<details style="margin-bottom: 2em">
<summary><h5>MacOS piped input mode<h4></summary>

📝 There is an [open issue](https://github.com/crossterm-rs/crossterm/issues/396) for MacOS in [`crossterm`](https://github.com/crossterm-rs/crossterm), one of busca's dependencies, that does not allow prompt interactivity when using piped input. Therefore, when a non interactive mode is detected, the file matches will be displayed but not interactively.

This can be worked around by adding the following aliases to your shell `.bashrc` or `.zshrc` file:

>   ```bash
>   # Wrap commands for busca search
>   busca_cmd_output() {
>       eval "$* > /tmp/busca_search.tmp" && busca -r /tmp/busca_search.tmp
>   }
>   ```

One-liners to add the wrapper function:

| Shell | Command                                                                                                                 |
| ----- | ----------------------------------------------------------------------------------------------------------------------- |
| Bash  | `echo -e 'busca_cmd_output() {\n\teval "$* > /tmp/busca_search.tmp" && busca -r /tmp/busca_search.tmp\n}' >> ~/.bashrc` |
| Zsh   | `echo -e 'busca_cmd_output() {\n\teval "$* > /tmp/busca_search.tmp" && busca -r /tmp/busca_search.tmp\n}' >> ~/.zshrc`  |

Reload your shell for the function to become available:

```shell
# busca_cmd_output <SomeCommand>
busca_cmd_output echo 'String to find in files.'
```

</details>

### CLI Installation

#### Mac OS

##### Homebrew

```shell
brew tap noahbaculi/busca
brew install busca
```

To update, run

```shell
brew update
brew upgrade busca
```

#### All platforms (Windows, MacOS, Linux)

##### Compile from source

0. Install Rust [using `rustup`](https://www.rust-lang.org/tools/install).

1. Clone this repo.

2. In the root of this repo, run

    ```shell
    cargo build --release
    ```

3. Add to path. For example, by copying the compiled binary to your local bin directory.

    ```shell
    cp target/release/busca $HOME/bin/
    ```


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "busca-py",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "file,match,closest",
    "author": null,
    "author_email": "Noah Baculi <noahbaculi@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/61/27/1135a79007f0e35efe2937bfe207651394be7293380e5100f449dd924b07/busca_py-2.3.1.tar.gz",
    "platform": null,
    "description": "# busca\n\n[![CICD](https://github.com/noahbaculi/busca/actions/workflows/cicd.yml/badge.svg)](https://github.com/noahbaculi/busca/actions/workflows/cicd.yml)\n[![PyPI version](https://badge.fury.io/py/busca-py.svg)](https://badge.fury.io/py/busca-py)\n\n<img src=\"https://github.com/noahbaculi/busca/assets/49008873/443ead58-ff6f-4e16-982d-ba57096a6068\" alt=\"busca logo\" width=\"200\">\n\nCLI and library to search for files with content that most closely match the lines of a reference string.\n\n![Busca Demo](https://github.com/noahbaculi/busca/assets/49008873/dbb40dc1-427e-4d55-839b-31e8c287bc43)\n\n## Table of Contents\n\n- [busca](#busca)\n  - [Table of Contents](#table-of-contents)\n  - [Python Library](#python-library)\n  - [Command Line Interface](#command-line-interface)\n    - [CLI Usage](#cli-usage)\n      - [Examples](#examples)\n        - [Find files that most closely match the source `file_5.py` file in a search directory](#find-files-that-most-closely-match-the-source-file_5py-file-in-a-search-directory)\n        - [Find files that most closely match the source `path_to_reference.json` file in a search directory](#find-files-that-most-closely-match-the-source-path_to_referencejson-file-in-a-search-directory)\n        - [Change search to scan the current working directory](#change-search-to-scan-the-current-working-directory)\n        - [Narrow search to only consider `.json` files whose paths include the substring \"foo\" and that contain fewer than 1,000 lines](#narrow-search-to-only-consider-json-files-whose-paths-include-the-substring-foo-and-that-contain-fewer-than-1000-lines)\n        - [Piped input mode to search the output of a command](#piped-input-mode-to-search-the-output-of-a-command)\n    - [CLI Installation](#cli-installation)\n      - [Mac OS](#mac-os)\n        - [Homebrew](#homebrew)\n      - [All platforms (Windows, MacOS, Linux)](#all-platforms-windows-macos-linux)\n        - [Compile from source](#compile-from-source)\n\n## Python Library\n\n> \ud83d\udc0d The Python library is renamed to `busca_py` due to a name conflict with an [existing (possibly abandoned) project](https://pypi.org/project/Busca/).\n\n```shell\npip install busca_py\n```\n\n```python\nimport busca_py as busca\n\n\nreference_file_path = \"./sample_dir_hello_world/file_1.py\"\nwith open(reference_file_path, \"r\") as file:\n    reference_string = file.read()\n\n# Perform search with required parameters\nall_file_matches = busca.search_for_lines(\n    reference_string=reference_string,\n    search_path=\"./sample_dir_hello_world\",\n)\n\n# File matches are returned in descending order of percent match\nclosest_file_match = all_file_matches[0]\nassert closest_file_match.path == reference_file_path\nassert closest_file_match.percent_match == 1.0\nassert closest_file_match.lines == reference_string\n\n# Perform search for top 5 matches with additional filters\n# to speed up runtime by skipping files that will not match\nrelevant_file_matches = busca.search_for_lines(\n    reference_string=reference_string,\n    search_path=\"./sample_dir_hello_world\",\n    max_lines=10_000,\n    include_globs=[\"*.py\"],\n    count=5,\n)\n\nassert len(relevant_file_matches) < len(all_file_matches)\n\n# Create new file match object\nnew_file_match = busca.FileMatch(\"file/path\", 1.0, \"file\\ncontent\")\n```\n\n## Command Line Interface\n\n### CLI Usage\n\n\ud83e\uddd1\u200d\ud83d\udcbb\ufe0f To see usage documentation, run\n\n```shell\nbusca -h\n```\n\nOutput for v2.1.3\n\n```text\nSimple utility to search for files with content that most closely match the lines of a reference string\n\nUsage: busca --ref-file-path <REF_FILE_PATH> [OPTIONS]\n       <SomeCommand> | busca [OPTIONS]\n\nOptions:\n  -r, --ref-file-path <REF_FILE_PATH>  Local or absolute path to the reference comparison file. Overrides any piped input\n  -s, --search-path <SEARCH_PATH>      Directory or file in which to search. Defaults to CWD\n  -m, --max-lines <MAX_LINES>          The number of lines to consider when comparing files. Files with more lines will be skipped [default: 10000]\n  -i, --include-glob <INCLUDE_GLOB>    Globs that qualify a file for comparison\n  -x, --exclude-glob <EXCLUDE_GLOB>    Globs that disqualify a file from comparison\n  -c, --count <COUNT>                  Number of results to display [default: 10]\n  -h, --help                           Print help\n  -V, --version                        Print version\n```\n\n#### Examples\n\n##### Find files that most closely match the source `file_5.py` file in a search directory\n\n```shell\n\u276f busca --ref-file-path sample_dir_mix/file_5.py --search-path sample_dir_mix\n\n? Select a file to compare:  \n  sample_dir_mix/file_5.py                  ++++++++++  100.0%\n> sample_dir_mix/file_5v2.py                ++++++++++   97.5%\n  sample_dir_mix/nested_dir/file_7.py       ++++         42.3%\n  sample_dir_mix/aldras/aldras_settings.py  ++           24.1%\n  sample_dir_mix/aldras/aldras_core.py      ++           21.0%\n  sample_dir_mix/file_3.py                  +            13.2%\n  sample_dir_mix/file_1.py                  +            11.0%\n  sample_dir_mix/file_2.py                  +             9.4%\n  sample_dir_mix/aldras/aldras_execute.py   +             7.5%\n  sample_dir_mix/file_4.py                  +             6.9%\n[\u2191\u2193 to move, enter to select, type to filter]\n```\n\n##### Find files that most closely match the source `path_to_reference.json` file in a search directory\n\n```shell\nbusca --ref-file-path path_to_reference.json --search-path path_to_search_dir\n```\n\n##### Change search to scan the current working directory\n\n```shell\nbusca --ref-file-path path_to_reference.json\n```\n\n##### Narrow search to only consider `.json` files whose paths include the substring \"foo\" and that contain fewer than 1,000 lines\n\n```shell\nbusca --ref-file-path path_to_reference.json --include-glob '*.json' --include-glob '**foo**' --max-lines 1000\n```\n\n- [Glob reference](https://en.wikipedia.org/wiki/Glob_(programming))\n\n##### Piped input mode to search the output of a command\n\n```shell\n# <SomeCommand> | busca [OPTIONS]\necho 'String to find in files.' | busca\n```\n\n<details style=\"margin-bottom: 2em\">\n<summary><h5>MacOS piped input mode<h4></summary>\n\n\ud83d\udcdd There is an [open issue](https://github.com/crossterm-rs/crossterm/issues/396) for MacOS in [`crossterm`](https://github.com/crossterm-rs/crossterm), one of busca's dependencies, that does not allow prompt interactivity when using piped input. Therefore, when a non interactive mode is detected, the file matches will be displayed but not interactively.\n\nThis can be worked around by adding the following aliases to your shell `.bashrc` or `.zshrc` file:\n\n>   ```bash\n>   # Wrap commands for busca search\n>   busca_cmd_output() {\n>       eval \"$* > /tmp/busca_search.tmp\" && busca -r /tmp/busca_search.tmp\n>   }\n>   ```\n\nOne-liners to add the wrapper function:\n\n| Shell | Command                                                                                                                 |\n| ----- | ----------------------------------------------------------------------------------------------------------------------- |\n| Bash  | `echo -e 'busca_cmd_output() {\\n\\teval \"$* > /tmp/busca_search.tmp\" && busca -r /tmp/busca_search.tmp\\n}' >> ~/.bashrc` |\n| Zsh   | `echo -e 'busca_cmd_output() {\\n\\teval \"$* > /tmp/busca_search.tmp\" && busca -r /tmp/busca_search.tmp\\n}' >> ~/.zshrc`  |\n\nReload your shell for the function to become available:\n\n```shell\n# busca_cmd_output <SomeCommand>\nbusca_cmd_output echo 'String to find in files.'\n```\n\n</details>\n\n### CLI Installation\n\n#### Mac OS\n\n##### Homebrew\n\n```shell\nbrew tap noahbaculi/busca\nbrew install busca\n```\n\nTo update, run\n\n```shell\nbrew update\nbrew upgrade busca\n```\n\n#### All platforms (Windows, MacOS, Linux)\n\n##### Compile from source\n\n0. Install Rust [using `rustup`](https://www.rust-lang.org/tools/install).\n\n1. Clone this repo.\n\n2. In the root of this repo, run\n\n    ```shell\n    cargo build --release\n    ```\n\n3. Add to path. For example, by copying the compiled binary to your local bin directory.\n\n    ```shell\n    cp target/release/busca $HOME/bin/\n    ```\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Library to search for files with content that most closely match the lines of a reference string",
    "version": "2.3.1",
    "project_urls": {
        "Bug Reports": "https://github.com/noahbaculi/busca/issues",
        "Homepage": "https://github.com/noahbaculi/busca",
        "Source": "https://github.com/noahbaculi/busca"
    },
    "split_keywords": [
        "file",
        "match",
        "closest"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "46c013621793bbb1747abe2578748a865302108b71053eefb330124dff81b63d",
                "md5": "6d32a60b042318e07ef21124cdb5bd8d",
                "sha256": "b7cb0eb9d49c078d223f3205e4fcd2c7801f43002e766cc080fdb44c8b984718"
            },
            "downloads": -1,
            "filename": "busca_py-2.3.1-cp310-cp310-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "6d32a60b042318e07ef21124cdb5bd8d",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.7",
            "size": 334979,
            "upload_time": "2023-10-31T17:01:47",
            "upload_time_iso_8601": "2023-10-31T17:01:47.384036Z",
            "url": "https://files.pythonhosted.org/packages/46/c0/13621793bbb1747abe2578748a865302108b71053eefb330124dff81b63d/busca_py-2.3.1-cp310-cp310-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "61271135a79007f0e35efe2937bfe207651394be7293380e5100f449dd924b07",
                "md5": "faf574c237d128de0fdf6c23f8a9c014",
                "sha256": "6cbda6d894412deb9d2af2b761ddc5999de7922e86d575726c05157d36d83e2a"
            },
            "downloads": -1,
            "filename": "busca_py-2.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "faf574c237d128de0fdf6c23f8a9c014",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 2574998,
            "upload_time": "2023-10-31T17:01:50",
            "upload_time_iso_8601": "2023-10-31T17:01:50.515946Z",
            "url": "https://files.pythonhosted.org/packages/61/27/1135a79007f0e35efe2937bfe207651394be7293380e5100f449dd924b07/busca_py-2.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-31 17:01:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "noahbaculi",
    "github_project": "busca",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "busca-py"
}
        
Elapsed time: 0.14624s