marcgrep


Namemarcgrep JSON
Version 1.2.0 PyPI version JSON
download
home_pagehttps://github.com/phette23/marcgreppy
Summarysearch MARC files for regex matches
upload_time2024-10-24 16:18:17
maintainerNone
docs_urlNone
authorphette23
requires_python>3.8
licenseMIT
keywords marc grep regex libraries cli metadata bibliographic cataloging
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # marcgrep ![PyPI](https://img.shields.io/pypi/v/marcgrep)

A CLI for searching MARC files like [MARCgrep.pl](https://pusc.it/bib/MARCgrep) but in Python and a bit different syntax.

[marcli](https://github.com/hectorcorrea/marcli) is also a similar project that's faster but a little less flexible.

## Installation

Python 3.9 or later.

```sh
pipx install marcgrep # install globally with pipx
pip install marcgrep # or use pip/pip3
```

## Usage

```sh
# general command format - pass one or more files or pipe stdin
marcgrep OPTIONS FILE1.mrc FILE2.mrc
cat FILE.mrc | marcgrep OPTIONS
# full usage information
Usage: marcgrep [OPTIONS] [FILES]...

  Find MARC records matching patterns in a file.

Options:
  -h, --help           Show this message and exit.
  -c, --count          Count matching records
  -i, --include TEXT   Include matching records (repeatable)
  -e, --exclude TEXT   Exclude matching records (repeatable)
  -f, --fields TEXT    Comma-separated list of fields to print
  -l, --limit INTEGER  Limit number of records to process
  --color              Colorize mnemonic MARC output
  --invert             Invert color scheme (for light terminal backgrounds)
  --version            Show the version and exit.
```

The `--include` and `--exclude` flags can be used multiple times to specify multiple criteria. They accept a pattern which is a sort of comma-separated filter expression for matching MARC fields. Examples:

```sh
# records with a 780 field
marcgrep -i 780 FILE.mrc
# records with Ulysses in the 245 field
marcgrep -i '245,Ulysses' FILE.mrc
# titles _without_ "Collected Poems" in the 245 ‡a subfield
marcgrep -e '245,a,Collected Poems' FILE.mrc
# titles with second indicator = 4 that do not start with "The "
marcgrep -i '245,,4,,^(?!The )' FILE.mrc
```

The meaning of the filter expression's components depends upon their number:

- 1: field, `910` -> 910 is in record
- 2: field and value (regular expression), `100,Lorde` -> 100 contains string "Lorde"
- 3: field, subfield, and value, `506,a,Open Access` -> 506‡a contains string "Open Access"
- 4: field, subfield, first indicator, and value, `856,0,u,@lcsh\.gov` -> 856‡u with 1st indicator 0 contains string "@lcsh.gov"
- 5: field, subfield, first & second indicators, and value, `245,0,4,a,The Communist Manifesto`

The intention of this syntax is to facilitate searching subfields and field values more easily than MARCgrep.pl since we care about them more often than indicators. To ignore a component but use one of lesser priority, leave the component empty. For instance, `856,s,` refers to records with an `856` field with an `s` subfield but the trailing comma means we don't care about the subfield's value. The pattern `245,,4,,` refers to records with a `245` field with a second indicator of `4` regardless its subfields or value.

To use a literal comma in a value pattern, include all the other components. For instance, to search for "Morrison, Toni" anywhere in a `100` field, use `100,,,,Morrison, Toni`.

Multiple criteria are combined with logical AND. Multiple `--include` flags is narrower than one, as is an `--include` and an `--exclude`.

## Color & Formatting

The `--color` flag lets you pick colors for various parts of a MARC record using environment variables. You can pick from [the available termcolor colors](https://github.com/termcolor/termcolor?tab=readme-ov-file#text-properties). The defaults are:

| Component | Color | Var |
|---|----|---|
| Tag | cyan | MARC_TAG_COLOR |
| Indicator | light_yellow | MARC_INDICATOR_COLOR |
| Subfield code | green | MARC_SUBFIELD_COLOR |
| Data | white | MARC_DATA_COLOR |

There is an inverted color scheme available with the `--invert` flag for use with light (e.g. white) terminal backgrounds.

You can also configure the subfield delimiter character and the symbol for an empty indicator. Those defaults are:

| Symbol | Var |
|---|---|
| ‡ | MARC_SUBFIELD_DELIMITER |
| _ | MARC_EMPTY_INDICATOR |

## Development

[Poetry](https://python-poetry.org/) is used for development.

```sh
poetry install # install dependencies
poetry run pytest # run tests
poetry build # build package, used in CI
```

Any tag triggers a release to [Test PyPI](https://test.pypi.org/project/marcgrep/). Any tag beginning with the letter `v` requires manual approval to be released to [PyPI](https://pypi.org/project/marcgrep/) and [GitHub](https://github.com/phette23/marcgreppy/releases). There are protection rules on the `pypi` and `testpypi` [environments](https://github.com/phette23/marcgreppy/settings/environments) to this effect, too.

## License

[MIT](https://opensource.org/license/mit) © Eric Phetteplace 2024.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/phette23/marcgreppy",
    "name": "marcgrep",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">3.8",
    "maintainer_email": null,
    "keywords": "marc, grep, regex, libraries, cli, metadata, bibliographic, cataloging",
    "author": "phette23",
    "author_email": "phette23@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ae/cc/d6413cc732cd5b4bd02de27ae416f1a65bec252ac13d0f9724aafc840df7/marcgrep-1.2.0.tar.gz",
    "platform": null,
    "description": "# marcgrep ![PyPI](https://img.shields.io/pypi/v/marcgrep)\n\nA CLI for searching MARC files like [MARCgrep.pl](https://pusc.it/bib/MARCgrep) but in Python and a bit different syntax.\n\n[marcli](https://github.com/hectorcorrea/marcli) is also a similar project that's faster but a little less flexible.\n\n## Installation\n\nPython 3.9 or later.\n\n```sh\npipx install marcgrep # install globally with pipx\npip install marcgrep # or use pip/pip3\n```\n\n## Usage\n\n```sh\n# general command format - pass one or more files or pipe stdin\nmarcgrep OPTIONS FILE1.mrc FILE2.mrc\ncat FILE.mrc | marcgrep OPTIONS\n# full usage information\nUsage: marcgrep [OPTIONS] [FILES]...\n\n  Find MARC records matching patterns in a file.\n\nOptions:\n  -h, --help           Show this message and exit.\n  -c, --count          Count matching records\n  -i, --include TEXT   Include matching records (repeatable)\n  -e, --exclude TEXT   Exclude matching records (repeatable)\n  -f, --fields TEXT    Comma-separated list of fields to print\n  -l, --limit INTEGER  Limit number of records to process\n  --color              Colorize mnemonic MARC output\n  --invert             Invert color scheme (for light terminal backgrounds)\n  --version            Show the version and exit.\n```\n\nThe `--include` and `--exclude` flags can be used multiple times to specify multiple criteria. They accept a pattern which is a sort of comma-separated filter expression for matching MARC fields. Examples:\n\n```sh\n# records with a 780 field\nmarcgrep -i 780 FILE.mrc\n# records with Ulysses in the 245 field\nmarcgrep -i '245,Ulysses' FILE.mrc\n# titles _without_ \"Collected Poems\" in the 245 \u2021a subfield\nmarcgrep -e '245,a,Collected Poems' FILE.mrc\n# titles with second indicator = 4 that do not start with \"The \"\nmarcgrep -i '245,,4,,^(?!The )' FILE.mrc\n```\n\nThe meaning of the filter expression's components depends upon their number:\n\n- 1: field, `910` -> 910 is in record\n- 2: field and value (regular expression), `100,Lorde` -> 100 contains string \"Lorde\"\n- 3: field, subfield, and value, `506,a,Open Access` -> 506\u2021a contains string \"Open Access\"\n- 4: field, subfield, first indicator, and value, `856,0,u,@lcsh\\.gov` -> 856\u2021u with 1st indicator 0 contains string \"@lcsh.gov\"\n- 5: field, subfield, first & second indicators, and value, `245,0,4,a,The Communist Manifesto`\n\nThe intention of this syntax is to facilitate searching subfields and field values more easily than MARCgrep.pl since we care about them more often than indicators. To ignore a component but use one of lesser priority, leave the component empty. For instance, `856,s,` refers to records with an `856` field with an `s` subfield but the trailing comma means we don't care about the subfield's value. The pattern `245,,4,,` refers to records with a `245` field with a second indicator of `4` regardless its subfields or value.\n\nTo use a literal comma in a value pattern, include all the other components. For instance, to search for \"Morrison, Toni\" anywhere in a `100` field, use `100,,,,Morrison, Toni`.\n\nMultiple criteria are combined with logical AND. Multiple `--include` flags is narrower than one, as is an `--include` and an `--exclude`.\n\n## Color & Formatting\n\nThe `--color` flag lets you pick colors for various parts of a MARC record using environment variables. You can pick from [the available termcolor colors](https://github.com/termcolor/termcolor?tab=readme-ov-file#text-properties). The defaults are:\n\n| Component | Color | Var |\n|---|----|---|\n| Tag | cyan | MARC_TAG_COLOR |\n| Indicator | light_yellow | MARC_INDICATOR_COLOR |\n| Subfield code | green | MARC_SUBFIELD_COLOR |\n| Data | white | MARC_DATA_COLOR |\n\nThere is an inverted color scheme available with the `--invert` flag for use with light (e.g. white) terminal backgrounds.\n\nYou can also configure the subfield delimiter character and the symbol for an empty indicator. Those defaults are:\n\n| Symbol | Var |\n|---|---|\n| \u2021 | MARC_SUBFIELD_DELIMITER |\n| _ | MARC_EMPTY_INDICATOR |\n\n## Development\n\n[Poetry](https://python-poetry.org/) is used for development.\n\n```sh\npoetry install # install dependencies\npoetry run pytest # run tests\npoetry build # build package, used in CI\n```\n\nAny tag triggers a release to [Test PyPI](https://test.pypi.org/project/marcgrep/). Any tag beginning with the letter `v` requires manual approval to be released to [PyPI](https://pypi.org/project/marcgrep/) and [GitHub](https://github.com/phette23/marcgreppy/releases). There are protection rules on the `pypi` and `testpypi` [environments](https://github.com/phette23/marcgreppy/settings/environments) to this effect, too.\n\n## License\n\n[MIT](https://opensource.org/license/mit) \u00a9 Eric Phetteplace 2024.\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "search MARC files for regex matches",
    "version": "1.2.0",
    "project_urls": {
        "Homepage": "https://github.com/phette23/marcgreppy",
        "Issues": "https://github.com/phette23/marcgreppy/issues",
        "Repository": "https://github.com/phette23/marcgreppy"
    },
    "split_keywords": [
        "marc",
        " grep",
        " regex",
        " libraries",
        " cli",
        " metadata",
        " bibliographic",
        " cataloging"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7a33c362efdcd70ff9d62180aa4dd51cc5e9ce57b8cda5d30bdb970c34a70675",
                "md5": "8bf4322da11c1fa353edbb9f9cbbaba0",
                "sha256": "e2d18efe75677ffa4ec4c4f4125856e8160af1d58e17aacd25646b99bfe326c8"
            },
            "downloads": -1,
            "filename": "marcgrep-1.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8bf4322da11c1fa353edbb9f9cbbaba0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">3.8",
            "size": 7156,
            "upload_time": "2024-10-24T16:18:16",
            "upload_time_iso_8601": "2024-10-24T16:18:16.566306Z",
            "url": "https://files.pythonhosted.org/packages/7a/33/c362efdcd70ff9d62180aa4dd51cc5e9ce57b8cda5d30bdb970c34a70675/marcgrep-1.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aeccd6413cc732cd5b4bd02de27ae416f1a65bec252ac13d0f9724aafc840df7",
                "md5": "ebd9a75953cc385461e31d32f8816ce4",
                "sha256": "1d0b7f2ab6ab28dedb39a6889078d5fab29a22c34fbb2aa943d425bfcfc3a8c1"
            },
            "downloads": -1,
            "filename": "marcgrep-1.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "ebd9a75953cc385461e31d32f8816ce4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">3.8",
            "size": 6100,
            "upload_time": "2024-10-24T16:18:17",
            "upload_time_iso_8601": "2024-10-24T16:18:17.881786Z",
            "url": "https://files.pythonhosted.org/packages/ae/cc/d6413cc732cd5b4bd02de27ae416f1a65bec252ac13d0f9724aafc840df7/marcgrep-1.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-24 16:18:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "phette23",
    "github_project": "marcgreppy",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "marcgrep"
}
        
Elapsed time: 0.33827s