# marcgrep ![PyPI](https://img.shields.io/pypi/v/marcgrep)
A CLI for searching MARC files like [MARCgrep.pl](https://pusc.it/bib/MARCgrep) but in Python and a bit different syntax.
[marcli](https://github.com/hectorcorrea/marcli) is also a similar project that's faster but a little less flexible.
## Installation
Python 3.9 or later.
```sh
pipx install marcgrep # install globally with pipx
pip install marcgrep # or use pip/pip3
```
## Usage
```sh
# general command format - pass one or more files or pipe stdin
marcgrep OPTIONS FILE1.mrc FILE2.mrc
cat FILE.mrc | marcgrep OPTIONS
# full usage information
Usage: marcgrep [OPTIONS] [FILES]...
Find MARC records matching patterns in a file.
Options:
-h, --help Show this message and exit.
-c, --count Count matching records
-i, --include TEXT Include matching records (repeatable)
-e, --exclude TEXT Exclude matching records (repeatable)
-f, --fields TEXT Comma-separated list of fields to print
-l, --limit INTEGER Limit number of records to process
--color Colorize mnemonic MARC output
--invert Invert color scheme (for light terminal backgrounds)
--version Show the version and exit.
```
The `--include` and `--exclude` flags can be used multiple times to specify multiple criteria. They accept a pattern which is a sort of comma-separated filter expression for matching MARC fields. Examples:
```sh
# records with a 780 field
marcgrep -i 780 FILE.mrc
# records with Ulysses in the 245 field
marcgrep -i '245,Ulysses' FILE.mrc
# titles _without_ "Collected Poems" in the 245 ‡a subfield
marcgrep -e '245,a,Collected Poems' FILE.mrc
# titles with second indicator = 4 that do not start with "The "
marcgrep -i '245,,4,,^(?!The )' FILE.mrc
```
The meaning of the filter expression's components depends upon their number:
- 1: field, `910` -> 910 is in record
- 2: field and value (regular expression), `100,Lorde` -> 100 contains string "Lorde"
- 3: field, subfield, and value, `506,a,Open Access` -> 506‡a contains string "Open Access"
- 4: field, subfield, first indicator, and value, `856,0,u,@lcsh\.gov` -> 856‡u with 1st indicator 0 contains string "@lcsh.gov"
- 5: field, subfield, first & second indicators, and value, `245,0,4,a,The Communist Manifesto`
The intention of this syntax is to facilitate searching subfields and field values more easily than MARCgrep.pl since we care about them more often than indicators. To ignore a component but use one of lesser priority, leave the component empty. For instance, `856,s,` refers to records with an `856` field with an `s` subfield but the trailing comma means we don't care about the subfield's value. The pattern `245,,4,,` refers to records with a `245` field with a second indicator of `4` regardless its subfields or value.
To use a literal comma in a value pattern, include all the other components. For instance, to search for "Morrison, Toni" anywhere in a `100` field, use `100,,,,Morrison, Toni`.
Multiple criteria are combined with logical AND. Multiple `--include` flags is narrower than one, as is an `--include` and an `--exclude`.
## Color & Formatting
The `--color` flag lets you pick colors for various parts of a MARC record using environment variables. You can pick from [the available termcolor colors](https://github.com/termcolor/termcolor?tab=readme-ov-file#text-properties). The defaults are:
| Component | Color | Var |
|---|----|---|
| Tag | cyan | MARC_TAG_COLOR |
| Indicator | light_yellow | MARC_INDICATOR_COLOR |
| Subfield code | green | MARC_SUBFIELD_COLOR |
| Data | white | MARC_DATA_COLOR |
There is an inverted color scheme available with the `--invert` flag for use with light (e.g. white) terminal backgrounds.
You can also configure the subfield delimiter character and the symbol for an empty indicator. Those defaults are:
| Symbol | Var |
|---|---|
| ‡ | MARC_SUBFIELD_DELIMITER |
| _ | MARC_EMPTY_INDICATOR |
## Development
[Poetry](https://python-poetry.org/) is used for development.
```sh
poetry install # install dependencies
poetry run pytest # run tests
poetry build # build package, used in CI
```
Any tag triggers a release to [Test PyPI](https://test.pypi.org/project/marcgrep/). Any tag beginning with the letter `v` requires manual approval to be released to [PyPI](https://pypi.org/project/marcgrep/) and [GitHub](https://github.com/phette23/marcgreppy/releases). There are protection rules on the `pypi` and `testpypi` [environments](https://github.com/phette23/marcgreppy/settings/environments) to this effect, too.
## License
[MIT](https://opensource.org/license/mit) © Eric Phetteplace 2024.
Raw data
{
"_id": null,
"home_page": "https://github.com/phette23/marcgreppy",
"name": "marcgrep",
"maintainer": null,
"docs_url": null,
"requires_python": ">3.8",
"maintainer_email": null,
"keywords": "marc, grep, regex, libraries, cli, metadata, bibliographic, cataloging",
"author": "phette23",
"author_email": "phette23@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/ae/cc/d6413cc732cd5b4bd02de27ae416f1a65bec252ac13d0f9724aafc840df7/marcgrep-1.2.0.tar.gz",
"platform": null,
"description": "# marcgrep ![PyPI](https://img.shields.io/pypi/v/marcgrep)\n\nA CLI for searching MARC files like [MARCgrep.pl](https://pusc.it/bib/MARCgrep) but in Python and a bit different syntax.\n\n[marcli](https://github.com/hectorcorrea/marcli) is also a similar project that's faster but a little less flexible.\n\n## Installation\n\nPython 3.9 or later.\n\n```sh\npipx install marcgrep # install globally with pipx\npip install marcgrep # or use pip/pip3\n```\n\n## Usage\n\n```sh\n# general command format - pass one or more files or pipe stdin\nmarcgrep OPTIONS FILE1.mrc FILE2.mrc\ncat FILE.mrc | marcgrep OPTIONS\n# full usage information\nUsage: marcgrep [OPTIONS] [FILES]...\n\n Find MARC records matching patterns in a file.\n\nOptions:\n -h, --help Show this message and exit.\n -c, --count Count matching records\n -i, --include TEXT Include matching records (repeatable)\n -e, --exclude TEXT Exclude matching records (repeatable)\n -f, --fields TEXT Comma-separated list of fields to print\n -l, --limit INTEGER Limit number of records to process\n --color Colorize mnemonic MARC output\n --invert Invert color scheme (for light terminal backgrounds)\n --version Show the version and exit.\n```\n\nThe `--include` and `--exclude` flags can be used multiple times to specify multiple criteria. They accept a pattern which is a sort of comma-separated filter expression for matching MARC fields. Examples:\n\n```sh\n# records with a 780 field\nmarcgrep -i 780 FILE.mrc\n# records with Ulysses in the 245 field\nmarcgrep -i '245,Ulysses' FILE.mrc\n# titles _without_ \"Collected Poems\" in the 245 \u2021a subfield\nmarcgrep -e '245,a,Collected Poems' FILE.mrc\n# titles with second indicator = 4 that do not start with \"The \"\nmarcgrep -i '245,,4,,^(?!The )' FILE.mrc\n```\n\nThe meaning of the filter expression's components depends upon their number:\n\n- 1: field, `910` -> 910 is in record\n- 2: field and value (regular expression), `100,Lorde` -> 100 contains string \"Lorde\"\n- 3: field, subfield, and value, `506,a,Open Access` -> 506\u2021a contains string \"Open Access\"\n- 4: field, subfield, first indicator, and value, `856,0,u,@lcsh\\.gov` -> 856\u2021u with 1st indicator 0 contains string \"@lcsh.gov\"\n- 5: field, subfield, first & second indicators, and value, `245,0,4,a,The Communist Manifesto`\n\nThe intention of this syntax is to facilitate searching subfields and field values more easily than MARCgrep.pl since we care about them more often than indicators. To ignore a component but use one of lesser priority, leave the component empty. For instance, `856,s,` refers to records with an `856` field with an `s` subfield but the trailing comma means we don't care about the subfield's value. The pattern `245,,4,,` refers to records with a `245` field with a second indicator of `4` regardless its subfields or value.\n\nTo use a literal comma in a value pattern, include all the other components. For instance, to search for \"Morrison, Toni\" anywhere in a `100` field, use `100,,,,Morrison, Toni`.\n\nMultiple criteria are combined with logical AND. Multiple `--include` flags is narrower than one, as is an `--include` and an `--exclude`.\n\n## Color & Formatting\n\nThe `--color` flag lets you pick colors for various parts of a MARC record using environment variables. You can pick from [the available termcolor colors](https://github.com/termcolor/termcolor?tab=readme-ov-file#text-properties). The defaults are:\n\n| Component | Color | Var |\n|---|----|---|\n| Tag | cyan | MARC_TAG_COLOR |\n| Indicator | light_yellow | MARC_INDICATOR_COLOR |\n| Subfield code | green | MARC_SUBFIELD_COLOR |\n| Data | white | MARC_DATA_COLOR |\n\nThere is an inverted color scheme available with the `--invert` flag for use with light (e.g. white) terminal backgrounds.\n\nYou can also configure the subfield delimiter character and the symbol for an empty indicator. Those defaults are:\n\n| Symbol | Var |\n|---|---|\n| \u2021 | MARC_SUBFIELD_DELIMITER |\n| _ | MARC_EMPTY_INDICATOR |\n\n## Development\n\n[Poetry](https://python-poetry.org/) is used for development.\n\n```sh\npoetry install # install dependencies\npoetry run pytest # run tests\npoetry build # build package, used in CI\n```\n\nAny tag triggers a release to [Test PyPI](https://test.pypi.org/project/marcgrep/). Any tag beginning with the letter `v` requires manual approval to be released to [PyPI](https://pypi.org/project/marcgrep/) and [GitHub](https://github.com/phette23/marcgreppy/releases). There are protection rules on the `pypi` and `testpypi` [environments](https://github.com/phette23/marcgreppy/settings/environments) to this effect, too.\n\n## License\n\n[MIT](https://opensource.org/license/mit) \u00a9 Eric Phetteplace 2024.\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "search MARC files for regex matches",
"version": "1.2.0",
"project_urls": {
"Homepage": "https://github.com/phette23/marcgreppy",
"Issues": "https://github.com/phette23/marcgreppy/issues",
"Repository": "https://github.com/phette23/marcgreppy"
},
"split_keywords": [
"marc",
" grep",
" regex",
" libraries",
" cli",
" metadata",
" bibliographic",
" cataloging"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7a33c362efdcd70ff9d62180aa4dd51cc5e9ce57b8cda5d30bdb970c34a70675",
"md5": "8bf4322da11c1fa353edbb9f9cbbaba0",
"sha256": "e2d18efe75677ffa4ec4c4f4125856e8160af1d58e17aacd25646b99bfe326c8"
},
"downloads": -1,
"filename": "marcgrep-1.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8bf4322da11c1fa353edbb9f9cbbaba0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">3.8",
"size": 7156,
"upload_time": "2024-10-24T16:18:16",
"upload_time_iso_8601": "2024-10-24T16:18:16.566306Z",
"url": "https://files.pythonhosted.org/packages/7a/33/c362efdcd70ff9d62180aa4dd51cc5e9ce57b8cda5d30bdb970c34a70675/marcgrep-1.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "aeccd6413cc732cd5b4bd02de27ae416f1a65bec252ac13d0f9724aafc840df7",
"md5": "ebd9a75953cc385461e31d32f8816ce4",
"sha256": "1d0b7f2ab6ab28dedb39a6889078d5fab29a22c34fbb2aa943d425bfcfc3a8c1"
},
"downloads": -1,
"filename": "marcgrep-1.2.0.tar.gz",
"has_sig": false,
"md5_digest": "ebd9a75953cc385461e31d32f8816ce4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">3.8",
"size": 6100,
"upload_time": "2024-10-24T16:18:17",
"upload_time_iso_8601": "2024-10-24T16:18:17.881786Z",
"url": "https://files.pythonhosted.org/packages/ae/cc/d6413cc732cd5b4bd02de27ae416f1a65bec252ac13d0f9724aafc840df7/marcgrep-1.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-24 16:18:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "phette23",
"github_project": "marcgreppy",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "marcgrep"
}