![logo](../images/logo.png)
# Drug Extraction CLI
- [Drug Extraction CLI](#drug-extraction-cli)
- [Demo](#demo)
- [Description](#description)
- [Requires](#requires)
- [Installation](#installation)
- [Python Developers / Data Scientists](#python-developers--data-scientists)
- [Rust Developers](#rust-developers)
- [Usage](#usage)
- [Interactive](#interactive)
- [Search](#search)
- [Output Data Dictionary](#output-data-dictionary)
- [Examples](#examples)
- [Support](#support)
- [Contributing](#contributing)
- [MIT License](#mit-license)
## Demo
![demo-gif](../images/demo.gif)
## Description
This application takes a CSV file and parses text records from another CSV file to detect and extract search term mentions using string similarity algorithms to account for common misspellings. It is named for the drug searching it does most commonly for us at IPOP but is flexible enough to accept any type search terms.
If you are wondering about specific use cases, check out the [Examples](../examples/) folder!
## Requires
- [cargo](https://doc.rust-lang.org/cargo/getting-started/installation.html) package manager (rust toolchain)
- [just](https://github.com/casey/just) (optional dev-dependency if you clone this repo)
## Installation
To install the drug-extraction-cli application, simply:
### Python Developers / Data Scientists
Please use [pipx](https://pypa.github.io/pipx/) since it is designed *specifically* for this use case of installing Python CLI apps into isolated virtual environments.
```bash
pipx install extract-drugs
```
### Rust Developers
```bash
cargo install drug-extraction-cli
```
> **IMPORTANT!** Both of these will install an executable called `extract-drugs`.
>
> No matter how you install the package from either packaging index, the binary program will be named `extract-drugs` for more intuitive commands.
>
> INFO: The naming discrepancy is due to to how `maturin` handles package names and wanting to both keep the same CLI command/name and maintain the Rust namespace. Apologies, but you'll be fine 🙂.
## Usage
This application has two commands: `interactive` and `search`. Both of these commands have the same underlying functionality, the latter allows you to pass command-line arguments and is better suited to automated processing or advanced users while the former allows interactive declaration of the same configuration options and is better for new or first time users.
API documentation for the library can be found on [docs.rs](https://docs.rs/crate/drug-extraction-cli/latest).
### Interactive
This will present you with a series of prompts to help you select correct options. Highly recommended for new users or one-off runs.
Usage:
```bash
extract-drugs interactive
```
This command is demoed in the GIF above.
### Search
`search` functions the same as `interactive` but allows you to declaratively provide the configuration options.
## Output Data Dictionary
This tool will output an `output.csv` file with the following format:
| Column Name | Description | Data Type | Limits/Ranges |
| :--------------: | :----------------------------------------------------------------------------------: | :------------: | :----------------------------------------------: |
| row_id | Identifier from `--id-col` if provided, else line number of row in `--data-file` | String | None |
| search_term | The search term, cleaned and normalized. This is the actual term that was compared. | String | None |
| matched_term | The matched term, cleaned and normalized. This is the actual term that was compared. | String | None |
| edits | The `osa` edit distance | Integer | 0-2 (top limit due to exclusion filter) |
| similarity_score | The `jaro_winkler` similarity score | Float | 0.95-1.0 (bottom limit due to exclusion filter) |
| search_field | The field that this match was found in, from `--search-cols` | String | None |
| metadata | The attached metadata to `search_term` in the search_terms file | String or None | None |
## Examples
For a whole showcase of example runs of this tool check out the shell scripts inside the [examples](../examples/) folder.
For a showcase of potential analytical value that can be derived from running this tool, checkout the Jupyter Notebooks in the same folder!
## Support
If you encounter any issues or need support please either contact [@nanthony007](<[github.com/](https://github.com/nanthony007)>) or [open an issue](https://github.com/UK-IPOP/drug-extraction/issues/new).
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md).
## MIT License
[LICENSE](../LICENSE)
Raw data
{
"_id": null,
"home_page": null,
"name": "extract-drugs",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "Nick Anthony <nanthony007@gmail.com>",
"keywords": "drug,extraction,nlp,text",
"author": "Nick Anthony <nicholas.anthony@uky.edu>",
"author_email": "Nick Anthony <nanthony007@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/9f/ce/99e9ba8fac6ef5c4422091da2759aa7c4bf27819eb84a3f9d1662c48eeea/extract_drugs-1.2.0.tar.gz",
"platform": null,
"description": "![logo](../images/logo.png)\n\n# Drug Extraction CLI\n\n- [Drug Extraction CLI](#drug-extraction-cli)\n - [Demo](#demo)\n - [Description](#description)\n - [Requires](#requires)\n - [Installation](#installation)\n - [Python Developers / Data Scientists](#python-developers--data-scientists)\n - [Rust Developers](#rust-developers)\n - [Usage](#usage)\n - [Interactive](#interactive)\n - [Search](#search)\n - [Output Data Dictionary](#output-data-dictionary)\n - [Examples](#examples)\n - [Support](#support)\n - [Contributing](#contributing)\n - [MIT License](#mit-license)\n\n## Demo\n\n![demo-gif](../images/demo.gif)\n\n## Description\n\nThis application takes a CSV file and parses text records from another CSV file to detect and extract search term mentions using string similarity algorithms to account for common misspellings. It is named for the drug searching it does most commonly for us at IPOP but is flexible enough to accept any type search terms.\n\nIf you are wondering about specific use cases, check out the [Examples](../examples/) folder!\n\n## Requires\n\n- [cargo](https://doc.rust-lang.org/cargo/getting-started/installation.html) package manager (rust toolchain)\n- [just](https://github.com/casey/just) (optional dev-dependency if you clone this repo)\n\n## Installation\n\nTo install the drug-extraction-cli application, simply:\n\n### Python Developers / Data Scientists\n\nPlease use [pipx](https://pypa.github.io/pipx/) since it is designed *specifically* for this use case of installing Python CLI apps into isolated virtual environments.\n\n```bash\npipx install extract-drugs\n```\n\n### Rust Developers\n\n```bash\ncargo install drug-extraction-cli\n```\n\n> **IMPORTANT!** Both of these will install an executable called `extract-drugs`.\n>\n> No matter how you install the package from either packaging index, the binary program will be named `extract-drugs` for more intuitive commands.\n>\n> INFO: The naming discrepancy is due to to how `maturin` handles package names and wanting to both keep the same CLI command/name and maintain the Rust namespace. Apologies, but you'll be fine \ud83d\ude42.\n\n## Usage\n\nThis application has two commands: `interactive` and `search`. Both of these commands have the same underlying functionality, the latter allows you to pass command-line arguments and is better suited to automated processing or advanced users while the former allows interactive declaration of the same configuration options and is better for new or first time users.\n\nAPI documentation for the library can be found on [docs.rs](https://docs.rs/crate/drug-extraction-cli/latest).\n\n### Interactive\n\nThis will present you with a series of prompts to help you select correct options. Highly recommended for new users or one-off runs.\n\nUsage:\n\n```bash\nextract-drugs interactive\n```\n\nThis command is demoed in the GIF above.\n\n### Search\n\n`search` functions the same as `interactive` but allows you to declaratively provide the configuration options.\n\n## Output Data Dictionary\n\nThis tool will output an `output.csv` file with the following format:\n\n| Column Name | Description | Data Type | Limits/Ranges |\n| :--------------: | :----------------------------------------------------------------------------------: | :------------: | :----------------------------------------------: |\n| row_id | Identifier from `--id-col` if provided, else line number of row in `--data-file` | String | None |\n| search_term | The search term, cleaned and normalized. This is the actual term that was compared. | String | None |\n| matched_term | The matched term, cleaned and normalized. This is the actual term that was compared. | String | None |\n| edits | The `osa` edit distance | Integer | 0-2 (top limit due to exclusion filter) |\n| similarity_score | The `jaro_winkler` similarity score | Float | 0.95-1.0 (bottom limit due to exclusion filter) |\n| search_field | The field that this match was found in, from `--search-cols` | String | None |\n| metadata | The attached metadata to `search_term` in the search_terms file | String or None | None |\n\n## Examples\n\nFor a whole showcase of example runs of this tool check out the shell scripts inside the [examples](../examples/) folder.\n\nFor a showcase of potential analytical value that can be derived from running this tool, checkout the Jupyter Notebooks in the same folder!\n\n## Support\n\nIf you encounter any issues or need support please either contact [@nanthony007](<[github.com/](https://github.com/nanthony007)>) or [open an issue](https://github.com/UK-IPOP/drug-extraction/issues/new).\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md).\n\n## MIT License\n\n[LICENSE](../LICENSE)\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A CLI for extracting drugs from text records",
"version": "1.2.0",
"project_urls": {
"changelog": "https://github.com/UK-IPOP/drug-extraction/releases",
"documentation": "https://github.com/UK-IPOP/drug-extraction",
"homepage": "https://github.com/UK-IPOP/drug-extraction",
"repository": "https://github.com/UK-IPOP/drug-extraction"
},
"split_keywords": [
"drug",
"extraction",
"nlp",
"text"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "778550662bbffdb26bb942e8f2eb1fde479b06f849263aaf77555f2e65080891",
"md5": "64c7c752c76ff147dac30d68bee4d1bf",
"sha256": "85156d4c5bdf48d5237712a9545feb8c0a26da2bb63cb274340b7dbf0499b625"
},
"downloads": -1,
"filename": "extract_drugs-1.2.0-py3-none-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "64c7c752c76ff147dac30d68bee4d1bf",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 655091,
"upload_time": "2024-01-12T17:20:53",
"upload_time_iso_8601": "2024-01-12T17:20:53.376963Z",
"url": "https://files.pythonhosted.org/packages/77/85/50662bbffdb26bb942e8f2eb1fde479b06f849263aaf77555f2e65080891/extract_drugs-1.2.0-py3-none-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "9fce99e9ba8fac6ef5c4422091da2759aa7c4bf27819eb84a3f9d1662c48eeea",
"md5": "517e9f0ff669e88ac386f780b4c70dfa",
"sha256": "e1f7d5780cf7a07849c0e96cdc3931b768f864fdfedfb2f664fac5aee8943f59"
},
"downloads": -1,
"filename": "extract_drugs-1.2.0.tar.gz",
"has_sig": false,
"md5_digest": "517e9f0ff669e88ac386f780b4c70dfa",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 3557266,
"upload_time": "2024-01-12T17:20:59",
"upload_time_iso_8601": "2024-01-12T17:20:59.476160Z",
"url": "https://files.pythonhosted.org/packages/9f/ce/99e9ba8fac6ef5c4422091da2759aa7c4bf27819eb84a3f9d1662c48eeea/extract_drugs-1.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-12 17:20:59",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "UK-IPOP",
"github_project": "drug-extraction",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "extract-drugs"
}