dose-instruction-parser


Namedose-instruction-parser JSON
Version 2024.1022a0 PyPI version JSON
download
home_pageNone
SummaryTool for parsing free text prescription dose instructions into structured output
upload_time2024-09-06 09:45:47
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT License Copyright (c) 2024 royashcenazi, Rosalyn Pearson, Nathalie Thureau, Mark Macartney, John Reid, Johanna Jokio Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords prescriptions dose instructions medical free text
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # `dose_instruction_parser`: Dose instructions free text parser for Public Health Scotland

Current version: "2024.1022-alpha"

πŸ““ Documentation can be found at https://public-health-scotland.github.io/dose_instruction_parser/

πŸ“¦ Package is available on PyPI at https://pypi.org/project/dose-instruction-parser/

* The `dose_instruction_parser` package is for parsing free text dose instructions which accompany NHS prescriptions.
* It draws upon the [`parsigs`](https://pypi.org/project/parsigs/) package, adapting and expanding the code to the context of data held by Public Health Scotland.
* `dose_instruction_parser` works by first applying a named entity recogniser (NER) model to identify parts of the text corresponding to different entities, and then using rules to extract structured output. 
* The default NER model is named `en_edris9`. This is an extension of the [`med7`](https://www.sciencedirect.com/science/article/abs/pii/S0933365721000798) model, which has the following named entities: form; dosage; frequency; duration; strength; route; drug. `en_edris9` has the additional entities as_directed and as_required, and has been further trained on approximately 7,000 gold standard examples of NHS dose instructions which were manually tagged by analysts at Public Health Scotland.
* `en_edris9` is not currently publicly available. For interested researchers and colleagues in the NHS, please contact the eDRIS team at Public Health Scotland via [phs.edris@phs.scot](mailto:phs.edris@phs.scot). For other users, code to train your own model is available [on GitHub](https://github.com/Public-Health-Scotland/dose_instruction_parser/).

## Contents

1. [File layout](#file-layout)
1. [Setup](#setup)
1. [Usage](#usage)
1. [Development](#development)

## File layout

```
πŸ“¦dose_instruction_parser
 ┣ πŸ“‚dose_instruction_parser                  # source code
 ┃ ┣ πŸ“‚data                     
 ┃ ┃ ┣ πŸ“œkeep_words.txt         # key words which won't be spellchecked
 ┃ ┃ ┣ πŸ“œreplace_words.csv      # key words to replace 
 ┃ ┃ β”— πŸ“œ__init__.py
 ┃ ┣ πŸ“‚tests                    
 ┃ ┃ ┣ πŸ“œconftest.py
 ┃ ┃ ┣ πŸ“œtest_dosage.py         
 ┃ ┃ ┣ πŸ“œtest_duration.py
 ┃ ┃ ┣ πŸ“œtest_frequency.py
 ┃ ┃ ┣ πŸ“œtest_parser.py
 ┃ ┃ ┣ πŸ“œtest_prepare.py
 ┃ ┃ β”— πŸ“œ__init__.py
 ┃ ┣ πŸ“œdi_dosage.py             # parsing dosage tags
 ┃ ┣ πŸ“œdi_duration.py           # parsing duration tags
 ┃ ┣ πŸ“œdi_frequency.py          # parsing frequency tags
 ┃ ┣ πŸ“œdi_prepare.py            # preprocessing 
 ┃ ┣ πŸ“œparser.py                # dose instruction parser
 ┃ ┣ πŸ“œ__init__.py  
 ┃ β”— πŸ“œ__main__.py              # parse_dose_instructions command line function
 ┣ πŸ“œ.coveragerc
 ┣ πŸ“œLICENSE
 ┣ πŸ“œMANIFEST.in
 ┣ πŸ“œpyproject.toml
 β”— πŸ“œREADME.md
```

## Setup

### Basic setup

```bash
conda create -n di                              # setup new conda env
conda activate di                               # activate
python -m pip install dose_instruction_parser   # install dose_instruction_parser from PyPI
parse_dose_instructions -h                      # get help on parsing dose instructions
```

(Optional) Install the `en_edris9` model. Contact [phs.edris@phs.scot](mailto:phs.edris@phs.scot) for access.

### Development setup

1.  Clone this repository
2.  Add a file called called `secrets.env` in the top level of the cloned    repository with the following contents:

    ```bash
    export DI_FILEPATH="</path/to/model/folder>"
    ```

    This sets the environment variable `DI_FILEPATH` where the code will read/write models. If you are working within Public Health Scotland please contact
    [phs.edris@phs.scot](mailto:phs.edris@phs.scot) to receive the filepath. 
3. Create new conda environment and activate: 
    ```bash
    conda create -n di-dev
    conda activate di-dev
    ```
4. Install package using editable pip install and development dependencies: 
    ```bash
    python -m pip install -e dose_instruction_parser[dev]
    ```
> [!IMPORTANT]
> Make sure you run this from the top directory of the repository
5. (Optional) Install the `en_edris9` model. Contact [phs.edris@phs.scot](mailto:phs.edris@phs.scot) for access.

## Usage

> [!TIP]
>   Run `parse_dose_instructions -h` on the command line to get help on parsing dose instructions

In the following examples we assume the model `en_edris9` is installed. You can provide your own path to an alternative model with the same nine entities.

### Command line interface

The simplest way to get started is to use the in-built command line interface. This can be accessed by running `parse_dose_instructions` on the command line.

#### A single instruction

A single dose instruction can be supplied using the `-di` argument.

```bash
(di-dev)$ parse_dose_instructions -di "take one tablet daily" -mod en_edris9 

Logging to command line. Use the --logfile argument to set a log file instead.
2024-05-28 07:45:49,803 Checking input and output files
2024-05-28 07:45:49,803 Setting up parser
2024-05-28 07:46:34,205 Parsing single dose instruction

StructuredDI(inputID=None, text='take one tablet daily', form='tablet', dosageMin=1.0, dosageMax=1.0, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False)
```

#### Multiple instructions

Multiple dose instructions can be supplied from file using the `-f` argument, where each line in the text file supplied is a dose instruction. For example, if the file `multiple_dis.txt` contains the following:

```
daily 2 tabs
once daily when required
```

then you will get the corresponding output:

```bash
(di-dev)$ parse_dose_instructions -f "multiple_dis.txt" -mod en_edris9

Logging to command line. Use the --logfile argument to set a log file instead.
2024-05-28 07:47:56,270 Checking input and output files
2024-05-28 07:47:56,282 Setting up parser
2024-05-28 07:48:18,003 Parsing multiple dose instructions
Parsing dose instructions                                                                                               
Parsed 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:00<00:00, 79.78 instructions/s]

StructuredDI(inputID=0, text='daily 2 tabs', form='tablet', dosageMin=2.0, dosageMax=2.0, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False)
StructuredDI(inputID=1, text='once daily when required', form=None, dosageMin=None, dosageMax=None, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=True, asDirected=False)
```

Where you have a lot of examples to parse you may want to send the output to a file rather than the command line. To do this, specify the output file location with the `-o` argument. If this has **.txt** extension the results will be presented line by line like they would on the command line. If this has **.csv** extension the results will be cast to a data frame with one entry per row.

```bash
(di-dev)$ parse_dose_instructions -f "multiple_dis.txt" -mod en_edris9 -o "out_dis.csv"
```

The contents of `out_dis.csv` is as follows:

```
inputID,text,form,dosageMin,dosageMax,frequencyMin,frequencyMax,frequencyType,durationMin,durationMax,durationType,asRequired,asDirected
0,daily 2 tabs,tablet,2.0,2.0,1.0,1.0,Day,,,,False,False
1,once daily when required,,,,1.0,1.0,Day,,,,True,False
```
> [!NOTE]
> Sometimes a dose instruction really contains more than one instruction within it. 
> In this case the output will be split into multiple outputs, one corresponding
> to each part of the instruction. For example,
> "Take two tablets twice daily for one week then one tablet once daily for two weeks"
> ```python
> $ parse_dose_instructions -di "Take two tablets twice daily for one week then one tablet once daily for two weeks"
>
> Logging to command line. Use the --logfile argument to set a log file instead.
> 2024-06-21 08:35:41,765 Checking input and output files
> 2024-06-21 08:35:41,765 Setting up parser
> 2024-06-21 08:35:59,572 Parsing single dose instruction
>
> StructuredDI(inputID=None, text='Take two tablets twice daily for one week then one tablet once daily for two weeks', form='tablet', dosageMin=2.0, dosageMax=2.0,  frequencyMin=2.0, frequencyMax=2.0, frequencyType='Day', durationMin=1.0, durationMax=1.0, durationType='Week', asRequired=False, asDirected=False)
> StructuredDI(inputID=None, text='Take two tablets twice daily for one week then one tablet once daily for two weeks', form='tablet', dosageMin=1.0, dosageMax=1.0, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=2.0, durationMax=2.0, durationType='Week', asRequired=False, asDirected=False)
> ```
>


#### Providing input IDs

The `inputID` value helps to keep track of which outputs correspond to which inputs. The default behaviour is:

* For a single dose instruction, set `inputID=None` 
* For multiple dose instructions, number each instruction starting from 0 by the order they appear in the input file

You may want to provide your own values for `inputID`. To do this, provide input dose instructions as a **.csv** file with columns 

* `inputID` specifying the input ID
* `di` specifying the dose instruction

For example, using `test.csv` with the following contents:

```
inputID,di
eDRIS/XXXX-XXXX/example/001,daily 2 caps
eDRIS/XXXX-XXXX/example/002,daily 0.2ml
eDRIS/XXXX-XXXX/example/003,two mane + two nocte
eDRIS/XXXX-XXXX/example/004,2 tabs twice daily increased to 2 tabs three times daily during exacerbation chest symptoms
eDRIS/XXXX-XXXX/example/005,take one in the morning and take two at night as directed
eDRIS/XXXX-XXXX/example/006,1 tablet(s) three times daily for pain/inflammation
eDRIS/XXXX-XXXX/example/007,two puffs at night
eDRIS/XXXX-XXXX/example/008,0.6mls daily
eDRIS/XXXX-XXXX/example/009,to be applied tds-qds
eDRIS/XXXX-XXXX/example/010,take 1 tablet for 3 weeks then take 3 tablets for 4 weeks
eDRIS/XXXX-XXXX/example/011,one to be taken twice a day  if sleepy do not drive/use machines. avoid alcohol. swallow whole.
eDRIS/XXXX-XXXX/example/012,1 tab take as required
eDRIS/XXXX-XXXX/example/013,take one daily for allergy
eDRIS/XXXX-XXXX/example/014,2x5ml spoonfuls with meals
eDRIS/XXXX-XXXX/example/015,one per month
eDRIS/XXXX-XXXX/example/016,1 cappful every four weeks
eDRIS/XXXX-XXXX/example/017,take two every 4-6hrs for pain
eDRIS/XXXX-XXXX/example/018,up to qid prn
eDRIS/XXXX-XXXX/example/019,one or two tabs dissolved in a glass of water at night
eDRIS/XXXX-XXXX/example/020,bid-tid
eDRIS/XXXX-XXXX/example/021,change every 2 weeks
eDRIS/XXXX-XXXX/example/022,take every fortnight
```
yields the corresponding output
```python
inputID,text,form,dosageMin,dosageMax,frequencyMin,frequencyMax,frequencyType,durationMin,durationMax,durationType,asRequired,asDirected
eDRIS/XXXX-XXXX/example/001,daily 2 caps,capsule,2.0,2.0,1.0,1.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/002,daily 0.2ml,ml,0.2,0.2,1.0,1.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/003,two mane + two nocte,,2.0,2.0,2.0,2.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/004,2 tabs twice daily increased to 2 tabs three times daily during exacerbation chest symptoms,tablet,2.0,2.0,5.0,5.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/005,take one in the morning and take two at night as directed,,3.0,3.0,1.0,1.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/006,1 tablet(s) three times daily for pain/inflammation,tablet,1.0,1.0,3.0,3.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/007,two puffs at night,puff,2.0,2.0,1.0,1.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/008,0.6mls daily,ml,0.6,0.6,1.0,1.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/009,to be applied tds-qds,,,,3.0,3.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/010,take 1 tablet for 3 weeks then take 3 tablets for 4 weeks,tablet,1.0,1.0,,,,3.0,3.0,Week,False,False
eDRIS/XXXX-XXXX/example/010,take 1 tablet for 3 weeks then take 3 tablets for 4 weeks,tablet,3.0,3.0,,,,4.0,4.0,Week,False,False
eDRIS/XXXX-XXXX/example/011,one to be taken twice a day  if sleepy do not drive/use machines. avoid alcohol. swallow whole.,,1.0,1.0,2.0,2.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/012,1 tab take as required,tablet,1.0,1.0,,,,,,,True,False
eDRIS/XXXX-XXXX/example/013,take one daily for allergy,,1.0,1.0,1.0,1.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/014,2x5ml spoonfuls with meals,ml,10.0,10.0,3.0,3.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/015,one per month,,1.0,1.0,1.0,1.0,Month,,,,False,False
eDRIS/XXXX-XXXX/example/016,1 cappful every four weeks,capful,1.0,1.0,1.0,1.0,Month,,,,False,False
eDRIS/XXXX-XXXX/example/017,take two every 4-6hrs for pain,,2.0,2.0,1.0,1.0,4 Hour,,,,True,False
eDRIS/XXXX-XXXX/example/018,up to qid prn,,,,0.0,4.0,Day,,,,True,False
eDRIS/XXXX-XXXX/example/019,one or two tabs dissolved in a glass of water at night,tablet,1.0,2.0,1.0,1.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/020,bid-tid,,,,2.0,3.0,Day,,,,False,False
eDRIS/XXXX-XXXX/example/021,change every 2 weeks,,,,1.0,1.0,2 Week,,,,False,False
eDRIS/XXXX-XXXX/example/022,take every fortnight,,,,1.0,1.0,2 Week,,,,False,False
```

> [!NOTE]
> In this example, `eDRIS/XXXX-XXXX/example/010` has been split up into two
> dose instructions 

### Usage from Python 

For more adaptable usage you can load the package into Python and use it within a script or on the Python prompt. For example, using [iPython](https://pypi.org/project/ipython/):

```python
In [1]: import pandas as pd
   ...: from dose_instruction_parser import parser

In [2]: # Create parser
   ...: p = parser.DIParser("en_edris9")

In [3]: # Parse one dose instruction
   ...: p.parse("Take 2 tablets morning and night")
Out[3]: [StructuredDI(inputID=None, text='Take 2 tablets morning and night', form='tablet', dosageMin=2.0, dosageMax=2.0, frequencyMin=2.0, frequencyMax=2.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False)]

In [4]: # Parse many dose instructions
   ...: parsed_dis = p.parse_many([
   ...:     "take one tablet daily",
   ...:     "two puffs prn",
   ...:     "one cap after meals for three weeks",
   ...:     "4 caplets tid"
   ...: ])

In [5]: print(parsed_dis)
[StructuredDI(inputID=0, text='take one tablet daily', form='tablet', dosageMin=1.0, dosageMax=1.0, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False), StructuredDI(inputID=1, text='two puffs prn', form='puff', dosageMin=2.0, dosageMax=2.0, frequencyMin=None, frequencyMax=None, frequencyType=None, durationMin=None, durationMax=None, durationType=None, asRequired=True, asDirected=False), StructuredDI(inputID=2, text='one cap after meals for three weeks', form='capsule', dosageMin=1.0, dosageMax=1.0, frequencyMin=3.0, frequencyMax=3.0, frequencyType='Day', durationMin=3.0, durationMax=3.0, durationType='Week', asRequired=False, asDirected=False), StructuredDI(inputID=3, text='4 caplets tid', form='carpet', dosageMin=4.0, dosageMax=4.0, frequencyMin=3.0, frequencyMax=3.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False)]

In [6]: # Convert output to pandas dataframe
   ...: di_df = pd.DataFrame(parsed_dis)

In [7]: print(di_df)
inputID                                 text     form  dosageMin  dosageMax  frequencyMin  frequencyMax frequencyType  durationMin  durationMax durationType  asRequired  asDirected
      0                take one tablet daily   tablet        1.0        1.0           1.0           1.0           Day          NaN          NaN         None       False       False
      1                        two puffs prn     puff        2.0        2.0           NaN           NaN          None          NaN          NaN         None        True       False
      2  one cap after meals for three weeks  capsule        1.0        1.0           3.0           3.0           Day          3.0          3.0         Week       False       False
      3                        4 caplets tid   carpet        4.0        4.0           3.0           3.0           Day          NaN          NaN         None       False       False
```

## Development

1. Please open a new branch for any change and submit a pull request for merging to main
1. If you have ideas for an improvement, or spot a bug, please open an issue
1. Remember to include tests for any changes you might make, where appropriate

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "dose-instruction-parser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "prescriptions, dose instructions, medical free text",
    "author": null,
    "author_email": "Rosalyn Pearson <rosalyn.pearson@phs.scot>, Nathalie Thureau <nathalie.thureau@phs.scot>, Mark Macartney <mark.macartney@phs.scot>, John Reid <john.reid5@phs.scot>, Johanna Jokio <johanna.jokio@phs.scot>, David Bailey <david.bailey@phs.scot>",
    "download_url": "https://files.pythonhosted.org/packages/8b/38/31cbdf3fd56c410a6d3c845bd83a1335e3b9f29a43a0e71d59d36698206c/dose_instruction_parser-2024.1022a0.tar.gz",
    "platform": null,
    "description": "# `dose_instruction_parser`: Dose instructions free text parser for Public Health Scotland\n\nCurrent version: \"2024.1022-alpha\"\n\n\ud83d\udcd3 Documentation can be found at https://public-health-scotland.github.io/dose_instruction_parser/\n\n\ud83d\udce6 Package is available on PyPI at https://pypi.org/project/dose-instruction-parser/\n\n* The `dose_instruction_parser` package is for parsing free text dose instructions which accompany NHS prescriptions.\n* It draws upon the [`parsigs`](https://pypi.org/project/parsigs/) package, adapting and expanding the code to the context of data held by Public Health Scotland.\n* `dose_instruction_parser` works by first applying a named entity recogniser (NER) model to identify parts of the text corresponding to different entities, and then using rules to extract structured output. \n* The default NER model is named `en_edris9`. This is an extension of the [`med7`](https://www.sciencedirect.com/science/article/abs/pii/S0933365721000798) model, which has the following named entities: form; dosage; frequency; duration; strength; route; drug. `en_edris9` has the additional entities as_directed and as_required, and has been further trained on approximately 7,000 gold standard examples of NHS dose instructions which were manually tagged by analysts at Public Health Scotland.\n* `en_edris9` is not currently publicly available. For interested researchers and colleagues in the NHS, please contact the eDRIS team at Public Health Scotland via [phs.edris@phs.scot](mailto:phs.edris@phs.scot). For other users, code to train your own model is available [on GitHub](https://github.com/Public-Health-Scotland/dose_instruction_parser/).\n\n## Contents\n\n1. [File layout](#file-layout)\n1. [Setup](#setup)\n1. [Usage](#usage)\n1. [Development](#development)\n\n## File layout\n\n```\n\ud83d\udce6dose_instruction_parser\n \u2523 \ud83d\udcc2dose_instruction_parser                  # source code\n \u2503 \u2523 \ud83d\udcc2data                     \n \u2503 \u2503 \u2523 \ud83d\udcdckeep_words.txt         # key words which won't be spellchecked\n \u2503 \u2503 \u2523 \ud83d\udcdcreplace_words.csv      # key words to replace \n \u2503 \u2503 \u2517 \ud83d\udcdc__init__.py\n \u2503 \u2523 \ud83d\udcc2tests                    \n \u2503 \u2503 \u2523 \ud83d\udcdcconftest.py\n \u2503 \u2503 \u2523 \ud83d\udcdctest_dosage.py         \n \u2503 \u2503 \u2523 \ud83d\udcdctest_duration.py\n \u2503 \u2503 \u2523 \ud83d\udcdctest_frequency.py\n \u2503 \u2503 \u2523 \ud83d\udcdctest_parser.py\n \u2503 \u2503 \u2523 \ud83d\udcdctest_prepare.py\n \u2503 \u2503 \u2517 \ud83d\udcdc__init__.py\n \u2503 \u2523 \ud83d\udcdcdi_dosage.py             # parsing dosage tags\n \u2503 \u2523 \ud83d\udcdcdi_duration.py           # parsing duration tags\n \u2503 \u2523 \ud83d\udcdcdi_frequency.py          # parsing frequency tags\n \u2503 \u2523 \ud83d\udcdcdi_prepare.py            # preprocessing \n \u2503 \u2523 \ud83d\udcdcparser.py                # dose instruction parser\n \u2503 \u2523 \ud83d\udcdc__init__.py  \n \u2503 \u2517 \ud83d\udcdc__main__.py              # parse_dose_instructions command line function\n \u2523 \ud83d\udcdc.coveragerc\n \u2523 \ud83d\udcdcLICENSE\n \u2523 \ud83d\udcdcMANIFEST.in\n \u2523 \ud83d\udcdcpyproject.toml\n \u2517 \ud83d\udcdcREADME.md\n```\n\n## Setup\n\n### Basic setup\n\n```bash\nconda create -n di                              # setup new conda env\nconda activate di                               # activate\npython -m pip install dose_instruction_parser   # install dose_instruction_parser from PyPI\nparse_dose_instructions -h                      # get help on parsing dose instructions\n```\n\n(Optional) Install the `en_edris9` model. Contact [phs.edris@phs.scot](mailto:phs.edris@phs.scot) for access.\n\n### Development setup\n\n1.  Clone this repository\n2.  Add a file called called `secrets.env` in the top level of the cloned    repository with the following contents:\n\n    ```bash\n    export DI_FILEPATH=\"</path/to/model/folder>\"\n    ```\n\n    This sets the environment variable `DI_FILEPATH` where the code will read/write models. If you are working within Public Health Scotland please contact\n    [phs.edris@phs.scot](mailto:phs.edris@phs.scot) to receive the filepath. \n3. Create new conda environment and activate: \n    ```bash\n    conda create -n di-dev\n    conda activate di-dev\n    ```\n4. Install package using editable pip install and development dependencies: \n    ```bash\n    python -m pip install -e dose_instruction_parser[dev]\n    ```\n> [!IMPORTANT]\n> Make sure you run this from the top directory of the repository\n5. (Optional) Install the `en_edris9` model. Contact [phs.edris@phs.scot](mailto:phs.edris@phs.scot) for access.\n\n## Usage\n\n> [!TIP]\n>   Run `parse_dose_instructions -h` on the command line to get help on parsing dose instructions\n\nIn the following examples we assume the model `en_edris9` is installed. You can provide your own path to an alternative model with the same nine entities.\n\n### Command line interface\n\nThe simplest way to get started is to use the in-built command line interface. This can be accessed by running `parse_dose_instructions` on the command line.\n\n#### A single instruction\n\nA single dose instruction can be supplied using the `-di` argument.\n\n```bash\n(di-dev)$ parse_dose_instructions -di \"take one tablet daily\" -mod en_edris9 \n\nLogging to command line. Use the --logfile argument to set a log file instead.\n2024-05-28 07:45:49,803 Checking input and output files\n2024-05-28 07:45:49,803 Setting up parser\n2024-05-28 07:46:34,205 Parsing single dose instruction\n\nStructuredDI(inputID=None, text='take one tablet daily', form='tablet', dosageMin=1.0, dosageMax=1.0, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False)\n```\n\n#### Multiple instructions\n\nMultiple dose instructions can be supplied from file using the `-f` argument, where each line in the text file supplied is a dose instruction. For example, if the file `multiple_dis.txt` contains the following:\n\n```\ndaily 2 tabs\nonce daily when required\n```\n\nthen you will get the corresponding output:\n\n```bash\n(di-dev)$ parse_dose_instructions -f \"multiple_dis.txt\" -mod en_edris9\n\nLogging to command line. Use the --logfile argument to set a log file instead.\n2024-05-28 07:47:56,270 Checking input and output files\n2024-05-28 07:47:56,282 Setting up parser\n2024-05-28 07:48:18,003 Parsing multiple dose instructions\nParsing dose instructions                                                                                               \nParsed 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 79.78 instructions/s]\n\nStructuredDI(inputID=0, text='daily 2 tabs', form='tablet', dosageMin=2.0, dosageMax=2.0, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False)\nStructuredDI(inputID=1, text='once daily when required', form=None, dosageMin=None, dosageMax=None, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=True, asDirected=False)\n```\n\nWhere you have a lot of examples to parse you may want to send the output to a file rather than the command line. To do this, specify the output file location with the `-o` argument. If this has **.txt** extension the results will be presented line by line like they would on the command line. If this has **.csv** extension the results will be cast to a data frame with one entry per row.\n\n```bash\n(di-dev)$ parse_dose_instructions -f \"multiple_dis.txt\" -mod en_edris9 -o \"out_dis.csv\"\n```\n\nThe contents of `out_dis.csv` is as follows:\n\n```\ninputID,text,form,dosageMin,dosageMax,frequencyMin,frequencyMax,frequencyType,durationMin,durationMax,durationType,asRequired,asDirected\n0,daily 2 tabs,tablet,2.0,2.0,1.0,1.0,Day,,,,False,False\n1,once daily when required,,,,1.0,1.0,Day,,,,True,False\n```\n> [!NOTE]\n> Sometimes a dose instruction really contains more than one instruction within it. \n> In this case the output will be split into multiple outputs, one corresponding\n> to each part of the instruction. For example,\n> \"Take two tablets twice daily for one week then one tablet once daily for two weeks\"\n> ```python\n> $ parse_dose_instructions -di \"Take two tablets twice daily for one week then one tablet once daily for two weeks\"\n>\n> Logging to command line. Use the --logfile argument to set a log file instead.\n> 2024-06-21 08:35:41,765 Checking input and output files\n> 2024-06-21 08:35:41,765 Setting up parser\n> 2024-06-21 08:35:59,572 Parsing single dose instruction\n>\n> StructuredDI(inputID=None, text='Take two tablets twice daily for one week then one tablet once daily for two weeks', form='tablet', dosageMin=2.0, dosageMax=2.0,  frequencyMin=2.0, frequencyMax=2.0, frequencyType='Day', durationMin=1.0, durationMax=1.0, durationType='Week', asRequired=False, asDirected=False)\n> StructuredDI(inputID=None, text='Take two tablets twice daily for one week then one tablet once daily for two weeks', form='tablet', dosageMin=1.0, dosageMax=1.0, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=2.0, durationMax=2.0, durationType='Week', asRequired=False, asDirected=False)\n> ```\n>\n\n\n#### Providing input IDs\n\nThe `inputID` value helps to keep track of which outputs correspond to which inputs. The default behaviour is:\n\n* For a single dose instruction, set `inputID=None` \n* For multiple dose instructions, number each instruction starting from 0 by the order they appear in the input file\n\nYou may want to provide your own values for `inputID`. To do this, provide input dose instructions as a **.csv** file with columns \n\n* `inputID` specifying the input ID\n* `di` specifying the dose instruction\n\nFor example, using `test.csv` with the following contents:\n\n```\ninputID,di\neDRIS/XXXX-XXXX/example/001,daily 2 caps\neDRIS/XXXX-XXXX/example/002,daily 0.2ml\neDRIS/XXXX-XXXX/example/003,two mane + two nocte\neDRIS/XXXX-XXXX/example/004,2 tabs twice daily increased to 2 tabs three times daily during exacerbation chest symptoms\neDRIS/XXXX-XXXX/example/005,take one in the morning and take two at night as directed\neDRIS/XXXX-XXXX/example/006,1 tablet(s) three times daily for pain/inflammation\neDRIS/XXXX-XXXX/example/007,two puffs at night\neDRIS/XXXX-XXXX/example/008,0.6mls daily\neDRIS/XXXX-XXXX/example/009,to be applied tds-qds\neDRIS/XXXX-XXXX/example/010,take 1 tablet for 3 weeks then take 3 tablets for 4 weeks\neDRIS/XXXX-XXXX/example/011,one to be taken twice a day  if sleepy do not drive/use machines. avoid alcohol. swallow whole.\neDRIS/XXXX-XXXX/example/012,1 tab take as required\neDRIS/XXXX-XXXX/example/013,take one daily for allergy\neDRIS/XXXX-XXXX/example/014,2x5ml spoonfuls with meals\neDRIS/XXXX-XXXX/example/015,one per month\neDRIS/XXXX-XXXX/example/016,1 cappful every four weeks\neDRIS/XXXX-XXXX/example/017,take two every 4-6hrs for pain\neDRIS/XXXX-XXXX/example/018,up to qid prn\neDRIS/XXXX-XXXX/example/019,one or two tabs dissolved in a glass of water at night\neDRIS/XXXX-XXXX/example/020,bid-tid\neDRIS/XXXX-XXXX/example/021,change every 2 weeks\neDRIS/XXXX-XXXX/example/022,take every fortnight\n```\nyields the corresponding output\n```python\ninputID,text,form,dosageMin,dosageMax,frequencyMin,frequencyMax,frequencyType,durationMin,durationMax,durationType,asRequired,asDirected\neDRIS/XXXX-XXXX/example/001,daily 2 caps,capsule,2.0,2.0,1.0,1.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/002,daily 0.2ml,ml,0.2,0.2,1.0,1.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/003,two mane + two nocte,,2.0,2.0,2.0,2.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/004,2 tabs twice daily increased to 2 tabs three times daily during exacerbation chest symptoms,tablet,2.0,2.0,5.0,5.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/005,take one in the morning and take two at night as directed,,3.0,3.0,1.0,1.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/006,1 tablet(s) three times daily for pain/inflammation,tablet,1.0,1.0,3.0,3.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/007,two puffs at night,puff,2.0,2.0,1.0,1.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/008,0.6mls daily,ml,0.6,0.6,1.0,1.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/009,to be applied tds-qds,,,,3.0,3.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/010,take 1 tablet for 3 weeks then take 3 tablets for 4 weeks,tablet,1.0,1.0,,,,3.0,3.0,Week,False,False\neDRIS/XXXX-XXXX/example/010,take 1 tablet for 3 weeks then take 3 tablets for 4 weeks,tablet,3.0,3.0,,,,4.0,4.0,Week,False,False\neDRIS/XXXX-XXXX/example/011,one to be taken twice a day  if sleepy do not drive/use machines. avoid alcohol. swallow whole.,,1.0,1.0,2.0,2.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/012,1 tab take as required,tablet,1.0,1.0,,,,,,,True,False\neDRIS/XXXX-XXXX/example/013,take one daily for allergy,,1.0,1.0,1.0,1.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/014,2x5ml spoonfuls with meals,ml,10.0,10.0,3.0,3.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/015,one per month,,1.0,1.0,1.0,1.0,Month,,,,False,False\neDRIS/XXXX-XXXX/example/016,1 cappful every four weeks,capful,1.0,1.0,1.0,1.0,Month,,,,False,False\neDRIS/XXXX-XXXX/example/017,take two every 4-6hrs for pain,,2.0,2.0,1.0,1.0,4 Hour,,,,True,False\neDRIS/XXXX-XXXX/example/018,up to qid prn,,,,0.0,4.0,Day,,,,True,False\neDRIS/XXXX-XXXX/example/019,one or two tabs dissolved in a glass of water at night,tablet,1.0,2.0,1.0,1.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/020,bid-tid,,,,2.0,3.0,Day,,,,False,False\neDRIS/XXXX-XXXX/example/021,change every 2 weeks,,,,1.0,1.0,2 Week,,,,False,False\neDRIS/XXXX-XXXX/example/022,take every fortnight,,,,1.0,1.0,2 Week,,,,False,False\n```\n\n> [!NOTE]\n> In this example, `eDRIS/XXXX-XXXX/example/010` has been split up into two\n> dose instructions \n\n### Usage from Python \n\nFor more adaptable usage you can load the package into Python and use it within a script or on the Python prompt. For example, using [iPython](https://pypi.org/project/ipython/):\n\n```python\nIn [1]: import pandas as pd\n   ...: from dose_instruction_parser import parser\n\nIn [2]: # Create parser\n   ...: p = parser.DIParser(\"en_edris9\")\n\nIn [3]: # Parse one dose instruction\n   ...: p.parse(\"Take 2 tablets morning and night\")\nOut[3]: [StructuredDI(inputID=None, text='Take 2 tablets morning and night', form='tablet', dosageMin=2.0, dosageMax=2.0, frequencyMin=2.0, frequencyMax=2.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False)]\n\nIn [4]: # Parse many dose instructions\n   ...: parsed_dis = p.parse_many([\n   ...:     \"take one tablet daily\",\n   ...:     \"two puffs prn\",\n   ...:     \"one cap after meals for three weeks\",\n   ...:     \"4 caplets tid\"\n   ...: ])\n\nIn [5]: print(parsed_dis)\n[StructuredDI(inputID=0, text='take one tablet daily', form='tablet', dosageMin=1.0, dosageMax=1.0, frequencyMin=1.0, frequencyMax=1.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False), StructuredDI(inputID=1, text='two puffs prn', form='puff', dosageMin=2.0, dosageMax=2.0, frequencyMin=None, frequencyMax=None, frequencyType=None, durationMin=None, durationMax=None, durationType=None, asRequired=True, asDirected=False), StructuredDI(inputID=2, text='one cap after meals for three weeks', form='capsule', dosageMin=1.0, dosageMax=1.0, frequencyMin=3.0, frequencyMax=3.0, frequencyType='Day', durationMin=3.0, durationMax=3.0, durationType='Week', asRequired=False, asDirected=False), StructuredDI(inputID=3, text='4 caplets tid', form='carpet', dosageMin=4.0, dosageMax=4.0, frequencyMin=3.0, frequencyMax=3.0, frequencyType='Day', durationMin=None, durationMax=None, durationType=None, asRequired=False, asDirected=False)]\n\nIn [6]: # Convert output to pandas dataframe\n   ...: di_df = pd.DataFrame(parsed_dis)\n\nIn [7]: print(di_df)\ninputID                                 text     form  dosageMin  dosageMax  frequencyMin  frequencyMax frequencyType  durationMin  durationMax durationType  asRequired  asDirected\n      0                take one tablet daily   tablet        1.0        1.0           1.0           1.0           Day          NaN          NaN         None       False       False\n      1                        two puffs prn     puff        2.0        2.0           NaN           NaN          None          NaN          NaN         None        True       False\n      2  one cap after meals for three weeks  capsule        1.0        1.0           3.0           3.0           Day          3.0          3.0         Week       False       False\n      3                        4 caplets tid   carpet        4.0        4.0           3.0           3.0           Day          NaN          NaN         None       False       False\n```\n\n## Development\n\n1. Please open a new branch for any change and submit a pull request for merging to main\n1. If you have ideas for an improvement, or spot a bug, please open an issue\n1. Remember to include tests for any changes you might make, where appropriate\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2024 royashcenazi, Rosalyn Pearson, Nathalie Thureau, Mark Macartney, John Reid, Johanna Jokio  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Tool for parsing free text prescription dose instructions into structured output",
    "version": "2024.1022a0",
    "project_urls": null,
    "split_keywords": [
        "prescriptions",
        " dose instructions",
        " medical free text"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b7cbb6a9629322064dc322dca76ade783c07646e0fcfcdf922c5b469b66558bc",
                "md5": "9fed3bb4a9c2244288ff1a005bf55d75",
                "sha256": "bdb776e8b77348b0724d5ad12575a3e20e2e4d90439deedee0b8b7a4a099d03a"
            },
            "downloads": -1,
            "filename": "dose_instruction_parser-2024.1022a0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9fed3bb4a9c2244288ff1a005bf55d75",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 29616,
            "upload_time": "2024-09-06T09:45:46",
            "upload_time_iso_8601": "2024-09-06T09:45:46.206215Z",
            "url": "https://files.pythonhosted.org/packages/b7/cb/b6a9629322064dc322dca76ade783c07646e0fcfcdf922c5b469b66558bc/dose_instruction_parser-2024.1022a0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8b3831cbdf3fd56c410a6d3c845bd83a1335e3b9f29a43a0e71d59d36698206c",
                "md5": "db0c53a572b0687cb032703308557807",
                "sha256": "1ecdef0645c6ddf148a1c58e9bbb7f8ee3a6e4408b507f72b4f33ccd16ccc357"
            },
            "downloads": -1,
            "filename": "dose_instruction_parser-2024.1022a0.tar.gz",
            "has_sig": false,
            "md5_digest": "db0c53a572b0687cb032703308557807",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 29662,
            "upload_time": "2024-09-06T09:45:47",
            "upload_time_iso_8601": "2024-09-06T09:45:47.580556Z",
            "url": "https://files.pythonhosted.org/packages/8b/38/31cbdf3fd56c410a6d3c845bd83a1335e3b9f29a43a0e71d59d36698206c/dose_instruction_parser-2024.1022a0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-06 09:45:47",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "dose-instruction-parser"
}
        
Elapsed time: 0.73718s