mednotegen


Namemednotegen JSON
Version 0.1.2 PyPI version JSON
download
home_pageNone
SummaryGenerate fake patient reports as PDFs.
upload_time2025-07-08 22:31:11
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseMIT
keywords synthea medical notes pdf
VCS
bugtrack_url
requirements faker fpdf
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # mednotegen

This project uses [Synthea™](https://github.com/synthetichealth/synthea) to generate realistic synthetic patient data for  medical notes. 

---

## Usage

```python
from mednotegen.generator import NoteGenerator

gen = NoteGenerator.from_config("config.yaml")
gen.generate_notes(10, "output_dir")

# Or specify Synthea CSV directory directly:
gen = NoteGenerator(synthea_csv_dir="/path/to/synthea/output/csv")
gen.generate_notes(10, "output_dir")
```

## Using a Custom Synthea Directory with config.yaml

You can specify the Synthea CSV directory directly in your config file. Add the following line to your `config.yaml`:

Example `config.yaml`:
```yaml
count: 10
output_dir: output_dir
synthea_csv_dir: /path/to/synthea/output/csv
```

Then generate notes using:

```python
from mednotegen.generator import NoteGenerator

gen = NoteGenerator.from_config("config.yaml")
gen.generate_notes(10, "output_dir")
```


## ⚠️ Synthea Dependency Required

This project requires [Synthea™](https://github.com/synthetichealth/synthea), an open-source synthetic patient generator, as an **external dependency**. You must clone and build Synthea yourself before using `mednotegen`.

**To set up Synthea:**

1. **Clone Synthea**
   ```sh
   git clone https://github.com/synthetichealth/synthea.git
   ```
2. **Build the Synthea JAR**
   ```sh
   cd synthea
   ./gradlew build check test
   cp build/libs/synthea-with-dependencies.jar .
   cd ..
   ```
   Ensure `synthea-with-dependencies.jar` is in the `synthea/` directory at the root of your project.

---

## Configuration (`config.yaml`)

You can customize patient generation and report output using a `config.yaml` file. Example options:

```yaml
count: 10                    # Number of reports to generate
output_dir: output_dir       # Output directory for PDFs
use_llm: false               # Use LLM for report generation
synthea_csv_dir: /path/to/synthea/output/csv   # Path to Synthea-generated CSV files
seed: 1234                   # Random seed for reproducibility
reference_date: "20250628"   # Reference date for data generation (YYYYMMDD)
clinician_seed: 5678         # Optional: separate seed for clinician assignment
gender: female               # male, female, or any
min_age: 30                  # Minimum patient age
max_age: 60                  # Maximum patient age
state: New York              # Synthea state parameter
modules:
  - cardiovascular-disease
  - diabetes      
  - hypertension
  - asthma          
local_config: custom_synthea.properties  # Custom Synthea config file
local_modules: ./synthea_modules         # Directory for custom modules
```

- **count**: Number of reports to generate
- **output_dir**: Directory to save generated PDFs
- **use_llm**: If true, uses OpenAI LLM for report text
- **seed**: Random seed for reproducibility
- **reference_date**: Reference date for age calculations (YYYYMMDD)
- **clinician_seed**: Optional, separate seed for clinician assignment
- **gender**: Gender filter for patients (`male`, `female`, or `any`)
- **min_age**, **max_age**: Age range for patients
- **state**: US state for Synthea simulation
- **modules**: Synthea disease modules to enable
- **local_config**: Path to a custom Synthea config file
- **local_modules**: Directory for custom Synthea modules

---

### More Synthea Modules
For an up-to-date and complete list of available modules, see the [official Synthea modules directory](https://github.com/synthetichealth/synthea/tree/master/src/main/resources/modules).

---

### Troubleshooting: 
#### Synthea Data Location

If you see errors about missing `patients.csv`, `medications.csv`, or `conditions.csv`, make sure you have generated Synthea data and that the path you provide (via `synthea_csv_dir`, CLI, or config) points to the correct directory containing those files.

If you installed `mednotegen` via pip, the default location is inside the package directory. For custom or system-wide Synthea runs, always specify the output CSV directory explicitly.

- **No CSV files generated:**
  - Make sure you edited the correct `synthea.properties` and used the `-c` flag when running Synthea.
  - Ensure `exporter.csv.export = true` is set and not overridden elsewhere in the file.
- **FileNotFoundError for CSVs:**
  - Confirm the CSV files exist in the path specified by `synthea_csv_dir` or in the expected package location.
- **ValueError: No patients found matching the specified filters:**
  - Check your age/gender filters in `config.yaml`. Try relaxing them if you have too few patients.


### Configure Synthea to Export CSVs

Edit `src/main/resources/synthea.properties` in your Synthea directory:

```
exporter.csv.export = true
```

(Ensure any `exporter.csv.export = false` lines are removed or commented out.)

### Generate Patient Data with Synthea

From your Synthea directory, clean any old output and generate new data:

```
rm -rf output/
java -jar synthea-with-dependencies.jar -c src/main/resources/synthea.properties -p 1000
```

- The `-p 1000` flag generates 1000 patients.
- After running, check for CSV files in `output/csv/`.


### Attribution

See `README_SYNTHEA_NOTICE.md` and `LICENSE-APACHE-2.0` for license and attribution requirements.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "mednotegen",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "synthea, medical notes, pdf",
    "author": null,
    "author_email": "Mikael Moise <mikael.moise@protonmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/f1/9f/b4dd049b9a73a93b35a666b110ebf3569c0bcdcaeb39eb6298caa9c238b7/mednotegen-0.1.2.tar.gz",
    "platform": null,
    "description": "# mednotegen\n\nThis project uses [Synthea\u2122](https://github.com/synthetichealth/synthea) to generate realistic synthetic patient data for  medical notes. \n\n---\n\n## Usage\n\n```python\nfrom mednotegen.generator import NoteGenerator\n\ngen = NoteGenerator.from_config(\"config.yaml\")\ngen.generate_notes(10, \"output_dir\")\n\n# Or specify Synthea CSV directory directly:\ngen = NoteGenerator(synthea_csv_dir=\"/path/to/synthea/output/csv\")\ngen.generate_notes(10, \"output_dir\")\n```\n\n## Using a Custom Synthea Directory with config.yaml\n\nYou can specify the Synthea CSV directory directly in your config file. Add the following line to your `config.yaml`:\n\nExample `config.yaml`:\n```yaml\ncount: 10\noutput_dir: output_dir\nsynthea_csv_dir: /path/to/synthea/output/csv\n```\n\nThen generate notes using:\n\n```python\nfrom mednotegen.generator import NoteGenerator\n\ngen = NoteGenerator.from_config(\"config.yaml\")\ngen.generate_notes(10, \"output_dir\")\n```\n\n\n## \u26a0\ufe0f Synthea Dependency Required\n\nThis project requires [Synthea\u2122](https://github.com/synthetichealth/synthea), an open-source synthetic patient generator, as an **external dependency**. You must clone and build Synthea yourself before using `mednotegen`.\n\n**To set up Synthea:**\n\n1. **Clone Synthea**\n   ```sh\n   git clone https://github.com/synthetichealth/synthea.git\n   ```\n2. **Build the Synthea JAR**\n   ```sh\n   cd synthea\n   ./gradlew build check test\n   cp build/libs/synthea-with-dependencies.jar .\n   cd ..\n   ```\n   Ensure `synthea-with-dependencies.jar` is in the `synthea/` directory at the root of your project.\n\n---\n\n## Configuration (`config.yaml`)\n\nYou can customize patient generation and report output using a `config.yaml` file. Example options:\n\n```yaml\ncount: 10                    # Number of reports to generate\noutput_dir: output_dir       # Output directory for PDFs\nuse_llm: false               # Use LLM for report generation\nsynthea_csv_dir: /path/to/synthea/output/csv   # Path to Synthea-generated CSV files\nseed: 1234                   # Random seed for reproducibility\nreference_date: \"20250628\"   # Reference date for data generation (YYYYMMDD)\nclinician_seed: 5678         # Optional: separate seed for clinician assignment\ngender: female               # male, female, or any\nmin_age: 30                  # Minimum patient age\nmax_age: 60                  # Maximum patient age\nstate: New York              # Synthea state parameter\nmodules:\n  - cardiovascular-disease\n  - diabetes      \n  - hypertension\n  - asthma          \nlocal_config: custom_synthea.properties  # Custom Synthea config file\nlocal_modules: ./synthea_modules         # Directory for custom modules\n```\n\n- **count**: Number of reports to generate\n- **output_dir**: Directory to save generated PDFs\n- **use_llm**: If true, uses OpenAI LLM for report text\n- **seed**: Random seed for reproducibility\n- **reference_date**: Reference date for age calculations (YYYYMMDD)\n- **clinician_seed**: Optional, separate seed for clinician assignment\n- **gender**: Gender filter for patients (`male`, `female`, or `any`)\n- **min_age**, **max_age**: Age range for patients\n- **state**: US state for Synthea simulation\n- **modules**: Synthea disease modules to enable\n- **local_config**: Path to a custom Synthea config file\n- **local_modules**: Directory for custom Synthea modules\n\n---\n\n### More Synthea Modules\nFor an up-to-date and complete list of available modules, see the [official Synthea modules directory](https://github.com/synthetichealth/synthea/tree/master/src/main/resources/modules).\n\n---\n\n### Troubleshooting: \n#### Synthea Data Location\n\nIf you see errors about missing `patients.csv`, `medications.csv`, or `conditions.csv`, make sure you have generated Synthea data and that the path you provide (via `synthea_csv_dir`, CLI, or config) points to the correct directory containing those files.\n\nIf you installed `mednotegen` via pip, the default location is inside the package directory. For custom or system-wide Synthea runs, always specify the output CSV directory explicitly.\n\n- **No CSV files generated:**\n  - Make sure you edited the correct `synthea.properties` and used the `-c` flag when running Synthea.\n  - Ensure `exporter.csv.export = true` is set and not overridden elsewhere in the file.\n- **FileNotFoundError for CSVs:**\n  - Confirm the CSV files exist in the path specified by `synthea_csv_dir` or in the expected package location.\n- **ValueError: No patients found matching the specified filters:**\n  - Check your age/gender filters in `config.yaml`. Try relaxing them if you have too few patients.\n\n\n### Configure Synthea to Export CSVs\n\nEdit `src/main/resources/synthea.properties` in your Synthea directory:\n\n```\nexporter.csv.export = true\n```\n\n(Ensure any `exporter.csv.export = false` lines are removed or commented out.)\n\n### Generate Patient Data with Synthea\n\nFrom your Synthea directory, clean any old output and generate new data:\n\n```\nrm -rf output/\njava -jar synthea-with-dependencies.jar -c src/main/resources/synthea.properties -p 1000\n```\n\n- The `-p 1000` flag generates 1000 patients.\n- After running, check for CSV files in `output/csv/`.\n\n\n### Attribution\n\nSee `README_SYNTHEA_NOTICE.md` and `LICENSE-APACHE-2.0` for license and attribution requirements.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Generate fake patient reports as PDFs.",
    "version": "0.1.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/nortelabs/mednotegen/issues",
        "Repository": "https://github.com/nortelabs/mednotegen"
    },
    "split_keywords": [
        "synthea",
        " medical notes",
        " pdf"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f7c59c20c2c5c3c9cb55e44cd904e4bb0590ab4f981d970429256b0505604d3b",
                "md5": "5db869dea0fba542ee4251ba28e80da4",
                "sha256": "9dea9607f01eb415a5c6bafb19725eed60d20db2898bcda3c000556ada22758f"
            },
            "downloads": -1,
            "filename": "mednotegen-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5db869dea0fba542ee4251ba28e80da4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 13829,
            "upload_time": "2025-07-08T22:31:10",
            "upload_time_iso_8601": "2025-07-08T22:31:10.917966Z",
            "url": "https://files.pythonhosted.org/packages/f7/c5/9c20c2c5c3c9cb55e44cd904e4bb0590ab4f981d970429256b0505604d3b/mednotegen-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f19fb4dd049b9a73a93b35a666b110ebf3569c0bcdcaeb39eb6298caa9c238b7",
                "md5": "7e41d2f361dc63cc6d6b0cd8d28d161c",
                "sha256": "23ec9e2edf97e77c004818d8335de72500e64673594a4cb3ce5c6874311fbd0e"
            },
            "downloads": -1,
            "filename": "mednotegen-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "7e41d2f361dc63cc6d6b0cd8d28d161c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 14332,
            "upload_time": "2025-07-08T22:31:11",
            "upload_time_iso_8601": "2025-07-08T22:31:11.846705Z",
            "url": "https://files.pythonhosted.org/packages/f1/9f/b4dd049b9a73a93b35a666b110ebf3569c0bcdcaeb39eb6298caa9c238b7/mednotegen-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-08 22:31:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nortelabs",
    "github_project": "mednotegen",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "faker",
            "specs": []
        },
        {
            "name": "fpdf",
            "specs": []
        }
    ],
    "lcname": "mednotegen"
}
        
Elapsed time: 1.63185s