# PADME Conductor Library
<p align="center">
<img src="https://raw.githubusercontent.com/sawelt/padme-conductor/main/logo.svg" width=220>
</p>
A library to simplify the interactions with the Personal Health Train (PHT) and its Stations.
## Connection Parameters
When working with the Stations you typically want to retrieve the information of how to connect to the database first.
This can be done with the `get_environment_vars` function, by passing the variable keys which need to be retrieved.
```python
env = pc.get_environment_vars(
[
"DATA_SOURCE_USERNAME",
"DATA_SOURCE_HOST",
"DATA_SOURCE_PASSWORD",
"DATA_SOURCE_PORT",
"STATION_NAME",
]
)
```
## Database Plugins
The next step would be to use the connection parameters and query the database of the Station.
For that, we first instantiate a database plugin for the appropriate database type.
### SQL
```python
sql = SqlPlugin(
db_name="patientdb",
user=env["DATA_SOURCE_USERNAME"],
password=env["DATA_SOURCE_PASSWORD"],
host=env["DATA_SOURCE_HOST"],
port=env["DATA_SOURCE_PORT"],
)
```
### FHIR
```python
fhir_plugin = FHIRClient(f"http://192.168.0.1:8080/fhir")
```
## Querying Databases
With the database plugin, we can query the data from the Station.
For that, we pass a default `Query` object and the current station name to the `query` function.
```python
result = pc.query(
Query("SELECT * FROM patients WHERE age >= 50", sql_plugin))
```
If the queries for each station differ, you can pass a list of queries and the current station name instead:
```python
result = pc.query(
[
Query("SELECT * FROM patients WHERE age >= 50", sql, "Klee"),
Query("SELECT * FROM patient_info WHERE age >= 50", sql, "Bruegel"),
],
env["STATION_NAME"],
)
```
## Executing the Analysis
You can now design your analysis with the libraries and frameworks you like.
This analysis can be e.g. a Machine Learning model, you set up and train, or an analysis that collects statistical data.
To execute the analysis we then pass the analysis function to the `execute_analysis` function, with all the parameters your function expects.
```python
def my_analysis(query_result):
res = len(query_result)
pc.log(f"found {res} patients")
return res
res = pc.execute_analysis(my_analysis, result)
```
## Saving the Results
We can then save the results from our analysis in the Train file system.
To simplify this Train interaction we provide the `save` function.
You can separate the saved results, either by each run, each station, or not separate them.
The append parameter defines whether the content should be appended to the file or not.
```python
save_string = f"The result is {res}"
pc.save(save_string, "result.txt", separate_by=Separation.STATION, append=True)
```
## Retrieving Previous Results
To retrieve the previous results, like a previous state of a machine learning model, you can use the `retrieve_prev_result` function.
*If you have separated your results when saving, you also need to provide the separation strategy when retrieving.*
```python
prev = pc.retrieve_prev_result("result.txt", separate_by=Separation.STATION)
```
## Logging
You can use the `log` functions to log simultaneously to stdout/stderr and a log file in the Train file system.
We also provide the ability to add custom tags to a log function with the `extra` parameter.
```python
pc.log("hello world", extra={"tags": ["cpu_consumption"]})
pc.log_debug("hello world")
pc.log_info("hello world")
pc.log_warning("hello world")
pc.log_error("hello world")
pc.log_critical("hello world")
```
## Simple example
This is a simple example Train-analysis showing the concepts described above.
```python
import padme_conductor as pc
from padme_conductor import Query, Separation
from padme_conductor.Plugins.SQL import SqlPlugin
env = pc.get_environment_vars(
[
"DATA_SOURCE_USERNAME",
"DATA_SOURCE_HOST",
"DATA_SOURCE_PASSWORD",
"DATA_SOURCE_PORT",
"STATION_NAME",
]
)
sql = SqlPlugin(
db_name="patientdb",
user=env["DATA_SOURCE_USERNAME"],
password=env["DATA_SOURCE_PASSWORD"],
host=env["DATA_SOURCE_HOST"],
port=env["DATA_SOURCE_PORT"],
)
result = pc.query(
[
Query("SELECT * FROM patients WHERE age >= 50", sql, "Klee"),
Query("SELECT * FROM patient_info WHERE age >= 50", sql, "Bruegel"),
],
env["STATION_NAME"],
)
def analysis(query_result):
res = len(query_result)
pc.log(f"found {res} patients")
return res
res = pc.execute_analysis(analysis, result)
prev = pc.retrieve_prev_result("result.txt", separate_by=Separation.STATION)
pc.log(prev, extra={"tags": ["cpu_consumption"]})
# Write to file
save_string = env["STATION_NAME"] + ":" + str(res) + "\n"
pc.save(save_string, "result.txt", separate_by=Separation.STATION)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/sawelt/padme-conductor",
"name": "padme-conductor",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "Personal Health Train,PADME",
"author": "Martin Goerz, Sascha Welten",
"author_email": "welten@dbis.rwth-aachen.de",
"download_url": "https://files.pythonhosted.org/packages/b2/1c/cf7aa7065845ec2959a4593f5f65df13dda0520e863a4d37c839325f50e4/padme-conductor-0.1.9.tar.gz",
"platform": null,
"description": "# PADME Conductor Library\n\n<p align=\"center\">\n <img src=\"https://raw.githubusercontent.com/sawelt/padme-conductor/main/logo.svg\" width=220>\n</p>\n\nA library to simplify the interactions with the Personal Health Train (PHT) and its Stations.\n\n\n## Connection Parameters\n\nWhen working with the Stations you typically want to retrieve the information of how to connect to the database first.\nThis can be done with the `get_environment_vars` function, by passing the variable keys which need to be retrieved.\n\n```python\nenv = pc.get_environment_vars(\n [\n \"DATA_SOURCE_USERNAME\",\n \"DATA_SOURCE_HOST\",\n \"DATA_SOURCE_PASSWORD\",\n \"DATA_SOURCE_PORT\",\n \"STATION_NAME\",\n ]\n)\n```\n\n## Database Plugins\n\nThe next step would be to use the connection parameters and query the database of the Station.\nFor that, we first instantiate a database plugin for the appropriate database type.\n\n### SQL\n\n```python\nsql = SqlPlugin(\n db_name=\"patientdb\",\n user=env[\"DATA_SOURCE_USERNAME\"],\n password=env[\"DATA_SOURCE_PASSWORD\"],\n host=env[\"DATA_SOURCE_HOST\"],\n port=env[\"DATA_SOURCE_PORT\"],\n)\n```\n\n### FHIR\n\n```python\nfhir_plugin = FHIRClient(f\"http://192.168.0.1:8080/fhir\")\n```\n\n\n## Querying Databases\n\nWith the database plugin, we can query the data from the Station.\nFor that, we pass a default `Query` object and the current station name to the `query` function.\n\n```python\nresult = pc.query(\n Query(\"SELECT * FROM patients WHERE age >= 50\", sql_plugin))\n```\n\nIf the queries for each station differ, you can pass a list of queries and the current station name instead:\n\n```python\nresult = pc.query(\n [\n Query(\"SELECT * FROM patients WHERE age >= 50\", sql, \"Klee\"),\n Query(\"SELECT * FROM patient_info WHERE age >= 50\", sql, \"Bruegel\"),\n ],\n env[\"STATION_NAME\"],\n)\n```\n\n## Executing the Analysis\n\nYou can now design your analysis with the libraries and frameworks you like.\nThis analysis can be e.g. a Machine Learning model, you set up and train, or an analysis that collects statistical data.\n\nTo execute the analysis we then pass the analysis function to the `execute_analysis` function, with all the parameters your function expects.\n\n```python\ndef my_analysis(query_result):\n res = len(query_result)\n pc.log(f\"found {res} patients\")\n return res\n\nres = pc.execute_analysis(my_analysis, result)\n```\n\n## Saving the Results\n\nWe can then save the results from our analysis in the Train file system.\nTo simplify this Train interaction we provide the `save` function.\n\nYou can separate the saved results, either by each run, each station, or not separate them.\nThe append parameter defines whether the content should be appended to the file or not.\n\n\n```python\nsave_string = f\"The result is {res}\"\npc.save(save_string, \"result.txt\", separate_by=Separation.STATION, append=True)\n```\n\n## Retrieving Previous Results\n\nTo retrieve the previous results, like a previous state of a machine learning model, you can use the `retrieve_prev_result` function.\n\n*If you have separated your results when saving, you also need to provide the separation strategy when retrieving.*\n\n```python\nprev = pc.retrieve_prev_result(\"result.txt\", separate_by=Separation.STATION)\n```\n\n## Logging\n\nYou can use the `log` functions to log simultaneously to stdout/stderr and a log file in the Train file system.\nWe also provide the ability to add custom tags to a log function with the `extra` parameter.\n\n```python\npc.log(\"hello world\", extra={\"tags\": [\"cpu_consumption\"]})\n\npc.log_debug(\"hello world\")\npc.log_info(\"hello world\")\npc.log_warning(\"hello world\")\npc.log_error(\"hello world\")\npc.log_critical(\"hello world\")\n```\n\n\n## Simple example\n\nThis is a simple example Train-analysis showing the concepts described above.\n\n```python\nimport padme_conductor as pc\nfrom padme_conductor import Query, Separation\nfrom padme_conductor.Plugins.SQL import SqlPlugin\n\nenv = pc.get_environment_vars(\n [\n \"DATA_SOURCE_USERNAME\",\n \"DATA_SOURCE_HOST\",\n \"DATA_SOURCE_PASSWORD\",\n \"DATA_SOURCE_PORT\",\n \"STATION_NAME\",\n ]\n)\n\nsql = SqlPlugin(\n db_name=\"patientdb\",\n user=env[\"DATA_SOURCE_USERNAME\"],\n password=env[\"DATA_SOURCE_PASSWORD\"],\n host=env[\"DATA_SOURCE_HOST\"],\n port=env[\"DATA_SOURCE_PORT\"],\n)\n\nresult = pc.query(\n [\n Query(\"SELECT * FROM patients WHERE age >= 50\", sql, \"Klee\"),\n Query(\"SELECT * FROM patient_info WHERE age >= 50\", sql, \"Bruegel\"),\n ],\n env[\"STATION_NAME\"],\n)\n\n\ndef analysis(query_result):\n res = len(query_result)\n pc.log(f\"found {res} patients\")\n return res\n\n\nres = pc.execute_analysis(analysis, result)\nprev = pc.retrieve_prev_result(\"result.txt\", separate_by=Separation.STATION)\npc.log(prev, extra={\"tags\": [\"cpu_consumption\"]})\n\n\n# Write to file\nsave_string = env[\"STATION_NAME\"] + \":\" + str(res) + \"\\n\"\npc.save(save_string, \"result.txt\", separate_by=Separation.STATION)\n\n```\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "A library which supports the creation of so-called Trains for the Personal Health Train infrastructure.",
"version": "0.1.9",
"project_urls": {
"Homepage": "https://github.com/sawelt/padme-conductor"
},
"split_keywords": [
"personal health train",
"padme"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "260dda6e287ef6c785a05df9a7f67da62971a322192d78530a2c57ffad56db2c",
"md5": "72fec852ae44a641210d5c3d2e691863",
"sha256": "1c4932a31d2eb8210cb6d519991560be61fa692b3e0eeecd1b54cebc0c81f62b"
},
"downloads": -1,
"filename": "padme_conductor-0.1.9-py3-none-any.whl",
"has_sig": false,
"md5_digest": "72fec852ae44a641210d5c3d2e691863",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 11051,
"upload_time": "2023-05-25T20:58:01",
"upload_time_iso_8601": "2023-05-25T20:58:01.791461Z",
"url": "https://files.pythonhosted.org/packages/26/0d/da6e287ef6c785a05df9a7f67da62971a322192d78530a2c57ffad56db2c/padme_conductor-0.1.9-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b21ccf7aa7065845ec2959a4593f5f65df13dda0520e863a4d37c839325f50e4",
"md5": "6c76018bc87cfa01ab28d4220f53d5a6",
"sha256": "3c15b3b33f0083b029b8b2a15beb150d7fe09ad7ec731795dc8e5b310198424d"
},
"downloads": -1,
"filename": "padme-conductor-0.1.9.tar.gz",
"has_sig": false,
"md5_digest": "6c76018bc87cfa01ab28d4220f53d5a6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10021,
"upload_time": "2023-05-25T20:58:06",
"upload_time_iso_8601": "2023-05-25T20:58:06.797078Z",
"url": "https://files.pythonhosted.org/packages/b2/1c/cf7aa7065845ec2959a4593f5f65df13dda0520e863a4d37c839325f50e4/padme-conductor-0.1.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-05-25 20:58:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sawelt",
"github_project": "padme-conductor",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "padme-conductor"
}