|pypi| |downloads| |clinicedc|
edc-analytics
=============
Build analytic tables from EDC data
Overview
--------
Read your data into a dataframe, for example an EDC screening table:
.. code-block:: python
import pandas as pd
from django_pandas.io import read_frame
from meta_screening.models import SubjectScreening
from edc_analytics.custom_tables import AgeTable, GenderTable
qs_screening = SubjectScreening.objects.all()
df = read_frame(qs_screening)
Convert all numerics to `pandas` numerics:
.. code-block:: python
cols = [
"age_in_years",
"dia_blood_pressure_avg",
"fbg_value",
"hba1c_value",
"ogtt_value",
"sys_blood_pressure_avg",
]
df[cols] = df[cols].apply(pd.to_numeric)
Pass the dataframe to each `Table` class
.. code-block:: python
gender_tbl = GenderTable(main_df=df)
age_tbl = AgeTable(main_df=df)
bp_table = BpTable(main_df=df)
In the `Table` instance,
* `data_df` is the supporting dataframe
* `table_df` is the dataframe to display. The `table_df` displays formatted data in the first 5 columns ("Characteristic", "Statistic", "F", "M", "All"). The `table_df` has additional columns that contain the statistics used for the statistics displayed in columns ["F", "M", "All"].
From above, `gender_tbl.table_df` is just a dataframe and can be combined with other `table_df` dataframes using `pd.concat()` to make a single `table_df`.
.. code-block:: python
table_df = pd.concat(
[gender_tbl.table_df, age_tbl.table_df, bp_table.table_df]
)
Show just the first 5 columns:
.. code-block:: python
table_df.iloc[:, :5]
Like any dataframe, you can export to csv:
.. code-block:: python
path = "my/path/to/csv/folder/table_df.csv"
table_df.to_csv(path_or_buf=path, encoding="utf-8", index=0, sep="|")
Details
-------
Assumptions
+++++++++++
The default table assumes:
* you have gender for all observations.
* gender is "M", "F" or from edc.constants `MALE`, `FEMALE`
A `Table` presents data by characteristic per row (such as age, bp, glucose, ...).
It is a dataframe where the first columns are formatted for presentation and the
remining columns are the descriptive statistics used to render the formatted columns
(mean, median, sd, range, IQR, proportions).
If a table is stratified by gender, then the formatted row for "Age" might be like this:
.. code-block:: text
| Characteristic | Statistic | F | M | All |
======================================================
| Age (years) | n | 1175 | 1000 | 2175 |
| | 18-34 | 70 | 64 | 134 |
| | ...etc | | | |
contains a collection of `RowDefinitions`
Stratification
++++++++++++++
Putting together a table
------------------------
RowDefinitions
++++++++++++++
`RowDefinitions` are a collection of `RowDefinition`.
To build a table use the `Table` class and override the `build_defs` method. For example:
.. |pypi| image:: https://img.shields.io/pypi/v/edc-analytics.svg
:target: https://pypi.python.org/pypi/edc-analytics
.. |downloads| image:: https://pepy.tech/badge/edc-analytics
:target: https://pepy.tech/project/edc-analytics
.. |clinicedc| image:: https://img.shields.io/badge/framework-Clinic_EDC-green
:alt:Made with clinicedc
:target: https://github.com/clinicedc
Raw data
{
"_id": null,
"home_page": null,
"name": "edc-analytics",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.14,>=3.12",
"maintainer_email": null,
"keywords": "analytics, clinical, clinicedc, collection, data, django, pandas, trials",
"author": "Erik van Widenfelt",
"author_email": "Erik van Widenfelt <ew2789@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/77/fb/0d1e42e7830b1bfa0847f374c31ec6f9ca7cf23334ad125aebe20f400af3/edc_analytics-1.0.2.tar.gz",
"platform": null,
"description": "|pypi| |downloads| |clinicedc|\n\n\nedc-analytics\n=============\n\n\nBuild analytic tables from EDC data\n\n\nOverview\n--------\n\nRead your data into a dataframe, for example an EDC screening table:\n\n.. code-block:: python\n\n import pandas as pd\n from django_pandas.io import read_frame\n from meta_screening.models import SubjectScreening\n from edc_analytics.custom_tables import AgeTable, GenderTable\n\n qs_screening = SubjectScreening.objects.all()\n df = read_frame(qs_screening)\n\n\nConvert all numerics to `pandas` numerics:\n\n.. code-block:: python\n\n cols = [\n \"age_in_years\",\n \"dia_blood_pressure_avg\",\n \"fbg_value\",\n \"hba1c_value\",\n \"ogtt_value\",\n \"sys_blood_pressure_avg\",\n ]\n df[cols] = df[cols].apply(pd.to_numeric)\n\n\nPass the dataframe to each `Table` class\n\n.. code-block:: python\n\n gender_tbl = GenderTable(main_df=df)\n age_tbl = AgeTable(main_df=df)\n bp_table = BpTable(main_df=df)\n\n\nIn the `Table` instance,\n\n* `data_df` is the supporting dataframe\n* `table_df` is the dataframe to display. The `table_df` displays formatted data in the first 5 columns (\"Characteristic\", \"Statistic\", \"F\", \"M\", \"All\"). The `table_df` has additional columns that contain the statistics used for the statistics displayed in columns [\"F\", \"M\", \"All\"].\n\nFrom above, `gender_tbl.table_df` is just a dataframe and can be combined with other `table_df` dataframes using `pd.concat()` to make a single `table_df`.\n\n.. code-block:: python\n\n table_df = pd.concat(\n [gender_tbl.table_df, age_tbl.table_df, bp_table.table_df]\n )\n\nShow just the first 5 columns:\n\n.. code-block:: python\n\n table_df.iloc[:, :5]\n\n\nLike any dataframe, you can export to csv:\n\n.. code-block:: python\n\n path = \"my/path/to/csv/folder/table_df.csv\"\n table_df.to_csv(path_or_buf=path, encoding=\"utf-8\", index=0, sep=\"|\")\n\n\nDetails\n-------\n\nAssumptions\n+++++++++++\n\nThe default table assumes:\n\n* you have gender for all observations.\n* gender is \"M\", \"F\" or from edc.constants `MALE`, `FEMALE`\n\n\nA `Table` presents data by characteristic per row (such as age, bp, glucose, ...).\nIt is a dataframe where the first columns are formatted for presentation and the\nremining columns are the descriptive statistics used to render the formatted columns\n(mean, median, sd, range, IQR, proportions).\n\nIf a table is stratified by gender, then the formatted row for \"Age\" might be like this:\n\n\n\n.. code-block:: text\n\n | Characteristic | Statistic | F | M | All |\n ======================================================\n | Age (years) | n | 1175 | 1000 | 2175 |\n | | 18-34 | 70 | 64 | 134 |\n | | ...etc | | | |\n\n\n\ncontains a collection of `RowDefinitions`\n\n\nStratification\n++++++++++++++\n\n\n\nPutting together a table\n------------------------\n\nRowDefinitions\n++++++++++++++\n\n`RowDefinitions` are a collection of `RowDefinition`.\n\nTo build a table use the `Table` class and override the `build_defs` method. For example:\n\n\n\n.. |pypi| image:: https://img.shields.io/pypi/v/edc-analytics.svg\n :target: https://pypi.python.org/pypi/edc-analytics\n\n.. |downloads| image:: https://pepy.tech/badge/edc-analytics\n :target: https://pepy.tech/project/edc-analytics\n\n.. |clinicedc| image:: https://img.shields.io/badge/framework-Clinic_EDC-green\n :alt:Made with clinicedc\n :target: https://github.com/clinicedc\n",
"bugtrack_url": null,
"license": null,
"summary": "Build analytical tables for clinicedc/edc projects",
"version": "1.0.2",
"project_urls": {
"Homepage": "https://github.com/clinicedc/edc-analytics"
},
"split_keywords": [
"analytics",
" clinical",
" clinicedc",
" collection",
" data",
" django",
" pandas",
" trials"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "6c3629f7f7b5da8eeb55ede71445765864ec448fc6fbd4250fdac1c5517dcbaa",
"md5": "0048d2edc66ad756cb0563f5411718ca",
"sha256": "7d96d458c5d2cf8a5915b15da11cff80d13212ed5ff77a70a4ba747d43d13c55"
},
"downloads": -1,
"filename": "edc_analytics-1.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0048d2edc66ad756cb0563f5411718ca",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.14,>=3.12",
"size": 35499,
"upload_time": "2025-08-01T13:21:06",
"upload_time_iso_8601": "2025-08-01T13:21:06.582361Z",
"url": "https://files.pythonhosted.org/packages/6c/36/29f7f7b5da8eeb55ede71445765864ec448fc6fbd4250fdac1c5517dcbaa/edc_analytics-1.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "77fb0d1e42e7830b1bfa0847f374c31ec6f9ca7cf23334ad125aebe20f400af3",
"md5": "a4195f3a064f8424b1e37ce32260313c",
"sha256": "0c71a5e80744fc9c3a419173a3ff9ae28b576146f345581940b6a62498999955"
},
"downloads": -1,
"filename": "edc_analytics-1.0.2.tar.gz",
"has_sig": false,
"md5_digest": "a4195f3a064f8424b1e37ce32260313c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.14,>=3.12",
"size": 25949,
"upload_time": "2025-08-01T13:21:07",
"upload_time_iso_8601": "2025-08-01T13:21:07.681844Z",
"url": "https://files.pythonhosted.org/packages/77/fb/0d1e42e7830b1bfa0847f374c31ec6f9ca7cf23334ad125aebe20f400af3/edc_analytics-1.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-01 13:21:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "clinicedc",
"github_project": "edc-analytics",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "edc-analytics"
}