edc-analytics

Name	edc-analytics JSON
Version	1.0.2 JSON
	download
home_page	None
Summary	Build analytical tables for clinicedc/edc projects
upload_time	2025-08-01 13:21:07
maintainer	None
docs_url	None
author	Erik van Widenfelt
requires_python	<3.14,>=3.12
license	None
keywords	analytics clinical clinicedc collection data django pandas trials
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            |pypi| |downloads| |clinicedc|


edc-analytics
=============


Build analytic tables from EDC data


Overview
--------

Read your data into a dataframe, for example an EDC screening table:

.. code-block:: python

    import pandas as pd
    from django_pandas.io import read_frame
    from meta_screening.models import SubjectScreening
    from edc_analytics.custom_tables import AgeTable, GenderTable

    qs_screening = SubjectScreening.objects.all()
    df = read_frame(qs_screening)


Convert all numerics to `pandas` numerics:

.. code-block:: python

    cols = [
        "age_in_years",
        "dia_blood_pressure_avg",
        "fbg_value",
        "hba1c_value",
        "ogtt_value",
        "sys_blood_pressure_avg",
    ]
    df[cols] = df[cols].apply(pd.to_numeric)


Pass the dataframe to each `Table` class

.. code-block:: python

    gender_tbl = GenderTable(main_df=df)
    age_tbl = AgeTable(main_df=df)
    bp_table = BpTable(main_df=df)


In the `Table` instance,

* `data_df` is the supporting dataframe
* `table_df` is the dataframe to display. The `table_df` displays formatted data in the first 5 columns ("Characteristic", "Statistic", "F", "M", "All"). The `table_df` has additional columns that contain the statistics used for the statistics displayed in columns ["F", "M", "All"].

From above, `gender_tbl.table_df` is just a dataframe and can be combined with other `table_df` dataframes using `pd.concat()` to make a single `table_df`.

.. code-block:: python

    table_df = pd.concat(
        [gender_tbl.table_df, age_tbl.table_df, bp_table.table_df]
     )

Show just the first 5 columns:

.. code-block:: python

    table_df.iloc[:, :5]


Like any dataframe, you can export to csv:

.. code-block:: python

    path = "my/path/to/csv/folder/table_df.csv"
    table_df.to_csv(path_or_buf=path, encoding="utf-8", index=0, sep="|")


Details
-------

Assumptions
+++++++++++

The default table assumes:

* you have gender for all observations.
* gender is "M", "F" or from edc.constants `MALE`, `FEMALE`


A `Table` presents data by characteristic per row (such as age, bp, glucose, ...).
It is a dataframe where the first columns are formatted for presentation and the
remining columns are the descriptive statistics used to render the formatted columns
(mean, median, sd, range, IQR, proportions).

If a table is stratified by gender, then the formatted row for "Age" might be like this:



.. code-block:: text

    | Characteristic | Statistic | F      | M     | All  |
    ======================================================
    | Age (years)    | n         |  1175  | 1000  | 2175 |
    |                | 18-34     |    70  |   64  |  134 |
    |                | ...etc    |        |       |      |



contains a collection of `RowDefinitions`


Stratification
++++++++++++++



Putting together a table
------------------------

RowDefinitions
++++++++++++++

`RowDefinitions` are a collection of `RowDefinition`.

To build a table use the `Table` class and override the `build_defs` method. For example:



.. |pypi| image:: https://img.shields.io/pypi/v/edc-analytics.svg
   :target: https://pypi.python.org/pypi/edc-analytics

.. |downloads| image:: https://pepy.tech/badge/edc-analytics
   :target: https://pepy.tech/project/edc-analytics

.. |clinicedc| image:: https://img.shields.io/badge/framework-Clinic_EDC-green
   :alt:Made with clinicedc
   :target: https://github.com/clinicedc

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "edc-analytics",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.14,>=3.12",
    "maintainer_email": null,
    "keywords": "analytics, clinical, clinicedc, collection, data, django, pandas, trials",
    "author": "Erik van Widenfelt",
    "author_email": "Erik van Widenfelt <ew2789@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/77/fb/0d1e42e7830b1bfa0847f374c31ec6f9ca7cf23334ad125aebe20f400af3/edc_analytics-1.0.2.tar.gz",
    "platform": null,
    "description": "|pypi| |downloads| |clinicedc|\n\n\nedc-analytics\n=============\n\n\nBuild analytic tables from EDC data\n\n\nOverview\n--------\n\nRead your data into a dataframe, for example an EDC screening table:\n\n.. code-block:: python\n\n    import pandas as pd\n    from django_pandas.io import read_frame\n    from meta_screening.models import SubjectScreening\n    from edc_analytics.custom_tables import AgeTable, GenderTable\n\n    qs_screening = SubjectScreening.objects.all()\n    df = read_frame(qs_screening)\n\n\nConvert all numerics to `pandas` numerics:\n\n.. code-block:: python\n\n    cols = [\n        \"age_in_years\",\n        \"dia_blood_pressure_avg\",\n        \"fbg_value\",\n        \"hba1c_value\",\n        \"ogtt_value\",\n        \"sys_blood_pressure_avg\",\n    ]\n    df[cols] = df[cols].apply(pd.to_numeric)\n\n\nPass the dataframe to each `Table` class\n\n.. code-block:: python\n\n    gender_tbl = GenderTable(main_df=df)\n    age_tbl = AgeTable(main_df=df)\n    bp_table = BpTable(main_df=df)\n\n\nIn the `Table` instance,\n\n* `data_df` is the supporting dataframe\n* `table_df` is the dataframe to display. The `table_df` displays formatted data in the first 5 columns (\"Characteristic\", \"Statistic\", \"F\", \"M\", \"All\"). The `table_df` has additional columns that contain the statistics used for the statistics displayed in columns [\"F\", \"M\", \"All\"].\n\nFrom above, `gender_tbl.table_df` is just a dataframe and can be combined with other `table_df` dataframes using `pd.concat()` to make a single `table_df`.\n\n.. code-block:: python\n\n    table_df = pd.concat(\n        [gender_tbl.table_df, age_tbl.table_df, bp_table.table_df]\n     )\n\nShow just the first 5 columns:\n\n.. code-block:: python\n\n    table_df.iloc[:, :5]\n\n\nLike any dataframe, you can export to csv:\n\n.. code-block:: python\n\n    path = \"my/path/to/csv/folder/table_df.csv\"\n    table_df.to_csv(path_or_buf=path, encoding=\"utf-8\", index=0, sep=\"|\")\n\n\nDetails\n-------\n\nAssumptions\n+++++++++++\n\nThe default table assumes:\n\n* you have gender for all observations.\n* gender is \"M\", \"F\" or from edc.constants `MALE`, `FEMALE`\n\n\nA `Table` presents data by characteristic per row (such as age, bp, glucose, ...).\nIt is a dataframe where the first columns are formatted for presentation and the\nremining columns are the descriptive statistics used to render the formatted columns\n(mean, median, sd, range, IQR, proportions).\n\nIf a table is stratified by gender, then the formatted row for \"Age\" might be like this:\n\n\n\n.. code-block:: text\n\n    | Characteristic | Statistic | F      | M     | All  |\n    ======================================================\n    | Age (years)    | n         |  1175  | 1000  | 2175 |\n    |                | 18-34     |    70  |   64  |  134 |\n    |                | ...etc    |        |       |      |\n\n\n\ncontains a collection of `RowDefinitions`\n\n\nStratification\n++++++++++++++\n\n\n\nPutting together a table\n------------------------\n\nRowDefinitions\n++++++++++++++\n\n`RowDefinitions` are a collection of `RowDefinition`.\n\nTo build a table use the `Table` class and override the `build_defs` method. For example:\n\n\n\n.. |pypi| image:: https://img.shields.io/pypi/v/edc-analytics.svg\n   :target: https://pypi.python.org/pypi/edc-analytics\n\n.. |downloads| image:: https://pepy.tech/badge/edc-analytics\n   :target: https://pepy.tech/project/edc-analytics\n\n.. |clinicedc| image:: https://img.shields.io/badge/framework-Clinic_EDC-green\n   :alt:Made with clinicedc\n   :target: https://github.com/clinicedc\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Build analytical tables for clinicedc/edc projects",
    "version": "1.0.2",
    "project_urls": {
        "Homepage": "https://github.com/clinicedc/edc-analytics"
    },
    "split_keywords": [
        "analytics",
        " clinical",
        " clinicedc",
        " collection",
        " data",
        " django",
        " pandas",
        " trials"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6c3629f7f7b5da8eeb55ede71445765864ec448fc6fbd4250fdac1c5517dcbaa",
                "md5": "0048d2edc66ad756cb0563f5411718ca",
                "sha256": "7d96d458c5d2cf8a5915b15da11cff80d13212ed5ff77a70a4ba747d43d13c55"
            },
            "downloads": -1,
            "filename": "edc_analytics-1.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0048d2edc66ad756cb0563f5411718ca",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.14,>=3.12",
            "size": 35499,
            "upload_time": "2025-08-01T13:21:06",
            "upload_time_iso_8601": "2025-08-01T13:21:06.582361Z",
            "url": "https://files.pythonhosted.org/packages/6c/36/29f7f7b5da8eeb55ede71445765864ec448fc6fbd4250fdac1c5517dcbaa/edc_analytics-1.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "77fb0d1e42e7830b1bfa0847f374c31ec6f9ca7cf23334ad125aebe20f400af3",
                "md5": "a4195f3a064f8424b1e37ce32260313c",
                "sha256": "0c71a5e80744fc9c3a419173a3ff9ae28b576146f345581940b6a62498999955"
            },
            "downloads": -1,
            "filename": "edc_analytics-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "a4195f3a064f8424b1e37ce32260313c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.14,>=3.12",
            "size": 25949,
            "upload_time": "2025-08-01T13:21:07",
            "upload_time_iso_8601": "2025-08-01T13:21:07.681844Z",
            "url": "https://files.pythonhosted.org/packages/77/fb/0d1e42e7830b1bfa0847f374c31ec6f9ca7cf23334ad125aebe20f400af3/edc_analytics-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-01 13:21:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "clinicedc",
    "github_project": "edc-analytics",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "edc-analytics"
}

Erik van Widenfelt