|pypi| |actions| |codecov| |downloads|
edc-pdutils
+++++++++++
Use pandas with the Edc
Using the management command to export to CSV and STATA
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
The ``export_models`` management command requires you to log in with an account that has export permissions.
The basic command requires an app_label (``-a``) and a path to the export folder (``-p``)
By default, the export format is CSV but delimited using the pipe delimiter, ``|``.
Export one or more modules
==========================
.. code-block:: python
python manage.py export_models -a ambition_subject -p /ambition/export
The ``-a`` excepts more than one app_label
.. code-block:: python
python manage.py export_models -a ambition_subject,ambition_prn,ambition_ae -p /ambition/export
Export data in CSV format or STATA format
==========================================
To export as CSV where the delimiter is ``|``
.. code-block:: python
python manage.py export_models -a ambition_subject -p /ambition/export
To export as STATA ``dta`` use option ``-f stata``
.. code-block:: python
python manage.py export_models -a ambition_subject -p /ambition/export -f stata
Export encrypted data
=====================
To export encrypted fields include option ``--decrypt``:
.. code-block:: python
python manage.py export_models -a ambition_subject -p /ambition/export --decrypt
**Note:** If using the ``--decrypt`` option, the user account will need ``PII_EXPORT`` permissions
Export with a simple file name
==============================
To export using a simpler filename that drops the tablename app_label prefix and does not include a datestamp suffix.
Add option ``--use_simple_filename``.
.. code-block:: python
python manage.py export_models -a ambition_subject -p /ambition/export --use_simple_filename
Export for a country only
=========================
Add option ``--country``.
.. code-block:: python
python manage.py export_models -a ambition_subject -p /ambition/export --country="uganda"
_________________________________
Export manually
+++++++++++++++
To export Crf data, for example:
.. code-block:: python
from edc_pdutils.df_exporters import CsvCrfTablesExporter
from edc_pdutils.df_handlers import CrfDfHandler
app_label = 'ambition_subject'
csv_path = '/Users/erikvw/Documents/ambition/export/'
date_format = '%Y-%m-%d'
sep = '|'
exclude_history_tables = True
class MyDfHandler(CrfDfHandler):
visit_tbl = f'{app_label}_subjectvisit'
exclude_columns = ['form_as_json', 'survival_status','last_alive_date',
'screening_age_in_years', 'registration_datetime',
'subject_type']
class MyCsvCrfTablesExporter(CsvCrfTablesExporter):
visit_column = 'subject_visit_id'
datetime_fields = ['randomization_datetime']
df_handler_cls = MyDfHandler
app_label = app_label
export_folder = csv_path
sys.stdout.write('\n')
exporter = MyCsvCrfTablesExporter(
export_folder=csv_path,
exclude_history_tables=exclude_history_tables
)
exporter.to_csv(date_format=date_format, delimiter=sep)
To export INLINE data for any CRF configured with an inline, for example:
.. code-block:: python
class MyDfHandler(CrfDfHandler):
visit_tbl = 'ambition_subject_subjectvisit'
exclude_columns = ['form_as_json', 'survival_status','last_alive_date',
'screening_age_in_years', 'registration_datetime',
'subject_type']
class MyCsvCrfInlineTablesExporter(CsvCrfInlineTablesExporter):
visit_columns = ['subject_visit_id']
df_handler_cls = MyDfHandler
app_label = 'ambition_subject'
export_folder = csv_path
exclude_inline_tables = [
'ambition_subject_radiology_abnormal_results_reason',
'ambition_subject_radiology_cxr_type']
sys.stdout.write('\n')
exporter = MyCsvCrfInlineTablesExporter()
exporter.to_csv(date_format=date_format, delimiter=sep)
Using ``model_to_dataframe``
++++++++++++++++++++++++++++
.. code-block:: python
from edc_pdutils.model_to_dataframe import ModelToDataframe
from edc_pdutils.utils import get_model_names
from edc_pdutils.df_exporters.csv_exporter import CsvExporter
app_label = 'ambition_subject'
csv_path = '/Users/erikvw/Documents/ambition/export/'
date_format = '%Y-%m-%d'
sep = '|'
for model_name in get_model_names(
app_label=app_label,
# with_columns=with_columns,
# without_columns=without_columns,
):
m = ModelToDataframe(model=model_name)
exporter = CsvExporter(
data_label=model_name,
date_format=date_format,
delimiter=sep,
export_folder=csv_path,
)
exported = exporter.to_csv(dataframe=m.dataframe)
Settings
========
``EXPORT_FILENAME_TIMESTAMP_FORMAT``: True/False (Default: False)
By default a timestamp of the current date is added as a suffix to CSV export filenames.
By default a timestamp of format ``%Y%m%d%H%M%S`` is added.
``EXPORT_FILENAME_TIMESTAMP_FORMAT`` may be set to an empty string or a valid format for ``strftime``.
If ``EXPORT_FILENAME_TIMESTAMP_FORMAT`` is set to an empty string, "", a suffix is not added.
For example:
.. code-block:: bash
# default
registered_subject_20190203112555.csv
# EXPORT_FILENAME_TIMESTAMP_FORMAT = "%Y%m%d"
registered_subject_20190203.csv
# EXPORT_FILENAME_TIMESTAMP_FORMAT = ""
registered_subject.csv
.. |pypi| image:: https://img.shields.io/pypi/v/edc-pdutils.svg
:target: https://pypi.python.org/pypi/edc-pdutils
.. |actions| image:: https://github.com/clinicedc/edc-pdutils/actions/workflows/build.yml/badge.svg
:target: https://github.com/clinicedc/edc-pdutils/actions/workflows/build.yml
.. |codecov| image:: https://codecov.io/gh/clinicedc/edc-pdutils/branch/develop/graph/badge.svg
:target: https://codecov.io/gh/clinicedc/edc-pdutils
.. |downloads| image:: https://pepy.tech/badge/edc-pdutils
:target: https://pepy.tech/project/edc-pdutils
Raw data
{
"_id": null,
"home_page": "https://github.com/clinicedc/edc-pdutils",
"name": "edc-pdutils",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "django Edc pandas, clinicedc, clinical trials",
"author": "Erik van Widenfelt",
"author_email": "ew2789@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/d4/99/f3adacba8b0fb78adb3afd95bc88483d2832ccb6495179a4b7b7f246da66/edc_pdutils-0.3.45.tar.gz",
"platform": null,
"description": "|pypi| |actions| |codecov| |downloads|\n\nedc-pdutils\n+++++++++++\n\nUse pandas with the Edc\n\n\nUsing the management command to export to CSV and STATA\n+++++++++++++++++++++++++++++++++++++++++++++++++++++++\n\nThe ``export_models`` management command requires you to log in with an account that has export permissions.\n\nThe basic command requires an app_label (``-a``) and a path to the export folder (``-p``)\n\nBy default, the export format is CSV but delimited using the pipe delimiter, ``|``.\n\nExport one or more modules\n==========================\n\n.. code-block:: python\n\n python manage.py export_models -a ambition_subject -p /ambition/export\n\n\nThe ``-a`` excepts more than one app_label\n\n.. code-block:: python\n\n python manage.py export_models -a ambition_subject,ambition_prn,ambition_ae -p /ambition/export\n\n\nExport data in CSV format or STATA format\n==========================================\nTo export as CSV where the delimiter is ``|``\n\n.. code-block:: python\n\n python manage.py export_models -a ambition_subject -p /ambition/export\n\n\nTo export as STATA ``dta`` use option ``-f stata``\n\n.. code-block:: python\n\n python manage.py export_models -a ambition_subject -p /ambition/export -f stata\n\n\nExport encrypted data\n=====================\nTo export encrypted fields include option ``--decrypt``:\n\n.. code-block:: python\n\n python manage.py export_models -a ambition_subject -p /ambition/export --decrypt\n\n\n**Note:** If using the ``--decrypt`` option, the user account will need ``PII_EXPORT`` permissions\n\nExport with a simple file name\n==============================\n\nTo export using a simpler filename that drops the tablename app_label prefix and does not include a datestamp suffix.\n\nAdd option ``--use_simple_filename``.\n\n.. code-block:: python\n\n python manage.py export_models -a ambition_subject -p /ambition/export --use_simple_filename\n\nExport for a country only\n=========================\n\nAdd option ``--country``.\n\n.. code-block:: python\n\n python manage.py export_models -a ambition_subject -p /ambition/export --country=\"uganda\"\n\n\n\n_________________________________\n\nExport manually\n+++++++++++++++\n\nTo export Crf data, for example:\n\n.. code-block:: python\n\n from edc_pdutils.df_exporters import CsvCrfTablesExporter\n from edc_pdutils.df_handlers import CrfDfHandler\n\n app_label = 'ambition_subject'\n csv_path = '/Users/erikvw/Documents/ambition/export/'\n date_format = '%Y-%m-%d'\n sep = '|'\n exclude_history_tables = True\n\n class MyDfHandler(CrfDfHandler):\n visit_tbl = f'{app_label}_subjectvisit'\n exclude_columns = ['form_as_json', 'survival_status','last_alive_date',\n 'screening_age_in_years', 'registration_datetime',\n 'subject_type']\n\n class MyCsvCrfTablesExporter(CsvCrfTablesExporter):\n visit_column = 'subject_visit_id'\n datetime_fields = ['randomization_datetime']\n df_handler_cls = MyDfHandler\n app_label = app_label\n export_folder = csv_path\n\n sys.stdout.write('\\n')\n exporter = MyCsvCrfTablesExporter(\n export_folder=csv_path,\n exclude_history_tables=exclude_history_tables\n )\n exporter.to_csv(date_format=date_format, delimiter=sep)\n\nTo export INLINE data for any CRF configured with an inline, for example:\n\n.. code-block:: python\n\n class MyDfHandler(CrfDfHandler):\n visit_tbl = 'ambition_subject_subjectvisit'\n exclude_columns = ['form_as_json', 'survival_status','last_alive_date',\n 'screening_age_in_years', 'registration_datetime',\n 'subject_type']\n\n\n class MyCsvCrfInlineTablesExporter(CsvCrfInlineTablesExporter):\n visit_columns = ['subject_visit_id']\n df_handler_cls = MyDfHandler\n app_label = 'ambition_subject'\n export_folder = csv_path\n exclude_inline_tables = [\n 'ambition_subject_radiology_abnormal_results_reason',\n 'ambition_subject_radiology_cxr_type']\n sys.stdout.write('\\n')\n exporter = MyCsvCrfInlineTablesExporter()\n exporter.to_csv(date_format=date_format, delimiter=sep)\n\nUsing ``model_to_dataframe``\n++++++++++++++++++++++++++++\n\n.. code-block:: python\n\n from edc_pdutils.model_to_dataframe import ModelToDataframe\n from edc_pdutils.utils import get_model_names\n from edc_pdutils.df_exporters.csv_exporter import CsvExporter\n\n app_label = 'ambition_subject'\n csv_path = '/Users/erikvw/Documents/ambition/export/'\n date_format = '%Y-%m-%d'\n sep = '|'\n\n for model_name in get_model_names(\n app_label=app_label,\n # with_columns=with_columns,\n # without_columns=without_columns,\n ):\n m = ModelToDataframe(model=model_name)\n exporter = CsvExporter(\n data_label=model_name,\n date_format=date_format,\n delimiter=sep,\n export_folder=csv_path,\n )\n exported = exporter.to_csv(dataframe=m.dataframe)\n\n\nSettings\n========\n\n``EXPORT_FILENAME_TIMESTAMP_FORMAT``: True/False (Default: False)\n\nBy default a timestamp of the current date is added as a suffix to CSV export filenames.\n\nBy default a timestamp of format ``%Y%m%d%H%M%S`` is added.\n\n``EXPORT_FILENAME_TIMESTAMP_FORMAT`` may be set to an empty string or a valid format for ``strftime``.\n\nIf ``EXPORT_FILENAME_TIMESTAMP_FORMAT`` is set to an empty string, \"\", a suffix is not added.\n\nFor example:\n\n.. code-block:: bash\n\n # default\n registered_subject_20190203112555.csv\n\n # EXPORT_FILENAME_TIMESTAMP_FORMAT = \"%Y%m%d\"\n registered_subject_20190203.csv\n\n # EXPORT_FILENAME_TIMESTAMP_FORMAT = \"\"\n registered_subject.csv\n\n.. |pypi| image:: https://img.shields.io/pypi/v/edc-pdutils.svg\n :target: https://pypi.python.org/pypi/edc-pdutils\n\n.. |actions| image:: https://github.com/clinicedc/edc-pdutils/actions/workflows/build.yml/badge.svg\n :target: https://github.com/clinicedc/edc-pdutils/actions/workflows/build.yml\n\n.. |codecov| image:: https://codecov.io/gh/clinicedc/edc-pdutils/branch/develop/graph/badge.svg\n :target: https://codecov.io/gh/clinicedc/edc-pdutils\n\n.. |downloads| image:: https://pepy.tech/badge/edc-pdutils\n :target: https://pepy.tech/project/edc-pdutils\n",
"bugtrack_url": null,
"license": "GPL license, see LICENSE",
"summary": "Use pandas with clinicedc/edc projects",
"version": "0.3.45",
"project_urls": {
"Homepage": "https://github.com/clinicedc/edc-pdutils"
},
"split_keywords": [
"django edc pandas",
" clinicedc",
" clinical trials"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b52549af0c2b63970664734a5cbd3fe63c24aee9a7c172dd60ee2a92c5f8babc",
"md5": "f02af7400153c7e1bb7113db63afb0f2",
"sha256": "fe50aa74bee655b2f067887a7431d285f05f33634c7bbc6bb5cfe011d3f018dc"
},
"downloads": -1,
"filename": "edc_pdutils-0.3.45-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f02af7400153c7e1bb7113db63afb0f2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 80612,
"upload_time": "2024-10-16T02:44:14",
"upload_time_iso_8601": "2024-10-16T02:44:14.808900Z",
"url": "https://files.pythonhosted.org/packages/b5/25/49af0c2b63970664734a5cbd3fe63c24aee9a7c172dd60ee2a92c5f8babc/edc_pdutils-0.3.45-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d499f3adacba8b0fb78adb3afd95bc88483d2832ccb6495179a4b7b7f246da66",
"md5": "c1a3504c767e9ada5f58eac1b6fdfc30",
"sha256": "e3156b3e14b02c91bab97a168e52818030199827d1fedf9eaf7bb861535d4228"
},
"downloads": -1,
"filename": "edc_pdutils-0.3.45.tar.gz",
"has_sig": false,
"md5_digest": "c1a3504c767e9ada5f58eac1b6fdfc30",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 62943,
"upload_time": "2024-10-16T02:44:16",
"upload_time_iso_8601": "2024-10-16T02:44:16.846836Z",
"url": "https://files.pythonhosted.org/packages/d4/99/f3adacba8b0fb78adb3afd95bc88483d2832ccb6495179a4b7b7f246da66/edc_pdutils-0.3.45.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-16 02:44:16",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "clinicedc",
"github_project": "edc-pdutils",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"lcname": "edc-pdutils"
}