treets


Nametreets JSON
Version 1.0.4 PyPI version JSON
download
home_pagehttps://github.com/FleischerResearchLab/treets/tree/master/
SummaryThis library provides functions to analyzes food logging data.
upload_time2023-03-14 22:46:46
maintainer
docs_urlNone
authorQiwen Zhang, Jason Fleischer
requires_python>=3.6
licenseApache Software License 2.0
keywords circadian ryhthm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TREETS
> Time Restricted Eating ExperimenTS.


## Install

`pip install treets`

## Example for a quick data analysis on phased studies.

```python
import treets.core as treets
import pandas as pd
```

Take a brief look on the food logging dataset and the reference information sheet

```python
treets.file_loader('data/col_test_data/yrt*').head(2)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Unnamed: 0</th>
      <th>original_logtime</th>
      <th>desc_text</th>
      <th>food_type</th>
      <th>PID</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>0</td>
      <td>2021-05-12 02:30:00 +0000</td>
      <td>Milk</td>
      <td>b</td>
      <td>yrt1999</td>
    </tr>
    <tr>
      <th>1</th>
      <td>1</td>
      <td>2021-05-12 02:45:00 +0000</td>
      <td>Some Medication</td>
      <td>m</td>
      <td>yrt1999</td>
    </tr>
  </tbody>
</table>
</div>



```python
pd.read_excel('data/col_test_data/toy_data_17May2021.xlsx').head(2)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>mCC_ID</th>
      <th>Participant_Study_ID</th>
      <th>Study Phase</th>
      <th>Intervention group (TRE or HABIT)</th>
      <th>Start_Day</th>
      <th>End_day</th>
      <th>Eating_Window_Start</th>
      <th>Eating_Window_End</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>yrt1999</td>
      <td>2</td>
      <td>S-REM</td>
      <td>TRE</td>
      <td>2021-05-12</td>
      <td>2021-05-14</td>
      <td>00:00:00</td>
      <td>23:59:00</td>
    </tr>
    <tr>
      <th>1</th>
      <td>yrt1999</td>
      <td>2</td>
      <td>T3-INT</td>
      <td>TRE</td>
      <td>2021-05-15</td>
      <td>2021-05-18</td>
      <td>08:00:00</td>
      <td>18:00:00</td>
    </tr>
  </tbody>
</table>
</div>



Call summarize_data_with_experiment_phases() function to make the table that contains analytic information that we want.

```python
df = treets.summarize_data_with_experiment_phases(treets.file_loader('data/col_test_data/yrt*')\
                      , pd.read_excel('data/col_test_data/toy_data_17May2021.xlsx'))
```

    Participant yrt1999 didn't log any food items in the following day(s):
    2021-05-18
    Participant yrt2000 didn't log any food items in the following day(s):
    2021-05-12
    2021-05-13
    2021-05-14
    2021-05-15
    2021-05-16
    2021-05-17
    2021-05-18
    Participant yrt1999 have bad logging day(s) in the following day(s):
    2021-05-12
    2021-05-15
    Participant yrt1999 have bad window day(s) in the following day(s):
    2021-05-15
    2021-05-17
    Participant yrt1999 have non adherent day(s) in the following day(s):
    2021-05-12
    2021-05-15
    2021-05-17


```python
df
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>mCC_ID</th>
      <th>Participant_Study_ID</th>
      <th>Study Phase</th>
      <th>Intervention group (TRE or HABIT)</th>
      <th>Start_Day</th>
      <th>End_day</th>
      <th>Eating_Window_Start</th>
      <th>Eating_Window_End</th>
      <th>phase_duration</th>
      <th>caloric_entries_num</th>
      <th>...</th>
      <th>logging_day_counts</th>
      <th>%_logging_day_counts</th>
      <th>good_logging_days</th>
      <th>%_good_logging_days</th>
      <th>good_window_days</th>
      <th>%_good_window_days</th>
      <th>outside_window_days</th>
      <th>%_outside_window_days</th>
      <th>adherent_days</th>
      <th>%_adherent_days</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>yrt1999</td>
      <td>2</td>
      <td>S-REM</td>
      <td>TRE</td>
      <td>2021-05-12</td>
      <td>2021-05-14</td>
      <td>00:00:00</td>
      <td>23:59:00</td>
      <td>3 days</td>
      <td>7</td>
      <td>...</td>
      <td>3</td>
      <td>100.0%</td>
      <td>2.0</td>
      <td>66.67%</td>
      <td>3.0</td>
      <td>100.0%</td>
      <td>0.0</td>
      <td>0.0%</td>
      <td>2.0</td>
      <td>66.67%</td>
    </tr>
    <tr>
      <th>1</th>
      <td>yrt1999</td>
      <td>2</td>
      <td>T3-INT</td>
      <td>TRE</td>
      <td>2021-05-15</td>
      <td>2021-05-18</td>
      <td>08:00:00</td>
      <td>18:00:00</td>
      <td>4 days</td>
      <td>8</td>
      <td>...</td>
      <td>3</td>
      <td>75.0%</td>
      <td>2.0</td>
      <td>50.0%</td>
      <td>1.0</td>
      <td>25.0%</td>
      <td>2.0</td>
      <td>50.0%</td>
      <td>1.0</td>
      <td>25.0%</td>
    </tr>
    <tr>
      <th>2</th>
      <td>yrt2000</td>
      <td>3</td>
      <td>T3-INT</td>
      <td>TRE</td>
      <td>2021-05-12</td>
      <td>2021-05-14</td>
      <td>08:00:00</td>
      <td>16:00:00</td>
      <td>3 days</td>
      <td>0</td>
      <td>...</td>
      <td>0</td>
      <td>0.0%</td>
      <td>0.0</td>
      <td>0.0%</td>
      <td>0.0</td>
      <td>0.0%</td>
      <td>0.0</td>
      <td>0.0%</td>
      <td>0.0</td>
      <td>0.0%</td>
    </tr>
    <tr>
      <th>3</th>
      <td>yrt2000</td>
      <td>3</td>
      <td>T3-INT</td>
      <td>TRE</td>
      <td>2021-05-15</td>
      <td>2021-05-18</td>
      <td>08:00:00</td>
      <td>16:00:00</td>
      <td>4 days</td>
      <td>0</td>
      <td>...</td>
      <td>0</td>
      <td>0.0%</td>
      <td>0.0</td>
      <td>0.0%</td>
      <td>0.0</td>
      <td>0.0%</td>
      <td>0.0</td>
      <td>0.0%</td>
      <td>0.0</td>
      <td>0.0%</td>
    </tr>
    <tr>
      <th>4</th>
      <td>yrt2001</td>
      <td>4</td>
      <td>T12-A</td>
      <td>TRE</td>
      <td>NaT</td>
      <td>NaT</td>
      <td>NaN</td>
      <td>NaN</td>
      <td>NaT</td>
      <td>0</td>
      <td>...</td>
      <td>0</td>
      <td>nan%</td>
      <td>NaN</td>
      <td>NaN</td>
      <td>NaN</td>
      <td>NaN</td>
      <td>NaN</td>
      <td>NaN</td>
      <td>NaN</td>
      <td>NaN</td>
    </tr>
  </tbody>
</table>
<p>5 rows × 32 columns</p>
</div>



Look at resulting statistical information for the first row in the resulting dataset.

```python
df.iloc[0]
```




    mCC_ID                                           yrt1999
    Participant_Study_ID                                   2
    Study Phase                                        S-REM
    Intervention group (TRE or HABIT)                    TRE
    Start_Day                            2021-05-12 00:00:00
    End_day                              2021-05-14 00:00:00
    Eating_Window_Start                             00:00:00
    Eating_Window_End                               23:59:00
    phase_duration                           3 days 00:00:00
    caloric_entries_num                                    7
    medication_num                                         0
    water_num                                              0
    first_cal_avg                                   5.916667
    first_cal_std                                   2.240722
    last_cal_avg                                   19.666667
    last_cal_std                                   12.933323
    mean_daily_eating_window                           13.75
    std_daily_eating_window                        11.986972
    earliest_entry                                       4.5
    2.5%                                              4.5375
    97.5%                                            27.5625
    duration mid 95%                                  23.025
    logging_day_counts                                     3
    %_logging_day_counts                              100.0%
    good_logging_days                                    2.0
    %_good_logging_days                               66.67%
    good_window_days                                     3.0
    %_good_window_days                                100.0%
    outside_window_days                                  0.0
    %_outside_window_days                               0.0%
    adherent_days                                        2.0
    %_adherent_days                                   66.67%
    Name: 0, dtype: object



## Example for a quick data analysis on non-phased studies.

take a look at the original dataset

```python
df = treets.file_loader('data/test_food_details.csv')
df.head(2)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Unnamed: 0</th>
      <th>ID</th>
      <th>unique_code</th>
      <th>research_info_id</th>
      <th>desc_text</th>
      <th>food_type</th>
      <th>original_logtime</th>
      <th>foodimage_file_name</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1340147</td>
      <td>7572733</td>
      <td>alqt14018795225</td>
      <td>150</td>
      <td>Water</td>
      <td>w</td>
      <td>2017-12-08 17:30:00+00:00</td>
      <td>NaN</td>
    </tr>
    <tr>
      <th>1</th>
      <td>1340148</td>
      <td>411111</td>
      <td>alqt14018795225</td>
      <td>150</td>
      <td>Coffee White</td>
      <td>b</td>
      <td>2017-12-09 00:01:00+00:00</td>
      <td>NaN</td>
    </tr>
  </tbody>
</table>
</div>



preprocess the data to create features we might need in the furthur analysis such as float time, week count since the first week, etc.

```python
df = treets.load_food_data(df,'unique_code', 'original_logtime',4)
df.head(2)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Unnamed: 0</th>
      <th>ID</th>
      <th>unique_code</th>
      <th>research_info_id</th>
      <th>desc_text</th>
      <th>food_type</th>
      <th>original_logtime</th>
      <th>date</th>
      <th>float_time</th>
      <th>time</th>
      <th>week_from_start</th>
      <th>year</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1340147</td>
      <td>7572733</td>
      <td>alqt14018795225</td>
      <td>150</td>
      <td>Water</td>
      <td>w</td>
      <td>2017-12-08 17:30:00+00:00</td>
      <td>2017-12-08</td>
      <td>17.500000</td>
      <td>17:30:00</td>
      <td>1</td>
      <td>2017</td>
    </tr>
    <tr>
      <th>1</th>
      <td>1340148</td>
      <td>411111</td>
      <td>alqt14018795225</td>
      <td>150</td>
      <td>Coffee White</td>
      <td>b</td>
      <td>2017-12-09 00:01:00+00:00</td>
      <td>2017-12-08</td>
      <td>24.016667</td>
      <td>00:01:00</td>
      <td>1</td>
      <td>2017</td>
    </tr>
  </tbody>
</table>
</div>



Call summarize_data() function to make the table that contains analytic information that we want.¶

```python
df = treets.summarize_data(df, 'unique_code', 'float_time', 'date')
df.head(2)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>unique_code</th>
      <th>num_days</th>
      <th>num_total_items</th>
      <th>num_f_n_b</th>
      <th>num_medications</th>
      <th>num_water</th>
      <th>first_cal_avg</th>
      <th>first_cal_std</th>
      <th>last_cal_avg</th>
      <th>last_cal_std</th>
      <th>eating_win_avg</th>
      <th>eating_win_std</th>
      <th>good_logging_count</th>
      <th>first_cal variation (90%-10%)</th>
      <th>last_cal variation (90%-10%)</th>
      <th>2.5%</th>
      <th>95%</th>
      <th>duration mid 95%</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>alqt1148284857</td>
      <td>13</td>
      <td>149</td>
      <td>96</td>
      <td>19</td>
      <td>34</td>
      <td>7.821795</td>
      <td>6.710717</td>
      <td>23.485897</td>
      <td>4.869082</td>
      <td>15.664103</td>
      <td>8.231201</td>
      <td>146</td>
      <td>2.966667</td>
      <td>9.666667</td>
      <td>4.535000</td>
      <td>26.813333</td>
      <td>22.636667</td>
    </tr>
    <tr>
      <th>1</th>
      <td>alqt14018795225</td>
      <td>64</td>
      <td>488</td>
      <td>484</td>
      <td>3</td>
      <td>1</td>
      <td>7.525781</td>
      <td>5.434563</td>
      <td>25.858594</td>
      <td>3.374839</td>
      <td>18.332813</td>
      <td>6.603913</td>
      <td>484</td>
      <td>13.450000</td>
      <td>3.100000</td>
      <td>4.183333</td>
      <td>27.438333</td>
      <td>23.416667</td>
    </tr>
  </tbody>
</table>
</div>



Look at resulting statistical information for the first row in the resulting dataset.

```python
df.iloc[0]
```




    unique_code                      alqt1148284857
    num_days                                     13
    num_total_items                             149
    num_f_n_b                                    96
    num_medications                              19
    num_water                                    34
    first_cal_avg                          7.821795
    first_cal_std                          6.710717
    last_cal_avg                          23.485897
    last_cal_std                           4.869082
    eating_win_avg                        15.664103
    eating_win_std                         8.231201
    good_logging_count                          146
    first_cal variation (90%-10%)          2.966667
    last_cal variation (90%-10%)           9.666667
    2.5%                                      4.535
    95%                                   26.813333
    duration mid 95%                      22.636667
    Name: 0, dtype: object



## Clean text in food loggings

```python
# import the dataset
df = treets.file_loader('data/col_test_data/yrt*')
df.head(3)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Unnamed: 0</th>
      <th>original_logtime</th>
      <th>desc_text</th>
      <th>food_type</th>
      <th>PID</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>0</td>
      <td>2021-05-12 02:30:00 +0000</td>
      <td>Milk</td>
      <td>b</td>
      <td>yrt1999</td>
    </tr>
    <tr>
      <th>1</th>
      <td>1</td>
      <td>2021-05-12 02:45:00 +0000</td>
      <td>Some Medication</td>
      <td>m</td>
      <td>yrt1999</td>
    </tr>
    <tr>
      <th>2</th>
      <td>2</td>
      <td>2021-05-12 04:45:00 +0000</td>
      <td>bacon egg</td>
      <td>f</td>
      <td>yrt1999</td>
    </tr>
  </tbody>
</table>
</div>



```python
treets.clean_loggings(df, 'desc_text', 'PID').head(3)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>PID</th>
      <th>desc_text</th>
      <th>cleaned</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>yrt1999</td>
      <td>Milk</td>
      <td>[milk]</td>
    </tr>
    <tr>
      <th>1</th>
      <td>yrt1999</td>
      <td>Some Medication</td>
      <td>[medication]</td>
    </tr>
    <tr>
      <th>2</th>
      <td>yrt1999</td>
      <td>bacon egg</td>
      <td>[bacon, egg]</td>
    </tr>
  </tbody>
</table>
</div>



We can see that words are lower cased, modifiers are removed(2nd row) and items are split into individual items(third row).

## Visualizations

```python
# import the dataset
df = treets.file_loader('data/test_food_details.csv')
df.head(2)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Unnamed: 0</th>
      <th>ID</th>
      <th>unique_code</th>
      <th>research_info_id</th>
      <th>desc_text</th>
      <th>food_type</th>
      <th>original_logtime</th>
      <th>foodimage_file_name</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1340147</td>
      <td>7572733</td>
      <td>alqt14018795225</td>
      <td>150</td>
      <td>Water</td>
      <td>w</td>
      <td>2017-12-08 17:30:00+00:00</td>
      <td>NaN</td>
    </tr>
    <tr>
      <th>1</th>
      <td>1340148</td>
      <td>411111</td>
      <td>alqt14018795225</td>
      <td>150</td>
      <td>Coffee White</td>
      <td>b</td>
      <td>2017-12-09 00:01:00+00:00</td>
      <td>NaN</td>
    </tr>
  </tbody>
</table>
</div>



make a scatter plot for people's breakfast time

```python
# create required features for function first_cal_mean_with_error_bar()
df['original_logtime'] = pd.to_datetime(df['original_logtime'])
df['local_time'] = treets.find_float_time(df, 'original_logtime')
df['date'] = treets.find_date(df, 'original_logtime')

# call the function
treets.first_cal_mean_with_error_bar(df,'unique_code', 'date', 'local_time')
```


![png](https://raw.githubusercontent.com/FleischerResearchLab/treets/master/docs/images/output_28_0.png)


Use swarmplot to visualize each person's eating time distribution.

```python
treets.swarmplot(df, 50, 'unique_code', 'date', 'local_time')
```


![png](https://raw.githubusercontent.com/FleischerResearchLab/treets/master/docs/images/output_30_0.png)




            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/FleischerResearchLab/treets/tree/master/",
    "name": "treets",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "circadian ryhthm",
    "author": "Qiwen Zhang, Jason Fleischer",
    "author_email": "owenzhang1999@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/e3/f9/e32d6b0f18c5f96af95bf15376c43dc4aa782f192223812daf587fbf0f48/treets-1.0.4.tar.gz",
    "platform": null,
    "description": "# TREETS\n> Time Restricted Eating ExperimenTS.\n\n\n## Install\n\n`pip install treets`\n\n## Example for a quick data analysis on phased studies.\n\n```python\nimport treets.core as treets\nimport pandas as pd\n```\n\nTake a brief look on the food logging dataset and the reference information sheet\n\n```python\ntreets.file_loader('data/col_test_data/yrt*').head(2)\n```\n\n\n\n\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>Unnamed: 0</th>\n      <th>original_logtime</th>\n      <th>desc_text</th>\n      <th>food_type</th>\n      <th>PID</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>0</td>\n      <td>2021-05-12 02:30:00 +0000</td>\n      <td>Milk</td>\n      <td>b</td>\n      <td>yrt1999</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>1</td>\n      <td>2021-05-12 02:45:00 +0000</td>\n      <td>Some Medication</td>\n      <td>m</td>\n      <td>yrt1999</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\n```python\npd.read_excel('data/col_test_data/toy_data_17May2021.xlsx').head(2)\n```\n\n\n\n\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>mCC_ID</th>\n      <th>Participant_Study_ID</th>\n      <th>Study Phase</th>\n      <th>Intervention group (TRE or HABIT)</th>\n      <th>Start_Day</th>\n      <th>End_day</th>\n      <th>Eating_Window_Start</th>\n      <th>Eating_Window_End</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>yrt1999</td>\n      <td>2</td>\n      <td>S-REM</td>\n      <td>TRE</td>\n      <td>2021-05-12</td>\n      <td>2021-05-14</td>\n      <td>00:00:00</td>\n      <td>23:59:00</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>yrt1999</td>\n      <td>2</td>\n      <td>T3-INT</td>\n      <td>TRE</td>\n      <td>2021-05-15</td>\n      <td>2021-05-18</td>\n      <td>08:00:00</td>\n      <td>18:00:00</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\nCall summarize_data_with_experiment_phases() function to make the table that contains analytic information that we want.\n\n```python\ndf = treets.summarize_data_with_experiment_phases(treets.file_loader('data/col_test_data/yrt*')\\\n                      , pd.read_excel('data/col_test_data/toy_data_17May2021.xlsx'))\n```\n\n    Participant yrt1999 didn't log any food items in the following day(s):\n    2021-05-18\n    Participant yrt2000 didn't log any food items in the following day(s):\n    2021-05-12\n    2021-05-13\n    2021-05-14\n    2021-05-15\n    2021-05-16\n    2021-05-17\n    2021-05-18\n    Participant yrt1999 have bad logging day(s) in the following day(s):\n    2021-05-12\n    2021-05-15\n    Participant yrt1999 have bad window day(s) in the following day(s):\n    2021-05-15\n    2021-05-17\n    Participant yrt1999 have non adherent day(s) in the following day(s):\n    2021-05-12\n    2021-05-15\n    2021-05-17\n\n\n```python\ndf\n```\n\n\n\n\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>mCC_ID</th>\n      <th>Participant_Study_ID</th>\n      <th>Study Phase</th>\n      <th>Intervention group (TRE or HABIT)</th>\n      <th>Start_Day</th>\n      <th>End_day</th>\n      <th>Eating_Window_Start</th>\n      <th>Eating_Window_End</th>\n      <th>phase_duration</th>\n      <th>caloric_entries_num</th>\n      <th>...</th>\n      <th>logging_day_counts</th>\n      <th>%_logging_day_counts</th>\n      <th>good_logging_days</th>\n      <th>%_good_logging_days</th>\n      <th>good_window_days</th>\n      <th>%_good_window_days</th>\n      <th>outside_window_days</th>\n      <th>%_outside_window_days</th>\n      <th>adherent_days</th>\n      <th>%_adherent_days</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>yrt1999</td>\n      <td>2</td>\n      <td>S-REM</td>\n      <td>TRE</td>\n      <td>2021-05-12</td>\n      <td>2021-05-14</td>\n      <td>00:00:00</td>\n      <td>23:59:00</td>\n      <td>3 days</td>\n      <td>7</td>\n      <td>...</td>\n      <td>3</td>\n      <td>100.0%</td>\n      <td>2.0</td>\n      <td>66.67%</td>\n      <td>3.0</td>\n      <td>100.0%</td>\n      <td>0.0</td>\n      <td>0.0%</td>\n      <td>2.0</td>\n      <td>66.67%</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>yrt1999</td>\n      <td>2</td>\n      <td>T3-INT</td>\n      <td>TRE</td>\n      <td>2021-05-15</td>\n      <td>2021-05-18</td>\n      <td>08:00:00</td>\n      <td>18:00:00</td>\n      <td>4 days</td>\n      <td>8</td>\n      <td>...</td>\n      <td>3</td>\n      <td>75.0%</td>\n      <td>2.0</td>\n      <td>50.0%</td>\n      <td>1.0</td>\n      <td>25.0%</td>\n      <td>2.0</td>\n      <td>50.0%</td>\n      <td>1.0</td>\n      <td>25.0%</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>yrt2000</td>\n      <td>3</td>\n      <td>T3-INT</td>\n      <td>TRE</td>\n      <td>2021-05-12</td>\n      <td>2021-05-14</td>\n      <td>08:00:00</td>\n      <td>16:00:00</td>\n      <td>3 days</td>\n      <td>0</td>\n      <td>...</td>\n      <td>0</td>\n      <td>0.0%</td>\n      <td>0.0</td>\n      <td>0.0%</td>\n      <td>0.0</td>\n      <td>0.0%</td>\n      <td>0.0</td>\n      <td>0.0%</td>\n      <td>0.0</td>\n      <td>0.0%</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>yrt2000</td>\n      <td>3</td>\n      <td>T3-INT</td>\n      <td>TRE</td>\n      <td>2021-05-15</td>\n      <td>2021-05-18</td>\n      <td>08:00:00</td>\n      <td>16:00:00</td>\n      <td>4 days</td>\n      <td>0</td>\n      <td>...</td>\n      <td>0</td>\n      <td>0.0%</td>\n      <td>0.0</td>\n      <td>0.0%</td>\n      <td>0.0</td>\n      <td>0.0%</td>\n      <td>0.0</td>\n      <td>0.0%</td>\n      <td>0.0</td>\n      <td>0.0%</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>yrt2001</td>\n      <td>4</td>\n      <td>T12-A</td>\n      <td>TRE</td>\n      <td>NaT</td>\n      <td>NaT</td>\n      <td>NaN</td>\n      <td>NaN</td>\n      <td>NaT</td>\n      <td>0</td>\n      <td>...</td>\n      <td>0</td>\n      <td>nan%</td>\n      <td>NaN</td>\n      <td>NaN</td>\n      <td>NaN</td>\n      <td>NaN</td>\n      <td>NaN</td>\n      <td>NaN</td>\n      <td>NaN</td>\n      <td>NaN</td>\n    </tr>\n  </tbody>\n</table>\n<p>5 rows \u00d7 32 columns</p>\n</div>\n\n\n\nLook at resulting statistical information for the first row in the resulting dataset.\n\n```python\ndf.iloc[0]\n```\n\n\n\n\n    mCC_ID                                           yrt1999\n    Participant_Study_ID                                   2\n    Study Phase                                        S-REM\n    Intervention group (TRE or HABIT)                    TRE\n    Start_Day                            2021-05-12 00:00:00\n    End_day                              2021-05-14 00:00:00\n    Eating_Window_Start                             00:00:00\n    Eating_Window_End                               23:59:00\n    phase_duration                           3 days 00:00:00\n    caloric_entries_num                                    7\n    medication_num                                         0\n    water_num                                              0\n    first_cal_avg                                   5.916667\n    first_cal_std                                   2.240722\n    last_cal_avg                                   19.666667\n    last_cal_std                                   12.933323\n    mean_daily_eating_window                           13.75\n    std_daily_eating_window                        11.986972\n    earliest_entry                                       4.5\n    2.5%                                              4.5375\n    97.5%                                            27.5625\n    duration mid 95%                                  23.025\n    logging_day_counts                                     3\n    %_logging_day_counts                              100.0%\n    good_logging_days                                    2.0\n    %_good_logging_days                               66.67%\n    good_window_days                                     3.0\n    %_good_window_days                                100.0%\n    outside_window_days                                  0.0\n    %_outside_window_days                               0.0%\n    adherent_days                                        2.0\n    %_adherent_days                                   66.67%\n    Name: 0, dtype: object\n\n\n\n## Example for a quick data analysis on non-phased studies.\n\ntake a look at the original dataset\n\n```python\ndf = treets.file_loader('data/test_food_details.csv')\ndf.head(2)\n```\n\n\n\n\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>Unnamed: 0</th>\n      <th>ID</th>\n      <th>unique_code</th>\n      <th>research_info_id</th>\n      <th>desc_text</th>\n      <th>food_type</th>\n      <th>original_logtime</th>\n      <th>foodimage_file_name</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>1340147</td>\n      <td>7572733</td>\n      <td>alqt14018795225</td>\n      <td>150</td>\n      <td>Water</td>\n      <td>w</td>\n      <td>2017-12-08 17:30:00+00:00</td>\n      <td>NaN</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>1340148</td>\n      <td>411111</td>\n      <td>alqt14018795225</td>\n      <td>150</td>\n      <td>Coffee White</td>\n      <td>b</td>\n      <td>2017-12-09 00:01:00+00:00</td>\n      <td>NaN</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\npreprocess the data to create features we might need in the furthur analysis such as float time, week count since the first week, etc.\n\n```python\ndf = treets.load_food_data(df,'unique_code', 'original_logtime',4)\ndf.head(2)\n```\n\n\n\n\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>Unnamed: 0</th>\n      <th>ID</th>\n      <th>unique_code</th>\n      <th>research_info_id</th>\n      <th>desc_text</th>\n      <th>food_type</th>\n      <th>original_logtime</th>\n      <th>date</th>\n      <th>float_time</th>\n      <th>time</th>\n      <th>week_from_start</th>\n      <th>year</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>1340147</td>\n      <td>7572733</td>\n      <td>alqt14018795225</td>\n      <td>150</td>\n      <td>Water</td>\n      <td>w</td>\n      <td>2017-12-08 17:30:00+00:00</td>\n      <td>2017-12-08</td>\n      <td>17.500000</td>\n      <td>17:30:00</td>\n      <td>1</td>\n      <td>2017</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>1340148</td>\n      <td>411111</td>\n      <td>alqt14018795225</td>\n      <td>150</td>\n      <td>Coffee White</td>\n      <td>b</td>\n      <td>2017-12-09 00:01:00+00:00</td>\n      <td>2017-12-08</td>\n      <td>24.016667</td>\n      <td>00:01:00</td>\n      <td>1</td>\n      <td>2017</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\nCall summarize_data() function to make the table that contains analytic information that we want.\u00b6\n\n```python\ndf = treets.summarize_data(df, 'unique_code', 'float_time', 'date')\ndf.head(2)\n```\n\n\n\n\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>unique_code</th>\n      <th>num_days</th>\n      <th>num_total_items</th>\n      <th>num_f_n_b</th>\n      <th>num_medications</th>\n      <th>num_water</th>\n      <th>first_cal_avg</th>\n      <th>first_cal_std</th>\n      <th>last_cal_avg</th>\n      <th>last_cal_std</th>\n      <th>eating_win_avg</th>\n      <th>eating_win_std</th>\n      <th>good_logging_count</th>\n      <th>first_cal variation (90%-10%)</th>\n      <th>last_cal variation (90%-10%)</th>\n      <th>2.5%</th>\n      <th>95%</th>\n      <th>duration mid 95%</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>alqt1148284857</td>\n      <td>13</td>\n      <td>149</td>\n      <td>96</td>\n      <td>19</td>\n      <td>34</td>\n      <td>7.821795</td>\n      <td>6.710717</td>\n      <td>23.485897</td>\n      <td>4.869082</td>\n      <td>15.664103</td>\n      <td>8.231201</td>\n      <td>146</td>\n      <td>2.966667</td>\n      <td>9.666667</td>\n      <td>4.535000</td>\n      <td>26.813333</td>\n      <td>22.636667</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>alqt14018795225</td>\n      <td>64</td>\n      <td>488</td>\n      <td>484</td>\n      <td>3</td>\n      <td>1</td>\n      <td>7.525781</td>\n      <td>5.434563</td>\n      <td>25.858594</td>\n      <td>3.374839</td>\n      <td>18.332813</td>\n      <td>6.603913</td>\n      <td>484</td>\n      <td>13.450000</td>\n      <td>3.100000</td>\n      <td>4.183333</td>\n      <td>27.438333</td>\n      <td>23.416667</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\nLook at resulting statistical information for the first row in the resulting dataset.\n\n```python\ndf.iloc[0]\n```\n\n\n\n\n    unique_code                      alqt1148284857\n    num_days                                     13\n    num_total_items                             149\n    num_f_n_b                                    96\n    num_medications                              19\n    num_water                                    34\n    first_cal_avg                          7.821795\n    first_cal_std                          6.710717\n    last_cal_avg                          23.485897\n    last_cal_std                           4.869082\n    eating_win_avg                        15.664103\n    eating_win_std                         8.231201\n    good_logging_count                          146\n    first_cal variation (90%-10%)          2.966667\n    last_cal variation (90%-10%)           9.666667\n    2.5%                                      4.535\n    95%                                   26.813333\n    duration mid 95%                      22.636667\n    Name: 0, dtype: object\n\n\n\n## Clean text in food loggings\n\n```python\n# import the dataset\ndf = treets.file_loader('data/col_test_data/yrt*')\ndf.head(3)\n```\n\n\n\n\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>Unnamed: 0</th>\n      <th>original_logtime</th>\n      <th>desc_text</th>\n      <th>food_type</th>\n      <th>PID</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>0</td>\n      <td>2021-05-12 02:30:00 +0000</td>\n      <td>Milk</td>\n      <td>b</td>\n      <td>yrt1999</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>1</td>\n      <td>2021-05-12 02:45:00 +0000</td>\n      <td>Some Medication</td>\n      <td>m</td>\n      <td>yrt1999</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>2</td>\n      <td>2021-05-12 04:45:00 +0000</td>\n      <td>bacon egg</td>\n      <td>f</td>\n      <td>yrt1999</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\n```python\ntreets.clean_loggings(df, 'desc_text', 'PID').head(3)\n```\n\n\n\n\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>PID</th>\n      <th>desc_text</th>\n      <th>cleaned</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>yrt1999</td>\n      <td>Milk</td>\n      <td>[milk]</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>yrt1999</td>\n      <td>Some Medication</td>\n      <td>[medication]</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>yrt1999</td>\n      <td>bacon egg</td>\n      <td>[bacon, egg]</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\nWe can see that words are lower cased, modifiers are removed(2nd row) and items are split into individual items(third row).\n\n## Visualizations\n\n```python\n# import the dataset\ndf = treets.file_loader('data/test_food_details.csv')\ndf.head(2)\n```\n\n\n\n\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>Unnamed: 0</th>\n      <th>ID</th>\n      <th>unique_code</th>\n      <th>research_info_id</th>\n      <th>desc_text</th>\n      <th>food_type</th>\n      <th>original_logtime</th>\n      <th>foodimage_file_name</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>1340147</td>\n      <td>7572733</td>\n      <td>alqt14018795225</td>\n      <td>150</td>\n      <td>Water</td>\n      <td>w</td>\n      <td>2017-12-08 17:30:00+00:00</td>\n      <td>NaN</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>1340148</td>\n      <td>411111</td>\n      <td>alqt14018795225</td>\n      <td>150</td>\n      <td>Coffee White</td>\n      <td>b</td>\n      <td>2017-12-09 00:01:00+00:00</td>\n      <td>NaN</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\nmake a scatter plot for people's breakfast time\n\n```python\n# create required features for function first_cal_mean_with_error_bar()\ndf['original_logtime'] = pd.to_datetime(df['original_logtime'])\ndf['local_time'] = treets.find_float_time(df, 'original_logtime')\ndf['date'] = treets.find_date(df, 'original_logtime')\n\n# call the function\ntreets.first_cal_mean_with_error_bar(df,'unique_code', 'date', 'local_time')\n```\n\n\n![png](https://raw.githubusercontent.com/FleischerResearchLab/treets/master/docs/images/output_28_0.png)\n\n\nUse swarmplot to visualize each person's eating time distribution.\n\n```python\ntreets.swarmplot(df, 50, 'unique_code', 'date', 'local_time')\n```\n\n\n![png](https://raw.githubusercontent.com/FleischerResearchLab/treets/master/docs/images/output_30_0.png)\n\n\n\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "This library provides functions to analyzes food logging data.",
    "version": "1.0.4",
    "split_keywords": [
        "circadian",
        "ryhthm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a11bf98cec133a154548f96f8b43d11940da5023ba08e968a366d0f8d8ca3e85",
                "md5": "61c0edc5992ca2f4ad1f987806f0809f",
                "sha256": "6598b3df5ed56ae9430b3f6339f6e0591b847ce1607aaa575b5b755853285df6"
            },
            "downloads": -1,
            "filename": "treets-1.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "61c0edc5992ca2f4ad1f987806f0809f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 26095,
            "upload_time": "2023-03-14T22:46:44",
            "upload_time_iso_8601": "2023-03-14T22:46:44.210976Z",
            "url": "https://files.pythonhosted.org/packages/a1/1b/f98cec133a154548f96f8b43d11940da5023ba08e968a366d0f8d8ca3e85/treets-1.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e3f9e32d6b0f18c5f96af95bf15376c43dc4aa782f192223812daf587fbf0f48",
                "md5": "afd883609d4baa155c7f01be34c1c324",
                "sha256": "56387a4749272b7bff56a8526b4a1cb99f6d9ae6a328fcfdde0c4cbd808e35ce"
            },
            "downloads": -1,
            "filename": "treets-1.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "afd883609d4baa155c7f01be34c1c324",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 32088,
            "upload_time": "2023-03-14T22:46:46",
            "upload_time_iso_8601": "2023-03-14T22:46:46.467461Z",
            "url": "https://files.pythonhosted.org/packages/e3/f9/e32d6b0f18c5f96af95bf15376c43dc4aa782f192223812daf587fbf0f48/treets-1.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-14 22:46:46",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "treets"
}
        
Elapsed time: 0.04498s