pyadlml

Name	pyadlml JSON
Version	0.0.8.0a0 JSON
	download
home_page	https://github.com/tcsvn/pyadlml
Summary	Sklearn flavored library containing numerous Activity of Daily Livings datasets, preprocessing methods, visualizations and models.
upload_time	2023-06-23 08:38:12
maintainer
docs_url	None
author	Christian Meier
requires_python
license	MIT
keywords	activity of daily living
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Activities of Daily Living - Machine Learning
> Contains data preprocessing and visualization methods for ADL datasets.

![PyPI version](https://img.shields.io/pypi/v/pyadlml?style=flat-square)
![Download Stats](https://img.shields.io/pypi/dd/pyadlml?style=flat-square)
![Read the Docs (version)](https://img.shields.io/readthedocs/pyadlml/latest?style=flat-square)
![License](https://img.shields.io/pypi/l/pyadlml?style=flat-square)
<p align="center"><img width=95% src="https://github.com/tcsvn/pyadlml/blob/master/media/pyadlml_banner.png"></p>
Activities of Daily living (ADLs) such as eating, working, sleeping and Smart Home device readings are recorded by inhabitants. Predicting resident activities based on the device event-stream allows for a range of applications, including automation, action recommendation and abnormal activity detection in the context of assisted living for elderly inhabitants. Pyadlml offers an easy way to fetch, visualize and preprocess common datasets.


## !! Disclaimer !!
Package is still an alpha-version and under active development. 
As of now do not expect anything to work! APIs are going to change, 
stuff breaks and the documentation may lack behind. Nevertheless, feel 
free to take a look. The safest point to start is probably the API reference.

## Last (stable) Release
```sh 
$ pip install pyadlml
```
## Latest Development Changes
```sh
$ git clone https://github.com/tcsvn/pyadlml
$ cd pyadlml
$ pip install .
```

## Usage example


### Simple

```python
# Fetch dataset
from pyadlml.dataset import fetch_amsterdam
data = fetch_amsterdam()
df_devs, df_acts = data['devices'], data['activities']


# Plot the residents activity density over one day
from pyadlml.plot import plot_activity_density
fig = plot_activity_density(df_acts)
fig.show()


# Create a vector representing the state of all Smart Home devices
# at a certain time and discretize the events into 20 second bins
from pyadlml.preprocessing import Event2Vec, LabelMatcher
e2v = Event2Vec(encode='state', dt='20s')
states = e2v.fit_transform(df_devs)

# Label each datum with the corresponding activity.
# When an event matches no activity set the activity to "other"
lbls = LabelMatcher(other=True).fit_transform(df_acts, states)

# Extract numpy arrays without timestamps (1st column)
X, y = states.values[:,1:], lbls.values[:,1:]

# Proceed with machine learning stuff 
from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier()
clf.fit(X, y)
clf.score(X, y)
...
```

### Less simple


```python
from pyadlml.dataset import fetch_amsterdam
from pyadlml.constants import VALUE
from pyadlml.pipeline import Pipeline
from pyadlml.preprocessing import IndexEncoder, LabelMatcher, DropTimeIndex, \
                                  EventWindows, DropColumn
from pyadlml.model_selection import train_test_split
from pyadlml.model import DilatedModel
from pyadlml.dataset.torch import TorchDataset
from torch.utils.data import DataLoader
from torch.optim import Adam 
from torch.nn import functional as F

# Featch data and split into train/val/test based on time rather than #events
data = fetch_amsterdam()
X_train, X_val, X_test, y_train, y_val, y_test = train_test_split(
    data['devices'], data['activities'], \
    split=(0.6, 0.2, 0.2), temporal=True,
)

# Formulate all preprocessing steps using a sklearn styled pipeline 
pipe = Pipeline([
    ('enc', IndexEncoder()),            # Encode devices strings with indices
    ('drop_obs', DropColumn(VALUE)),    # Disregard device observations
    ('lbl', LabelMatcher(other=True)),  # Generate labels y  
    ('drop_time', DropTimeIndex()),     # Remove timestamps for x and y
    ('windows', EventWindows(           # Create sequences S with a sliding window
                  rep='many-to-one',    # Only one label y_i per sequence S_i
                  window_size=16,       # Each sequence contains 16 events 
                  stride=2)),           # A sequence is created every 2 events
    ('passthrough', 'passthrough')      # Do not use a classifier in the pipeline
])

# Create a dataset to sample from
dataset = TorchDataset(X_train, y_train, pipe) 
train_loader = DataLoader(dataset, batch_size=32, shuffle=True)
model = DilatedModel(
    n_features=14,       # Number of devices
    n_classes=8          # Number of activities
)
optimizer = Adam(model.parameters(), lr=3e-4)

# Minimal loop to overfit the data
for s in range(10000):
    for (Xb, yb) in train_loader:
        optimizer.zero_grad()
        logits = model(Xb)
        loss = F.cross_entropy(logits, yb)
        loss.backward()
        optimizer.step()

...
```



_For more examples and how to use, please refer to the [documentation](https://pyadlml.readthedocs.io/en/latest/)._

## Features
  - Access to 14 Datasets 
  - Importing data from [Home Assistant](https://www.home-assistant.io/) or [Activity Assistant](https://github.com/tcsvn/activity-assistant)
  - Tools for data cleaning
    - Relabel activities and devices
    - Merge overlapping activities
    - Find and replace specific patterns in device signals
    - Interactive dashboard for data exploration
  - Various statistics and visualizations for devices, activities and their interaction
  - Preprocessing methods
    - Device encoder (index, state, changepoint, last_fired, ...)
    - Feature extraction (inter-event-times, intensity, time2vec, ...)
    - Sliding windows (event, temporal, explicit or (fuzzytime))
    - Many more ... 
  - Cross validation iterators and pipeline adapted for ADLs
    - LeaveKDayOutSplit, TimeSeriesSplit
    - Conditional transformer: YTransformer, XorYTransformer, ...
  - Online metrics to compare models regardless of resample frequency
    - Accuracy, TPR, PPV, ConfMat, Calibration
  - Ready to use models (TODO)
    - RNNs
    - WaveNet
    - Transformer
  - Translate datasets to sktime formats

### Supported Datasets
  - [x] Amsterdam [1]
  - [x] Aras [2]
  - [x] Casas Aruba (2011) [3]
  - [X] Casas Cairo [4]
  - [X] Casas Milan (2009) [4]
  - [X] Casas Tulum [4]
  - [X] Casas Kyoto (2010) [4]
  - [x] Kasteren House A,B,C [5]
  - [x] MITLab [6]
  - [x] UCI Adl Binary [8]
  - [ ] Chinokeeh [9]
  - [ ] Orange [TODO]

## Examples, benchmarks and replications
The project includes (TODO) a ranked model leaderboard evaluated on the cleaned dataset versions.
Additionaly, here is a useful list of awesome references (todo include link) to papers
and repos related to ADLs and machine learning.


## Contributing 
1. Fork it (<https://github.com/tcsvn/pyadlml/fork>)
2. Create your feature branch (`git checkout -b feature/fooBar`)
3. Commit your changes (`git commit -am 'Add some fooBar'`)
4. Push to the branch (`git push origin feature/fooBar`)
5. Create a new Pull Request

## Related projects
  - [Activity Assistant](https://github.com/tcsvn/activity-assistant) - Recording, predicting ADLs within Home assistant.
  - [Sci-kit learn](https://github.com/sklearn) - The main inspiration and some borrowed source code.

## Support 
[![Buy me a coffee][buy-me-a-coffee-shield]][buy-me-a-coffee]

## How to cite
If you are using pyadlml for publications consider citing the package
```
@software{pyadlml,
  author = {Christian Meier},
  title = {PyADLMl - Machine Learning library for Activities of Daily Living},    
  url = {https://github.com/tcsvn/pyadlml},
  version = {0.0.22-alpha},
  date = {2023-01-03}
}
```

## Sources

#### Dataset


[1]: T.L.M. van Kasteren; A. K. Noulas; G. Englebienne and B.J.A. Kroese, Tenth International Conference on Ubiquitous Computing 2008  
[2]: H. Alemdar, H. Ertan, O.D. Incel, C. Ersoy, ARAS Human Activity Datasets in Multiple Homes with Multiple Residents, Pervasive Health, Venice, May 2013.  
[3,4]: WSU CASAS smart home project: D. Cook. Learning setting-generalized activity models for smart spaces. IEEE Intelligent Systems, 2011.  
[5]: Transferring Knowledge of Activity Recognition across Sensor networks. Eighth International Conference on Pervasive Computing. Helsinki, Finland, 2010.  
[6]: E. Munguia Tapia. Activity Recognition in the Home Setting Using Simple and Ubiquitous sensors. S.M Thesis  
[7]: Activity Recognition in Smart Home Environments using Hidden Markov Models. Bachelor Thesis. Uni Tuebingen.   
[8]: Ordonez, F.J.; de Toledo, P.; Sanchis, A. Activity Recognition Using Hybrid Generative/Discriminative Models on Home Environments Using Binary Sensors. Sensors 2013, 13, 5460-5477.  
[9]: D. Cook and M. Schmitter-Edgecombe, Assessing the quality of activities in a smart environment. Methods of information in Medicine, 2009

#### Software

TODO add software used in TPPs 

## License
MIT  © [tcsvn](http://deadlink)


[buy-me-a-coffee-shield]: https://img.shields.io/static/v1.svg?label=%20&message=Buy%20me%20a%20coffee&color=6f4e37&logo=buy%20me%20a%20coffee&logoColor=white
[buy-me-a-coffee]: https://www.buymeacoffee.com/tscvn


sdf

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tcsvn/pyadlml",
    "name": "pyadlml",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "Activity of Daily Living",
    "author": "Christian Meier",
    "author_email": "account@meier-lossburg.de",
    "download_url": "https://files.pythonhosted.org/packages/87/56/956fc3d7fed6fc034350d71d8852b4b5538fb9c43dac4ef58de085c51172/pyadlml-0.0.8.0a0.tar.gz",
    "platform": null,
    "description": "# Activities of Daily Living - Machine Learning\n> Contains data preprocessing and visualization methods for ADL datasets.\n\n![PyPI version](https://img.shields.io/pypi/v/pyadlml?style=flat-square)\n![Download Stats](https://img.shields.io/pypi/dd/pyadlml?style=flat-square)\n![Read the Docs (version)](https://img.shields.io/readthedocs/pyadlml/latest?style=flat-square)\n![License](https://img.shields.io/pypi/l/pyadlml?style=flat-square)\n<p align=\"center\"><img width=95% src=\"https://github.com/tcsvn/pyadlml/blob/master/media/pyadlml_banner.png\"></p>\nActivities of Daily living (ADLs) such as eating, working, sleeping and Smart Home device readings are recorded by inhabitants. Predicting resident activities based on the device event-stream allows for a range of applications, including automation, action recommendation and abnormal activity detection in the context of assisted living for elderly inhabitants. Pyadlml offers an easy way to fetch, visualize and preprocess common datasets.\n\n\n## !! Disclaimer !!\nPackage is still an alpha-version and under active development. \nAs of now do not expect anything to work! APIs are going to change, \nstuff breaks and the documentation may lack behind. Nevertheless, feel \nfree to take a look. The safest point to start is probably the API reference.\n\n## Last (stable) Release\n```sh \n$ pip install pyadlml\n```\n## Latest Development Changes\n```sh\n$ git clone https://github.com/tcsvn/pyadlml\n$ cd pyadlml\n$ pip install .\n```\n\n## Usage example\n\n\n### Simple\n\n```python\n# Fetch dataset\nfrom pyadlml.dataset import fetch_amsterdam\ndata = fetch_amsterdam()\ndf_devs, df_acts = data['devices'], data['activities']\n\n\n# Plot the residents activity density over one day\nfrom pyadlml.plot import plot_activity_density\nfig = plot_activity_density(df_acts)\nfig.show()\n\n\n# Create a vector representing the state of all Smart Home devices\n# at a certain time and discretize the events into 20 second bins\nfrom pyadlml.preprocessing import Event2Vec, LabelMatcher\ne2v = Event2Vec(encode='state', dt='20s')\nstates = e2v.fit_transform(df_devs)\n\n# Label each datum with the corresponding activity.\n# When an event matches no activity set the activity to \"other\"\nlbls = LabelMatcher(other=True).fit_transform(df_acts, states)\n\n# Extract numpy arrays without timestamps (1st column)\nX, y = states.values[:,1:], lbls.values[:,1:]\n\n# Proceed with machine learning stuff \nfrom sklearn.tree import DecisionTreeClassifier\nclf = DecisionTreeClassifier()\nclf.fit(X, y)\nclf.score(X, y)\n...\n```\n\n### Less simple\n\n\n```python\nfrom pyadlml.dataset import fetch_amsterdam\nfrom pyadlml.constants import VALUE\nfrom pyadlml.pipeline import Pipeline\nfrom pyadlml.preprocessing import IndexEncoder, LabelMatcher, DropTimeIndex, \\\n                                  EventWindows, DropColumn\nfrom pyadlml.model_selection import train_test_split\nfrom pyadlml.model import DilatedModel\nfrom pyadlml.dataset.torch import TorchDataset\nfrom torch.utils.data import DataLoader\nfrom torch.optim import Adam \nfrom torch.nn import functional as F\n\n# Featch data and split into train/val/test based on time rather than #events\ndata = fetch_amsterdam()\nX_train, X_val, X_test, y_train, y_val, y_test = train_test_split(\n    data['devices'], data['activities'], \\\n    split=(0.6, 0.2, 0.2), temporal=True,\n)\n\n# Formulate all preprocessing steps using a sklearn styled pipeline \npipe = Pipeline([\n    ('enc', IndexEncoder()),            # Encode devices strings with indices\n    ('drop_obs', DropColumn(VALUE)),    # Disregard device observations\n    ('lbl', LabelMatcher(other=True)),  # Generate labels y  \n    ('drop_time', DropTimeIndex()),     # Remove timestamps for x and y\n    ('windows', EventWindows(           # Create sequences S with a sliding window\n                  rep='many-to-one',    # Only one label y_i per sequence S_i\n                  window_size=16,       # Each sequence contains 16 events \n                  stride=2)),           # A sequence is created every 2 events\n    ('passthrough', 'passthrough')      # Do not use a classifier in the pipeline\n])\n\n# Create a dataset to sample from\ndataset = TorchDataset(X_train, y_train, pipe) \ntrain_loader = DataLoader(dataset, batch_size=32, shuffle=True)\nmodel = DilatedModel(\n    n_features=14,       # Number of devices\n    n_classes=8          # Number of activities\n)\noptimizer = Adam(model.parameters(), lr=3e-4)\n\n# Minimal loop to overfit the data\nfor s in range(10000):\n    for (Xb, yb) in train_loader:\n        optimizer.zero_grad()\n        logits = model(Xb)\n        loss = F.cross_entropy(logits, yb)\n        loss.backward()\n        optimizer.step()\n\n...\n```\n\n\n\n_For more examples and how to use, please refer to the [documentation](https://pyadlml.readthedocs.io/en/latest/)._\n\n## Features\n  - Access to 14 Datasets \n  - Importing data from [Home Assistant](https://www.home-assistant.io/) or [Activity Assistant](https://github.com/tcsvn/activity-assistant)\n  - Tools for data cleaning\n    - Relabel activities and devices\n    - Merge overlapping activities\n    - Find and replace specific patterns in device signals\n    - Interactive dashboard for data exploration\n  - Various statistics and visualizations for devices, activities and their interaction\n  - Preprocessing methods\n    - Device encoder (index, state, changepoint, last_fired, ...)\n    - Feature extraction (inter-event-times, intensity, time2vec, ...)\n    - Sliding windows (event, temporal, explicit or (fuzzytime))\n    - Many more ... \n  - Cross validation iterators and pipeline adapted for ADLs\n    - LeaveKDayOutSplit, TimeSeriesSplit\n    - Conditional transformer: YTransformer, XorYTransformer, ...\n  - Online metrics to compare models regardless of resample frequency\n    - Accuracy, TPR, PPV, ConfMat, Calibration\n  - Ready to use models (TODO)\n    - RNNs\n    - WaveNet\n    - Transformer\n  - Translate datasets to sktime formats\n\n### Supported Datasets\n  - [x] Amsterdam [1]\n  - [x] Aras [2]\n  - [x] Casas Aruba (2011) [3]\n  - [X] Casas Cairo [4]\n  - [X] Casas Milan (2009) [4]\n  - [X] Casas Tulum [4]\n  - [X] Casas Kyoto (2010) [4]\n  - [x] Kasteren House A,B,C [5]\n  - [x] MITLab [6]\n  - [x] UCI Adl Binary [8]\n  - [ ] Chinokeeh [9]\n  - [ ] Orange [TODO]\n\n## Examples, benchmarks and replications\nThe project includes (TODO) a ranked model leaderboard evaluated on the cleaned dataset versions.\nAdditionaly, here is a useful list of awesome references (todo include link) to papers\nand repos related to ADLs and machine learning.\n\n\n## Contributing \n1. Fork it (<https://github.com/tcsvn/pyadlml/fork>)\n2. Create your feature branch (`git checkout -b feature/fooBar`)\n3. Commit your changes (`git commit -am 'Add some fooBar'`)\n4. Push to the branch (`git push origin feature/fooBar`)\n5. Create a new Pull Request\n\n## Related projects\n  - [Activity Assistant](https://github.com/tcsvn/activity-assistant) - Recording, predicting ADLs within Home assistant.\n  - [Sci-kit learn](https://github.com/sklearn) - The main inspiration and some borrowed source code.\n\n## Support \n[![Buy me a coffee][buy-me-a-coffee-shield]][buy-me-a-coffee]\n\n## How to cite\nIf you are using pyadlml for publications consider citing the package\n```\n@software{pyadlml,\n  author = {Christian Meier},\n  title = {PyADLMl - Machine Learning library for Activities of Daily Living},    \n  url = {https://github.com/tcsvn/pyadlml},\n  version = {0.0.22-alpha},\n  date = {2023-01-03}\n}\n```\n\n## Sources\n\n#### Dataset\n\n\n[1]: T.L.M. van Kasteren; A. K. Noulas; G. Englebienne and B.J.A. Kroese, Tenth International Conference on Ubiquitous Computing 2008  \n[2]: H. Alemdar, H. Ertan, O.D. Incel, C. Ersoy, ARAS Human Activity Datasets in Multiple Homes with Multiple Residents, Pervasive Health, Venice, May 2013.  \n[3,4]: WSU CASAS smart home project: D. Cook. Learning setting-generalized activity models for smart spaces. IEEE Intelligent Systems, 2011.  \n[5]: Transferring Knowledge of Activity Recognition across Sensor networks. Eighth International Conference on Pervasive Computing. Helsinki, Finland, 2010.  \n[6]: E. Munguia Tapia. Activity Recognition in the Home Setting Using Simple and Ubiquitous sensors. S.M Thesis  \n[7]: Activity Recognition in Smart Home Environments using Hidden Markov Models. Bachelor Thesis. Uni Tuebingen.   \n[8]: Ordonez, F.J.; de Toledo, P.; Sanchis, A. Activity Recognition Using Hybrid Generative/Discriminative Models on Home Environments Using Binary Sensors. Sensors 2013, 13, 5460-5477.  \n[9]: D. Cook and M. Schmitter-Edgecombe, Assessing the quality of activities in a smart environment. Methods of information in Medicine, 2009\n\n#### Software\n\nTODO add software used in TPPs \n\n## License\nMIT  \u00a9 [tcsvn](http://deadlink)\n\n\n[buy-me-a-coffee-shield]: https://img.shields.io/static/v1.svg?label=%20&message=Buy%20me%20a%20coffee&color=6f4e37&logo=buy%20me%20a%20coffee&logoColor=white\n[buy-me-a-coffee]: https://www.buymeacoffee.com/tscvn\n\n\nsdf\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Sklearn flavored library containing numerous Activity of Daily Livings datasets, preprocessing methods, visualizations and models.",
    "version": "0.0.8.0a0",
    "project_urls": {
        "Homepage": "https://github.com/tcsvn/pyadlml"
    },
    "split_keywords": [
        "activity",
        "of",
        "daily",
        "living"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "902b9676b1f8aa92be8146c7ad9a8b4c2aa735a84c95cb3f7aa84d42a88daf8b",
                "md5": "65dcd8f6eeec57f546f70b1d74cd1230",
                "sha256": "9d75f04c83d89344849cddd1b5cb1c1f2ca75f7123be19b50df296ead8a27053"
            },
            "downloads": -1,
            "filename": "pyadlml-0.0.8.0a0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "65dcd8f6eeec57f546f70b1d74cd1230",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 328241,
            "upload_time": "2023-06-23T08:38:09",
            "upload_time_iso_8601": "2023-06-23T08:38:09.922086Z",
            "url": "https://files.pythonhosted.org/packages/90/2b/9676b1f8aa92be8146c7ad9a8b4c2aa735a84c95cb3f7aa84d42a88daf8b/pyadlml-0.0.8.0a0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8756956fc3d7fed6fc034350d71d8852b4b5538fb9c43dac4ef58de085c51172",
                "md5": "1d18c95d88aabf4be00f2224e22a9c33",
                "sha256": "7ea675c152511d13bdb29c0491171b77f15cbcbed66246f1c8bda3d2aab08c75"
            },
            "downloads": -1,
            "filename": "pyadlml-0.0.8.0a0.tar.gz",
            "has_sig": false,
            "md5_digest": "1d18c95d88aabf4be00f2224e22a9c33",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 272889,
            "upload_time": "2023-06-23T08:38:12",
            "upload_time_iso_8601": "2023-06-23T08:38:12.438244Z",
            "url": "https://files.pythonhosted.org/packages/87/56/956fc3d7fed6fc034350d71d8852b4b5538fb9c43dac4ef58de085c51172/pyadlml-0.0.8.0a0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-23 08:38:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tcsvn",
    "github_project": "pyadlml",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pyadlml"
}

Christian Meier