spike2py-preprocess


Namespike2py-preprocess JSON
Version 0.1.14 PyPI version JSON
download
home_pagehttps://github.com/MartinHeroux/spike2py_preprocess
Summary"Preprocess data with spike2py."
upload_time2023-10-14 20:15:15
maintainer
docs_urlNone
authorMartin Heroux
requires_python>=3.10
licenseGPLv3
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![spike2py](https://raw.githubusercontent.com/MartinHeroux/spike2py_preprocess/master/spike2py_preprocess_icon_600x300.png)](https://github.com/MartinHeroux/spike2py)


[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](LICENSE)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
    [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg)](code_of_conduct.md)

**spike2py_preprocess** provides a simple way to batch (pre)process data with [spike2py](https://github.com/MartinHeroux/spike2py).

**spike2py_preprocess** can be used to batch read a series of `.mat` files and save them to `.pkl` files. 
However, the power of **spike2py_preprocess** is its ability to also preprocess the data, and this for a single trial, all trials from a subject, or all trials from a study.
Moreover, **spike2py_preprocess** can be used to extract only relevant sections of data; simply add two Spike2 TextMarks to mark the section of data to be extracted.
More than one section can be extracted per trial.

### Trial

In python:
```python
>>> from spike2py.trial import TrialInfo
>>> from spike2py_preprocess.trial import trial
>>> trial_info = TrialInfo(file="0004.mat",
                           name='h_reflex_curve',
                           subject_id='sub01',
                           path_save_trial='./proc')
>>> trial(trial_info)
```

On the command line:
```bash
$ python -m spike2py_preprocess trial --help
$ python -m spike2py_preprocess trial <path_to_trial_info_json>
```

or simply:

```bash
$ spike2py_preprocess trial --help
$ spike2py_preprocess trial <path_to_trial_info_json>
```

Here, we need to point `spike2py_preprocess.py` to a valid json file.
The json file requires the following fields:

```json
{
  "file": "/home/maple/study/sub01/data/raw/sub01_DATA000_H_B.mat",
  "channels": ["FDI", "W_EXT", "stim"],
  "name": "biphasic_high_fq",
  "subject_id": "sub01",
  "path_save_trial": "/home/maple/study/sub01/data/proc"
}
```

### Subject:

In Python:
```python
>>> from spike2py_preprocess.subject import subject
>>> from pathlib import Path
>>> subject_folder = Path('sub01')
>>> subject(subject_folder)
```
On the command line:
```bash
$ python -m spike2py_preprocess subject --help
$ python -m spike2py_preprocess subject /home/maple/study/sub01
```

or simply:
```bash
$ spike2py_preprocess subject --help
$ spike2py_preprocess subject /home/maple/study/sub01
```

### Study:

In Python: 

```python
>>> from spike2py_preprocess.study import study
>>> from pathlib import Path
>>> study_folder = Path('great_study')
>>> study(study_folder)
```

On the command line:
```bash
$ python -m spike2py_preprocess study --help
$ python -m spike2py_preprocess study /home/maple/study/
```

or simply:
```bash
$ spike2py_preprocess study --help
$ spike2py_preprocess study /home/maple/study/
```
## Preprocess

You can specify the preprocessing settings to apply to one or more channels by including one or more `<level>_preprocess.json` files.

For a single trial, **spike2py_preprocess** looks for `<trialname.mat>_preprocess.json` in the same folder as the `.mat` file.

For all trials for a subject, **spike2py_preprocess** looks for `subject_preprocess.json` in the provided subject folder.

Finally, for all trials in a study, **spike2py_preprocess** looks for `study_preprocess.json` in the provided study folder.

### Controlling the preprocessing

By including `study_preprocess.json`, `subject_preprocess.json` and `<trialname.mat>_preprocess.json` files in a given file structure, 
it is possible to provide a general preprocess scheme, but that can be overridden for a given subject or a given trial.

Below is an example of what could be included in a preprocess file. 
As you can see, each item is the name of a channel that exists in the `.mat` file. 

For each channel, a specific preprocessing step is specified.
At present, the preprocessing steps that are possible are those included in the `spike2py` 
[SignalProcessing](https://spike2py.readthedocs.io/en/latest/pages/reference_guides.html#sig-proc-signalprocessing)
mixin class.

The keys are the call to the preprocessing method, and the values are the inputs to those methods (if required).

```json
{
  "Fdi": {
    "remove_mean": "",
    "lowpass": "cutoff=200"
  },
  "W_Ext": {
    "remove_mean": "",
    "bandstop": "cutoff = [49, 51]"
  },
  "Stim": {
    "lowpass": "cutoff=20, order=8"
    }
}

```
#### IMPORTANT

Note that, in our example above, the channels were specified as `"channels": ["FDI", "W_EXT", "stim"]`,
but here they are specified as `Fdi`, `W_Ext`, and `Stim`. The reason for this is the researchers often used a wide
variety of styles to label channels, sometimes with ALLCAPS, other times with camel_case, or a combination of many.
To standardise things, `spike2py` applied TitleCase to each of the channel names.  If you are not 100% certain what the 
resulting channel name will be, simply apply `spike2py_preprocess` with you preprocess specified. This will result in a 
pickle (`.pkl`) file that can be opened with `spike2py`. 

```python
import spike2py as s2p
from pathlib import Path

tutorial = s2p.trial.load(file=Path('data.pkl'))
```

Now if you simply type `tutorial` and hit return, you will see information about the data, include the channel names.
## File structure

Below is an example of the required file/folder structure for **spike2py_preprocess**.

In the example, `sub02_DATA000_H_B.mat` has its own preprocess details located in **preprocess_sub02_DATA000_H_B.json**.

Similarly, at the subject level, `sub02` has a `subject_preprocess.json` file. This means all their files (excluding `sub02_DATA000_H_B.mat`) will be preprocessed in the same way.

Finally, because `sub01` does not include a dedicated `.json` file, their data would simply be read and saved as `.pkl` files if their data was analysed on their own. 
However, if **spike2py_preprocess** was used to preprocess all trials in the study, trials from `sub01` would be preprocessed with the details provided in `study_preprocess.json`.



### Study folder structure and required files

Below is an example study file structure. The study folder can have whatever name you like, `study1` in this case.
Similarly, the subject folders can have whatever names you like, but they should match the list of subjects you include
in the `study_info.json` file (see below for more details).

The raw data, in `.mat` format for each subject **must** be located in a folder with the name `raw`.

```bash

study1/
├── study_info.json
├── study_preprocess.json
├── sub01
│   ├── raw
│   │   ├── sub01_DATA000_H_B_trial_info.json
│   │   ├── sub01_DATA000_H_B.mat
│   │   ├── sub01_DATA001_C_B.mat
│   │   ├── sub01_DATA002_C_M.mat
│   │   └── sub01_DATA003_H_M.mat
│   └── subject_info.json
└── sub02
    ├── raw
    │   ├── preprocess_sub02_DATA000_H_B.json
    │   ├── sub02_DATA000_H_B.mat
    │   ├── sub02_DATA001_C_B.mat
    │   ├── sub02_DATA002_C_M.mat
    │   └── sub02_DATA003_H_M.mat
    ├── subject_info.json
    └── subject_preprocess.json
```

## subject_info.json
This file contains details about the subject. Additional information can appear in this file, but at a minimum it requires
that "subject_id" be provided, as well as "trials", which contains the various trials to be processed for this subject.
For each trial, the minimum data required is "name" and "file". If "channels" is provided, only these channels will be 
included and preprocessed; if not provided, all channels will be included.

```json
{
  "subject_id": "sub01",
  "age": 50,
  "gender": "F",
  "trials": {
    "trial1": {
      "name": "conv_biphasic",
      "file": "sub01_001.mat"
    },
    "trial2": {
      "name": "khz_biphasic",
      "file": "sub01_002.mat",
      "channels": ["FDI", "W_EXT", "stim"]
    }
  }
}

```
## study_info.json
This file contains details about the study. Additional information can appear in this file, but at a minimum it requires
that "name" and "subjects" be provided. If "channels" is provided, only these channels will be included and preprocessed,
noting that this can be trumped 
```json
{
    "name": "TSS_H-reflex",
    "subjects": [
      "sub01",
      "sub02"
    ],
  "channels": ["FDI", "W_EXT", "stim"]
}

```

## Spike2 TextMarks

Please refer to the document entitled "How_to_add_TextMarks_in_Spike2.pdf" for a guide on how to add TextMarks in Spike2.

If you add two TextMarks with the same label (e.g. 'MVC'), the section of data between the two TextMarks will be extracted and saved to a .pkl file.
Many such pairs of TextMarks can be included in a trial.

If you have two related sections of data, but want to exclude a middle section that is not useful or relevant,
you can add four labels, two around each section of data of interest, that have the same label, the data from both sections will be concatenated and extracted.

Note that Spike2 TextMarks need to be added prior to batch exporting the trial to .mat.

## Installing

**spike2py_preprocess** is available on PyPI:

```console
$ python -m pip install spike2py_preprocess
```

**spike2py** officially supports Python 3.8+.

## Contributing

Like this project? Want to help? We would love to have your contribution! Please see [CONTRIBUTING](CONTRIBUTING.md) to get started.

## Code of conduct

This project adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to [heroux.martin@gmail.com](heroux.martin@gmail.com).

## License

[GPLv3](./LICENSE)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MartinHeroux/spike2py_preprocess",
    "name": "spike2py-preprocess",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "",
    "author": "Martin Heroux",
    "author_email": "\"Martin Heroux\" <heroux.martin@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/d3/1d/9fbed1ef08b6e4f4a1d0744e09eea89783a68c4dd562c613e0903d23ef0c/spike2py_preprocess-0.1.14.tar.gz",
    "platform": null,
    "description": "[![spike2py](https://raw.githubusercontent.com/MartinHeroux/spike2py_preprocess/master/spike2py_preprocess_icon_600x300.png)](https://github.com/MartinHeroux/spike2py)\n\n\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](LICENSE)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n    [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg)](code_of_conduct.md)\n\n**spike2py_preprocess** provides a simple way to batch (pre)process data with [spike2py](https://github.com/MartinHeroux/spike2py).\n\n**spike2py_preprocess** can be used to batch read a series of `.mat` files and save them to `.pkl` files. \nHowever, the power of **spike2py_preprocess** is its ability to also preprocess the data, and this for a single trial, all trials from a subject, or all trials from a study.\nMoreover, **spike2py_preprocess** can be used to extract only relevant sections of data; simply add two Spike2 TextMarks to mark the section of data to be extracted.\nMore than one section can be extracted per trial.\n\n### Trial\n\nIn python:\n```python\n>>> from spike2py.trial import TrialInfo\n>>> from spike2py_preprocess.trial import trial\n>>> trial_info = TrialInfo(file=\"0004.mat\",\n                           name='h_reflex_curve',\n                           subject_id='sub01',\n                           path_save_trial='./proc')\n>>> trial(trial_info)\n```\n\nOn the command line:\n```bash\n$ python -m spike2py_preprocess trial --help\n$ python -m spike2py_preprocess trial <path_to_trial_info_json>\n```\n\nor simply:\n\n```bash\n$ spike2py_preprocess trial --help\n$ spike2py_preprocess trial <path_to_trial_info_json>\n```\n\nHere, we need to point `spike2py_preprocess.py` to a valid json file.\nThe json file requires the following fields:\n\n```json\n{\n  \"file\": \"/home/maple/study/sub01/data/raw/sub01_DATA000_H_B.mat\",\n  \"channels\": [\"FDI\", \"W_EXT\", \"stim\"],\n  \"name\": \"biphasic_high_fq\",\n  \"subject_id\": \"sub01\",\n  \"path_save_trial\": \"/home/maple/study/sub01/data/proc\"\n}\n```\n\n### Subject:\n\nIn Python:\n```python\n>>> from spike2py_preprocess.subject import subject\n>>> from pathlib import Path\n>>> subject_folder = Path('sub01')\n>>> subject(subject_folder)\n```\nOn the command line:\n```bash\n$ python -m spike2py_preprocess subject --help\n$ python -m spike2py_preprocess subject /home/maple/study/sub01\n```\n\nor simply:\n```bash\n$ spike2py_preprocess subject --help\n$ spike2py_preprocess subject /home/maple/study/sub01\n```\n\n### Study:\n\nIn Python: \n\n```python\n>>> from spike2py_preprocess.study import study\n>>> from pathlib import Path\n>>> study_folder = Path('great_study')\n>>> study(study_folder)\n```\n\nOn the command line:\n```bash\n$ python -m spike2py_preprocess study --help\n$ python -m spike2py_preprocess study /home/maple/study/\n```\n\nor simply:\n```bash\n$ spike2py_preprocess study --help\n$ spike2py_preprocess study /home/maple/study/\n```\n## Preprocess\n\nYou can specify the preprocessing settings to apply to one or more channels by including one or more `<level>_preprocess.json` files.\n\nFor a single trial, **spike2py_preprocess** looks for `<trialname.mat>_preprocess.json` in the same folder as the `.mat` file.\n\nFor all trials for a subject, **spike2py_preprocess** looks for `subject_preprocess.json` in the provided subject folder.\n\nFinally, for all trials in a study, **spike2py_preprocess** looks for `study_preprocess.json` in the provided study folder.\n\n### Controlling the preprocessing\n\nBy including `study_preprocess.json`, `subject_preprocess.json` and `<trialname.mat>_preprocess.json` files in a given file structure, \nit is possible to provide a general preprocess scheme, but that can be overridden for a given subject or a given trial.\n\nBelow is an example of what could be included in a preprocess file. \nAs you can see, each item is the name of a channel that exists in the `.mat` file. \n\nFor each channel, a specific preprocessing step is specified.\nAt present, the preprocessing steps that are possible are those included in the `spike2py` \n[SignalProcessing](https://spike2py.readthedocs.io/en/latest/pages/reference_guides.html#sig-proc-signalprocessing)\nmixin class.\n\nThe keys are the call to the preprocessing method, and the values are the inputs to those methods (if required).\n\n```json\n{\n  \"Fdi\": {\n    \"remove_mean\": \"\",\n    \"lowpass\": \"cutoff=200\"\n  },\n  \"W_Ext\": {\n    \"remove_mean\": \"\",\n    \"bandstop\": \"cutoff = [49, 51]\"\n  },\n  \"Stim\": {\n    \"lowpass\": \"cutoff=20, order=8\"\n    }\n}\n\n```\n#### IMPORTANT\n\nNote that, in our example above, the channels were specified as `\"channels\": [\"FDI\", \"W_EXT\", \"stim\"]`,\nbut here they are specified as `Fdi`, `W_Ext`, and `Stim`. The reason for this is the researchers often used a wide\nvariety of styles to label channels, sometimes with ALLCAPS, other times with camel_case, or a combination of many.\nTo standardise things, `spike2py` applied TitleCase to each of the channel names.  If you are not 100% certain what the \nresulting channel name will be, simply apply `spike2py_preprocess` with you preprocess specified. This will result in a \npickle (`.pkl`) file that can be opened with `spike2py`. \n\n```python\nimport spike2py as s2p\nfrom pathlib import Path\n\ntutorial = s2p.trial.load(file=Path('data.pkl'))\n```\n\nNow if you simply type `tutorial` and hit return, you will see information about the data, include the channel names.\n## File structure\n\nBelow is an example of the required file/folder structure for **spike2py_preprocess**.\n\nIn the example, `sub02_DATA000_H_B.mat` has its own preprocess details located in **preprocess_sub02_DATA000_H_B.json**.\n\nSimilarly, at the subject level, `sub02` has a `subject_preprocess.json` file. This means all their files (excluding `sub02_DATA000_H_B.mat`) will be preprocessed in the same way.\n\nFinally, because `sub01` does not include a dedicated `.json` file, their data would simply be read and saved as `.pkl` files if their data was analysed on their own. \nHowever, if **spike2py_preprocess** was used to preprocess all trials in the study, trials from `sub01` would be preprocessed with the details provided in `study_preprocess.json`.\n\n\n\n### Study folder structure and required files\n\nBelow is an example study file structure. The study folder can have whatever name you like, `study1` in this case.\nSimilarly, the subject folders can have whatever names you like, but they should match the list of subjects you include\nin the `study_info.json` file (see below for more details).\n\nThe raw data, in `.mat` format for each subject **must** be located in a folder with the name `raw`.\n\n```bash\n\nstudy1/\n\u251c\u2500\u2500 study_info.json\n\u251c\u2500\u2500 study_preprocess.json\n\u251c\u2500\u2500 sub01\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 raw\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 sub01_DATA000_H_B_trial_info.json\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 sub01_DATA000_H_B.mat\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 sub01_DATA001_C_B.mat\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 sub01_DATA002_C_M.mat\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 sub01_DATA003_H_M.mat\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 subject_info.json\n\u2514\u2500\u2500 sub02\n    \u251c\u2500\u2500 raw\n    \u2502\u00a0\u00a0 \u251c\u2500\u2500 preprocess_sub02_DATA000_H_B.json\n    \u2502\u00a0\u00a0 \u251c\u2500\u2500 sub02_DATA000_H_B.mat\n    \u2502\u00a0\u00a0 \u251c\u2500\u2500 sub02_DATA001_C_B.mat\n    \u2502\u00a0\u00a0 \u251c\u2500\u2500 sub02_DATA002_C_M.mat\n    \u2502\u00a0\u00a0 \u2514\u2500\u2500 sub02_DATA003_H_M.mat\n    \u251c\u2500\u2500 subject_info.json\n    \u2514\u2500\u2500 subject_preprocess.json\n```\n\n## subject_info.json\nThis file contains details about the subject. Additional information can appear in this file, but at a minimum it requires\nthat \"subject_id\" be provided, as well as \"trials\", which contains the various trials to be processed for this subject.\nFor each trial, the minimum data required is \"name\" and \"file\". If \"channels\" is provided, only these channels will be \nincluded and preprocessed; if not provided, all channels will be included.\n\n```json\n{\n  \"subject_id\": \"sub01\",\n  \"age\": 50,\n  \"gender\": \"F\",\n  \"trials\": {\n    \"trial1\": {\n      \"name\": \"conv_biphasic\",\n      \"file\": \"sub01_001.mat\"\n    },\n    \"trial2\": {\n      \"name\": \"khz_biphasic\",\n      \"file\": \"sub01_002.mat\",\n      \"channels\": [\"FDI\", \"W_EXT\", \"stim\"]\n    }\n  }\n}\n\n```\n## study_info.json\nThis file contains details about the study. Additional information can appear in this file, but at a minimum it requires\nthat \"name\" and \"subjects\" be provided. If \"channels\" is provided, only these channels will be included and preprocessed,\nnoting that this can be trumped \n```json\n{\n    \"name\": \"TSS_H-reflex\",\n    \"subjects\": [\n      \"sub01\",\n      \"sub02\"\n    ],\n  \"channels\": [\"FDI\", \"W_EXT\", \"stim\"]\n}\n\n```\n\n## Spike2 TextMarks\n\nPlease refer to the document entitled \"How_to_add_TextMarks_in_Spike2.pdf\" for a guide on how to add TextMarks in Spike2.\n\nIf you add two TextMarks with the same label (e.g. 'MVC'), the section of data between the two TextMarks will be extracted and saved to a .pkl file.\nMany such pairs of TextMarks can be included in a trial.\n\nIf you have two related sections of data, but want to exclude a middle section that is not useful or relevant,\nyou can add four labels, two around each section of data of interest, that have the same label, the data from both sections will be concatenated and extracted.\n\nNote that Spike2 TextMarks need to be added prior to batch exporting the trial to .mat.\n\n## Installing\n\n**spike2py_preprocess** is available on PyPI:\n\n```console\n$ python -m pip install spike2py_preprocess\n```\n\n**spike2py** officially supports Python 3.8+.\n\n## Contributing\n\nLike this project? Want to help? We would love to have your contribution! Please see [CONTRIBUTING](CONTRIBUTING.md) to get started.\n\n## Code of conduct\n\nThis project adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to [heroux.martin@gmail.com](heroux.martin@gmail.com).\n\n## License\n\n[GPLv3](./LICENSE)\n",
    "bugtrack_url": null,
    "license": "GPLv3",
    "summary": "\"Preprocess data with spike2py.\"",
    "version": "0.1.14",
    "project_urls": {
        "Homepage": "https://github.com/MartinHeroux/spike2py_preprocess"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aab1339367a36eba8d095319e573530e6b1f7ad46e4aeeb0d26395c29575695c",
                "md5": "db090f0afb67ab78d17051d57b4a01ff",
                "sha256": "8d162e89e0b8b858e42f1019d2d15d17a17da830449c7c3225b3785da1b827c7"
            },
            "downloads": -1,
            "filename": "spike2py_preprocess-0.1.14-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "db090f0afb67ab78d17051d57b4a01ff",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 25167,
            "upload_time": "2023-10-14T20:15:12",
            "upload_time_iso_8601": "2023-10-14T20:15:12.927530Z",
            "url": "https://files.pythonhosted.org/packages/aa/b1/339367a36eba8d095319e573530e6b1f7ad46e4aeeb0d26395c29575695c/spike2py_preprocess-0.1.14-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d31d9fbed1ef08b6e4f4a1d0744e09eea89783a68c4dd562c613e0903d23ef0c",
                "md5": "1389381048bdb57320e4838cf385de8c",
                "sha256": "ce3667367553b2ed602ca9a82070b7ef5906a99916cd87e47e82720baa62b581"
            },
            "downloads": -1,
            "filename": "spike2py_preprocess-0.1.14.tar.gz",
            "has_sig": false,
            "md5_digest": "1389381048bdb57320e4838cf385de8c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 29256,
            "upload_time": "2023-10-14T20:15:15",
            "upload_time_iso_8601": "2023-10-14T20:15:15.096292Z",
            "url": "https://files.pythonhosted.org/packages/d3/1d/9fbed1ef08b6e4f4a1d0744e09eea89783a68c4dd562c613e0903d23ef0c/spike2py_preprocess-0.1.14.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-14 20:15:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MartinHeroux",
    "github_project": "spike2py_preprocess",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "spike2py-preprocess"
}
        
Elapsed time: 0.12584s