bci-dataset

Name	bci-dataset JSON
Version	1.0.0 JSON
	download
home_page	https://github.com/s-n-1-0/bci-dataset
Summary	Building HDF datasets for machine learning.
upload_time	2023-10-26 09:08:41
maintainer
docs_url	None
author	sn-10
requires_python
license	MIT
keywords	eeg
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # bci-dataset
Python library for organizing multiple EEG datasets using HDF.  
Support EEGLAB Data!

*For do deep learning, this library was created as a tool to combine datasets for the major BCI paradigms.

## Installation
`
pip install bci-dataset
`

## How to Use
### Add EEG Data
#### Supported Formats
+ EEGLAB(.set)
    + Epoching (epoch splitting) on EEGLAB is required.
+ numpy(ndarray)

#### Commonality
```python
import bci_dataset

fpath = "./dataset.hdf"
fs = 500 # sampling rate
updater = DatasetUpdater(fpath,fs=fs)
updater.remove_hdf() # delete hdf file that already exist
```
#### Add EEGLAB Data
```python
import numpy as np

labels = ["left","right"]
eeglab_list = ["./sample.set"] # path list of eeglab files

# add eeglab(.set) files
for fp in eeglab_list:
    updater.add_eeglab(fp,labels)

```

#### Add NumPy Data
```python
#dummy
dummy_data = np.ones((12,6000)) # channel × signal
dummy_indexes = [0,1000,2000,3000,4000,5000] #Index of trial start
dummy_labels = ["left","right"]*3 #Label of trials
dummy_size = 500 #Size of 1 trial

updater.add_numpy(dummy_data,dummy_indexes,dummy_labels,dummy_size)

```
### Apply Preprocessing
If the "preprocess" method is executed again with the same group name, the already created group with the specified name is deleted once before preprocessing.

```python
"""
preprocessing example
bx : ch × samples
"""
def prepro_func(bx:np.ndarray): 
    x = bx[12:15,:]
    return StandardScaler().fit_transform(x.T).T
updater.preprocess("custom",prepro_func)
```

### Contents of HDF
Note that "dataset" in the figure below refers to the HDF dataset (class).
```
hdf file
├ origin : group / raw data
│ ├ 1 : dataset
│ ├ 2 : dataset
│ ├ 3 : dataset
│ ├ 4 : dataset
│ ├ 5 : dataset
│ └ …
└ prepro : group / data after preprocessing
　 ├ custom : group / "custom" is any group name
　 │ ├ 1 : dataset
　 │ ├ 2 : dataset
　 │ ├ 3 : dataset
　 │ ├ 4 : dataset
　 │ ├ 5 : dataset
　 │ └ …
　 └ custom2 : group
　 　 └ ...omit (1,2,3,4,…)
```

+ Check the contents with software such as HDFView.
+ Use "h5py" or similar to read the HDF file.
    ```python
    import h5py
    with h5py.File(fpath) as h5:
        fs = h5["prepro/custom"].attrs["fs"]
        dataset_size = h5["prepro/custom"].attrs["count"]
        dataset79 = h5["prepro/custom/79"][()] #ch × samples
        dataset79_label = h5["prepro/custom/79"].attrs["label"]
    ```

### Merge Dataset
In order to merge, "dataset_name" must be set.  
If the order of channels is different for each dataset, the order can be aligned by specifying ch_indexes.

**Source's preprocessing group is not inherited. In other words, preprocess() must be executed after the merge.**

Example: Merge source1 and source2 datasets
```python
    target = DatasetUpdater("new_dataset.h5",fs=fs)
    target.remove_hdf() # reset hdf
    s1 = DatasetUpdater("source1.h5",fs=fs,dataset_name="source1")
    s2 = DatasetUpdater("source2.h5",fs=fs,dataset_name="source2")
    s1_ch_indexes = [1,60,10,5]# channel indexes to use
    target.merge_hdf(s1,ch_indexes=s1_ch_indexes)
    target.merge_hdf(s2)
```

## Pull requests / Issues
If you need anything...

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/s-n-1-0/bci-dataset",
    "name": "bci-dataset",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "eeg",
    "author": "sn-10",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/87/46/09f647c7cfb4f2edaba7cc27f579367957b6a741003a740cb6bd7bb65504/bci-dataset-1.0.0.tar.gz",
    "platform": null,
    "description": "# bci-dataset\r\nPython library for organizing multiple EEG datasets using HDF.  \r\nSupport EEGLAB Data!\r\n\r\n*For do deep learning, this library was created as a tool to combine datasets for the major BCI paradigms.\r\n\r\n## Installation\r\n`\r\npip install bci-dataset\r\n`\r\n\r\n## How to Use\r\n### Add EEG Data\r\n#### Supported Formats\r\n+ EEGLAB(.set)\r\n    + Epoching (epoch splitting) on EEGLAB is required.\r\n+ numpy(ndarray)\r\n\r\n#### Commonality\r\n```python\r\nimport bci_dataset\r\n\r\nfpath = \"./dataset.hdf\"\r\nfs = 500 # sampling rate\r\nupdater = DatasetUpdater(fpath,fs=fs)\r\nupdater.remove_hdf() # delete hdf file that already exist\r\n```\r\n#### Add EEGLAB Data\r\n```python\r\nimport numpy as np\r\n\r\nlabels = [\"left\",\"right\"]\r\neeglab_list = [\"./sample.set\"] # path list of eeglab files\r\n\r\n# add eeglab(.set) files\r\nfor fp in eeglab_list:\r\n    updater.add_eeglab(fp,labels)\r\n\r\n```\r\n\r\n#### Add NumPy Data\r\n```python\r\n#dummy\r\ndummy_data = np.ones((12,6000)) # channel \u00d7 signal\r\ndummy_indexes = [0,1000,2000,3000,4000,5000] #Index of trial start\r\ndummy_labels = [\"left\",\"right\"]*3 #Label of trials\r\ndummy_size = 500 #Size of 1 trial\r\n\r\nupdater.add_numpy(dummy_data,dummy_indexes,dummy_labels,dummy_size)\r\n\r\n```\r\n### Apply Preprocessing\r\nIf the \"preprocess\" method is executed again with the same group name, the already created group with the specified name is deleted once before preprocessing.\r\n\r\n```python\r\n\"\"\"\r\npreprocessing example\r\nbx : ch \u00d7 samples\r\n\"\"\"\r\ndef prepro_func(bx:np.ndarray): \r\n    x = bx[12:15,:]\r\n    return StandardScaler().fit_transform(x.T).T\r\nupdater.preprocess(\"custom\",prepro_func)\r\n```\r\n\r\n### Contents of HDF\r\nNote that \"dataset\" in the figure below refers to the HDF dataset (class).\r\n```\r\nhdf file\r\n\u251c origin : group / raw data\r\n\u2502 \u251c 1 : dataset\r\n\u2502 \u251c 2 : dataset\r\n\u2502 \u251c 3 : dataset\r\n\u2502 \u251c 4 : dataset\r\n\u2502 \u251c 5 : dataset\r\n\u2502 \u2514 \u2026\r\n\u2514 prepro : group / data after preprocessing\r\n\u3000 \u251c custom : group / \"custom\" is any group name\r\n\u3000 \u2502 \u251c 1 : dataset\r\n\u3000 \u2502 \u251c 2 : dataset\r\n\u3000 \u2502 \u251c 3 : dataset\r\n\u3000 \u2502 \u251c 4 : dataset\r\n\u3000 \u2502 \u251c 5 : dataset\r\n\u3000 \u2502 \u2514 \u2026\r\n\u3000 \u2514 custom2 : group\r\n\u3000 \u3000 \u2514 ...omit (1,2,3,4,\u2026)\r\n```\r\n\r\n+ Check the contents with software such as HDFView.\r\n+ Use \"h5py\" or similar to read the HDF file.\r\n    ```python\r\n    import h5py\r\n    with h5py.File(fpath) as h5:\r\n        fs = h5[\"prepro/custom\"].attrs[\"fs\"]\r\n        dataset_size = h5[\"prepro/custom\"].attrs[\"count\"]\r\n        dataset79 = h5[\"prepro/custom/79\"][()] #ch \u00d7 samples\r\n        dataset79_label = h5[\"prepro/custom/79\"].attrs[\"label\"]\r\n    ```\r\n\r\n### Merge Dataset\r\nIn order to merge, \"dataset_name\" must be set.  \r\nIf the order of channels is different for each dataset, the order can be aligned by specifying ch_indexes.\r\n\r\n**Source's preprocessing group is not inherited. In other words, preprocess() must be executed after the merge.**\r\n\r\nExample: Merge source1 and source2 datasets\r\n```python\r\n    target = DatasetUpdater(\"new_dataset.h5\",fs=fs)\r\n    target.remove_hdf() # reset hdf\r\n    s1 = DatasetUpdater(\"source1.h5\",fs=fs,dataset_name=\"source1\")\r\n    s2 = DatasetUpdater(\"source2.h5\",fs=fs,dataset_name=\"source2\")\r\n    s1_ch_indexes = [1,60,10,5]# channel indexes to use\r\n    target.merge_hdf(s1,ch_indexes=s1_ch_indexes)\r\n    target.merge_hdf(s2)\r\n```\r\n\r\n## Pull requests / Issues\r\nIf you need anything...\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Building HDF datasets for machine learning.",
    "version": "1.0.0",
    "project_urls": {
        "Download": "https://github.com/s-n-1-0/bci-dataset",
        "Homepage": "https://github.com/s-n-1-0/bci-dataset"
    },
    "split_keywords": [
        "eeg"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "923bf090a01733627b99edd22119b66cb3b9a5894ee7107ddea51f206beaff70",
                "md5": "f9505a4577c2ab39a8c569bcd125bc83",
                "sha256": "3b6d91987022a9c17d0bf82c72ea4a296fff4bb6d7a991a1907ea12644b95464"
            },
            "downloads": -1,
            "filename": "bci_dataset-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f9505a4577c2ab39a8c569bcd125bc83",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 7653,
            "upload_time": "2023-10-26T09:08:39",
            "upload_time_iso_8601": "2023-10-26T09:08:39.471810Z",
            "url": "https://files.pythonhosted.org/packages/92/3b/f090a01733627b99edd22119b66cb3b9a5894ee7107ddea51f206beaff70/bci_dataset-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "874609f647c7cfb4f2edaba7cc27f579367957b6a741003a740cb6bd7bb65504",
                "md5": "cfa5cfc7513e9709fa8cb9561dae8b86",
                "sha256": "ae2bb40ddad32bd50d086fe59dcbb79b0f5ae7236098a71a9b4df3f8e17d2bac"
            },
            "downloads": -1,
            "filename": "bci-dataset-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "cfa5cfc7513e9709fa8cb9561dae8b86",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 6112,
            "upload_time": "2023-10-26T09:08:41",
            "upload_time_iso_8601": "2023-10-26T09:08:41.349487Z",
            "url": "https://files.pythonhosted.org/packages/87/46/09f647c7cfb4f2edaba7cc27f579367957b6a741003a740cb6bd7bb65504/bci-dataset-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-26 09:08:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "s-n-1-0",
    "github_project": "bci-dataset",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "bci-dataset"
}

sn-10