# aind-dynamic-foraging-data-utils
[](LICENSE)

[](https://github.com/semantic-release/semantic-release)



# Scope
Purpose: Ingests NWB and spits out dataframes with the relevant information. Focused on dynamic foraging. Other tasks can branch and build task-specific utils.
Inputs are nwbs, outputs are dataframes (tidy and not)
Dependencies: xarray (includes numpy and pandas), scikit-learn (includes scipy), matplotlib
# Installation
To use the software, in the root directory, run
```bash
pip install -e .
```
To develop the code, run
```bash
pip install -e .[dev]
```
# Usage
## Accessing data from an NWB file
To load an NWB file
```
import aind_dynamic_foraging_data_utils.nwb_utils as nwb_utils
nwb = nwb_utils.load_nwb_from_filename(<filepath>)
```
To extract a pandas dataframe of trials
```
df_trials = nwb_utils.create_df_trials(nwb)
```
To extract a pandas dataframe of events
```
df_events = nwb_utils.create_df_events(nwb)
```
To extract a pandas dataframe of photometry data
```
df_fip = nwb_utils.create_df_fip(nwb)
```
By default, all of these functions adjust timestamps such that t(0) is the time of the first go cue. If you wish to disable this feature, use `adjust_time=False`
## Time alignment tools
To align a data variable to a set of timepoints and create an event triggered response use the alignment module. For example to align FIP data to each go cue:
```
import aind_dynamic_foraging_data_utils.alignment as alignment
etr = alignment.event_triggered_response(
df_fip.query('event == "<FIP channel>"'),
"timestamps",
"data",
df_trials['goCue_start_time_in_session'].values,
t_start = 0,
t_end = 1,
output_sampling_rate=40
)
```
## Code ocean utility code
To attach data, you'll want to [create a token on code ocean ](https://docs.codeocean.com/user-guide/code-ocean-api/authentication#to-create-an-access-token) with all read/write permissions. Make sure to attach your token on your capsule.
Then, you should be able to access the token via `os.getenv(token_name)`.
### Get list of assets
To get a list of code ocean assets for a subject
```
import aind_dynamic_foraging_data_utils.code_ocean_utils as cou
results = cou.get_subject_assets(my_id)
```
Users can give a list of required data modalities
```
import aind_dynamic_foraging_data_utils.code_ocean_utils as co
# FIP data
results = co.get_subject_assets(<subject_id>, modality=['fib'])
# FIP and behavior-videos
results = co.get_subject_assets(<subject_id>, modality=['fib','behavior-videos'])
# any modalities (default)
results = co.get_subject_assets(<subject_id>, modality=[])
```
Or supply an additional filter string
```
results = co.get_subject_assets(<subject_id>, extra_filter = <my docdb query string>)
```
Or filter by a task type:
```
results = co.get_subject_assets(<subject_id>, task=['Uncoupled Baiting', 'Coupled Baiting'])
```
### Attach data
The 'code_ocean_asset_id' column gives you the data asset ID's on Code Ocean. the 'id' column is the docDB id.
To attach a long list of data, simply call
```
cou.attach_data(results['code_ocean_asset_id'].values)
results = co.add_data_asset_path(results)
```
with results as the dataframe from 'get_subject_assets', and 'code_ocean_asset_id' the 16 digit data asset ID from code ocean.
### Load data
To get the dataframes from the NWBs, you can call function
```
filename_sessions = glob.glob(f"../data/**/nwb/behavior**")
SAVED_LOC = '../scratch/dfs'
interested_channels = ['G_1_dff-poly', 'R_1_dff-poly', 'R_2_dff-poly']
get_all_df_for_nwb(filename_sessions, loc = SAVED_LOC, interested_channels = interested_channels)
```
where filename_sessions are the folder locations for the nwbs, loc is a folder location where the dataframes will be saved, interested channels are the channels you want to save for df_fip.
All dataframes are saved per session, other than df_trials (this is because some df_trials have 2 y coordinates for the lick tube, some have 1).
To load the dataframes, use:
```
df_sess = pd.read_csv(SAVED_LOC + 'df_sess.csv', index_col = False)
df_events = pd.read_csv(SAVED_LOC + 'df_events.csv', index_col = False)
df_trials = pd.read_csv(SAVED_LOC + 'df_trials.csv', index_col = 0)
df_fip = pd.read_csv(SAVED_LOC + 'df_fip.csv', index_col = False)
```
To check what available fitted models we already have for each session, you can check with:
```
check_avail_model_by_nwb_name('746345_2024-11-22_09-55-54.nwb')
```
where you input the name of the session (formatted as `<subject_ID>_<collection_date>_<collection_time>.nwb`; sometimes a prefix of `behavior_` is needed). Currently the models that are fitted on all sessions should include:
['QLearning_L2F1_softmax', 'QLearning_L1F1_CK1_softmax', 'WSLS', 'QLearning_L1F0_epsi', 'QLearning_L2F1_CK1_softmax']
You can find out more about these models by [going here. ]([url](https://foraging-behavior-browser.allenneuraldynamics-test.org/RL_model_playground#all-available-foragers))
To enrich `df_sessions` and `df_trials` with the model information, you can use
```
nwb_name_for_models = [filename.split('/')[-1].replace('behavior_', '') for filename in filename_sessions]
SAVED_LOC = '../scratch/dfs'
get_foraging_model_info(df_trials, df_sess, nwb_name_for_models, loc = SAVED_LOC)
```
df_trials and df_sess are dataframes created from `get_all_df_for_nwb` and `nwb_name_for_models` formatted the same way for `check_avail_model_by_nwb_name`.
# Contributing
### Linters and testing
There are several libraries used to run linters, check documentation, and run tests.
- Please test your changes using the **coverage** library, which will run the tests and log a coverage report:
```bash
coverage run -m unittest discover && coverage report
```
- Use **interrogate** to check that modules, methods, etc. have been documented thoroughly:
```bash
interrogate .
```
- Use **flake8** to check that code is up to standards (no unused imports, etc.):
```bash
flake8 .
```
- Use **black** to automatically format the code into PEP standards:
```bash
black .
```
- Use **isort** to automatically sort import statements:
```bash
isort .
```
### Pull requests
For internal members, please create a branch. For external members, please fork the repository and open a pull request from the fork. We'll primarily use [Angular](https://github.com/angular/angular/blob/main/CONTRIBUTING.md#commit) style for commit messages. Roughly, they should follow the pattern:
```text
<type>(<scope>): <short summary>
```
where scope (optional) describes the packages affected by the code changes and type (mandatory) is one of:
- **build**: Changes that affect build tools or external dependencies (example scopes: pyproject.toml, setup.py)
- **ci**: Changes to our CI configuration files and scripts (examples: .github/workflows/ci.yml)
- **docs**: Documentation only changes
- **feat**: A new feature
- **fix**: A bugfix
- **perf**: A code change that improves performance
- **refactor**: A code change that neither fixes a bug nor adds a feature
- **test**: Adding missing tests or correcting existing tests
### Semantic Release
The table below, from [semantic release](https://github.com/semantic-release/semantic-release), shows which commit message gets you which release type when `semantic-release` runs (using the default configuration):
| Commit message | Release type |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------- |
| `fix(pencil): stop graphite breaking when too much pressure applied` | ~~Patch~~ Fix Release, Default release |
| `feat(pencil): add 'graphiteWidth' option` | ~~Minor~~ Feature Release |
| `perf(pencil): remove graphiteWidth option`<br><br>`BREAKING CHANGE: The graphiteWidth option has been removed.`<br>`The default graphite width of 10mm is always used for performance reasons.` | ~~Major~~ Breaking Release <br /> (Note that the `BREAKING CHANGE: ` token must be in the footer of the commit) |
### Documentation
To generate the rst files source files for documentation, run
```bash
sphinx-apidoc -o doc_template/source/ src
```
Then to create the documentation HTML files, run
```bash
sphinx-build -b html doc_template/source/ doc_template/build/html
```
More info on sphinx installation can be found [here](https://www.sphinx-doc.org/en/master/usage/installation.html).
Raw data
{
"_id": null,
"home_page": null,
"name": "aind-dynamic-foraging-data-utils",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Allen Institute for Neural Dynamics",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/cd/d9/5d62f9a14efabea1b349e63aba8d6d197c876ac0c3900ccb353e63851246/aind_dynamic_foraging_data_utils-0.1.35.tar.gz",
"platform": null,
"description": "# aind-dynamic-foraging-data-utils\n\n\n[](LICENSE)\n\n[](https://github.com/semantic-release/semantic-release)\n\n\n\n\n\n\n# Scope\nPurpose: Ingests NWB and spits out dataframes with the relevant information. Focused on dynamic foraging. Other tasks can branch and build task-specific utils.\nInputs are nwbs, outputs are dataframes (tidy and not)\nDependencies: xarray (includes numpy and pandas), scikit-learn (includes scipy), matplotlib\n\n\n\n\n# Installation\nTo use the software, in the root directory, run\n```bash\npip install -e .\n```\n\nTo develop the code, run\n```bash\npip install -e .[dev]\n```\n\n# Usage\n\n## Accessing data from an NWB file\nTo load an NWB file\n```\nimport aind_dynamic_foraging_data_utils.nwb_utils as nwb_utils\nnwb = nwb_utils.load_nwb_from_filename(<filepath>)\n```\n\nTo extract a pandas dataframe of trials\n```\ndf_trials = nwb_utils.create_df_trials(nwb)\n```\n\nTo extract a pandas dataframe of events\n```\ndf_events = nwb_utils.create_df_events(nwb)\n```\n\nTo extract a pandas dataframe of photometry data\n```\ndf_fip = nwb_utils.create_df_fip(nwb)\n```\n\nBy default, all of these functions adjust timestamps such that t(0) is the time of the first go cue. If you wish to disable this feature, use `adjust_time=False`\n\n## Time alignment tools\nTo align a data variable to a set of timepoints and create an event triggered response use the alignment module. For example to align FIP data to each go cue:\n\n```\nimport aind_dynamic_foraging_data_utils.alignment as alignment\n\netr = alignment.event_triggered_response(\n df_fip.query('event == \"<FIP channel>\"'),\n \"timestamps\",\n \"data\",\n df_trials['goCue_start_time_in_session'].values,\n t_start = 0,\n t_end = 1,\n output_sampling_rate=40\n )\n```\n\n\n## Code ocean utility code\n\nTo attach data, you'll want to [create a token on code ocean ](https://docs.codeocean.com/user-guide/code-ocean-api/authentication#to-create-an-access-token) with all read/write permissions. Make sure to attach your token on your capsule. \n\nThen, you should be able to access the token via `os.getenv(token_name)`. \n\n### Get list of assets\nTo get a list of code ocean assets for a subject\n```\nimport aind_dynamic_foraging_data_utils.code_ocean_utils as cou\nresults = cou.get_subject_assets(my_id)\n```\n\nUsers can give a list of required data modalities\n```\nimport aind_dynamic_foraging_data_utils.code_ocean_utils as co\n# FIP data\nresults = co.get_subject_assets(<subject_id>, modality=['fib'])\n\n# FIP and behavior-videos\nresults = co.get_subject_assets(<subject_id>, modality=['fib','behavior-videos'])\n\n# any modalities (default)\nresults = co.get_subject_assets(<subject_id>, modality=[])\n```\n\nOr supply an additional filter string\n```\nresults = co.get_subject_assets(<subject_id>, extra_filter = <my docdb query string>)\n```\n\nOr filter by a task type:\n```\nresults = co.get_subject_assets(<subject_id>, task=['Uncoupled Baiting', 'Coupled Baiting'])\n```\n\n### Attach data\nThe 'code_ocean_asset_id' column gives you the data asset ID's on Code Ocean. the 'id' column is the docDB id. \n\nTo attach a long list of data, simply call \n\n\n```\ncou.attach_data(results['code_ocean_asset_id'].values)\nresults = co.add_data_asset_path(results)\n```\n\nwith results as the dataframe from 'get_subject_assets', and 'code_ocean_asset_id' the 16 digit data asset ID from code ocean. \n\n### Load data\nTo get the dataframes from the NWBs, you can call function \n\n```\nfilename_sessions = glob.glob(f\"../data/**/nwb/behavior**\")\nSAVED_LOC = '../scratch/dfs'\ninterested_channels = ['G_1_dff-poly', 'R_1_dff-poly', 'R_2_dff-poly']\n\nget_all_df_for_nwb(filename_sessions, loc = SAVED_LOC, interested_channels = interested_channels)\n```\n\nwhere filename_sessions are the folder locations for the nwbs, loc is a folder location where the dataframes will be saved, interested channels are the channels you want to save for df_fip. \n\nAll dataframes are saved per session, other than df_trials (this is because some df_trials have 2 y coordinates for the lick tube, some have 1). \n\nTo load the dataframes, use: \n\n```\ndf_sess = pd.read_csv(SAVED_LOC + 'df_sess.csv', index_col = False)\ndf_events = pd.read_csv(SAVED_LOC + 'df_events.csv', index_col = False)\ndf_trials = pd.read_csv(SAVED_LOC + 'df_trials.csv', index_col = 0)\ndf_fip = pd.read_csv(SAVED_LOC + 'df_fip.csv', index_col = False)\n\n```\n\nTo check what available fitted models we already have for each session, you can check with: \n\n```\ncheck_avail_model_by_nwb_name('746345_2024-11-22_09-55-54.nwb')\n```\n\nwhere you input the name of the session (formatted as `<subject_ID>_<collection_date>_<collection_time>.nwb`; sometimes a prefix of `behavior_` is needed). Currently the models that are fitted on all sessions should include: \n\n['QLearning_L2F1_softmax', 'QLearning_L1F1_CK1_softmax', 'WSLS', 'QLearning_L1F0_epsi', 'QLearning_L2F1_CK1_softmax']\n\nYou can find out more about these models by [going here. ]([url](https://foraging-behavior-browser.allenneuraldynamics-test.org/RL_model_playground#all-available-foragers))\n\n\nTo enrich `df_sessions` and `df_trials` with the model information, you can use\n\n```\nnwb_name_for_models = [filename.split('/')[-1].replace('behavior_', '') for filename in filename_sessions]\nSAVED_LOC = '../scratch/dfs'\nget_foraging_model_info(df_trials, df_sess, nwb_name_for_models, loc = SAVED_LOC)\n```\n\ndf_trials and df_sess are dataframes created from `get_all_df_for_nwb` and `nwb_name_for_models` formatted the same way for `check_avail_model_by_nwb_name`. \n\n\n# Contributing\n\n### Linters and testing\n\nThere are several libraries used to run linters, check documentation, and run tests.\n\n- Please test your changes using the **coverage** library, which will run the tests and log a coverage report:\n\n```bash\ncoverage run -m unittest discover && coverage report\n```\n\n- Use **interrogate** to check that modules, methods, etc. have been documented thoroughly:\n\n```bash\ninterrogate .\n```\n\n- Use **flake8** to check that code is up to standards (no unused imports, etc.):\n```bash\nflake8 .\n```\n\n- Use **black** to automatically format the code into PEP standards:\n```bash\nblack .\n```\n\n- Use **isort** to automatically sort import statements:\n```bash\nisort .\n```\n\n### Pull requests\n\nFor internal members, please create a branch. For external members, please fork the repository and open a pull request from the fork. We'll primarily use [Angular](https://github.com/angular/angular/blob/main/CONTRIBUTING.md#commit) style for commit messages. Roughly, they should follow the pattern:\n```text\n<type>(<scope>): <short summary>\n```\n\nwhere scope (optional) describes the packages affected by the code changes and type (mandatory) is one of:\n\n- **build**: Changes that affect build tools or external dependencies (example scopes: pyproject.toml, setup.py)\n- **ci**: Changes to our CI configuration files and scripts (examples: .github/workflows/ci.yml)\n- **docs**: Documentation only changes\n- **feat**: A new feature\n- **fix**: A bugfix\n- **perf**: A code change that improves performance\n- **refactor**: A code change that neither fixes a bug nor adds a feature\n- **test**: Adding missing tests or correcting existing tests\n\n### Semantic Release\n\nThe table below, from [semantic release](https://github.com/semantic-release/semantic-release), shows which commit message gets you which release type when `semantic-release` runs (using the default configuration):\n\n| Commit message | Release type |\n| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------- |\n| `fix(pencil): stop graphite breaking when too much pressure applied` | ~~Patch~~ Fix Release, Default release |\n| `feat(pencil): add 'graphiteWidth' option` | ~~Minor~~ Feature Release |\n| `perf(pencil): remove graphiteWidth option`<br><br>`BREAKING CHANGE: The graphiteWidth option has been removed.`<br>`The default graphite width of 10mm is always used for performance reasons.` | ~~Major~~ Breaking Release <br /> (Note that the `BREAKING CHANGE: ` token must be in the footer of the commit) |\n\n### Documentation\nTo generate the rst files source files for documentation, run\n```bash\nsphinx-apidoc -o doc_template/source/ src \n```\nThen to create the documentation HTML files, run\n```bash\nsphinx-build -b html doc_template/source/ doc_template/build/html\n```\nMore info on sphinx installation can be found [here](https://www.sphinx-doc.org/en/master/usage/installation.html).\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Functions to help with postprocessing dynamic foraging data.",
"version": "0.1.35",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "46f868eb69dfe8d999cdec8dd0bf7811441e0a5a5bb19c13bf82790d64496fec",
"md5": "34920a6c01f5443ce23ef25a00f237d2",
"sha256": "2f229b3fb94ae48b8c1e4f6714a9a3b75fe56166300e22c91870f9c208866de8"
},
"downloads": -1,
"filename": "aind_dynamic_foraging_data_utils-0.1.35-py3-none-any.whl",
"has_sig": false,
"md5_digest": "34920a6c01f5443ce23ef25a00f237d2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 28605,
"upload_time": "2025-09-14T18:37:15",
"upload_time_iso_8601": "2025-09-14T18:37:15.544217Z",
"url": "https://files.pythonhosted.org/packages/46/f8/68eb69dfe8d999cdec8dd0bf7811441e0a5a5bb19c13bf82790d64496fec/aind_dynamic_foraging_data_utils-0.1.35-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "cdd95d62f9a14efabea1b349e63aba8d6d197c876ac0c3900ccb353e63851246",
"md5": "0b563a045584f344054214ba5f251877",
"sha256": "b84e9c9db6e67e45f18be3d2b2abd8ce141b59edb2460f2455304d12bc4df69d"
},
"downloads": -1,
"filename": "aind_dynamic_foraging_data_utils-0.1.35.tar.gz",
"has_sig": false,
"md5_digest": "0b563a045584f344054214ba5f251877",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 18010467,
"upload_time": "2025-09-14T18:37:17",
"upload_time_iso_8601": "2025-09-14T18:37:17.041147Z",
"url": "https://files.pythonhosted.org/packages/cd/d9/5d62f9a14efabea1b349e63aba8d6d197c876ac0c3900ccb353e63851246/aind_dynamic_foraging_data_utils-0.1.35.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-14 18:37:17",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "aind-dynamic-foraging-data-utils"
}