npyx

Name	npyx JSON
Version	4.1.1 JSON
	download
home_page	https://github.com/Npix-routines/NeuroPyxels
Summary	Python routines dealing with Neuropixels data.
upload_time	2024-11-26 15:38:31
maintainer	None
docs_url	None
author	Maxime Beau
requires_python	<3.12
license	None
keywords	neuropixels kilosort phy data analysis electrophysiology neuroscience
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![PyPI Version](https://img.shields.io/pypi/v/npyx.svg)](https://pypi.org/project/npyx/)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5509733.svg)](https://doi.org/10.5281/zenodo.5509733)
[![License](https://img.shields.io/pypi/l/npyx.svg)](https://github.com/m-beau/NeuroPyxels/blob/master/LICENSE)
[![Downloads](https://static.pepy.tech/badge/npyx)](https://pepy.tech/project/npyx)
# NeuroPyxels: loading, processing and plotting Neuropixels data in Python</h1> <img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/NeuroPyxels_logo_final.png" width="150" title="Neuropyxels" alt="Neuropixels" align="right" vspace = "50">

[NeuroPyxels](https://github.com/m-beau/NeuroPyxels) (npyx) is a python library built for electrophysiologists using Neuropixels electrodes. This package results from the needs of a pythonist who really did not want to transition to MATLAB to work with Neuropixels: it features a suite of core utility functions for loading, processing and plotting Neuropixels data.

❓**Any questions or issues?**: [Create a github issue](https://github.com/Maxime-Beau/Neuropyxels/issues) to get support, or create a [pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request). Alternatively, you can email [us: maximebeaujeanroch047[at]gmail[dot]com](mailto:maximebeaujeanroch047@gmail.com). You can also use the [Neuropixels slack workgroup](neuropixelsgroup.slack.com).

- **[⬇️ Installation](#%EF%B8%8F-installation)**
- **[🤗 Support and citing ](#-support-and-citing)**
- **[🔍️ Documentation](#%EF%B8%8F-documentation)**
  - [💡 Design philosophy](#-design-philosophy)
  - [📁 Directory structure](#-directory-structure)
  - [👉 Common use cases](#-common-use-cases)
    - [Load recording metadata](#load-recording-metadata)
    - [Load synchronization channel](#load-synchronization-channel)
    - [Get good units from dataset](#get-good-units-from-dataset)
    - [Load spike times from unit u](#load-spike-times-from-unit-u)
    - [Load waveforms from unit u](#load-waveforms-from-unit-u)
    - [Compute auto/crosscorrelogram between 2 units](#compute-autocrosscorrelogram-between-2-units)
    - [Plot waveform and crosscorrelograms of unit u](#plot-correlograms-and-waveforms-from-unit-u)
    - [Preprocess your waveforms and spike trains](#preprocess-your-waveforms-drift-shift-matching-and-spike-trains-detect-periods-with-few-false-positivenegative)
    - [Plot chunk of raw data with overlaid units](#plot-chunk-of-raw-data-with-overlaid-units)
    - [Plot peri-stimulus time histograms across neurons and conditions](#plot-peri-stimulus-time-histograms-across-neurons-and-conditions)
    - [Merge datasets acquired on two probes simultaneously](#merge-datasets-acquired-on-two-probes-simultaneously)
  - [⭐ Bonus: matplotlib plot prettifier (mplp)](#-bonus-matplotlib-plot-prettifier)

## ⬇️ Installation:

We recommend using a conda environment. Pre-existing packages on a python installation might be incompatible with npyx and break your installation. You can find instructions on setting up a conda environment [here](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html).

```bash
  conda create -n my_env python=3.11 £ # python 3.12 and above not supported
  conda activate my_env
  pip install npyx
  # optionally (see 'Dealing with cupy' section below):
  conda install -c conda-forge cupy cudatoolkit=11.0
  # test installation:
  python -c 'import npyx' # should not return any error
  ```

<details>
  <summary>Advanced installation</summary>

- if you want the very latest version:
  ```bash
  conda create -n my_env python=3.10
  conda activate my_env
  pip install git+https://github.com/m-beau/NeuroPyxels@master
  # optionally (see 'Dealing with cupy' section below):
  conda install -c conda-forge cupy cudatoolkit=11.0
  # test installation:
  python -c 'import npyx' # should not return any error
  ```

- If you want to edit npyx locally and eventually contribute:
  > 💡 Tip: in an ipython/jupyter session, use `%load_ext autoreload` then `%autoreload 2` to make your local edits active in your session without having to restart your kernel. Amazing for development.
  ```bash
  conda create -n my_env python=3.10
  conda activate my_env
  cd path/to/save_dir # any directory where your code will be accessible by your editor and safe. NOT downloads folder.
  git clone https://github.com/m-beau/NeuroPyxels
  cd NeuroPyxels
  pip install . # this will create an egg link to save_dir, which means that you do not need to reinstall the package each time you edit it (e.g. after pulling from github).
  # optionally (see 'Dealing with cupy' section below):
  conda install -c conda-forge cupy cudatoolkit=11.0
  # test installation:
  python -c 'import npyx' # should not return any error
  ```
  and pull every now and then:
  ```bash
  cd path/to/save_dir/NeuroPyxels
  git pull
  # And that's it, thanks to the egg link no need to reinstall the package!
  ```
</details>
</br>
Npyx supports Python >=3.7.

### Dealing with cupy (GPU shenanigans)
To run some preprocessing functions, you will need NVIDIA drivers and cuda-toolkit installed on your computer. It is a notorious source of bugs. To test your CUDA installation do the following:
```bash
nvidia-smi # Should show how much your GPU is being used right now
nvcc # This is the CUDA compiler
```
If it doesn't work, try up/downgrading the version of cudatoolkit installed:
```bash
# check the current version
conda activate my_env
conda list cudatoolkit
# E.g. install version 10.0
conda activate my_env
conda remove cupy, cudatoolkit
conda install -c conda-forge cupy cudatoolkit=10.0
```


### Test installation
You can use the built-in unit testing function 'test_npyx' to make sure that npyx core functions run smoothly, all at once.

```python
from npyx.testing import test_npyx

# any spike sorted recording compatible with phy
# (e.g. kilosort output)
dp = 'datapath/to/myrecording'
test_npyx(dp)

# if any test fails, re-run them with the following to print the error log, and try to fix it or post an issue on github:
test_npyx(dp, raise_error=True)
```
<span style="color:#1F45FC">

--- npyx version 2.3.4 unit testing initiated, on directory /media/maxime/AnalysisSSD/test_dataset_artefact... <br>

--- Successfully ran 'read_metadata' from npyx.inout. <br>
--- Successfully ran 'get_npix_sync' from npyx.inout. <br>
--- Successfully ran 'get_units' from npyx.gl. <br>
--- Successfully ran 'ids' from npyx.spk_t. <br>
--- Successfully ran 'trn' from npyx.spk_t. <br>
--- Successfully ran 'trn_filtered' from npyx.spk_t. <br>
--- Successfully ran 'wvf' from npyx.spk_wvf. <br>
--- Successfully ran 'wvf_dsmatch' from npyx.spk_wvf. <br>
--- Successfully ran 'get_peak_chan' from npyx.spk_wvf. <br>
--- Successfully ran 'templates' from npyx.spk_wvf. <br>
--- Successfully ran 'ccg' from npyx.corr. <br>
--- Successfully ran 'plot_wvf' from npyx.plot. <br>
--- Successfully ran 'plot_ccg' from npyx.plot. <br>
--- Successfully ran 'plot_raw' from npyx.plot. <br>

</span>

```
(bunch of plots...)
```
<details>
  <summary>:warning: Known installation issues</summary>

- **cannot import numba.core hence cannot import npyx** <br/>
Older versions of numba did not feature the .core submodule. If you get this error, you are probably running a too old version of numba. Make sure that you have installed npyx in a fresh conda environment if that happens to you. If you still get an error, check that numba is not installed in your root directory.

  ```# open new terminal
  pip uninstall numba
  conda activate my_env
  pip uninstall numba
  pip install numba
  ```
<br/>

- **core dumped when importing** <br/>
This seems to be an issue related to PyQt5 required by opencv (opencv-python).
Solution (from [post](https://stackoverflow.com/questions/71088095/opencv-could-not-load-the-qt-platform-plugin-xcb-in-even-though-it-was-fou)):
```
# activate npyx environment first
pip uninstall PyQt5
pip uninstall opencv-python
pip install opencv-python
# pip install other missing dependencies
```
Full log:
```
In [1]: from npyx import *
In [2]: QObject::moveToThread: Current thread (0x5622e1ea6800) is not the object's thread (0x5622e30e86f0).
Cannot move to target thread (0x5622e1ea6800)

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/maxime/miniconda3/envs/npyx/lib/python3.7/site-packages/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl.

Aborted (core dumped)
```
<br/>

- **I think I installed everything properly, but npyx is not found if I run 'python -c "import npyx" '!** <br/>
Typically:
```bash
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'npyx'
```
Make sure that the python installation that you are using is indeed the version of your new environment. <br/>
To do so, in your terminal, run "which python" on linux/mac or "where python" on windows: the output should be the path to the right environment e.g. "/home/.../anaconda/envs/npyx/bin/python". If it isn't, try to deactivate/reactivate your conda environment, or make sure you do not have conflicting python installations on your machine.

</details>

## 🤗 Support and citing 

If you find Neuropyxels useful in your work, we kindly request that you cite:

> Maxime Beau, Federico D'Agostino, Ago Lajko, Gabriela Martínez, Michael Häusser & Dimitar Kostadinov. NeuroPyxels: loading, processing and plotting Neuropixels data in python. Zenodo https://doi.org/10.5281/zenodo.5509733 (2021).

You can additionally star this repo using the top-right star button to help it gain more visibility.

Cheers!

## 🔍️ Documentation:

Npyx works with the data formatting employed by [SpikeGLX](https://billkarsh.github.io/SpikeGLX/) and [OpenEphys](https://open-ephys.org/neuropixels) (binary data and meta data) used in combination with [Phy](https://phy.readthedocs.io/en/latest/)-compatible spike-sorters ([Kilosort](https://github.com/MouseLand/Kilosort), [SpyKING CIRCUS](https://spyking-circus.readthedocs.io/en/latest/)...). <span style="color:pink">**Any dataset compatible with phy can also be analyzed with npyx, in essence.**</span>

### 💡 Design philosophy

- [Memoization](https://en.wikipedia.org/wiki/Memoization) (a.k.a. **caching**)

  <ins>Npyx is fast because it rarely computes the same thing twice by relying heavily on caching (memoization as purists like to call it)</ins> - in the background, it saves most relevant outputs (spike trains, waveforms, correlograms...) at **npix_dataset/npyxMemory**, from where they are simply reloaded if called again.

  An important argument controlling this behaviour is **`again`** (boolean), by default set to False: if True, NeuroPyxels cached functions will recompute their output rather than loading it from npyxMemory. This is important to be aware of this behaviour, as it can lead to mind boggling bugs. For instance, if you load a spike train then re-curate your dataset, e.g. by splitting unit 56 into 504 and 505, the train of the old 'unit 56' will still exist at kilosort_dataset/npyxMemory and you will remain able to load it even though the unit is gone!

Under the hood, NeuroPyxels caching is handled with another package of mine, **[cachecache](https://github.com/m-beau/cachecache)** (as of July 2024). Functions cached with cachecache's decorator (named `@npyx_cache` in NeuroPyxels) can have their caching behaviour altered at call time with three arguments: **again** (mentioned earlier), **cache_results**, and **cache_path**. If again is set to True (False by default), the results are recomputed rather than loaded from cache. If cache_results is set to False (True by default), the results will not be cached on disk and therefore will not take up any space. If cache_path is set to a string, the results will be cached in the directory specified by the string. If cache_path is set to None (default), the results will be cached in the default directory, which is `~/.NeuroPyxels` and can be customized by simply editing the `__cachedir__` directory inside `./npyx/CONFIG.py` (where NeuroPyxels is installed on your machine). For cached functions that use the `dp` argument, the default cache directory is `{dp}/npyxMemory`, not the path specified in `./npyx/CONFIG.py`.

 If you wish to grant your functions with the same caching capabilities, I wrote a very complete documentation in **[cachecache](https://github.com/m-beau/cachecache)**'s README. You can install it independently with `pip install cachecache`. 

  **General tip**: if your data loaded with NeuroPyxels seems incomprehensibly odd at times, try re-running the function with `again=True`. Joblib sometimes makes mistakes and I am yet to put my finger on what causes them - this is an ugly but quick and reliable fix.

- Ubiquitous arguments

  Most npyx functions take at least one input: **`dp`**, which is the path to your Neuropixels-phy dataset. You can find a [full description of the structure of such datasets](https://phy.readthedocs.io/en/latest/sorting_user_guide/#installation) on the phy documentation.

  Other typical parameters are: **`verbose`** (whether to print a bunch of informative messages, useful when debugging), **`saveFig`** (boolean) and **`saveDir`** (whether to save the figure in saveDir for plotting functions).

  Importantly, **`dp`** can also be the path to a **merged dataset**, generated with `npyx.merge_datasets()` - <ins>every function will run as smoothly on merged datasets as on any regular dataset</ins>. See below for more details.

- Minimal and modular reliance of spike-sorter output

  Every function requires the files `myrecording.ap.meta`/`myrecording.oebin` (metadata from SpikeGLX/OpenEphys), `params.py`, `spike_times.npy` and `spike_clusters.npy`.
  
  If you have started spike sorting, `cluster_groups.tsv` will also be required obviously (will be created filled with 'unsorted' groups if none is found).
  
  Then, specific functions will require specific files: loading waveforms with `npyx.spk_wvf.wvf` or extracting your sync channel with `npyx.io.get_npix_sync` require the raw data `myrecording.ap.bin`, `npyx.spk_wvf.templates` the files `templates.npy` and `spike_templates.npy`, and so on. This allows you to only transfer the strictly necassary files for your use case from a machine to the next: for instance, if you only want to make behavioural analysis of spike trains but do not care about the waveforms, you can run `get_npix_sync` on a first machine (which will generate a `sync_chan` folder containing extracted onsets/offsets from the sync channel(s)), then exclusively transfer the `dataset/sync_chan/` folder along with `spike_times.npy` and `spike_clusters.npy` (all very light files) on another computer and analyze your data there seemlessly.

### 📁 Directory structure

The **`dp`** parameter of all npyx functions must be the **absolute path to `myrecording`** below.

For SpikeGLX recordings:
```
myrecording/
  myrecording.ap.meta
  params.py
  spike_times.npy
  spike_clusters.npy
  cluster_groups.tsv # optional, if manually curated with phy
  myrecording.ap.bin # optional, if wanna plot waveforms

  # other kilosort/spyking circus outputs here
```
For Open-Ephys recordings:
```
myrecording/
  myrecording.oebin
  params.py
  spike_times.npy
  spike_clusters.npy
  cluster_groups.tsv # if manually curated with phy

  # other spikesorter outputs here

  continuous/
    Neuropix-PXI-100.somethingsomething (1, AP...)/
      continuous.dat # optional, if wanna plot waveforms
    Neuropix-PXI-100.somethingsomething (2, LFP...)/
      continuous.dat # optional, if want to plot LFP with plot_raw

  events/
    Neuropix-PXI-100.somethingsomething (1, AP...)/
      TTL somethingelse/
        timestamps.npy # optional, if need to get synchronyzation channel to load with get_npix_sync e.g. to merge datasets
    Neuropix-PXI-100.somethingsomething (2, LFP...)/
      TTL somethingelse/
        timestamps.npy # same timestamps for LFP channel
```

### 👉 Common use cases

**General tip**: if your data loaded with NeuroPyxels seems incomprehensibly odd at times, try re-running the function with `again=True`. Joblib sometimes makes mistakes and I am yet to put my finger on what causes them - this is an ugly but quick and reliable fix.

#### Load recording metadata

```python
from npyx import *

dp = 'datapath/to/myrecording'

# load contents of .lf.meta and .ap.meta or .oebin files as python dictionnary.
# The metadata of the high and lowpass filtered files are in meta['highpass'] and meta['lowpass']
# Quite handy to get probe version, sampling frequency, recording length etc
meta = read_metadata(dp) # works for spikeGLX (contents of .meta files) and open-ephys (contents of .oebin file)

```

#### Load synchronization channel
```python
from npyx.inout import get_npix_sync # star import is sufficient, but I like explicit imports!

# If SpikeGLX: slow the first time, then super fast
onsets, offsets = get_npix_sync(dp, filt_key='highpass') # works for spikeGLX (extracted from .ap.bin file) and open-ephys (/events/..AP/TTL/timestamps.npy)
# onsets/offsets are dictionnaries
# keys: ids of sync channel where a TTL was detected (0,1,2... for spikeGLX, name of TTL folders in events/..AP for openephys),
# values: times of up (onsets) or down (offsets) threshold crosses, in seconds.
```
#### Preprocess binary data
Makes a preprocessed copy of the binary file in dp, moves original binary file at dp/original_data
This will be as fast as literally copying your file, with a decent GPU!
```python
from npyx.inout import preprocess_binary_file # star import is sufficient, but I like explicit imports!

# can perform bandpass filtering (butterworth 3 nodes) and median subtraction (aka common average referenceing, CAR)
# in the future: ADC realignment (like CatGT), whitening, spatial filtering (experimental).
filtered_fname = preprocess_binary_file(dp, filt_key='ap', median_subtract=True, f_low=None, f_high=300, order=3, verbose=True)
```

#### Get good units from dataset
```python
from npyx.gl import get_units
units = get_units(dp, quality='good')
```
#### Load spike times from unit u
```python
from npyx.spk_t import trn
u=234
t = trn(dp, u) # gets all spikes from unit 234, in samples
```

#### Load waveforms from unit u
```python
from npyx.inout import read_spikeglx_meta
from npyx.spk_t import ids, trn
from npyx.spk_wvf import get_peak_chan, wvf, templates

# returns a random sample of 100 waveforms from unit 234, in uV, across 384 channels
waveforms = wvf(dp, u) # return array of shape (n_waves, n_samples, n_channels)=(100, 82, 384) by default
waveforms = wvf(dp, u, n_waveforms=1000, t_waveforms=90) # now 1000 random waveforms, 90 samples=3ms long

# Get the unit peak channel (channel with the biggest amplitude)
peak_chan = get_peak_chan(dp,u)
# extract the waveforms located on peak channel
w=waves[:,:,peak_chan]

# Extract waveforms of spikes occurring between
# 0-100s and 300-400s in the recording,
# because that's when your mouse sneezed
waveforms = wvf(dp, u, periods=[(0,100),(300,400)])

# alternatively, longer but more flexible:
fs=meta['highpass']['sampling_rate']
t=trn(dp,u)/fs # convert in s
# get ids of unit u: all spikes have a unique index in the dataset,
# which is their rank sorted by time (as in spike_times.npy)
u_ids = ids(dp,u)
ids=ids(dp,u)[(t>900)&(t<1000)]
mask = (t<100)|((t>300)&(t<400))
waves = wvf(dp, u, spike_ids=u_ids[mask])

# If you want to load the templates instead (faster and does not require binary file):
temp = templates(dp,u) # return array of shape (n_templates, 82, n_channels)
```

#### Compute auto/crosscorrelogram between 2 units
```python
from npyx.corr import ccg, ccg_stack

# returns ccg between 234 and 92 with a binsize of 0.2 and a window of 80
c = ccg(dp, [234,92], cbin=0.2, cwin=80)

# Only using spikes from the first and third minutes of recording
c = ccg(dp, [234,92], cbin=0.2, cwin=80, periods=[(0,60), (120,180)])

# better, compute a big stack of crosscorrelograms with a given name
# The first time, CCGs will be computed in parallel using all the available CPU cores
# and it will be saved in the background and, reloadable instantaneously in the future
source_units = [1,2,3,4,5]
target_units = [6,7,8,9,10]
c_stack = ccg_stack(dp, source_units, target_units, 0.2, 80, name='my_relevant_ccg_stack')
c_stack = ccg_stack(dp, name='my_relevant_ccg_stack') # will work to reaload in the future
```

#### Plot waveform and crosscorrelogram of unit u
```python
# all plotting functions return matplotlib figures
from npyx.plot import plot_wvf, get_peak_chan

u=234
# plot waveform, 2.8ms around templates center, on 16 channels around peak channel
# (the peak channel is found automatically, no need to worry about finding it)
fig = plot_wvf(dp, u, Nchannels=16, t_waveforms=2.8)

# But if you wished to get it, simply run
peakchannel = get_peak_chan(dp, u)
```
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/wvf.png" width="300"/>

```python
# plot ccg between 234 and 92
# as_grid also plot the autocorrelograms
fig = plot_ccg(dp, [u,92], cbin=0.2, cwin=80, as_grid=True)
```
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/ccg.png" width="400"/>

#### Preprocess your waveforms (drift-shift-matching) and spike trains (detect periods with few false positive/negative)
```python
# all plotting functions return matplotlib figures
from npyx.spk_wvf import wvf_dsmatch
from npyx.spk_t import trn_filtered

# wvf_dsmatch subselect 'best looking' waveforms
# by first matching them by drift state (Z, peak channel and XY, amplitude on peak channel)
# then shifting them around to realign them (using the crosscorr of its whole spatial footprint)
# on the plot, black is the original waveform as it would be plotted in phy,
# green is drift-matched, red is drift-shift matched
w_preprocessed = wvf_dsmatch(dp, u, plot_debug=True)
```
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/dsmatch_example1_driftmatch.png" width="500"/>
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/dsmatch_example1.png" width="350"/>

```python
# trn_filtered clips the recording in 10s (default) chunks
# and estimates the false positive/false negative spike sporting rates on such chunks
# before masking out spikes occurring inside 'bad chunks',
# defined as chunks with too high FP OR FN rates (5% and 5% by default)
t_preprocessed = trn_filtered(dp, u, plot_debug=True)
```
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/trnfiltered_example1.png" width="600"/>


#### Plot chunk of raw data with overlaid units
```python
units = [1,2,3,4,5,6]
channels = np.arange(70,250)
# raw data are whitened, high-pass filtered and median-subtracted by default - parameters are explicit below
plot_raw_units(dp, times=[0,0.130], units = units, channels = channels,
               colors=['orange', 'red', 'limegreen', 'darkgreen', 'cyan', 'navy'],
               lw=1.5, offset=450, figsize=(6,16), Nchan_plot=10,
               med_sub=1, whiten=1, hpfilt=1)
```
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/raw.png" width="400"/>

#### Plot peri-stimulus time histograms across neurons and conditions

```python
# Explore responses of 3 neurons to 4 categories of events:
fs=30000 # Hz
units=[1,2,3]
trains=[trn(dp,u)/fs for u in units] # make list of trains of 3 units
trains_str=units # can give specific names to units here, show on the left of each row
events=[licks, sneezes, visual_stimuli, auditory_stimuli] # get events corresponding to 4 conditions
events_str=['licking', 'sneezing', 'visual_stim', 'auditory_stim'] # can give specific names to events here, show above each column
events_col='batlow' # colormap from which the event colors will be drawn
fig=summary_psth(trains, trains_str, events, events_str, psthb=10, psthw=[-750,750],
                 zscore=0, bsl_subtract=False, bsl_window=[-3000,-750], convolve=True, gsd=2,
                 events_toplot=[0], events_col=events_col, trains_col_groups=trains_col_groups,
                 title=None, saveFig=0, saveDir='~/Downloads', _format='pdf',
                 figh=None, figratio=None, transpose=1,
                 as_heatmap=False,  vmin=None, center=None, vmax=None, cmap_str=None)
```
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/psth.png" width="600"/>

#### Merge datasets acquired on two probes simultaneously
```python
# The three recordings need to include the same sync channel.
from npyx.merger import merge_datasets
dps = ['same_folder/lateralprobe_dataset',
       'same_folder/medialprobe_dataset',
       'same_folder/anteriorprobe_dataset']
probenames = ['lateral','medial','anterior']
dp_dict = {p:dp for p, dp in zip(dps, probenames)}

# This will merge the 3 datasets (only relevant information, not the raw data) in a new folder at
# dp_merged: same_folder/merged_lateralprobe_dataset_medialprobe_dataset_anteriorprobe_dataset
# where all npyx functions can smoothly run.
# The only difference is that units now need to be called as floats,
# of format u.x (u=unit id, x=dataset id [0-2]).
# lateralprobe, medial probe and anteriorprobe x will be respectively 0,1 and 2.
dp_merged, datasets_table = merge_datasets(dp_dic)


--- Merged data (from 2 dataset(s)) will be saved here: /same_folder/merged_lateralprobe_dataset_medialprobe_dataset_anteriorprobe_dataset.

--- Loading spike trains of 2 datasets...

sync channel extraction directory found: /same_folder/lateralprobe_dataset/sync_chan
Data found on sync channels:
chan 2 (201 events).
chan 4 (16 events).
chan 5 (175 events).
chan 6 (28447 events).
chan 7 (93609 events).
Which channel shall be used to synchronize probes? >>> 7

sync channel extraction directory found: /same_folder/medialprobe_dataset/sync_chan
Data found on sync channels:
chan 2 (201 events).
chan 4 (16 events).
chan 5 (175 events).
chan 6 (28447 events).
chan 7 (93609 events).
Which channel shall be used to synchronize probes? >>> 7

sync channel extraction directory found: /same_folder/anteriorprobe_dataset/sync_chan
Data found on sync channels:
chan 2 (201 events).
chan 4 (16 events).
chan 5 (175 events).
chan 6 (28194 events).
chan 7 (93609 events).
Which channel shall be used to synchronize probes? >>> 7

--- Aligning spike trains of 2 datasets...
More than 50 sync signals found - for performance reasons, sub-sampling to 50 homogenoeously spaced sync signals to align data.
50 sync events used for alignement - start-end drift of -3080.633ms

--- Merged spike_times and spike_clusters saved at /same_folder/merged_lateralprobe_dataset_medialprobe_dataset_anteriorprobe_dataset.

--> Merge successful! Use a float u.x in any npyx function to call unit u from dataset x:
- u.0 for dataset lateralprobe_dataset,
- u.1 for dataset medialprobe_dataset,
- u.2 for dataset anteriorprobe_dataset.
```
<ins>Now any npyx function runs on the merged dataset!</ins>
Under the hood, it will create a `merged_dataset_dataset1_dataset2/npyxMemory` folder to save any data computed across dataframes, but will use the original `dataset1/npyxMemory` folder to save data related to this dataset exclusively (e.g. waveforms). Hence, there is no redundancy: space and time are saved.

This is also why <ins>it is primordial that you do not move your datatasets from their original paths after merging them</ins> - else, functions ran on merged_dataset1_dataset2 will not know where to go fetch the data! They refer to the paths in `merged_dataset_dataset1_dataset2/datasets_table.csv`. If you really need to, you can move your datasets but do not forget to edit this file accordingly.
```python
# These will work!
t = trn(dp_merged, 92.1) # get spikes of unit 92 in dataset 1 i.e. medialprobe
fig=plot_ccg(dp_merged,[10.0, 92.1, cbin=0.2, cwin=80]) # compute CCG between 2 units across datasets
```

PS - The spike times are aligned across datasets by modelling the drift between the clocks of the neuropixels headstages linearly: TTL probe 1 = a * TTL probe 1 + b (if a!=1, there is drift between the clocks), so spiketimes_probe2_aligned_to_probe1  = a * spiketimes_probe2 + b
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/ttl1-ttl2_1.png" width="600"/>
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/ttl1-ttl2_2.png" width="600"/>
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/error_dist.png" width="600"/>
<br/>

### ⭐ Bonus: matplotlib plot prettifier
```python
from npyx.plot import get_ncolors_cmap

# allows you to easily extract the (r,g,b) tuples from a matplotlib or crameri colormap
# to use them in other plots!
colors = get_ncolors_cmap('coolwarm', 10, plot=1)
colors = get_ncolors_cmap('viridis', 10, plot=1)
# in a jupyter notebook, will also plot he HTML colormap:
```
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/colormaps.png" width="600"/>

```python
from npyx.plot import mplp
import matplotlib.pyplot as plt

# mplp() will turn any matplotlib plot into something you can work with.
# fed up googling around and landing on stack overflow to tweak your figures?
# just read mplp parameters, they are self-explanatory!

df1 = pd.load("my_dataframe.csv")

# Seaborn figure (seaborn is simply a wrapper for matplotlib):
fig = plt.figure()
sns.scatterplot(data=df1,
                x='popsync', y='depth', hue='mean_popsync',
                palette='plasma', alpha=1, linewidth=1, edgecolor='black')
```
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/no_mplp.png" width="600"/>

```python
# Same figure, tweaked with mplp():
fig = plt.figure()
sns.scatterplot(data=df1,
                x='popsync', y='depth', hue='mean_popsync',
                palette='plasma', alpha=1, linewidth=1, edgecolor='black')
mplp(figsize=(3,3), title="My title", ylim=[-10,-2], xlim=[-40,60],
      xlabel = "My x label (rotated ticks)", ylabel="My y label",
      xtickrot=45,
      hide_legend=True, colorbar=True,
      vmin=df['mean_popsync'].min(), vmax=df['mean_popsync'].max(),
      cbar_w=0.03, cbar_h=0.4, clabel="My colorbar label\n(no more ugly legend!)", cmap="plasma",
      clabel_s=16, cticks_s=14, ticklab_s=16,
      saveFig=saveFig, saveDir=saveDir, figname = f"popsync_{pair}")
```
<img src="https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/mplp.png" width="600"/>

<br/>

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Npix-routines/NeuroPyxels",
    "name": "npyx",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12",
    "maintainer_email": null,
    "keywords": "neuropixels, kilosort, phy, data analysis, electrophysiology, neuroscience",
    "author": "Maxime Beau",
    "author_email": "maximebeaujeanroch047@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/05/71/bac02862425faeba8cbb1ef0c3dd579a4bc2da8e853cad37ec6483173b4a/npyx-4.1.1.tar.gz",
    "platform": null,
    "description": "[![PyPI Version](https://img.shields.io/pypi/v/npyx.svg)](https://pypi.org/project/npyx/)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5509733.svg)](https://doi.org/10.5281/zenodo.5509733)\n[![License](https://img.shields.io/pypi/l/npyx.svg)](https://github.com/m-beau/NeuroPyxels/blob/master/LICENSE)\n[![Downloads](https://static.pepy.tech/badge/npyx)](https://pepy.tech/project/npyx)\n# NeuroPyxels: loading, processing and plotting Neuropixels data in Python</h1> <img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/NeuroPyxels_logo_final.png\" width=\"150\" title=\"Neuropyxels\" alt=\"Neuropixels\" align=\"right\" vspace = \"50\">\n\n[NeuroPyxels](https://github.com/m-beau/NeuroPyxels) (npyx) is a python library built for electrophysiologists using Neuropixels electrodes. This package results from the needs of a pythonist who really did not want to transition to MATLAB to work with Neuropixels: it features a suite of core utility functions for loading, processing and plotting Neuropixels data.\n\n\u2753**Any questions or issues?**: [Create a github issue](https://github.com/Maxime-Beau/Neuropyxels/issues) to get support, or create a [pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request). Alternatively, you can email [us: maximebeaujeanroch047[at]gmail[dot]com](mailto:maximebeaujeanroch047@gmail.com). You can also use the [Neuropixels slack workgroup](neuropixelsgroup.slack.com).\n\n- **[\u2b07\ufe0f Installation](#%EF%B8%8F-installation)**\n- **[\ud83e\udd17 Support and citing ](#-support-and-citing)**\n- **[\ud83d\udd0d\ufe0f Documentation](#%EF%B8%8F-documentation)**\n  - [\ud83d\udca1 Design philosophy](#-design-philosophy)\n  - [\ud83d\udcc1 Directory structure](#-directory-structure)\n  - [\ud83d\udc49 Common use cases](#-common-use-cases)\n    - [Load recording metadata](#load-recording-metadata)\n    - [Load synchronization channel](#load-synchronization-channel)\n    - [Get good units from dataset](#get-good-units-from-dataset)\n    - [Load spike times from unit u](#load-spike-times-from-unit-u)\n    - [Load waveforms from unit u](#load-waveforms-from-unit-u)\n    - [Compute auto/crosscorrelogram between 2 units](#compute-autocrosscorrelogram-between-2-units)\n    - [Plot waveform and crosscorrelograms of unit u](#plot-correlograms-and-waveforms-from-unit-u)\n    - [Preprocess your waveforms and spike trains](#preprocess-your-waveforms-drift-shift-matching-and-spike-trains-detect-periods-with-few-false-positivenegative)\n    - [Plot chunk of raw data with overlaid units](#plot-chunk-of-raw-data-with-overlaid-units)\n    - [Plot peri-stimulus time histograms across neurons and conditions](#plot-peri-stimulus-time-histograms-across-neurons-and-conditions)\n    - [Merge datasets acquired on two probes simultaneously](#merge-datasets-acquired-on-two-probes-simultaneously)\n  - [\u2b50 Bonus: matplotlib plot prettifier (mplp)](#-bonus-matplotlib-plot-prettifier)\n\n## \u2b07\ufe0f Installation:\n\nWe recommend using a conda environment. Pre-existing packages on a python installation might be incompatible with npyx and break your installation. You can find instructions on setting up a conda environment [here](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html).\n\n```bash\n  conda create -n my_env python=3.11 \u00a3 # python 3.12 and above not supported\n  conda activate my_env\n  pip install npyx\n  # optionally (see 'Dealing with cupy' section below):\n  conda install -c conda-forge cupy cudatoolkit=11.0\n  # test installation:\n  python -c 'import npyx' # should not return any error\n  ```\n\n<details>\n  <summary>Advanced installation</summary>\n\n- if you want the very latest version:\n  ```bash\n  conda create -n my_env python=3.10\n  conda activate my_env\n  pip install git+https://github.com/m-beau/NeuroPyxels@master\n  # optionally (see 'Dealing with cupy' section below):\n  conda install -c conda-forge cupy cudatoolkit=11.0\n  # test installation:\n  python -c 'import npyx' # should not return any error\n  ```\n\n- If you want to edit npyx locally and eventually contribute:\n  > \ud83d\udca1 Tip: in an ipython/jupyter session, use `%load_ext autoreload` then `%autoreload 2` to make your local edits active in your session without having to restart your kernel. Amazing for development.\n  ```bash\n  conda create -n my_env python=3.10\n  conda activate my_env\n  cd path/to/save_dir # any directory where your code will be accessible by your editor and safe. NOT downloads folder.\n  git clone https://github.com/m-beau/NeuroPyxels\n  cd NeuroPyxels\n  pip install . # this will create an egg link to save_dir, which means that you do not need to reinstall the package each time you edit it (e.g. after pulling from github).\n  # optionally (see 'Dealing with cupy' section below):\n  conda install -c conda-forge cupy cudatoolkit=11.0\n  # test installation:\n  python -c 'import npyx' # should not return any error\n  ```\n  and pull every now and then:\n  ```bash\n  cd path/to/save_dir/NeuroPyxels\n  git pull\n  # And that's it, thanks to the egg link no need to reinstall the package!\n  ```\n</details>\n</br>\nNpyx supports Python >=3.7.\n\n### Dealing with cupy (GPU shenanigans)\nTo run some preprocessing functions, you will need NVIDIA drivers and cuda-toolkit installed on your computer. It is a notorious source of bugs. To test your CUDA installation do the following:\n```bash\nnvidia-smi # Should show how much your GPU is being used right now\nnvcc # This is the CUDA compiler\n```\nIf it doesn't work, try up/downgrading the version of cudatoolkit installed:\n```bash\n# check the current version\nconda activate my_env\nconda list cudatoolkit\n# E.g. install version 10.0\nconda activate my_env\nconda remove cupy, cudatoolkit\nconda install -c conda-forge cupy cudatoolkit=10.0\n```\n\n\n### Test installation\nYou can use the built-in unit testing function 'test_npyx' to make sure that npyx core functions run smoothly, all at once.\n\n```python\nfrom npyx.testing import test_npyx\n\n# any spike sorted recording compatible with phy\n# (e.g. kilosort output)\ndp = 'datapath/to/myrecording'\ntest_npyx(dp)\n\n# if any test fails, re-run them with the following to print the error log, and try to fix it or post an issue on github:\ntest_npyx(dp, raise_error=True)\n```\n<span style=\"color:#1F45FC\">\n\n--- npyx version 2.3.4 unit testing initiated, on directory /media/maxime/AnalysisSSD/test_dataset_artefact... <br>\n\n--- Successfully ran 'read_metadata' from npyx.inout. <br>\n--- Successfully ran 'get_npix_sync' from npyx.inout. <br>\n--- Successfully ran 'get_units' from npyx.gl. <br>\n--- Successfully ran 'ids' from npyx.spk_t. <br>\n--- Successfully ran 'trn' from npyx.spk_t. <br>\n--- Successfully ran 'trn_filtered' from npyx.spk_t. <br>\n--- Successfully ran 'wvf' from npyx.spk_wvf. <br>\n--- Successfully ran 'wvf_dsmatch' from npyx.spk_wvf. <br>\n--- Successfully ran 'get_peak_chan' from npyx.spk_wvf. <br>\n--- Successfully ran 'templates' from npyx.spk_wvf. <br>\n--- Successfully ran 'ccg' from npyx.corr. <br>\n--- Successfully ran 'plot_wvf' from npyx.plot. <br>\n--- Successfully ran 'plot_ccg' from npyx.plot. <br>\n--- Successfully ran 'plot_raw' from npyx.plot. <br>\n\n</span>\n\n```\n(bunch of plots...)\n```\n<details>\n  <summary>:warning: Known installation issues</summary>\n\n- **cannot import numba.core hence cannot import npyx** <br/>\nOlder versions of numba did not feature the .core submodule. If you get this error, you are probably running a too old version of numba. Make sure that you have installed npyx in a fresh conda environment if that happens to you. If you still get an error, check that numba is not installed in your root directory.\n\n  ```# open new terminal\n  pip uninstall numba\n  conda activate my_env\n  pip uninstall numba\n  pip install numba\n  ```\n<br/>\n\n- **core dumped when importing** <br/>\nThis seems to be an issue related to PyQt5 required by opencv (opencv-python).\nSolution (from [post](https://stackoverflow.com/questions/71088095/opencv-could-not-load-the-qt-platform-plugin-xcb-in-even-though-it-was-fou)):\n```\n# activate npyx environment first\npip uninstall PyQt5\npip uninstall opencv-python\npip install opencv-python\n# pip install other missing dependencies\n```\nFull log:\n```\nIn [1]: from npyx import *\nIn [2]: QObject::moveToThread: Current thread (0x5622e1ea6800) is not the object's thread (0x5622e30e86f0).\nCannot move to target thread (0x5622e1ea6800)\n\nqt.qpa.plugin: Could not load the Qt platform plugin \"xcb\" in \"/home/maxime/miniconda3/envs/npyx/lib/python3.7/site-packages/cv2/qt/plugins\" even though it was found.\nThis application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.\n\nAvailable platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl.\n\nAborted (core dumped)\n```\n<br/>\n\n- **I think I installed everything properly, but npyx is not found if I run 'python -c \"import npyx\" '!** <br/>\nTypically:\n```bash\nTraceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>\nModuleNotFoundError: No module named 'npyx'\n```\nMake sure that the python installation that you are using is indeed the version of your new environment. <br/>\nTo do so, in your terminal, run \"which python\" on linux/mac or \"where python\" on windows: the output should be the path to the right environment e.g. \"/home/.../anaconda/envs/npyx/bin/python\". If it isn't, try to deactivate/reactivate your conda environment, or make sure you do not have conflicting python installations on your machine.\n\n</details>\n\n## \ud83e\udd17 Support and citing \n\nIf you find Neuropyxels useful in your work, we kindly request that you cite:\n\n> Maxime Beau, Federico D'Agostino, Ago Lajko, Gabriela Mart\u00ednez, Michael H\u00e4usser & Dimitar Kostadinov. NeuroPyxels: loading, processing and plotting Neuropixels data in python. Zenodo https://doi.org/10.5281/zenodo.5509733 (2021).\n\nYou can additionally star this repo using the top-right star button to help it gain more visibility.\n\nCheers!\n\n## \ud83d\udd0d\ufe0f Documentation:\n\nNpyx works with the data formatting employed by [SpikeGLX](https://billkarsh.github.io/SpikeGLX/) and [OpenEphys](https://open-ephys.org/neuropixels) (binary data and meta data) used in combination with [Phy](https://phy.readthedocs.io/en/latest/)-compatible spike-sorters ([Kilosort](https://github.com/MouseLand/Kilosort), [SpyKING CIRCUS](https://spyking-circus.readthedocs.io/en/latest/)...). <span style=\"color:pink\">**Any dataset compatible with phy can also be analyzed with npyx, in essence.**</span>\n\n### \ud83d\udca1 Design philosophy\n\n- [Memoization](https://en.wikipedia.org/wiki/Memoization) (a.k.a. **caching**)\n\n  <ins>Npyx is fast because it rarely computes the same thing twice by relying heavily on caching (memoization as purists like to call it)</ins> - in the background, it saves most relevant outputs (spike trains, waveforms, correlograms...) at **npix_dataset/npyxMemory**, from where they are simply reloaded if called again.\n\n  An important argument controlling this behaviour is **`again`** (boolean), by default set to False: if True, NeuroPyxels cached functions will recompute their output rather than loading it from npyxMemory. This is important to be aware of this behaviour, as it can lead to mind boggling bugs. For instance, if you load a spike train then re-curate your dataset, e.g. by splitting unit 56 into 504 and 505, the train of the old 'unit 56' will still exist at kilosort_dataset/npyxMemory and you will remain able to load it even though the unit is gone!\n\nUnder the hood, NeuroPyxels caching is handled with another package of mine, **[cachecache](https://github.com/m-beau/cachecache)** (as of July 2024). Functions cached with cachecache's decorator (named `@npyx_cache` in NeuroPyxels) can have their caching behaviour altered at call time with three arguments: **again** (mentioned earlier), **cache_results**, and **cache_path**. If again is set to True (False by default), the results are recomputed rather than loaded from cache. If cache_results is set to False (True by default), the results will not be cached on disk and therefore will not take up any space. If cache_path is set to a string, the results will be cached in the directory specified by the string. If cache_path is set to None (default), the results will be cached in the default directory, which is `~/.NeuroPyxels` and can be customized by simply editing the `__cachedir__` directory inside `./npyx/CONFIG.py` (where NeuroPyxels is installed on your machine). For cached functions that use the `dp` argument, the default cache directory is `{dp}/npyxMemory`, not the path specified in `./npyx/CONFIG.py`.\n\n If you wish to grant your functions with the same caching capabilities, I wrote a very complete documentation in **[cachecache](https://github.com/m-beau/cachecache)**'s README. You can install it independently with `pip install cachecache`. \n\n  **General tip**: if your data loaded with NeuroPyxels seems incomprehensibly odd at times, try re-running the function with `again=True`. Joblib sometimes makes mistakes and I am yet to put my finger on what causes them - this is an ugly but quick and reliable fix.\n\n- Ubiquitous arguments\n\n  Most npyx functions take at least one input: **`dp`**, which is the path to your Neuropixels-phy dataset. You can find a [full description of the structure of such datasets](https://phy.readthedocs.io/en/latest/sorting_user_guide/#installation) on the phy documentation.\n\n  Other typical parameters are: **`verbose`** (whether to print a bunch of informative messages, useful when debugging), **`saveFig`** (boolean) and **`saveDir`** (whether to save the figure in saveDir for plotting functions).\n\n  Importantly, **`dp`** can also be the path to a **merged dataset**, generated with `npyx.merge_datasets()` - <ins>every function will run as smoothly on merged datasets as on any regular dataset</ins>. See below for more details.\n\n- Minimal and modular reliance of spike-sorter output\n\n  Every function requires the files `myrecording.ap.meta`/`myrecording.oebin` (metadata from SpikeGLX/OpenEphys), `params.py`, `spike_times.npy` and `spike_clusters.npy`.\n  \n  If you have started spike sorting, `cluster_groups.tsv` will also be required obviously (will be created filled with 'unsorted' groups if none is found).\n  \n  Then, specific functions will require specific files: loading waveforms with `npyx.spk_wvf.wvf` or extracting your sync channel with `npyx.io.get_npix_sync` require the raw data `myrecording.ap.bin`, `npyx.spk_wvf.templates` the files `templates.npy` and `spike_templates.npy`, and so on. This allows you to only transfer the strictly necassary files for your use case from a machine to the next: for instance, if you only want to make behavioural analysis of spike trains but do not care about the waveforms, you can run `get_npix_sync` on a first machine (which will generate a `sync_chan` folder containing extracted onsets/offsets from the sync channel(s)), then exclusively transfer the `dataset/sync_chan/` folder along with `spike_times.npy` and `spike_clusters.npy` (all very light files) on another computer and analyze your data there seemlessly.\n\n### \ud83d\udcc1 Directory structure\n\nThe **`dp`** parameter of all npyx functions must be the **absolute path to `myrecording`** below.\n\nFor SpikeGLX recordings:\n```\nmyrecording/\n  myrecording.ap.meta\n  params.py\n  spike_times.npy\n  spike_clusters.npy\n  cluster_groups.tsv # optional, if manually curated with phy\n  myrecording.ap.bin # optional, if wanna plot waveforms\n\n  # other kilosort/spyking circus outputs here\n```\nFor Open-Ephys recordings:\n```\nmyrecording/\n  myrecording.oebin\n  params.py\n  spike_times.npy\n  spike_clusters.npy\n  cluster_groups.tsv # if manually curated with phy\n\n  # other spikesorter outputs here\n\n  continuous/\n    Neuropix-PXI-100.somethingsomething (1, AP...)/\n      continuous.dat # optional, if wanna plot waveforms\n    Neuropix-PXI-100.somethingsomething (2, LFP...)/\n      continuous.dat # optional, if want to plot LFP with plot_raw\n\n  events/\n    Neuropix-PXI-100.somethingsomething (1, AP...)/\n      TTL somethingelse/\n        timestamps.npy # optional, if need to get synchronyzation channel to load with get_npix_sync e.g. to merge datasets\n    Neuropix-PXI-100.somethingsomething (2, LFP...)/\n      TTL somethingelse/\n        timestamps.npy # same timestamps for LFP channel\n```\n\n### \ud83d\udc49 Common use cases\n\n**General tip**: if your data loaded with NeuroPyxels seems incomprehensibly odd at times, try re-running the function with `again=True`. Joblib sometimes makes mistakes and I am yet to put my finger on what causes them - this is an ugly but quick and reliable fix.\n\n#### Load recording metadata\n\n```python\nfrom npyx import *\n\ndp = 'datapath/to/myrecording'\n\n# load contents of .lf.meta and .ap.meta or .oebin files as python dictionnary.\n# The metadata of the high and lowpass filtered files are in meta['highpass'] and meta['lowpass']\n# Quite handy to get probe version, sampling frequency, recording length etc\nmeta = read_metadata(dp) # works for spikeGLX (contents of .meta files) and open-ephys (contents of .oebin file)\n\n```\n\n#### Load synchronization channel\n```python\nfrom npyx.inout import get_npix_sync # star import is sufficient, but I like explicit imports!\n\n# If SpikeGLX: slow the first time, then super fast\nonsets, offsets = get_npix_sync(dp, filt_key='highpass') # works for spikeGLX (extracted from .ap.bin file) and open-ephys (/events/..AP/TTL/timestamps.npy)\n# onsets/offsets are dictionnaries\n# keys: ids of sync channel where a TTL was detected (0,1,2... for spikeGLX, name of TTL folders in events/..AP for openephys),\n# values: times of up (onsets) or down (offsets) threshold crosses, in seconds.\n```\n#### Preprocess binary data\nMakes a preprocessed copy of the binary file in dp, moves original binary file at dp/original_data\nThis will be as fast as literally copying your file, with a decent GPU!\n```python\nfrom npyx.inout import preprocess_binary_file # star import is sufficient, but I like explicit imports!\n\n# can perform bandpass filtering (butterworth 3 nodes) and median subtraction (aka common average referenceing, CAR)\n# in the future: ADC realignment (like CatGT), whitening, spatial filtering (experimental).\nfiltered_fname = preprocess_binary_file(dp, filt_key='ap', median_subtract=True, f_low=None, f_high=300, order=3, verbose=True)\n```\n\n#### Get good units from dataset\n```python\nfrom npyx.gl import get_units\nunits = get_units(dp, quality='good')\n```\n#### Load spike times from unit u\n```python\nfrom npyx.spk_t import trn\nu=234\nt = trn(dp, u) # gets all spikes from unit 234, in samples\n```\n\n#### Load waveforms from unit u\n```python\nfrom npyx.inout import read_spikeglx_meta\nfrom npyx.spk_t import ids, trn\nfrom npyx.spk_wvf import get_peak_chan, wvf, templates\n\n# returns a random sample of 100 waveforms from unit 234, in uV, across 384 channels\nwaveforms = wvf(dp, u) # return array of shape (n_waves, n_samples, n_channels)=(100, 82, 384) by default\nwaveforms = wvf(dp, u, n_waveforms=1000, t_waveforms=90) # now 1000 random waveforms, 90 samples=3ms long\n\n# Get the unit peak channel (channel with the biggest amplitude)\npeak_chan = get_peak_chan(dp,u)\n# extract the waveforms located on peak channel\nw=waves[:,:,peak_chan]\n\n# Extract waveforms of spikes occurring between\n# 0-100s and 300-400s in the recording,\n# because that's when your mouse sneezed\nwaveforms = wvf(dp, u, periods=[(0,100),(300,400)])\n\n# alternatively, longer but more flexible:\nfs=meta['highpass']['sampling_rate']\nt=trn(dp,u)/fs # convert in s\n# get ids of unit u: all spikes have a unique index in the dataset,\n# which is their rank sorted by time (as in spike_times.npy)\nu_ids = ids(dp,u)\nids=ids(dp,u)[(t>900)&(t<1000)]\nmask = (t<100)|((t>300)&(t<400))\nwaves = wvf(dp, u, spike_ids=u_ids[mask])\n\n# If you want to load the templates instead (faster and does not require binary file):\ntemp = templates(dp,u) # return array of shape (n_templates, 82, n_channels)\n```\n\n#### Compute auto/crosscorrelogram between 2 units\n```python\nfrom npyx.corr import ccg, ccg_stack\n\n# returns ccg between 234 and 92 with a binsize of 0.2 and a window of 80\nc = ccg(dp, [234,92], cbin=0.2, cwin=80)\n\n# Only using spikes from the first and third minutes of recording\nc = ccg(dp, [234,92], cbin=0.2, cwin=80, periods=[(0,60), (120,180)])\n\n# better, compute a big stack of crosscorrelograms with a given name\n# The first time, CCGs will be computed in parallel using all the available CPU cores\n# and it will be saved in the background and, reloadable instantaneously in the future\nsource_units = [1,2,3,4,5]\ntarget_units = [6,7,8,9,10]\nc_stack = ccg_stack(dp, source_units, target_units, 0.2, 80, name='my_relevant_ccg_stack')\nc_stack = ccg_stack(dp, name='my_relevant_ccg_stack') # will work to reaload in the future\n```\n\n#### Plot waveform and crosscorrelogram of unit u\n```python\n# all plotting functions return matplotlib figures\nfrom npyx.plot import plot_wvf, get_peak_chan\n\nu=234\n# plot waveform, 2.8ms around templates center, on 16 channels around peak channel\n# (the peak channel is found automatically, no need to worry about finding it)\nfig = plot_wvf(dp, u, Nchannels=16, t_waveforms=2.8)\n\n# But if you wished to get it, simply run\npeakchannel = get_peak_chan(dp, u)\n```\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/wvf.png\" width=\"300\"/>\n\n```python\n# plot ccg between 234 and 92\n# as_grid also plot the autocorrelograms\nfig = plot_ccg(dp, [u,92], cbin=0.2, cwin=80, as_grid=True)\n```\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/ccg.png\" width=\"400\"/>\n\n#### Preprocess your waveforms (drift-shift-matching) and spike trains (detect periods with few false positive/negative)\n```python\n# all plotting functions return matplotlib figures\nfrom npyx.spk_wvf import wvf_dsmatch\nfrom npyx.spk_t import trn_filtered\n\n# wvf_dsmatch subselect 'best looking' waveforms\n# by first matching them by drift state (Z, peak channel and XY, amplitude on peak channel)\n# then shifting them around to realign them (using the crosscorr of its whole spatial footprint)\n# on the plot, black is the original waveform as it would be plotted in phy,\n# green is drift-matched, red is drift-shift matched\nw_preprocessed = wvf_dsmatch(dp, u, plot_debug=True)\n```\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/dsmatch_example1_driftmatch.png\" width=\"500\"/>\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/dsmatch_example1.png\" width=\"350\"/>\n\n```python\n# trn_filtered clips the recording in 10s (default) chunks\n# and estimates the false positive/false negative spike sporting rates on such chunks\n# before masking out spikes occurring inside 'bad chunks',\n# defined as chunks with too high FP OR FN rates (5% and 5% by default)\nt_preprocessed = trn_filtered(dp, u, plot_debug=True)\n```\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/trnfiltered_example1.png\" width=\"600\"/>\n\n\n#### Plot chunk of raw data with overlaid units\n```python\nunits = [1,2,3,4,5,6]\nchannels = np.arange(70,250)\n# raw data are whitened, high-pass filtered and median-subtracted by default - parameters are explicit below\nplot_raw_units(dp, times=[0,0.130], units = units, channels = channels,\n               colors=['orange', 'red', 'limegreen', 'darkgreen', 'cyan', 'navy'],\n               lw=1.5, offset=450, figsize=(6,16), Nchan_plot=10,\n               med_sub=1, whiten=1, hpfilt=1)\n```\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/raw.png\" width=\"400\"/>\n\n#### Plot peri-stimulus time histograms across neurons and conditions\n\n```python\n# Explore responses of 3 neurons to 4 categories of events:\nfs=30000 # Hz\nunits=[1,2,3]\ntrains=[trn(dp,u)/fs for u in units] # make list of trains of 3 units\ntrains_str=units # can give specific names to units here, show on the left of each row\nevents=[licks, sneezes, visual_stimuli, auditory_stimuli] # get events corresponding to 4 conditions\nevents_str=['licking', 'sneezing', 'visual_stim', 'auditory_stim'] # can give specific names to events here, show above each column\nevents_col='batlow' # colormap from which the event colors will be drawn\nfig=summary_psth(trains, trains_str, events, events_str, psthb=10, psthw=[-750,750],\n                 zscore=0, bsl_subtract=False, bsl_window=[-3000,-750], convolve=True, gsd=2,\n                 events_toplot=[0], events_col=events_col, trains_col_groups=trains_col_groups,\n                 title=None, saveFig=0, saveDir='~/Downloads', _format='pdf',\n                 figh=None, figratio=None, transpose=1,\n                 as_heatmap=False,  vmin=None, center=None, vmax=None, cmap_str=None)\n```\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/psth.png\" width=\"600\"/>\n\n#### Merge datasets acquired on two probes simultaneously\n```python\n# The three recordings need to include the same sync channel.\nfrom npyx.merger import merge_datasets\ndps = ['same_folder/lateralprobe_dataset',\n       'same_folder/medialprobe_dataset',\n       'same_folder/anteriorprobe_dataset']\nprobenames = ['lateral','medial','anterior']\ndp_dict = {p:dp for p, dp in zip(dps, probenames)}\n\n# This will merge the 3 datasets (only relevant information, not the raw data) in a new folder at\n# dp_merged: same_folder/merged_lateralprobe_dataset_medialprobe_dataset_anteriorprobe_dataset\n# where all npyx functions can smoothly run.\n# The only difference is that units now need to be called as floats,\n# of format u.x (u=unit id, x=dataset id [0-2]).\n# lateralprobe, medial probe and anteriorprobe x will be respectively 0,1 and 2.\ndp_merged, datasets_table = merge_datasets(dp_dic)\n\n\n--- Merged data (from 2 dataset(s)) will be saved here: /same_folder/merged_lateralprobe_dataset_medialprobe_dataset_anteriorprobe_dataset.\n\n--- Loading spike trains of 2 datasets...\n\nsync channel extraction directory found: /same_folder/lateralprobe_dataset/sync_chan\nData found on sync channels:\nchan 2 (201 events).\nchan 4 (16 events).\nchan 5 (175 events).\nchan 6 (28447 events).\nchan 7 (93609 events).\nWhich channel shall be used to synchronize probes? >>> 7\n\nsync channel extraction directory found: /same_folder/medialprobe_dataset/sync_chan\nData found on sync channels:\nchan 2 (201 events).\nchan 4 (16 events).\nchan 5 (175 events).\nchan 6 (28447 events).\nchan 7 (93609 events).\nWhich channel shall be used to synchronize probes? >>> 7\n\nsync channel extraction directory found: /same_folder/anteriorprobe_dataset/sync_chan\nData found on sync channels:\nchan 2 (201 events).\nchan 4 (16 events).\nchan 5 (175 events).\nchan 6 (28194 events).\nchan 7 (93609 events).\nWhich channel shall be used to synchronize probes? >>> 7\n\n--- Aligning spike trains of 2 datasets...\nMore than 50 sync signals found - for performance reasons, sub-sampling to 50 homogenoeously spaced sync signals to align data.\n50 sync events used for alignement - start-end drift of -3080.633ms\n\n--- Merged spike_times and spike_clusters saved at /same_folder/merged_lateralprobe_dataset_medialprobe_dataset_anteriorprobe_dataset.\n\n--> Merge successful! Use a float u.x in any npyx function to call unit u from dataset x:\n- u.0 for dataset lateralprobe_dataset,\n- u.1 for dataset medialprobe_dataset,\n- u.2 for dataset anteriorprobe_dataset.\n```\n<ins>Now any npyx function runs on the merged dataset!</ins>\nUnder the hood, it will create a `merged_dataset_dataset1_dataset2/npyxMemory` folder to save any data computed across dataframes, but will use the original `dataset1/npyxMemory` folder to save data related to this dataset exclusively (e.g. waveforms). Hence, there is no redundancy: space and time are saved.\n\nThis is also why <ins>it is primordial that you do not move your datatasets from their original paths after merging them</ins> - else, functions ran on merged_dataset1_dataset2 will not know where to go fetch the data! They refer to the paths in `merged_dataset_dataset1_dataset2/datasets_table.csv`. If you really need to, you can move your datasets but do not forget to edit this file accordingly.\n```python\n# These will work!\nt = trn(dp_merged, 92.1) # get spikes of unit 92 in dataset 1 i.e. medialprobe\nfig=plot_ccg(dp_merged,[10.0, 92.1, cbin=0.2, cwin=80]) # compute CCG between 2 units across datasets\n```\n\nPS - The spike times are aligned across datasets by modelling the drift between the clocks of the neuropixels headstages linearly: TTL probe 1 = a * TTL probe 1 + b (if a!=1, there is drift between the clocks), so spiketimes_probe2_aligned_to_probe1  = a * spiketimes_probe2 + b\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/ttl1-ttl2_1.png\" width=\"600\"/>\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/ttl1-ttl2_2.png\" width=\"600\"/>\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/error_dist.png\" width=\"600\"/>\n<br/>\n\n### \u2b50 Bonus: matplotlib plot prettifier\n```python\nfrom npyx.plot import get_ncolors_cmap\n\n# allows you to easily extract the (r,g,b) tuples from a matplotlib or crameri colormap\n# to use them in other plots!\ncolors = get_ncolors_cmap('coolwarm', 10, plot=1)\ncolors = get_ncolors_cmap('viridis', 10, plot=1)\n# in a jupyter notebook, will also plot he HTML colormap:\n```\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/colormaps.png\" width=\"600\"/>\n\n```python\nfrom npyx.plot import mplp\nimport matplotlib.pyplot as plt\n\n# mplp() will turn any matplotlib plot into something you can work with.\n# fed up googling around and landing on stack overflow to tweak your figures?\n# just read mplp parameters, they are self-explanatory!\n\ndf1 = pd.load(\"my_dataframe.csv\")\n\n# Seaborn figure (seaborn is simply a wrapper for matplotlib):\nfig = plt.figure()\nsns.scatterplot(data=df1,\n                x='popsync', y='depth', hue='mean_popsync',\n                palette='plasma', alpha=1, linewidth=1, edgecolor='black')\n```\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/no_mplp.png\" width=\"600\"/>\n\n```python\n# Same figure, tweaked with mplp():\nfig = plt.figure()\nsns.scatterplot(data=df1,\n                x='popsync', y='depth', hue='mean_popsync',\n                palette='plasma', alpha=1, linewidth=1, edgecolor='black')\nmplp(figsize=(3,3), title=\"My title\", ylim=[-10,-2], xlim=[-40,60],\n      xlabel = \"My x label (rotated ticks)\", ylabel=\"My y label\",\n      xtickrot=45,\n      hide_legend=True, colorbar=True,\n      vmin=df['mean_popsync'].min(), vmax=df['mean_popsync'].max(),\n      cbar_w=0.03, cbar_h=0.4, clabel=\"My colorbar label\\n(no more ugly legend!)\", cmap=\"plasma\",\n      clabel_s=16, cticks_s=14, ticklab_s=16,\n      saveFig=saveFig, saveDir=saveDir, figname = f\"popsync_{pair}\")\n```\n<img src=\"https://raw.githubusercontent.com/m-beau/NeuroPyxels/master/images/mplp.png\" width=\"600\"/>\n\n<br/>\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Python routines dealing with Neuropixels data.",
    "version": "4.1.1",
    "project_urls": {
        "Homepage": "https://github.com/Npix-routines/NeuroPyxels"
    },
    "split_keywords": [
        "neuropixels",
        " kilosort",
        " phy",
        " data analysis",
        " electrophysiology",
        " neuroscience"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "71003f3fb0d3b0597bfdfc62c17d23eb79353ce8a83a37a73ce1dfd2e71a88b3",
                "md5": "a40b3f5d1e9f6c1ef50c402e6b816671",
                "sha256": "fca7ffea0363c4ccc862f5d489a74de752e4a00e31c75a62061f341643816aad"
            },
            "downloads": -1,
            "filename": "npyx-4.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a40b3f5d1e9f6c1ef50c402e6b816671",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12",
            "size": 362698,
            "upload_time": "2024-11-26T15:38:30",
            "upload_time_iso_8601": "2024-11-26T15:38:30.048884Z",
            "url": "https://files.pythonhosted.org/packages/71/00/3f3fb0d3b0597bfdfc62c17d23eb79353ce8a83a37a73ce1dfd2e71a88b3/npyx-4.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0571bac02862425faeba8cbb1ef0c3dd579a4bc2da8e853cad37ec6483173b4a",
                "md5": "2196db754efe23279aa1777cedc17c72",
                "sha256": "37909b65c548a084c484573f5dfa0afcb617465358b4afbf416782bf82e3c93e"
            },
            "downloads": -1,
            "filename": "npyx-4.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "2196db754efe23279aa1777cedc17c72",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12",
            "size": 364876,
            "upload_time": "2024-11-26T15:38:31",
            "upload_time_iso_8601": "2024-11-26T15:38:31.790336Z",
            "url": "https://files.pythonhosted.org/packages/05/71/bac02862425faeba8cbb1ef0c3dd579a4bc2da8e853cad37ec6483173b4a/npyx-4.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-26 15:38:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Npix-routines",
    "github_project": "NeuroPyxels",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "npyx"
}

Maxime Beau