arfx

Name	arfx JSON
Version	2.7.1 JSON
	download
home_page	None
Summary	Advanced Recording Format Tools
upload_time	2025-01-03 23:41:19
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	BSD 3-Clause License
keywords
VCS
bugtrack_url
requirements	arf ewave h5py natsort numpy tqdm
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            arfx
====

|ProjectStatus|_ |Version|_ |BuildStatus|_ |License|_ |PythonVersions|_

.. |ProjectStatus| image:: https://www.repostatus.org/badges/latest/active.svg
.. _ProjectStatus: https://www.repostatus.org/#active

.. |Version| image:: https://img.shields.io/pypi/v/arfx.svg
.. _Version: https://pypi.python.org/pypi/arfx/

.. |BuildStatus| image:: https://github.com/melizalab/arfx/actions/workflows/python-package.yml/badge.svg
.. _BuildStatus: https://github.com/melizalab/arfx/actions/workflows/python-package.yml

.. |License| image:: https://img.shields.io/pypi/l/arfx.svg
.. _License: https://opensource.org/license/bsd-3-clause/

.. |PythonVersions| image:: https://img.shields.io/pypi/pyversions/arfx.svg
.. _PythonVersions: https://pypi.python.org/pypi/arfx/

**arfx** is a family of commandline tools for copying sampled data in
and out of ARF containers. ARF (https://github.com/melizalab/arf) is an
open, portable file format for storing behavioral and neural data, based
on `HDF5 <http://www.hdfgroup.org/HDF5>`__.

installation
------------

.. code:: bash

   pip install arfx

or from source:

.. code:: bash

   python setup.py install

use
---

The general syntax is ``arfx operation [options] files``. The syntax is
similar to ``tar``. Operations are as follows:

-  **-A:** copy data from one container to another
-  **-c:** create a new container
-  **-r:** append data to the container
-  **-t:** list contents of the container
-  **-x:** extract entries from the container
-  **-d:** delete entries from the container

Options specify the target ARF file, verbosity, automatic naming schemes, and
any metadata to be stored in the entry. Some important options include:

-  **-f FILE:** use ARF file FILE
-  **-v:** verbose output
-  **-n NAME:** name entries sequentially, using NAME as the base
-  **-k key=value** add metadata to the entries
-  **-T DATATYPE:** specify the type of data

input files
~~~~~~~~~~~

**arfx** can read sampled data from ``pcm``, ``wave``, ``npy`` and
``mda`` files. Support for additional file formats can be added as
plugins (see 4).

When adding data to an ARF container (``-c`` and ``-r`` modes), the
input files are specified on the command line, and added in the order
given. By default, entries are given the same name as the input file,
minus the extension; however, if the input file has more than one entry,
they are given an additional numerical extension. To override this, the
``-n`` flag can be used to specify the base name; all entries are given
sequential names based on this.

The ``-n, -a, -e, -p, -s, -T`` options are used to store information
about the data being added to the file. The DATATYPE argument can be the
numerical code or enumeration code (run ``arfx --help-datatypes`` for a
list), and indicates the type of data in the entries. All of the entries
created in a single run of arfx are given these values. The ``-u``
option tells arfx not to compress the data, which can speed up I/O
operations slightly.

Currently only one sampled dataset per entry is supported. Clearly this
does not encompass many use cases, but **arfx** is intended as a simple
tool. More specialized import procedures can be easily written in Python
using the ``arf`` library.

output files
~~~~~~~~~~~~

The entries to be extracted (in ``-x`` mode) can be specified by name.
If no names are specified, all the entries are extracted. All sampled
datasets in each entry are extracted as separate channels, because they
may have different sampling rates. Event datasets are not extracted.

By default the output files will be in ``wave`` format and will have
names with the format ``entry_channel.wav``. The ``-n`` argument can be
used to customize the names and file format of the output files. The
argument must be a template in the format defined by the `python string
module <http://docs.python.org/library/string.html###format-specification-mini-language>`__.
Supported field names include ``entry``, ``channel``, and ``index``, as
well as the names of any HDF5 attributes stored on the entry or channel.
The extension of the output template is used to determine the file
format. Currently only ``wave`` is supported, but additional formats may
be supplied as plugins (see below).

The metadata options are ignored when extracting files; any metadata
present in the ARF container that is also supported by the target
container is copied.

other operations
~~~~~~~~~~~~~~~~                

As with ``tar``, the ``-t`` operation will list the contents of the
archive. Each entry/channel is listed on a separate line in path
notation.

The ``-A`` flag is used to copy the contents of one ARF file to another.
The entries are copied without modification from the source ARF file(s)
to the target container.

The ``-d`` (delete) operation uses the same syntax as the extract
operation, but instead of extracting the entries, they are deleted.
Because of limitations in the underlying HDF5 library, this does not
free up the space, so the file is repacked unless the ``-P`` option is
set.

The ``-U`` (update) operation can be used to add or update attributes of
entries.

The ``--write-attr`` operation can be used to store the contents of text
files in top-level attributes. The attributes have the name
``user_<filename>``. The ``--read-attr`` operation can be used to read
out those attributes. This is useful when data collection programs
generate log or settings files that you want to store in the ARF file.

other utilities
---------------

This package comes with a few additional scripts that do fairly specific
operations.

arfx-split
~~~~~~~~~~

This script is used to reorganize very large recordings, possibly
contained in multiple files, into manageable chunks. Each new entry is
given an updated timestamp and attributes from the source entries.
Currently, no effort is made to splice data across entries or files.
This may result in some short entries. Only sampled datasets are
processed.

arfx-oephys
~~~~~~~~~~~

Converts the output of an `open-ephys <https://open-ephys.org/>`_ recording into an ARF file. open-ephys stores its data in a big complex directory tree, which this script will navigate and store in an appropriately timestamped entry in the ARF file. Has not been tested with data from outside our lab. Example invocation::

   arfx-oephys -T EXTRAC_HP -k experimenter=smm3rc -k bird=C194 -k pen=1 -k site=1 -k protocol=chorus -f C194_1_1.arf C194_2023-10-16_16-30-54_chorus/

We typically run this command before starting spike sorting to create a copy of the recording for archival.

arfx-collect-sampled
~~~~~~~~~~~~~~~~~~~~

This script is used to export data into a flat binary structure. It collects
sampled data across channels and entries into a single 2-D array. The output can
be stored in a multichannel wav file or in a raw binary ``dat`` format (N
samples by M channels), which is used by a wide variety of spike-sorting tools.
We use this script if we ever have to re-sort a recording after deleting the
original raw recording.

arfx-select
~~~~~~~~~~~

This is a pretty specialized script that takes in a table of segments defined by entry name and start/stop time and copies them to a new ARF file. It's usually better to just write analysis code to directly access the desired data from the original file, but it can be useful as a first stage in exporting small segments of a recording to wave files for sharing or depositing.

extending arfx
--------------

Additional formats for reading and writing can be added using the Python
setuptools plugin system. Plugins must be registered in the ``arfx.io``
entry point group, with a name corresponding to the extension of the
file format handled by the plugin.

An arfx IO plugin is a class with the following required methods:

``__init__(path, mode, **attributes)``: Opens the file at ``path``. The
``mode`` argument specifies whether the file is opened for reading
(``r``), writing (``w``), or appending (``a``). Must throw an
``IOError`` if the file does not exist or cannot be created, and a
``ValueError`` if the specified value for ``mode`` is not supported. The
additional ``attributes`` arguments specify metadata to be stored in the
file when created. **arfx** will pass all attributes of the channel and
entry (e.g., ``channels``, ``sampling_rate``, ``units``, and
``datatype``) when opening a file for writing. This method may issue a
``ValueError`` if the caller fails to set a required attribute, or
attempts to set an attribute inconsistent with the data format.
Unsupported attributes should be ignored.

``read()``: Reads the contents of the opened file and returns the data
in a format suitable for storage in an ARF file. Specifically, it must
be an acceptable type for the ``arf.entry.add_data()`` method (see
https://github.com/melizalab/arf for documentation).

``write(data)``: Writes data to the file. Must issue an ``IOError`` if
the file is opened in the wrong mode, and ``TypeError`` if the data
format is not correct for the file format.

``timestamp``: A readable property giving the time point of the data.
The value may be a scalar indicating the number of seconds since the
epoch, or a two-element sequence giving the number of seconds and
microseconds since the epoch. If this property is writable it will be
set by **arfx** when writing data.

``sampling_rate``: A property indicating the sampling rate of the data
in the file (or current entry), in units of Hz.

The class may also define the following methods and properties. If any
property is not defined, it is assumed to have the default value defined
below.

``nentries``: A readable property indicating the number of entries in
the file. Default value is 1.

``entry``: A readable and writable integer-valued property corresponding
to the index of the currently active entry in the file. Active means
that the ``read()`` and ``write()`` methods will affect only that entry.
Default is 0, and **arfx** will not attempt to change the property if
``nentries`` is 1.

version information
-------------------

**arfx** uses semantic versioning and is synchronized with the
major/minor version numbers of the arf package specification.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "arfx",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "C Daniel Meliza <dan@meliza.org>",
    "keywords": null,
    "author": null,
    "author_email": "C Daniel Meliza <dan@meliza.org>",
    "download_url": "https://files.pythonhosted.org/packages/d7/6a/801b6bb3477f208289c3ef9feff3e0f113ef49f145e9662da62615b83fd3/arfx-2.7.1.tar.gz",
    "platform": null,
    "description": "arfx\n====\n\n|ProjectStatus|_ |Version|_ |BuildStatus|_ |License|_ |PythonVersions|_\n\n.. |ProjectStatus| image:: https://www.repostatus.org/badges/latest/active.svg\n.. _ProjectStatus: https://www.repostatus.org/#active\n\n.. |Version| image:: https://img.shields.io/pypi/v/arfx.svg\n.. _Version: https://pypi.python.org/pypi/arfx/\n\n.. |BuildStatus| image:: https://github.com/melizalab/arfx/actions/workflows/python-package.yml/badge.svg\n.. _BuildStatus: https://github.com/melizalab/arfx/actions/workflows/python-package.yml\n\n.. |License| image:: https://img.shields.io/pypi/l/arfx.svg\n.. _License: https://opensource.org/license/bsd-3-clause/\n\n.. |PythonVersions| image:: https://img.shields.io/pypi/pyversions/arfx.svg\n.. _PythonVersions: https://pypi.python.org/pypi/arfx/\n\n**arfx** is a family of commandline tools for copying sampled data in\nand out of ARF containers. ARF (https://github.com/melizalab/arf) is an\nopen, portable file format for storing behavioral and neural data, based\non `HDF5 <http://www.hdfgroup.org/HDF5>`__.\n\ninstallation\n------------\n\n.. code:: bash\n\n   pip install arfx\n\nor from source:\n\n.. code:: bash\n\n   python setup.py install\n\nuse\n---\n\nThe general syntax is ``arfx operation [options] files``. The syntax is\nsimilar to ``tar``. Operations are as follows:\n\n-  **-A:** copy data from one container to another\n-  **-c:** create a new container\n-  **-r:** append data to the container\n-  **-t:** list contents of the container\n-  **-x:** extract entries from the container\n-  **-d:** delete entries from the container\n\nOptions specify the target ARF file, verbosity, automatic naming schemes, and\nany metadata to be stored in the entry. Some important options include:\n\n-  **-f FILE:** use ARF file FILE\n-  **-v:** verbose output\n-  **-n NAME:** name entries sequentially, using NAME as the base\n-  **-k key=value** add metadata to the entries\n-  **-T DATATYPE:** specify the type of data\n\ninput files\n~~~~~~~~~~~\n\n**arfx** can read sampled data from ``pcm``, ``wave``, ``npy`` and\n``mda`` files. Support for additional file formats can be added as\nplugins (see 4).\n\nWhen adding data to an ARF container (``-c`` and ``-r`` modes), the\ninput files are specified on the command line, and added in the order\ngiven. By default, entries are given the same name as the input file,\nminus the extension; however, if the input file has more than one entry,\nthey are given an additional numerical extension. To override this, the\n``-n`` flag can be used to specify the base name; all entries are given\nsequential names based on this.\n\nThe ``-n, -a, -e, -p, -s, -T`` options are used to store information\nabout the data being added to the file. The DATATYPE argument can be the\nnumerical code or enumeration code (run ``arfx --help-datatypes`` for a\nlist), and indicates the type of data in the entries. All of the entries\ncreated in a single run of arfx are given these values. The ``-u``\noption tells arfx not to compress the data, which can speed up I/O\noperations slightly.\n\nCurrently only one sampled dataset per entry is supported. Clearly this\ndoes not encompass many use cases, but **arfx** is intended as a simple\ntool. More specialized import procedures can be easily written in Python\nusing the ``arf`` library.\n\noutput files\n~~~~~~~~~~~~\n\nThe entries to be extracted (in ``-x`` mode) can be specified by name.\nIf no names are specified, all the entries are extracted. All sampled\ndatasets in each entry are extracted as separate channels, because they\nmay have different sampling rates. Event datasets are not extracted.\n\nBy default the output files will be in ``wave`` format and will have\nnames with the format ``entry_channel.wav``. The ``-n`` argument can be\nused to customize the names and file format of the output files. The\nargument must be a template in the format defined by the `python string\nmodule <http://docs.python.org/library/string.html###format-specification-mini-language>`__.\nSupported field names include ``entry``, ``channel``, and ``index``, as\nwell as the names of any HDF5 attributes stored on the entry or channel.\nThe extension of the output template is used to determine the file\nformat. Currently only ``wave`` is supported, but additional formats may\nbe supplied as plugins (see below).\n\nThe metadata options are ignored when extracting files; any metadata\npresent in the ARF container that is also supported by the target\ncontainer is copied.\n\nother operations\n~~~~~~~~~~~~~~~~                \n\nAs with ``tar``, the ``-t`` operation will list the contents of the\narchive. Each entry/channel is listed on a separate line in path\nnotation.\n\nThe ``-A`` flag is used to copy the contents of one ARF file to another.\nThe entries are copied without modification from the source ARF file(s)\nto the target container.\n\nThe ``-d`` (delete) operation uses the same syntax as the extract\noperation, but instead of extracting the entries, they are deleted.\nBecause of limitations in the underlying HDF5 library, this does not\nfree up the space, so the file is repacked unless the ``-P`` option is\nset.\n\nThe ``-U`` (update) operation can be used to add or update attributes of\nentries.\n\nThe ``--write-attr`` operation can be used to store the contents of text\nfiles in top-level attributes. The attributes have the name\n``user_<filename>``. The ``--read-attr`` operation can be used to read\nout those attributes. This is useful when data collection programs\ngenerate log or settings files that you want to store in the ARF file.\n\nother utilities\n---------------\n\nThis package comes with a few additional scripts that do fairly specific\noperations.\n\narfx-split\n~~~~~~~~~~\n\nThis script is used to reorganize very large recordings, possibly\ncontained in multiple files, into manageable chunks. Each new entry is\ngiven an updated timestamp and attributes from the source entries.\nCurrently, no effort is made to splice data across entries or files.\nThis may result in some short entries. Only sampled datasets are\nprocessed.\n\narfx-oephys\n~~~~~~~~~~~\n\nConverts the output of an `open-ephys <https://open-ephys.org/>`_ recording into an ARF file. open-ephys stores its data in a big complex directory tree, which this script will navigate and store in an appropriately timestamped entry in the ARF file. Has not been tested with data from outside our lab. Example invocation::\n\n   arfx-oephys -T EXTRAC_HP -k experimenter=smm3rc -k bird=C194 -k pen=1 -k site=1 -k protocol=chorus -f C194_1_1.arf C194_2023-10-16_16-30-54_chorus/\n\nWe typically run this command before starting spike sorting to create a copy of the recording for archival.\n\narfx-collect-sampled\n~~~~~~~~~~~~~~~~~~~~\n\nThis script is used to export data into a flat binary structure. It collects\nsampled data across channels and entries into a single 2-D array. The output can\nbe stored in a multichannel wav file or in a raw binary ``dat`` format (N\nsamples by M channels), which is used by a wide variety of spike-sorting tools.\nWe use this script if we ever have to re-sort a recording after deleting the\noriginal raw recording.\n\narfx-select\n~~~~~~~~~~~\n\nThis is a pretty specialized script that takes in a table of segments defined by entry name and start/stop time and copies them to a new ARF file. It's usually better to just write analysis code to directly access the desired data from the original file, but it can be useful as a first stage in exporting small segments of a recording to wave files for sharing or depositing.\n\nextending arfx\n--------------\n\nAdditional formats for reading and writing can be added using the Python\nsetuptools plugin system. Plugins must be registered in the ``arfx.io``\nentry point group, with a name corresponding to the extension of the\nfile format handled by the plugin.\n\nAn arfx IO plugin is a class with the following required methods:\n\n``__init__(path, mode, **attributes)``: Opens the file at ``path``. The\n``mode`` argument specifies whether the file is opened for reading\n(``r``), writing (``w``), or appending (``a``). Must throw an\n``IOError`` if the file does not exist or cannot be created, and a\n``ValueError`` if the specified value for ``mode`` is not supported. The\nadditional ``attributes`` arguments specify metadata to be stored in the\nfile when created. **arfx** will pass all attributes of the channel and\nentry (e.g., ``channels``, ``sampling_rate``, ``units``, and\n``datatype``) when opening a file for writing. This method may issue a\n``ValueError`` if the caller fails to set a required attribute, or\nattempts to set an attribute inconsistent with the data format.\nUnsupported attributes should be ignored.\n\n``read()``: Reads the contents of the opened file and returns the data\nin a format suitable for storage in an ARF file. Specifically, it must\nbe an acceptable type for the ``arf.entry.add_data()`` method (see\nhttps://github.com/melizalab/arf for documentation).\n\n``write(data)``: Writes data to the file. Must issue an ``IOError`` if\nthe file is opened in the wrong mode, and ``TypeError`` if the data\nformat is not correct for the file format.\n\n``timestamp``: A readable property giving the time point of the data.\nThe value may be a scalar indicating the number of seconds since the\nepoch, or a two-element sequence giving the number of seconds and\nmicroseconds since the epoch. If this property is writable it will be\nset by **arfx** when writing data.\n\n``sampling_rate``: A property indicating the sampling rate of the data\nin the file (or current entry), in units of Hz.\n\nThe class may also define the following methods and properties. If any\nproperty is not defined, it is assumed to have the default value defined\nbelow.\n\n``nentries``: A readable property indicating the number of entries in\nthe file. Default value is 1.\n\n``entry``: A readable and writable integer-valued property corresponding\nto the index of the currently active entry in the file. Active means\nthat the ``read()`` and ``write()`` methods will affect only that entry.\nDefault is 0, and **arfx** will not attempt to change the property if\n``nentries`` is 1.\n\nversion information\n-------------------\n\n**arfx** uses semantic versioning and is synchronized with the\nmajor/minor version numbers of the arf package specification.\n",
    "bugtrack_url": null,
    "license": "BSD 3-Clause License",
    "summary": "Advanced Recording Format Tools",
    "version": "2.7.1",
    "project_urls": {
        "Homepage": "https://github.com/melizalab/arfx"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "42620204efcb4afb7b849d80b90e9270aa0629312f5341b030cad3beda536e78",
                "md5": "dccdc3007886122814da4aa441d1128a",
                "sha256": "60ce9673ec5fb4543d4e05ddde5a6a4592805496026737f8490f50cbd75bae8b"
            },
            "downloads": -1,
            "filename": "arfx-2.7.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dccdc3007886122814da4aa441d1128a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 48340,
            "upload_time": "2025-01-03T23:41:17",
            "upload_time_iso_8601": "2025-01-03T23:41:17.530872Z",
            "url": "https://files.pythonhosted.org/packages/42/62/0204efcb4afb7b849d80b90e9270aa0629312f5341b030cad3beda536e78/arfx-2.7.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d76a801b6bb3477f208289c3ef9feff3e0f113ef49f145e9662da62615b83fd3",
                "md5": "4452920a6360d07e42f4fdab6dc73565",
                "sha256": "b6fe18e7b0330d9dd56ee5ac8b72120dd3f06ab6cd71eb76f8deb0e8bb46b1d0"
            },
            "downloads": -1,
            "filename": "arfx-2.7.1.tar.gz",
            "has_sig": false,
            "md5_digest": "4452920a6360d07e42f4fdab6dc73565",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 36746,
            "upload_time": "2025-01-03T23:41:19",
            "upload_time_iso_8601": "2025-01-03T23:41:19.179763Z",
            "url": "https://files.pythonhosted.org/packages/d7/6a/801b6bb3477f208289c3ef9feff3e0f113ef49f145e9662da62615b83fd3/arfx-2.7.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-03 23:41:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "melizalab",
    "github_project": "arfx",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "arf",
            "specs": [
                [
                    "==",
                    "2.6.4"
                ]
            ]
        },
        {
            "name": "ewave",
            "specs": [
                [
                    "==",
                    "1.0.7"
                ]
            ]
        },
        {
            "name": "h5py",
            "specs": [
                [
                    "==",
                    "3.9.0"
                ]
            ]
        },
        {
            "name": "natsort",
            "specs": [
                [
                    "==",
                    "8.4.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.24.4"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    "==",
                    "4.66.3"
                ]
            ]
        }
    ],
    "lcname": "arfx"
}

None