rdata


Namerdata JSON
Version 0.11.2 PyPI version JSON
download
home_page
SummaryRead R datasets from Python.
upload_time2024-03-04 12:39:07
maintainer
docs_urlNone
author
requires_python>=3.9
licenseMIT License Copyright (c) 2018 Rdata developers. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords rdata r dataset
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            rdata
=====

|build-status| |docs| |coverage| |repostatus| |versions| |pypi| |conda| |zenodo| |pyOpenSci|

Read R datasets from Python.

..
	Github does not support include in README for dubious security reasons, so
	we copy-paste instead. Also Github does not understand Sphinx directives.
	.. include:: docs/index.rst
	.. include:: docs/simpleusage.rst

The package rdata offers a lightweight way to import R datasets/objects stored
in the ".rda" and ".rds" formats into Python.
Its main advantages are:

- It is a pure Python implementation, with no dependencies on the R language or
  related libraries.
  Thus, it can be used anywhere where Python is supported, including the web
  using `Pyodide <https://pyodide.org/>`__.
- It attempt to support all R objects that can be meaningfully translated.
  As opposed to other solutions, you are no limited to import dataframes or
  data with a particular structure.
- It allows users to easily customize the conversion of R classes to Python
  ones.
  Does your data use custom R classes?
  Worry no longer, as it is possible to define custom conversions to the Python
  classes of your choosing.
- It has a permissive license (MIT). As opposed to other packages that depend
  on R libraries and thus need to adhere to the GPL license, you can use rdata
  as a dependency on MIT, BSD or even closed source projects.
	
Installation
============

rdata is on PyPi and can be installed using :code:`pip`:

.. code::

   pip install rdata

It is also available for :code:`conda` using the :code:`conda-forge` channel:

.. code::

   conda install -c conda-forge rdata
   
Installing the develop version
------------------------------

The current version from the develop branch can be installed as

.. code::

   pip install git+https://github.com/vnmabus/rdata.git@develop

Documentation
=============

The documentation of rdata is in
`ReadTheDocs <https://rdata.readthedocs.io/>`__.

Examples
========

Examples of use are available in
`ReadTheDocs <https://rdata.readthedocs.io/en/stable/auto_examples/>`__.
	
Simple usage
============

Read a R dataset
----------------

The common way of reading an R dataset is the following one:

.. code:: python

    import rdata

    converted = rdata.read_rda(rdata.TESTDATA_PATH / "test_vector.rda")
    converted
    
which results in

.. code::

    {'test_vector': array([1., 2., 3.])}

Under the hood, this is equivalent to the following code:

.. code:: python

    import rdata

    parsed = rdata.parser.parse_file(rdata.TESTDATA_PATH / "test_vector.rda")
    converted = rdata.conversion.convert(parsed)
    converted
    
This consists on two steps: 

#. First, the file is parsed using the function
   `rdata.parser.parse_file <https://rdata.readthedocs.io/en/latest/modules/rdata.parser.parse_file.html>`__.
   This provides a literal description of the
   file contents as a hierarchy of Python objects representing the basic R
   objects. This step is unambiguous and always the same.
#. Then, each object must be converted to an appropriate Python object. In this
   step there are several choices on which Python type is the most appropriate
   as the conversion for a given R object. Thus, we provide a default
   `rdata.conversion.convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html>`__
   routine, which tries to select Python objects that preserve most information
   of the original R object. For custom R classes, it is also possible to
   specify conversion routines to Python objects.
   
Convert custom R classes
------------------------

The basic
`convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html>`__
routine only constructs a
`SimpleConverter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html>`__
object and calls its
`convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html#rdata.conversion.SimpleConverter.convert>`__
method. All arguments of
`convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html>`__
are directly passed to the
`SimpleConverter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html>`__
initialization method.

It is possible, although not trivial, to make a custom
`Converter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.Converter.html>`__
object to change the way in which the
basic R objects are transformed to Python objects. However, a more common
situation is that one does not want to change how basic R objects are
converted, but instead wants to provide conversions for specific R classes.
This can be done by passing a dictionary to the
`SimpleConverter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html>`__
initialization method, containing
as keys the names of R classes and as values, callables that convert a
R object of that class to a Python object. By default, the dictionary used
is
`DEFAULT_CLASS_MAP <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.DEFAULT_CLASS_MAP.html>`__,
which can convert commonly used R classes such as
`data.frame <https://www.rdocumentation.org/packages/base/topics/data.frame>`__
and `factor <https://www.rdocumentation.org/packages/base/topics/factor>`__.

As an example, here is how we would implement a conversion routine for the
factor class to
`bytes <https://docs.python.org/3/library/stdtypes.html#bytes>`__
objects, instead of the default conversion to
Pandas
`Categorical <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Categorical.html#pandas.Categorical>`__ objects:

.. code:: python

    import rdata

    def factor_constructor(obj, attrs):
        values = [bytes(attrs['levels'][i - 1], 'utf8')
                  if i >= 0 else None for i in obj]
   
        return values

    new_dict = {
        **rdata.conversion.DEFAULT_CLASS_MAP,
        "factor": factor_constructor
    }

    converted = rdata.read_rda(
        rdata.TESTDATA_PATH / "test_dataframe.rda",
        constructor_dict=new_dict,
    )
    converted
    
which has the following result:

.. code::

    {'test_dataframe':   class  value
        1     b'a'      1
        2     b'b'      2
        3     b'b'      3}
    
Additional examples
===================

Additional examples illustrating the functionalities of this package can be
found in the
`ReadTheDocs documentation <https://rdata.readthedocs.io/en/latest/auto_examples/index.html>`__.


.. |build-status| image:: https://github.com/vnmabus/rdata/actions/workflows/main.yml/badge.svg?branch=master
    :alt: build status
    :scale: 100%
    :target: https://github.com/vnmabus/rdata/actions/workflows/main.yml

.. |docs| image:: https://readthedocs.org/projects/rdata/badge/?version=latest
    :alt: Documentation Status
    :scale: 100%
    :target: https://rdata.readthedocs.io/en/latest/?badge=latest
    
.. |coverage| image:: http://codecov.io/github/vnmabus/rdata/coverage.svg?branch=develop
    :alt: Coverage Status
    :scale: 100%
    :target: https://codecov.io/gh/vnmabus/rdata/branch/develop

.. |repostatus| image:: https://www.repostatus.org/badges/latest/active.svg
   :alt: Project Status: Active – The project has reached a stable, usable state and is being actively developed.
   :target: https://www.repostatus.org/#active

.. |versions| image:: https://img.shields.io/pypi/pyversions/rdata
   :alt: PyPI - Python Version
   :scale: 100%
    
.. |pypi| image:: https://badge.fury.io/py/rdata.svg
    :alt: Pypi version
    :scale: 100%
    :target: https://pypi.python.org/pypi/rdata/

.. |conda| image:: https://anaconda.org/conda-forge/rdata/badges/version.svg
    :alt: Conda version
    :scale: 100%
    :target: https://anaconda.org/conda-forge/rdata

.. |zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.6382237.svg
    :alt: Zenodo DOI
    :scale: 100%
    :target: https://doi.org/10.5281/zenodo.6382237
    
.. |pyOpenSci| image:: https://tinyurl.com/y22nb8up
    :alt: pyOpenSci: Peer reviewed
    :scale: 100%
    :target: https://github.com/pyOpenSci/software-submission/issues/144

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "rdata",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "Carlos Ramos Carre\u00f1o <vnmabus@gmail.com>",
    "keywords": "rdata,r,dataset",
    "author": "",
    "author_email": "Carlos Ramos Carre\u00f1o <vnmabus@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/47/07/97936fdd91fb71b4d48e0f72da65e35b40f992819ddf793abf390dc0f06e/rdata-0.11.2.tar.gz",
    "platform": null,
    "description": "rdata\n=====\n\n|build-status| |docs| |coverage| |repostatus| |versions| |pypi| |conda| |zenodo| |pyOpenSci|\n\nRead R datasets from Python.\n\n..\n\tGithub does not support include in README for dubious security reasons, so\n\twe copy-paste instead. Also Github does not understand Sphinx directives.\n\t.. include:: docs/index.rst\n\t.. include:: docs/simpleusage.rst\n\nThe package rdata offers a lightweight way to import R datasets/objects stored\nin the \".rda\" and \".rds\" formats into Python.\nIts main advantages are:\n\n- It is a pure Python implementation, with no dependencies on the R language or\n  related libraries.\n  Thus, it can be used anywhere where Python is supported, including the web\n  using `Pyodide <https://pyodide.org/>`__.\n- It attempt to support all R objects that can be meaningfully translated.\n  As opposed to other solutions, you are no limited to import dataframes or\n  data with a particular structure.\n- It allows users to easily customize the conversion of R classes to Python\n  ones.\n  Does your data use custom R classes?\n  Worry no longer, as it is possible to define custom conversions to the Python\n  classes of your choosing.\n- It has a permissive license (MIT). As opposed to other packages that depend\n  on R libraries and thus need to adhere to the GPL license, you can use rdata\n  as a dependency on MIT, BSD or even closed source projects.\n\t\nInstallation\n============\n\nrdata is on PyPi and can be installed using :code:`pip`:\n\n.. code::\n\n   pip install rdata\n\nIt is also available for :code:`conda` using the :code:`conda-forge` channel:\n\n.. code::\n\n   conda install -c conda-forge rdata\n   \nInstalling the develop version\n------------------------------\n\nThe current version from the develop branch can be installed as\n\n.. code::\n\n   pip install git+https://github.com/vnmabus/rdata.git@develop\n\nDocumentation\n=============\n\nThe documentation of rdata is in\n`ReadTheDocs <https://rdata.readthedocs.io/>`__.\n\nExamples\n========\n\nExamples of use are available in\n`ReadTheDocs <https://rdata.readthedocs.io/en/stable/auto_examples/>`__.\n\t\nSimple usage\n============\n\nRead a R dataset\n----------------\n\nThe common way of reading an R dataset is the following one:\n\n.. code:: python\n\n    import rdata\n\n    converted = rdata.read_rda(rdata.TESTDATA_PATH / \"test_vector.rda\")\n    converted\n    \nwhich results in\n\n.. code::\n\n    {'test_vector': array([1., 2., 3.])}\n\nUnder the hood, this is equivalent to the following code:\n\n.. code:: python\n\n    import rdata\n\n    parsed = rdata.parser.parse_file(rdata.TESTDATA_PATH / \"test_vector.rda\")\n    converted = rdata.conversion.convert(parsed)\n    converted\n    \nThis consists on two steps: \n\n#. First, the file is parsed using the function\n   `rdata.parser.parse_file <https://rdata.readthedocs.io/en/latest/modules/rdata.parser.parse_file.html>`__.\n   This provides a literal description of the\n   file contents as a hierarchy of Python objects representing the basic R\n   objects. This step is unambiguous and always the same.\n#. Then, each object must be converted to an appropriate Python object. In this\n   step there are several choices on which Python type is the most appropriate\n   as the conversion for a given R object. Thus, we provide a default\n   `rdata.conversion.convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html>`__\n   routine, which tries to select Python objects that preserve most information\n   of the original R object. For custom R classes, it is also possible to\n   specify conversion routines to Python objects.\n   \nConvert custom R classes\n------------------------\n\nThe basic\n`convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html>`__\nroutine only constructs a\n`SimpleConverter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html>`__\nobject and calls its\n`convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html#rdata.conversion.SimpleConverter.convert>`__\nmethod. All arguments of\n`convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html>`__\nare directly passed to the\n`SimpleConverter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html>`__\ninitialization method.\n\nIt is possible, although not trivial, to make a custom\n`Converter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.Converter.html>`__\nobject to change the way in which the\nbasic R objects are transformed to Python objects. However, a more common\nsituation is that one does not want to change how basic R objects are\nconverted, but instead wants to provide conversions for specific R classes.\nThis can be done by passing a dictionary to the\n`SimpleConverter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html>`__\ninitialization method, containing\nas keys the names of R classes and as values, callables that convert a\nR object of that class to a Python object. By default, the dictionary used\nis\n`DEFAULT_CLASS_MAP <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.DEFAULT_CLASS_MAP.html>`__,\nwhich can convert commonly used R classes such as\n`data.frame <https://www.rdocumentation.org/packages/base/topics/data.frame>`__\nand `factor <https://www.rdocumentation.org/packages/base/topics/factor>`__.\n\nAs an example, here is how we would implement a conversion routine for the\nfactor class to\n`bytes <https://docs.python.org/3/library/stdtypes.html#bytes>`__\nobjects, instead of the default conversion to\nPandas\n`Categorical <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Categorical.html#pandas.Categorical>`__ objects:\n\n.. code:: python\n\n    import rdata\n\n    def factor_constructor(obj, attrs):\n        values = [bytes(attrs['levels'][i - 1], 'utf8')\n                  if i >= 0 else None for i in obj]\n   \n        return values\n\n    new_dict = {\n        **rdata.conversion.DEFAULT_CLASS_MAP,\n        \"factor\": factor_constructor\n    }\n\n    converted = rdata.read_rda(\n        rdata.TESTDATA_PATH / \"test_dataframe.rda\",\n        constructor_dict=new_dict,\n    )\n    converted\n    \nwhich has the following result:\n\n.. code::\n\n    {'test_dataframe':   class  value\n        1     b'a'      1\n        2     b'b'      2\n        3     b'b'      3}\n    \nAdditional examples\n===================\n\nAdditional examples illustrating the functionalities of this package can be\nfound in the\n`ReadTheDocs documentation <https://rdata.readthedocs.io/en/latest/auto_examples/index.html>`__.\n\n\n.. |build-status| image:: https://github.com/vnmabus/rdata/actions/workflows/main.yml/badge.svg?branch=master\n    :alt: build status\n    :scale: 100%\n    :target: https://github.com/vnmabus/rdata/actions/workflows/main.yml\n\n.. |docs| image:: https://readthedocs.org/projects/rdata/badge/?version=latest\n    :alt: Documentation Status\n    :scale: 100%\n    :target: https://rdata.readthedocs.io/en/latest/?badge=latest\n    \n.. |coverage| image:: http://codecov.io/github/vnmabus/rdata/coverage.svg?branch=develop\n    :alt: Coverage Status\n    :scale: 100%\n    :target: https://codecov.io/gh/vnmabus/rdata/branch/develop\n\n.. |repostatus| image:: https://www.repostatus.org/badges/latest/active.svg\n   :alt: Project Status: Active \u2013 The project has reached a stable, usable state and is being actively developed.\n   :target: https://www.repostatus.org/#active\n\n.. |versions| image:: https://img.shields.io/pypi/pyversions/rdata\n   :alt: PyPI - Python Version\n   :scale: 100%\n    \n.. |pypi| image:: https://badge.fury.io/py/rdata.svg\n    :alt: Pypi version\n    :scale: 100%\n    :target: https://pypi.python.org/pypi/rdata/\n\n.. |conda| image:: https://anaconda.org/conda-forge/rdata/badges/version.svg\n    :alt: Conda version\n    :scale: 100%\n    :target: https://anaconda.org/conda-forge/rdata\n\n.. |zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.6382237.svg\n    :alt: Zenodo DOI\n    :scale: 100%\n    :target: https://doi.org/10.5281/zenodo.6382237\n    \n.. |pyOpenSci| image:: https://tinyurl.com/y22nb8up\n    :alt: pyOpenSci: Peer reviewed\n    :scale: 100%\n    :target: https://github.com/pyOpenSci/software-submission/issues/144\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2018 Rdata developers.  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Read R datasets from Python.",
    "version": "0.11.2",
    "project_urls": {
        "documentation": "https://rdata.readthedocs.io",
        "homepage": "https://github.com/vnmabus/rdata",
        "repository": "https://github.com/vnmabus/rdata"
    },
    "split_keywords": [
        "rdata",
        "r",
        "dataset"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "df0b56f33362cb4e4319e7de8dff31ea1f27517df8f4087066bc946b2272324d",
                "md5": "a7c3b853b047e16643ba2bd1138a174a",
                "sha256": "d819241bcec2aaaf5d267256cbdbcbe4fcbfae66b605e7a34980049f80521450"
            },
            "downloads": -1,
            "filename": "rdata-0.11.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a7c3b853b047e16643ba2bd1138a174a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 46478,
            "upload_time": "2024-03-04T12:39:05",
            "upload_time_iso_8601": "2024-03-04T12:39:05.558833Z",
            "url": "https://files.pythonhosted.org/packages/df/0b/56f33362cb4e4319e7de8dff31ea1f27517df8f4087066bc946b2272324d/rdata-0.11.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "470797936fdd91fb71b4d48e0f72da65e35b40f992819ddf793abf390dc0f06e",
                "md5": "abccd933dc71996425e4241e78927008",
                "sha256": "86f50312f97569c656f01d6dc343b920ded0ccf884a31decfb670cbef80bab39"
            },
            "downloads": -1,
            "filename": "rdata-0.11.2.tar.gz",
            "has_sig": false,
            "md5_digest": "abccd933dc71996425e4241e78927008",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 38531,
            "upload_time": "2024-03-04T12:39:07",
            "upload_time_iso_8601": "2024-03-04T12:39:07.358941Z",
            "url": "https://files.pythonhosted.org/packages/47/07/97936fdd91fb71b4d48e0f72da65e35b40f992819ddf793abf390dc0f06e/rdata-0.11.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-04 12:39:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "vnmabus",
    "github_project": "rdata",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "rdata"
}
        
Elapsed time: 8.10423s