datarig

Name	datarig JSON
Version	1.0.0 JSON
	download
home_page
Summary	Simplified Downloading from Data Repositories with RESTful APIS
upload_time	2023-04-27 16:35:39
maintainer
docs_url	None
author
requires_python	>=3.9
license
keywords	data repositories rest packaging
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <h1 align="center">
    <img src="https://github.com/mscaudill/datarig/blob/master/docs/imgs/logo.png" 
    style="width:700px;height:auto;"/>
</h1>

<p align="center">
  <a href="https://github.com/mscaudill/datarig/blob/master/LICENSE"><img
    src="https://img.shields.io/badge/License-BSD%203--Clause-teal" 
    alt="DataRig is released under the BSD 3-Clause license." />
  </a>
  <a href="https://github.com/mscaudill/datarig/tree/master#Dependencies"><img 
    src="https://img.shields.io/pypi/pyversions/datarig?logo=python&logoColor=gold" 
    alt="Python versions supported." />
  </a>
<a href="https://github.com/mscaudill/openseize/actions/workflows/test.yml"><img 
    src="https://img.shields.io/github/actions/workflow/status/mscaudill/datarig/test.yml?label=CI&logo=github" 
    alt="DataRig's test status" />
  </a>
 <a href="https://github.com/mscaudill/datarig/pulls"><img 
    src="https://img.shields.io/badge/PRs-welcome-F8A3A3"
    alt="Pull Request Welcomed!" />
  </a>
</p>

<p align="center"  style="font-size: 20px">
<a href="#Key-Features">Features</a>   |  
<a href="#Installation">Installation</a>   |  
<a href="#Dependencies">Dependencies</a>   |  
<a href="#Documentation">Documentation</a>   |  
<a href="#Attribution">Attribution</a>   |  
<a href="#Contributions">Contributions</a>   |  
<a href="#Issues">Issues</a>   |  
<a href="#Acknowledgements">Acknowledgements</a> 
</p>

# Features
Providing large testing and demo data alongside your package releases is
challenging for two reasons. First, code repositories have strict limits on file
sizes. Second, you don't want your users to wait forever to download your cool
package because you've included large data files.  If you're a python developer
and have hit these issues then <b><a href=https://github.com/mscaudill/datarig
target=_blank>DataRig</a></b> is for you.  DataRig allows you to
move data from web-based repositories into your user's local directories
post-installation. This "just-in-time" data fetching is perfect for users to
test or run your package's demos.

# Installation
DataRig can be installed into your projects environment using pip:

1. Activate the virtual or conda environment of your package
```Shell
$ source <YOUR_ENV>/bin/activate # python virtual environment
```

```Shell
$ conda activate <YOUR_ENV>
```

2. Install DataRig to your active environment
```Shell
(<YOUR_ENV>)$ pip install datarig
```

# Dependencies

DataRig is super lightweight requiring just <b>Python <span>&#8805;</span>
3.9</b> and the request library available here:

<table>

<tr>
    <th>package</th>
    <th>pypi</th>
    <th>conda</th>
  </tr>

<tr>
    <td><a href="https://requests.readthedocs.io/en/latest/" 
        target=_blank>requests</a></td>
    <td>https://pypi.org/project/requests/</td>
    <td align='center'><span>&#10003;</span></td>
  </tr>

</table>

# Documentation
Using DataRig to access a repository is simple. Just build a <b>Record</b>
instance and all the data will be at your fingertips. Here's how to do it for
a sample Zenodo repository:
```Shell
$ ipython
```
```python
>>> from datarig import Zenodo
>>> # set the url to the api endpoint url for the record id 7868945
>>> url = 'http://zenodo.org/api/records/7868945'
>>> record = Zenodo(url)
```
This record contains all of the repositories information stored as attributes.
To see everything at once, just print the record.
```python
>>> print(record)
```
You will see a datasets attribute with a list of Dataset objects. These Datasets
contain the name, url link, size and file type of the datasets that can be
downloaded from the repository record. Let's print each of them.
```python
>>> for dset in record.datasets:
...     print(dset)
```
Notice that a Dataset instance describes the data but does not contain the
actual data. To get the data to your machine, you call call the records
'download' method. Let's get help for this method before calling it.
```python
>>> help(record.download)
```
To call this method we need a directory to place the downloaded data, the name
of the dataset to download, the amount of memory to use during downloading
(chunksize) and a boolean of whether the download should be streamed to disk.
Streaming is usually the right choice since the files you will download are
likely large. Let's download the "sample_arr.npy" file from this record into
your current working dir.
```python
>>> from pathlib import Path
>>> record.download(directory=None, name='sample_arr.npy')
```

That's it! You've just downloaded a dataset from a Zenodo record :sunglasses:


# Attribution
If you find DataRig useful, please cite the Zenodo archive of this repository.

If you really like DataRig, you can also star the <a
href=https://github.com/mscaudill/datarig>repository</a> 
<span>&#11088;</span>!

# Contributions
Contributions are what makes open-source fun and we would love for you to
contribute. Please check out our [contribution guide](
https://github.com/mscaudill/datarig/blob/master/.github/CONTRIBUTING.md)
to get started.

# Issues

DataRig provides custom issue templates for filing bugs, requesting
feature enhancements, suggesting documentation changes, or just asking
questions. *Ready to discuss?* File an issue <a
href=https://github.com/mscaudill/datarig/issues/new/choose>here</a>. 

# Acknowledgements

**This work is generously supported through the Ting Tsung and Wei Fong Chao 
Foundation and the National Institute of Neurological Disorders and Stroke 
(Grant 2R01 NS100738-05A1).**

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "datarig",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "data,repositories,REST,packaging",
    "author": "",
    "author_email": "Matthew Caudill <mscaudill@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/ca/d5/aea409a3404889dbc7d028784c0fced4c023e92373ce5850af8bb9d1fcf4/datarig-1.0.0.tar.gz",
    "platform": null,
    "description": "<h1 align=\"center\">\n    <img src=\"https://github.com/mscaudill/datarig/blob/master/docs/imgs/logo.png\" \n    style=\"width:700px;height:auto;\"/>\n</h1>\n\n<p align=\"center\">\n  <a href=\"https://github.com/mscaudill/datarig/blob/master/LICENSE\"><img\n    src=\"https://img.shields.io/badge/License-BSD%203--Clause-teal\" \n    alt=\"DataRig is released under the BSD 3-Clause license.\" />\n  </a>\n  <a href=\"https://github.com/mscaudill/datarig/tree/master#Dependencies\"><img \n    src=\"https://img.shields.io/pypi/pyversions/datarig?logo=python&logoColor=gold\" \n    alt=\"Python versions supported.\" />\n  </a>\n<a href=\"https://github.com/mscaudill/openseize/actions/workflows/test.yml\"><img \n    src=\"https://img.shields.io/github/actions/workflow/status/mscaudill/datarig/test.yml?label=CI&logo=github\" \n    alt=\"DataRig's test status\" />\n  </a>\n <a href=\"https://github.com/mscaudill/datarig/pulls\"><img \n    src=\"https://img.shields.io/badge/PRs-welcome-F8A3A3\"\n    alt=\"Pull Request Welcomed!\" />\n  </a>\n</p>\n\n<p align=\"center\"  style=\"font-size: 20px\">\n<a href=\"#Key-Features\">Features</a>   |  \n<a href=\"#Installation\">Installation</a>   |  \n<a href=\"#Dependencies\">Dependencies</a>   |  \n<a href=\"#Documentation\">Documentation</a>   |  \n<a href=\"#Attribution\">Attribution</a>   |  \n<a href=\"#Contributions\">Contributions</a>   |  \n<a href=\"#Issues\">Issues</a>   |  \n<a href=\"#Acknowledgements\">Acknowledgements</a> \n</p>\n\n# Features\nProviding large testing and demo data alongside your package releases is\nchallenging for two reasons. First, code repositories have strict limits on file\nsizes. Second, you don't want your users to wait forever to download your cool\npackage because you've included large data files.  If you're a python developer\nand have hit these issues then <b><a href=https://github.com/mscaudill/datarig\ntarget=_blank>DataRig</a></b> is for you.  DataRig allows you to\nmove data from web-based repositories into your user's local directories\npost-installation. This \"just-in-time\" data fetching is perfect for users to\ntest or run your package's demos.\n\n# Installation\nDataRig can be installed into your projects environment using pip:\n\n1. Activate the virtual or conda environment of your package\n```Shell\n$ source <YOUR_ENV>/bin/activate # python virtual environment\n```\n\n```Shell\n$ conda activate <YOUR_ENV>\n```\n\n2. Install DataRig to your active environment\n```Shell\n(<YOUR_ENV>)$ pip install datarig\n```\n\n# Dependencies\n\nDataRig is super lightweight requiring just <b>Python <span>&#8805;</span>\n3.9</b> and the request library available here:\n\n<table>\n\n<tr>\n    <th>package</th>\n    <th>pypi</th>\n    <th>conda</th>\n  </tr>\n\n<tr>\n    <td><a href=\"https://requests.readthedocs.io/en/latest/\" \n        target=_blank>requests</a></td>\n    <td>https://pypi.org/project/requests/</td>\n    <td align='center'><span>&#10003;</span></td>\n  </tr>\n\n</table>\n\n# Documentation\nUsing DataRig to access a repository is simple. Just build a <b>Record</b>\ninstance and all the data will be at your fingertips. Here's how to do it for\na sample Zenodo repository:\n```Shell\n$ ipython\n```\n```python\n>>> from datarig import Zenodo\n>>> # set the url to the api endpoint url for the record id 7868945\n>>> url = 'http://zenodo.org/api/records/7868945'\n>>> record = Zenodo(url)\n```\nThis record contains all of the repositories information stored as attributes.\nTo see everything at once, just print the record.\n```python\n>>> print(record)\n```\nYou will see a datasets attribute with a list of Dataset objects. These Datasets\ncontain the name, url link, size and file type of the datasets that can be\ndownloaded from the repository record. Let's print each of them.\n```python\n>>> for dset in record.datasets:\n...     print(dset)\n```\nNotice that a Dataset instance describes the data but does not contain the\nactual data. To get the data to your machine, you call call the records\n'download' method. Let's get help for this method before calling it.\n```python\n>>> help(record.download)\n```\nTo call this method we need a directory to place the downloaded data, the name\nof the dataset to download, the amount of memory to use during downloading\n(chunksize) and a boolean of whether the download should be streamed to disk.\nStreaming is usually the right choice since the files you will download are\nlikely large. Let's download the \"sample_arr.npy\" file from this record into\nyour current working dir.\n```python\n>>> from pathlib import Path\n>>> record.download(directory=None, name='sample_arr.npy')\n```\n\nThat's it! You've just downloaded a dataset from a Zenodo record :sunglasses:\n\n\n# Attribution\nIf you find DataRig useful, please cite the Zenodo archive of this repository.\n\nIf you really like DataRig, you can also star the <a\nhref=https://github.com/mscaudill/datarig>repository</a> \n<span>&#11088;</span>!\n\n# Contributions\nContributions are what makes open-source fun and we would love for you to\ncontribute. Please check out our [contribution guide](\nhttps://github.com/mscaudill/datarig/blob/master/.github/CONTRIBUTING.md)\nto get started.\n\n# Issues\n\nDataRig provides custom issue templates for filing bugs, requesting\nfeature enhancements, suggesting documentation changes, or just asking\nquestions. *Ready to discuss?* File an issue <a\nhref=https://github.com/mscaudill/datarig/issues/new/choose>here</a>. \n\n# Acknowledgements\n\n**This work is generously supported through the Ting Tsung and Wei Fong Chao \nFoundation and the National Institute of Neurological Disorders and Stroke \n(Grant 2R01 NS100738-05A1).**\n\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Simplified Downloading from Data Repositories with RESTful APIS",
    "version": "1.0.0",
    "split_keywords": [
        "data",
        "repositories",
        "rest",
        "packaging"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ff9ef7e5107b02c3c39de9a8b7df02484db2723a1b273fef9bfdc80bdae9f2ce",
                "md5": "fe54c14a849489c7a4403b519c6424b8",
                "sha256": "91216753b6b97fdc07dc47e56bd2b3f134f7fa0edec2eb0d7cef5b8e30c38c16"
            },
            "downloads": -1,
            "filename": "datarig-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fe54c14a849489c7a4403b519c6424b8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 9329,
            "upload_time": "2023-04-27T16:35:37",
            "upload_time_iso_8601": "2023-04-27T16:35:37.643133Z",
            "url": "https://files.pythonhosted.org/packages/ff/9e/f7e5107b02c3c39de9a8b7df02484db2723a1b273fef9bfdc80bdae9f2ce/datarig-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cad5aea409a3404889dbc7d028784c0fced4c023e92373ce5850af8bb9d1fcf4",
                "md5": "096549c3d6cae1276cba5b8ecc5af754",
                "sha256": "a5c69d642503201bd833e2749a2282ef68bc0cc8b5890a3c44cce81cef01f1a4"
            },
            "downloads": -1,
            "filename": "datarig-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "096549c3d6cae1276cba5b8ecc5af754",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 12528,
            "upload_time": "2023-04-27T16:35:39",
            "upload_time_iso_8601": "2023-04-27T16:35:39.467614Z",
            "url": "https://files.pythonhosted.org/packages/ca/d5/aea409a3404889dbc7d028784c0fced4c023e92373ce5850af8bb9d1fcf4/datarig-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-27 16:35:39",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "datarig"
}