# taigapy
![Run tests](https://github.com/broadinstitute/taigapy/workflows/Run%20tests/badge.svg)
Python client for fetching datafiles from and creating/updating datasets in [Taiga](https://github.com/broadinstitute/taiga).
See [taigr](https://github.com/broadinstitute/taigr) for the R client.
## Table of Contents
- [Quickstart](#quickstart)
- [Prerequisites](#prerequisites)
- [Installing](#installing)
- [Usage](#usage)
- [Get datafile as dataframe](#get-datafile-as-dataframe)
- [Download file](#download-file)
- [Create dataset](#create-dataset)
- [Update dataset](#update-dataset)
- [Get dataset metadata](#get-dataset-metadata)
- [Support](#support)
- [Development](#development)
- [Setup](#setup)
- [Running Tests](#running-tests)
- [Publishing Taigapy](#publishing-taigapy)
## Quickstart
### Prerequisites
First, you need to get your authorization token so the client library can make requests on your behalf. Go to https://cds.team/taiga/token/ and click on the "Copy" button to copy your token. Paste your token in a file at `~/.taiga/token`.
```bash
mkdir ~/.taiga/
echo YOUR_TOKEN_HERE > ~/.taiga/token
```
### Installing
Use the package manager [pip](https://pip.pypa.io/en/stable/) to install taigapy.
```bash
pip install taigapy
```
### Usage
See [docs](docs/) for the complete documentation.
#### Get datafile as dataframe
Get a NumericMatrix/HDF5 or TableCSV/Columnar file from Taiga as a [pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/frame.html)
```python
from taigapy import TaigaClient
tc = TaigaClient() # These two steps could be merged in one with `from taigapy import default_tc as tc`
df = tc.get("achilles-v2-4-6.4/data") # df is a pandas DataFrame, with data from the file 'data' in the version 4 of the dataset 'achilles-v2-4-6'
```
#### Download file
Download the raw (plaintext of Raw, CSV otherwise) file from Taiga
```python
from taigapy import default_tc as tc
path = tc.download_to_cache("achilles-v2-4-6.4/data") # path is the local path to the downloaded CSV
```
#### Create dataset
Create a new dataset in folder with id `folder_id`, with local files `upload_files` and virtual files `add_taiga_ids`.
```python
from taigapy import default_tc as tc
new_dataset_id = tc.create_dataset(
"dataset_name",
dataset_description="description", # optional (but recommended)
upload_files=[
{
"path": "path/to/file",
"name": "name of file in dataset", # optional, will use file name if not provided
"format": "Raw", # or "NumericMatrixCSV" or "TableCSV"
"encoding": "utf-8" # optional (but recommended), will use iso-8859-1 if not provided
}
],
add_taiga_ids=[
{
"taiga_id": "achilles-v2-4-6.4/data",
"name": "name in new dataset" # optional, will use name in referenced dataset if not provided (required if there is a name collision)
}
],
add_gcs_files=[
{
"gcs_path": "gs://bucket_name/file_name.extension",
"name": "name of file in dataset",
}
],
folder_id="folder_id", # optional, will default to your home folder if not provided
)
```
#### Update dataset
Create a new dataset in folder with id `folder_id`, with local files `upload_files` and virtual files `add_taiga_ids`.
```python
from taigapy import default_tc as tc
new_dataset_id = tc.update_dataset(
"dataset_permaname",
changes_description="description",
upload_files=[
{
"path": "path/to/file",
"name": "name of file in dataset", # optional, will use file name if not provided
"format": "Raw", # or "NumericMatrixCSV" or "TableCSV"
"encoding": "utf-8" # optional (but recommended), will use iso-8859-1 if not provided
}
],
add_taiga_ids=[
{
"taiga_id": "achilles-v2-4-6.4/data",
"name": "name in new dataset" # optional, will use name in referenced dataset if not provided (required if there is a name collision)
}
],
add_gcs_files=[
{
"gcs_path": "gs://bucket_name/file_name.extension",
"name": "name of file in dataset",
}
],
add_all_existing_files=True, # If True, will add all files from the base dataset version, except files with the same names as those in upload_files or add_taiga_ids
)
```
#### Get dataset metadata
Get metadata about a dataset or dataset version. See fields returned in [TaigaClient API](docs/TaigaClient%20API.md#returns-4)
```python
from taigapy import default_tc as tc
metadata = tc.get_dataset_metadata("achilles-v2-4-6.4")
```
### Support
Please [open an issue](https://github.com/broadinstitute/taigapy/issues) if you find a bug, or email yejia@broadinstitute.org for general assistance.
## Development
### Setup
In an environment with Python 3.6, run `sh setup.sh` to set up requirements and git hooks.
Run `python setup.py develop`.
### Running Tests
The fetch (i.e. `get`, `download_to_cache`, `get_dataset_metadata`, etc.) will run against the production Taiga server. The create and update dataset tests will run against your locally hosted Taiga.
To run the fetch tests, run `pytest`.
To run all the tests, [set up Taiga locally](https://github.com/broadinstitute/taiga#installing), then run `pytest --runlocal`.
### Publishing Taigapy
To create a new version, please update the version number in `taigapy/__init__.py` and `git tag` the commit with that version number. Push the tags to GitHub and create a new release with the tag. Update the [changelog](CHANGELOG.md) with the changes.
Publish a new version of taigapy to pypi by executing `publish_new_taigapy_pypi.sh`, which will do the following:
1. `rm -r dist/`
2. `python setup.py bdist_wheel --universal`
3. `twine upload dist/*`
Raw data
{
"_id": null,
"home_page": "",
"name": "taigapy",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Remi Marenco",
"author_email": "rmarenco@broadinstitute.org",
"download_url": "",
"platform": null,
"description": "# taigapy\n![Run tests](https://github.com/broadinstitute/taigapy/workflows/Run%20tests/badge.svg)\n\nPython client for fetching datafiles from and creating/updating datasets in [Taiga](https://github.com/broadinstitute/taiga).\n\nSee [taigr](https://github.com/broadinstitute/taigr) for the R client.\n\n## Table of Contents\n- [Quickstart](#quickstart)\n - [Prerequisites](#prerequisites)\n - [Installing](#installing)\n - [Usage](#usage)\n - [Get datafile as dataframe](#get-datafile-as-dataframe)\n - [Download file](#download-file)\n - [Create dataset](#create-dataset)\n - [Update dataset](#update-dataset)\n - [Get dataset metadata](#get-dataset-metadata)\n- [Support](#support)\n- [Development](#development)\n - [Setup](#setup)\n - [Running Tests](#running-tests)\n - [Publishing Taigapy](#publishing-taigapy)\n\n\n## Quickstart\n\n### Prerequisites\nFirst, you need to get your authorization token so the client library can make requests on your behalf. Go to https://cds.team/taiga/token/ and click on the \"Copy\" button to copy your token. Paste your token in a file at `~/.taiga/token`.\n\n```bash\nmkdir ~/.taiga/\necho YOUR_TOKEN_HERE > ~/.taiga/token\n```\n\n### Installing\nUse the package manager [pip](https://pip.pypa.io/en/stable/) to install taigapy.\n\n```bash\npip install taigapy\n```\n\n### Usage\nSee [docs](docs/) for the complete documentation.\n\n#### Get datafile as dataframe\nGet a NumericMatrix/HDF5 or TableCSV/Columnar file from Taiga as a [pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/frame.html)\n```python\nfrom taigapy import TaigaClient\n\ntc = TaigaClient() # These two steps could be merged in one with `from taigapy import default_tc as tc`\n\ndf = tc.get(\"achilles-v2-4-6.4/data\") # df is a pandas DataFrame, with data from the file 'data' in the version 4 of the dataset 'achilles-v2-4-6'\n```\n\n#### Download file\nDownload the raw (plaintext of Raw, CSV otherwise) file from Taiga\n```python\nfrom taigapy import default_tc as tc\n\npath = tc.download_to_cache(\"achilles-v2-4-6.4/data\") # path is the local path to the downloaded CSV\n```\n\n#### Create dataset\nCreate a new dataset in folder with id `folder_id`, with local files `upload_files` and virtual files `add_taiga_ids`.\n```python\nfrom taigapy import default_tc as tc\n\nnew_dataset_id = tc.create_dataset(\n \"dataset_name\",\n dataset_description=\"description\", # optional (but recommended)\n upload_files=[\n {\n \"path\": \"path/to/file\",\n \"name\": \"name of file in dataset\", # optional, will use file name if not provided\n \"format\": \"Raw\", # or \"NumericMatrixCSV\" or \"TableCSV\"\n \"encoding\": \"utf-8\" # optional (but recommended), will use iso-8859-1 if not provided\n }\n ],\n add_taiga_ids=[\n {\n \"taiga_id\": \"achilles-v2-4-6.4/data\",\n \"name\": \"name in new dataset\" # optional, will use name in referenced dataset if not provided (required if there is a name collision)\n }\n ],\n add_gcs_files=[\n {\n \"gcs_path\": \"gs://bucket_name/file_name.extension\",\n \"name\": \"name of file in dataset\",\n }\n ],\n folder_id=\"folder_id\", # optional, will default to your home folder if not provided\n)\n```\n\n#### Update dataset\nCreate a new dataset in folder with id `folder_id`, with local files `upload_files` and virtual files `add_taiga_ids`.\n```python\nfrom taigapy import default_tc as tc\n\nnew_dataset_id = tc.update_dataset(\n \"dataset_permaname\",\n changes_description=\"description\",\n upload_files=[\n {\n \"path\": \"path/to/file\",\n \"name\": \"name of file in dataset\", # optional, will use file name if not provided\n \"format\": \"Raw\", # or \"NumericMatrixCSV\" or \"TableCSV\"\n \"encoding\": \"utf-8\" # optional (but recommended), will use iso-8859-1 if not provided\n }\n ],\n add_taiga_ids=[\n {\n \"taiga_id\": \"achilles-v2-4-6.4/data\",\n \"name\": \"name in new dataset\" # optional, will use name in referenced dataset if not provided (required if there is a name collision)\n }\n ],\n add_gcs_files=[\n {\n \"gcs_path\": \"gs://bucket_name/file_name.extension\",\n \"name\": \"name of file in dataset\",\n }\n ],\n add_all_existing_files=True, # If True, will add all files from the base dataset version, except files with the same names as those in upload_files or add_taiga_ids\n)\n```\n\n#### Get dataset metadata\nGet metadata about a dataset or dataset version. See fields returned in [TaigaClient API](docs/TaigaClient%20API.md#returns-4)\n```python\nfrom taigapy import default_tc as tc\n\nmetadata = tc.get_dataset_metadata(\"achilles-v2-4-6.4\")\n```\n\n\n### Support\nPlease [open an issue](https://github.com/broadinstitute/taigapy/issues) if you find a bug, or email yejia@broadinstitute.org for general assistance.\n\n## Development\n### Setup\nIn an environment with Python 3.6, run `sh setup.sh` to set up requirements and git hooks.\n\nRun `python setup.py develop`. \n\n### Running Tests\nThe fetch (i.e. `get`, `download_to_cache`, `get_dataset_metadata`, etc.) will run against the production Taiga server. The create and update dataset tests will run against your locally hosted Taiga.\n\nTo run the fetch tests, run `pytest`.\n\nTo run all the tests, [set up Taiga locally](https://github.com/broadinstitute/taiga#installing), then run `pytest --runlocal`.\n\n### Publishing Taigapy\nTo create a new version, please update the version number in `taigapy/__init__.py` and `git tag` the commit with that version number. Push the tags to GitHub and create a new release with the tag. Update the [changelog](CHANGELOG.md) with the changes.\n\nPublish a new version of taigapy to pypi by executing `publish_new_taigapy_pypi.sh`, which will do the following:\n1. `rm -r dist/`\n2. `python setup.py bdist_wheel --universal`\n3. `twine upload dist/*`\n\n\n",
"bugtrack_url": null,
"license": "Creative Commons Attribution-Noncommercial-Share Alike license",
"summary": "",
"version": "3.3.5",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c30cd871ffd94fbf716f18e327f1ea3ea7e2842febff9efcb269abc68b0b1d50",
"md5": "6e59fe2e9e4a50fbe5e262e14d7e25a5",
"sha256": "0c5baba3e0cb1604a71d0243b63c82025d28b6d70584c15f50482e9be2b1bebb"
},
"downloads": -1,
"filename": "taigapy-3.3.5-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "6e59fe2e9e4a50fbe5e262e14d7e25a5",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 24376,
"upload_time": "2023-04-14T15:17:45",
"upload_time_iso_8601": "2023-04-14T15:17:45.550502Z",
"url": "https://files.pythonhosted.org/packages/c3/0c/d871ffd94fbf716f18e327f1ea3ea7e2842febff9efcb269abc68b0b1d50/taigapy-3.3.5-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-14 15:17:45",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "taigapy"
}