<h1 align="center">
PyStow
</h1>
<p align="center">
<a href="https://github.com/cthoyt/pystow/actions">
<img src="https://github.com/cthoyt/pystow/workflows/Tests/badge.svg" alt="Build status" height="20" />
</a>
<a href="https://pypi.org/project/pystow">
<img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/pystow">
</a>
<a href='https://opensource.org/licenses/MIT'>
<img src='https://img.shields.io/badge/License-MIT-blue.svg' alt='License'/>
</a>
<a href='https://pystow.readthedocs.io/en/latest/?badge=latest'>
<img src='https://readthedocs.org/projects/pystow/badge/?version=latest' alt='Documentation Status' />
</a>
<a href="https://zenodo.org/badge/latestdoi/318194121">
<img src="https://zenodo.org/badge/318194121.svg" alt="DOI">
</a>
<a href="https://github.com/psf/black">
<img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Code style: black">
</a>
</p>
👜 Easily pick a place to store data for your python code.
## 🚀 Getting Started
Get a directory for your application.
```python
import pystow
# Get a directory (as a pathlib.Path) for ~/.data/pykeen
pykeen_directory = pystow.join('pykeen')
# Get a subdirectory (as a pathlib.Path) for ~/.data/pykeen/experiments
pykeen_experiments_directory = pystow.join('pykeen', 'experiments')
# You can go as deep as you want
pykeen_deep_directory = pystow.join('pykeen', 'experiments', 'a', 'b', 'c')
```
If you reuse the same directory structure a lot, you can save them in a module:
```python
import pystow
pykeen_module = pystow.module("pykeen")
# Access the module's directory with .base
assert pystow.join("pykeen") == pystow.module("pykeen").base
# Get a subdirectory (as a pathlib.Path) for ~/.data/pykeen/experiments
pykeen_experiments_directory = pykeen_module.join('experiments')
# You can go as deep as you want past the original "pykeen" module
pykeen_deep_directory = pykeen_module.join('experiments', 'a', 'b', 'c')
```
Get a file path for your application by adding the `name` keyword argument. This is made explicit so PyStow knows which
parent directories to automatically create. This works with `pystow` or any module you create with `pystow.module`.
```python
import pystow
# Get a directory (as a pathlib.Path) for ~/.data/indra/database.tsv
indra_database_path = pystow.join('indra', 'database', name='database.tsv')
```
Ensure a file from the internet is available in your application's directory:
```python
import pystow
url = 'https://raw.githubusercontent.com/pykeen/pykeen/master/src/pykeen/datasets/nations/test.txt'
path = pystow.ensure('pykeen', 'datasets', 'nations', url=url)
```
Ensure a tabular data file from the internet and load it for usage (requires `pip install pandas`):
```python
import pystow
import pandas as pd
url = 'https://raw.githubusercontent.com/pykeen/pykeen/master/src/pykeen/datasets/nations/test.txt'
df: pd.DataFrame = pystow.ensure_csv('pykeen', 'datasets', 'nations', url=url)
```
Ensure a comma-separated tabular data file from the internet and load it for usage (requires `pip install pandas`):
```python
import pystow
import pandas as pd
url = 'https://raw.githubusercontent.com/cthoyt/pystow/main/tests/resources/test_1.csv'
df: pd.DataFrame = pystow.ensure_csv('pykeen', 'datasets', 'nations', url=url, read_csv_kwargs=dict(sep=","))
```
Ensure a RDF file from the internet and load it for usage (requires `pip install rdflib`)
```python
import pystow
import rdflib
url = 'https://ftp.expasy.org/databases/rhea/rdf/rhea.rdf.gz'
rdf_graph: rdflib.Graph = pystow.ensure_rdf('rhea', url=url)
```
Also see `pystow.ensure_excel()`, `pystow.ensure_rdf()`, `pystow.ensure_zip_df()`, and `pystow.ensure_tar_df()`.
If your data comes with a lot of different files in an archive,
you can ensure the archive is downloaded and get specific files from it:
```python
import numpy as np
import pystow
url = "https://cloud.enterprise.informatik.uni-leipzig.de/index.php/s/LHPbMCre7SLqajB/download/MultiKE_D_Y_15K_V1.zip"
# the path inside the archive to the file you want
inner_path = "MultiKE/D_Y_15K_V1/721_5fold/1/20210219183115/ent_embeds.npy"
with pystow.ensure_open_zip("kiez", url=url, inner_path=inner_path) as file:
emb = np.load(file)
```
Also see `pystow.module.ensure_open_lzma()`, `pystow.module.ensure_open_tarfile()` and `pystow.module.ensure_open_gz()`.
## ⚙️️ Configuration
By default, data is stored in the `$HOME/.data` directory. By default, the `<app>` app will create the
`$HOME/.data/<app>` folder.
If you want to use an alternate folder name to `.data` inside the home directory, you can set the `PYSTOW_NAME`
environment variable. For example, if you set `PYSTOW_NAME=mydata`, then the following code for the `pykeen` app will
create the `$HOME/mydata/pykeen/` directory:
```python
import os
import pystow
# Only for demonstration purposes. You should set environment
# variables either with your .bashrc or in the command line REPL.
os.environ['PYSTOW_NAME'] = 'mydata'
# Get a directory (as a pathlib.Path) for ~/mydata/pykeen
pykeen_directory = pystow.join('pykeen')
```
If you want to specify a completely custom directory that isn't relative to your home directory, you can set
the `PYSTOW_HOME` environment variable. For example, if you set `PYSTOW_HOME=/usr/local/`, then the following code for
the `pykeen` app will create the `/usr/local/pykeen/` directory:
```python
import os
import pystow
# Only for demonstration purposes. You should set environment
# variables either with your .bashrc or in the command line REPL.
os.environ['PYSTOW_HOME'] = '/usr/local/'
# Get a directory (as a pathlib.Path) for /usr/local/pykeen
pykeen_directory = pystow.join('pykeen')
```
Note: if you set `PYSTOW_HOME`, then `PYSTOW_NAME` is disregarded.
### X Desktop Group (XDG) Compatibility
While PyStow's main goal is to make application data less opaque and less
hidden, some users might want to use the
[XDG specifications](http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html)
for storing their app data.
If you set the environment variable `PYSTOW_USE_APPDIRS` to `true` or `True`, then the
[`appdirs`](https://pypi.org/project/appdirs/) package will be used to choose
the base directory based on the `user data dir` option. This can still be
overridden by `PYSTOW_HOME`.
## 🚀 Installation
The most recent release can be installed from
[PyPI](https://pypi.org/project/pystow/) with:
```bash
$ pip install pystow
```
Note, as of v0.3.0, Python 3.6 isn't officially supported (its
end-of-life was in December 2021). For the time being, `pystow` might still
work on py36, but this is only coincidental.
The most recent code and data can be installed directly from GitHub with:
```bash
$ pip install git+https://github.com/cthoyt/pystow.git
```
To install in development mode, use the following:
```bash
$ git clone git+https://github.com/cthoyt/pystow.git
$ cd pystow
$ pip install -e .
```
## ⚖️ License
The code in this package is licensed under the MIT License.
Raw data
{
"_id": null,
"home_page": "https://github.com/cthoyt/pystow",
"name": "pystow",
"maintainer": "Charles Tapley Hoyt",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "cthoyt@gmail.com",
"keywords": "caching, file management",
"author": "Charles Tapley Hoyt",
"author_email": "cthoyt@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/a5/9a/7677a2b8101da65f30d49144910088168ea3acd1ecbe4ee989dfae1f41bc/pystow-0.5.4.tar.gz",
"platform": null,
"description": "<h1 align=\"center\">\n PyStow\n</h1>\n\n<p align=\"center\">\n <a href=\"https://github.com/cthoyt/pystow/actions\">\n <img src=\"https://github.com/cthoyt/pystow/workflows/Tests/badge.svg\" alt=\"Build status\" height=\"20\" />\n </a>\n\n <a href=\"https://pypi.org/project/pystow\">\n <img alt=\"PyPI - Python Version\" src=\"https://img.shields.io/pypi/pyversions/pystow\">\n </a>\n\n <a href='https://opensource.org/licenses/MIT'>\n <img src='https://img.shields.io/badge/License-MIT-blue.svg' alt='License'/>\n </a>\n\n <a href='https://pystow.readthedocs.io/en/latest/?badge=latest'>\n <img src='https://readthedocs.org/projects/pystow/badge/?version=latest' alt='Documentation Status' />\n </a>\n\n <a href=\"https://zenodo.org/badge/latestdoi/318194121\">\n <img src=\"https://zenodo.org/badge/318194121.svg\" alt=\"DOI\">\n </a>\n\n <a href=\"https://github.com/psf/black\">\n <img src=\"https://img.shields.io/badge/code%20style-black-000000.svg\" alt=\"Code style: black\">\n </a>\n</p>\n\n\ud83d\udc5c Easily pick a place to store data for your python code.\n\n## \ud83d\ude80 Getting Started\n\nGet a directory for your application.\n\n```python\nimport pystow\n\n# Get a directory (as a pathlib.Path) for ~/.data/pykeen\npykeen_directory = pystow.join('pykeen')\n\n# Get a subdirectory (as a pathlib.Path) for ~/.data/pykeen/experiments\npykeen_experiments_directory = pystow.join('pykeen', 'experiments')\n\n# You can go as deep as you want\npykeen_deep_directory = pystow.join('pykeen', 'experiments', 'a', 'b', 'c')\n```\n\nIf you reuse the same directory structure a lot, you can save them in a module:\n\n```python\nimport pystow\n\npykeen_module = pystow.module(\"pykeen\")\n\n# Access the module's directory with .base\nassert pystow.join(\"pykeen\") == pystow.module(\"pykeen\").base\n\n# Get a subdirectory (as a pathlib.Path) for ~/.data/pykeen/experiments\npykeen_experiments_directory = pykeen_module.join('experiments')\n\n# You can go as deep as you want past the original \"pykeen\" module\npykeen_deep_directory = pykeen_module.join('experiments', 'a', 'b', 'c')\n```\n\nGet a file path for your application by adding the `name` keyword argument. This is made explicit so PyStow knows which\nparent directories to automatically create. This works with `pystow` or any module you create with `pystow.module`.\n\n```python\nimport pystow\n\n# Get a directory (as a pathlib.Path) for ~/.data/indra/database.tsv\nindra_database_path = pystow.join('indra', 'database', name='database.tsv')\n```\n\nEnsure a file from the internet is available in your application's directory:\n\n```python\nimport pystow\n\nurl = 'https://raw.githubusercontent.com/pykeen/pykeen/master/src/pykeen/datasets/nations/test.txt'\npath = pystow.ensure('pykeen', 'datasets', 'nations', url=url)\n```\n\nEnsure a tabular data file from the internet and load it for usage (requires `pip install pandas`):\n\n```python\nimport pystow\nimport pandas as pd\n\nurl = 'https://raw.githubusercontent.com/pykeen/pykeen/master/src/pykeen/datasets/nations/test.txt'\ndf: pd.DataFrame = pystow.ensure_csv('pykeen', 'datasets', 'nations', url=url)\n```\n\nEnsure a comma-separated tabular data file from the internet and load it for usage (requires `pip install pandas`):\n\n```python\nimport pystow\nimport pandas as pd\n\nurl = 'https://raw.githubusercontent.com/cthoyt/pystow/main/tests/resources/test_1.csv'\ndf: pd.DataFrame = pystow.ensure_csv('pykeen', 'datasets', 'nations', url=url, read_csv_kwargs=dict(sep=\",\"))\n```\n\nEnsure a RDF file from the internet and load it for usage (requires `pip install rdflib`)\n\n```python\nimport pystow\nimport rdflib\n\nurl = 'https://ftp.expasy.org/databases/rhea/rdf/rhea.rdf.gz'\nrdf_graph: rdflib.Graph = pystow.ensure_rdf('rhea', url=url)\n```\n\nAlso see `pystow.ensure_excel()`, `pystow.ensure_rdf()`, `pystow.ensure_zip_df()`, and `pystow.ensure_tar_df()`.\n\nIf your data comes with a lot of different files in an archive,\nyou can ensure the archive is downloaded and get specific files from it:\n\n```python\nimport numpy as np\nimport pystow\n\nurl = \"https://cloud.enterprise.informatik.uni-leipzig.de/index.php/s/LHPbMCre7SLqajB/download/MultiKE_D_Y_15K_V1.zip\"\n# the path inside the archive to the file you want\ninner_path = \"MultiKE/D_Y_15K_V1/721_5fold/1/20210219183115/ent_embeds.npy\"\nwith pystow.ensure_open_zip(\"kiez\", url=url, inner_path=inner_path) as file:\n emb = np.load(file)\n```\n\nAlso see `pystow.module.ensure_open_lzma()`, `pystow.module.ensure_open_tarfile()` and `pystow.module.ensure_open_gz()`.\n\n## \u2699\ufe0f\ufe0f Configuration\n\nBy default, data is stored in the `$HOME/.data` directory. By default, the `<app>` app will create the\n`$HOME/.data/<app>` folder.\n\nIf you want to use an alternate folder name to `.data` inside the home directory, you can set the `PYSTOW_NAME`\nenvironment variable. For example, if you set `PYSTOW_NAME=mydata`, then the following code for the `pykeen` app will\ncreate the `$HOME/mydata/pykeen/` directory:\n\n```python\nimport os\nimport pystow\n\n# Only for demonstration purposes. You should set environment\n# variables either with your .bashrc or in the command line REPL.\nos.environ['PYSTOW_NAME'] = 'mydata'\n\n# Get a directory (as a pathlib.Path) for ~/mydata/pykeen\npykeen_directory = pystow.join('pykeen')\n```\n\nIf you want to specify a completely custom directory that isn't relative to your home directory, you can set\nthe `PYSTOW_HOME` environment variable. For example, if you set `PYSTOW_HOME=/usr/local/`, then the following code for\nthe `pykeen` app will create the `/usr/local/pykeen/` directory:\n\n```python\nimport os\nimport pystow\n\n# Only for demonstration purposes. You should set environment\n# variables either with your .bashrc or in the command line REPL.\nos.environ['PYSTOW_HOME'] = '/usr/local/'\n\n# Get a directory (as a pathlib.Path) for /usr/local/pykeen\npykeen_directory = pystow.join('pykeen')\n```\n\nNote: if you set `PYSTOW_HOME`, then `PYSTOW_NAME` is disregarded.\n\n### X Desktop Group (XDG) Compatibility\n\nWhile PyStow's main goal is to make application data less opaque and less\nhidden, some users might want to use the\n[XDG specifications](http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html)\nfor storing their app data.\n\nIf you set the environment variable `PYSTOW_USE_APPDIRS` to `true` or `True`, then the\n[`appdirs`](https://pypi.org/project/appdirs/) package will be used to choose\nthe base directory based on the `user data dir` option. This can still be\noverridden by `PYSTOW_HOME`.\n\n## \ud83d\ude80 Installation\n\nThe most recent release can be installed from\n[PyPI](https://pypi.org/project/pystow/) with:\n\n```bash\n$ pip install pystow\n```\n\nNote, as of v0.3.0, Python 3.6 isn't officially supported (its\nend-of-life was in December 2021). For the time being, `pystow` might still\nwork on py36, but this is only coincidental.\n\nThe most recent code and data can be installed directly from GitHub with:\n\n```bash\n$ pip install git+https://github.com/cthoyt/pystow.git\n```\n\nTo install in development mode, use the following:\n\n```bash\n$ git clone git+https://github.com/cthoyt/pystow.git\n$ cd pystow\n$ pip install -e .\n```\n\n## \u2696\ufe0f License\n\nThe code in this package is licensed under the MIT License.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Easily pick a place to store data for your python package.",
"version": "0.5.4",
"project_urls": {
"Bug Tracker": "https://github.com/cthoyt/pystow/issues",
"Download": "https://github.com/cthoyt/pystow/releases",
"Homepage": "https://github.com/cthoyt/pystow"
},
"split_keywords": [
"caching",
" file management"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "da756b5b2817f28344112fd01163fbe2df4906850a235badcf6a7b730da784db",
"md5": "2de69b7fe72917d84c2715b94b5f6ce2",
"sha256": "c377cc9fff11127007e60eb5c4dc18f2ffd986c0d0cec27134cdcd4c805bc7d8"
},
"downloads": -1,
"filename": "pystow-0.5.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2de69b7fe72917d84c2715b94b5f6ce2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 32575,
"upload_time": "2024-04-01T19:38:33",
"upload_time_iso_8601": "2024-04-01T19:38:33.687682Z",
"url": "https://files.pythonhosted.org/packages/da/75/6b5b2817f28344112fd01163fbe2df4906850a235badcf6a7b730da784db/pystow-0.5.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a59a7677a2b8101da65f30d49144910088168ea3acd1ecbe4ee989dfae1f41bc",
"md5": "4a28960235761a72ac2eeafc7ad71e59",
"sha256": "2692180cb405bd77259bee6c7f4db545d10e81939980064730609f21750567ff"
},
"downloads": -1,
"filename": "pystow-0.5.4.tar.gz",
"has_sig": false,
"md5_digest": "4a28960235761a72ac2eeafc7ad71e59",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 45662,
"upload_time": "2024-04-01T19:38:35",
"upload_time_iso_8601": "2024-04-01T19:38:35.891192Z",
"url": "https://files.pythonhosted.org/packages/a5/9a/7677a2b8101da65f30d49144910088168ea3acd1ecbe4ee989dfae1f41bc/pystow-0.5.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-01 19:38:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "cthoyt",
"github_project": "pystow",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "pystow"
}