housekeeper


Namehousekeeper JSON
Version 4.13.0 PyPI version JSON
download
home_pagehttps://github.com/Clinical-Genomics/housekeeper
SummaryHousekeeper takes care of files
upload_time2024-04-30 07:26:15
maintainerNone
docs_urlNone
authorRobin Andeer
requires_pythonNone
licenseMIT
keywords housekeeper development
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# Housekeeper
![Housekeeper tests][github-url] [![Coverage Status][coveralls-image]][coveralls-url] [![CodeFactor][codefactor-image]][codefactor-url] [![Code style: black][black-image]][black-url]

### Store, tag, fetch, and archive files with ease 🗃

**Housekeeper** is a tool that aims to provide:

- a backend for storing versioned bundles of files
- different interfaces (Python, CLI, REST) for fetching files based on tags
- a way to backup and retrieve bundles from long-term storage

### Todo

- [ ] re-implement the archive/encryption interface [@ingkebil]
- [ ] handle clean up of expired bundles [@robinandeer]
- [ ] expand the CLI with `get` command etc. [@robinandeer]

## Installation

Housekeeper written in Python 3.6+ and is available on the [Python Package Index][pypi] (PyPI).

```bash
pip install housekeeper
```

If you would like to install the latest development version:

```bash
git clone https://github.com/Clinical-Genomics/housekeeper
cd housekeeper
pip install --editable .
```

## Contributing

Housekeeper is using github flow branching model as described in our [development manual][development manual].

## Documentation

### Command line interface

#### Config file

Housekeeper supports a very simple YAML config. The following options are supported:

```yaml
---
database: mysql+pymysql://userName:passWord@domain.com/database
root: /path/to/root/dir
```

The `root` option is used to store files within the Housekeeper context.

#### Command: `init`

Setup (or reset) the database. It will simply setup all the tables in the database. You can reset an existing database by using the `--reset` option.

```bash
housekeeper --database "sqlite:///hk.sqlite3" init
Success! New tables: bundle, file, file_tag_link, tag, version
```

#### Command: `include`

Include (hard-link) all files of an existing bundle version into Housekeeper and the `root` path.

```bash
housekeeper myBundle
```

This will only work if the bundle only has a single version which can be "imported". If you want to import a specific version of a bundle you can use the `--version` option.

#### Command: `delete files`

Delete files that are not on disk anymore like his:
`housekeeper delete files --tag fastq --notondisk`

Remove all bam files before a certain date:
`housekeeper delete files --tag bam --before 2017-06-15`

Remove fastq files from a flowcell:
`housekeeper delete files --tag fastq --tag H0HKKALXX`

It'll always ask for confirmation, unless you add --yes:
`housekeeper delete files --bundle sillyfish --yes`

If you do not provide a --tag or --bundle, essentially deleting everything, the function will not let you do that.

[pypi]: https://pypi.python.org/pypi/housekeeper/
[coveralls-url]: https://coveralls.io/r/Clinical-Genomics/housekeeper
[coveralls-image]: https://img.shields.io/coveralls/Clinical-Genomics/housekeeper.svg?style=flat-square
[github-url]: https://github.com/Clinical-Genomics/housekeeper/workflows/Housekeeper%20tests/badge.svg
[development manual]: http://www.clinicalgenomics.se/development/dev/githubflow/
[codefactor-image]: https://www.codefactor.io/repository/github/clinical-genomics/housekeeper/badge
[codefactor-url]: https://www.codefactor.io/repository/github/clinical-genomics/housekeeper
[black-image]: https://img.shields.io/badge/code%20style-black-000000.svg
[black-url]: https://github.com/psf/black

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Clinical-Genomics/housekeeper",
    "name": "housekeeper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "housekeeper development",
    "author": "Robin Andeer",
    "author_email": "mans.magnusson@scilifelab.se",
    "download_url": "https://files.pythonhosted.org/packages/15/0a/93855cec0d7de24579cb634eb6684aa790f83be56034a11ca3f69fd1e0d1/housekeeper-4.13.0.tar.gz",
    "platform": null,
    "description": "\n# Housekeeper\n![Housekeeper tests][github-url] [![Coverage Status][coveralls-image]][coveralls-url] [![CodeFactor][codefactor-image]][codefactor-url] [![Code style: black][black-image]][black-url]\n\n### Store, tag, fetch, and archive files with ease \ud83d\uddc3\n\n**Housekeeper** is a tool that aims to provide:\n\n- a backend for storing versioned bundles of files\n- different interfaces (Python, CLI, REST) for fetching files based on tags\n- a way to backup and retrieve bundles from long-term storage\n\n### Todo\n\n- [ ] re-implement the archive/encryption interface [@ingkebil]\n- [ ] handle clean up of expired bundles [@robinandeer]\n- [ ] expand the CLI with `get` command etc. [@robinandeer]\n\n## Installation\n\nHousekeeper written in Python 3.6+ and is available on the [Python Package Index][pypi] (PyPI).\n\n```bash\npip install housekeeper\n```\n\nIf you would like to install the latest development version:\n\n```bash\ngit clone https://github.com/Clinical-Genomics/housekeeper\ncd housekeeper\npip install --editable .\n```\n\n## Contributing\n\nHousekeeper is using github flow branching model as described in our [development manual][development manual].\n\n## Documentation\n\n### Command line interface\n\n#### Config file\n\nHousekeeper supports a very simple YAML config. The following options are supported:\n\n```yaml\n---\ndatabase: mysql+pymysql://userName:passWord@domain.com/database\nroot: /path/to/root/dir\n```\n\nThe `root` option is used to store files within the Housekeeper context.\n\n#### Command: `init`\n\nSetup (or reset) the database. It will simply setup all the tables in the database. You can reset an existing database by using the `--reset` option.\n\n```bash\nhousekeeper --database \"sqlite:///hk.sqlite3\" init\nSuccess! New tables: bundle, file, file_tag_link, tag, version\n```\n\n#### Command: `include`\n\nInclude (hard-link) all files of an existing bundle version into Housekeeper and the `root` path.\n\n```bash\nhousekeeper myBundle\n```\n\nThis will only work if the bundle only has a single version which can be \"imported\". If you want to import a specific version of a bundle you can use the `--version` option.\n\n#### Command: `delete files`\n\nDelete files that are not on disk anymore like his:\n`housekeeper delete files --tag fastq --notondisk`\n\nRemove all bam files before a certain date:\n`housekeeper delete files --tag bam --before 2017-06-15`\n\nRemove fastq files from a flowcell:\n`housekeeper delete files --tag fastq --tag H0HKKALXX`\n\nIt'll always ask for confirmation, unless you add --yes:\n`housekeeper delete files --bundle sillyfish --yes`\n\nIf you do not provide a --tag or --bundle, essentially deleting everything, the function will not let you do that.\n\n[pypi]: https://pypi.python.org/pypi/housekeeper/\n[coveralls-url]: https://coveralls.io/r/Clinical-Genomics/housekeeper\n[coveralls-image]: https://img.shields.io/coveralls/Clinical-Genomics/housekeeper.svg?style=flat-square\n[github-url]: https://github.com/Clinical-Genomics/housekeeper/workflows/Housekeeper%20tests/badge.svg\n[development manual]: http://www.clinicalgenomics.se/development/dev/githubflow/\n[codefactor-image]: https://www.codefactor.io/repository/github/clinical-genomics/housekeeper/badge\n[codefactor-url]: https://www.codefactor.io/repository/github/clinical-genomics/housekeeper\n[black-image]: https://img.shields.io/badge/code%20style-black-000000.svg\n[black-url]: https://github.com/psf/black\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Housekeeper takes care of files",
    "version": "4.13.0",
    "project_urls": {
        "Homepage": "https://github.com/Clinical-Genomics/housekeeper"
    },
    "split_keywords": [
        "housekeeper",
        "development"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e5a6ca1dc28ab47e02b0b02d76f14365480bd411a230bafe3d06a8271047fae8",
                "md5": "7d8071c82bdf20dc2943f823e75547c8",
                "sha256": "f30f9f31762fe547eaebd8cc6fe386fb986ecc4cf99c1cedd1a01d2274be0ee1"
            },
            "downloads": -1,
            "filename": "housekeeper-4.13.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7d8071c82bdf20dc2943f823e75547c8",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 34970,
            "upload_time": "2024-04-30T07:26:13",
            "upload_time_iso_8601": "2024-04-30T07:26:13.755683Z",
            "url": "https://files.pythonhosted.org/packages/e5/a6/ca1dc28ab47e02b0b02d76f14365480bd411a230bafe3d06a8271047fae8/housekeeper-4.13.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "150a93855cec0d7de24579cb634eb6684aa790f83be56034a11ca3f69fd1e0d1",
                "md5": "bc9f59bd43663b4255be7df6a8a7c7ad",
                "sha256": "46f43e3f3c8669506e1becd88d7f1635781f60610fea6464e607e8aabbe4fde9"
            },
            "downloads": -1,
            "filename": "housekeeper-4.13.0.tar.gz",
            "has_sig": false,
            "md5_digest": "bc9f59bd43663b4255be7df6a8a7c7ad",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 49805,
            "upload_time": "2024-04-30T07:26:15",
            "upload_time_iso_8601": "2024-04-30T07:26:15.705052Z",
            "url": "https://files.pythonhosted.org/packages/15/0a/93855cec0d7de24579cb634eb6684aa790f83be56034a11ca3f69fd1e0d1/housekeeper-4.13.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-30 07:26:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Clinical-Genomics",
    "github_project": "housekeeper",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "housekeeper"
}
        
Elapsed time: 0.24652s