# DCIC-wrangling
This is a collection of scripts and Jupyter notebooks that can be helpful when performing many data wrangling tasks. Most of the tools are specific to 4DN Nucleome wrangling needs, however may be modified to be more generally useful for certain tasks.
## Install
Packaged with `poetry` can be installed using `make`, `poetry` or `pip`.
From the dcicwrangling directory - `make build`
If you already have poetry installed - `poetry install`
Or to pip install from PyPi - `pip install dcicwrangling`
All dependencies are installed by default - if for some reason you don't want to install `pytest` packages or `invoke` (used to launch notebooks) you can do `poetry install --no-dev` - not recommended.
## Usage
### Jupyter notebooks
There are a collection of commonly used jupyter notebooks in the `notebooks/useful_notebooks` directory. You can start a jupyter notebook server locally using `invoke notebook` from the top level directory. This should launch the server and open a browser page where the notebooks can be accessed.
**IMPORTANT!** - You should create your own folder in the `notebooks` directory named `Yourname_scripts`. This folder is where you should create, access and run your notebooks. If you want to start with one of the notebooks in the useful_notebooks directory please create a copy and move it to your own folder. This keeps the repository clean and organized. Please **DO NOT** run notebooks in the useful_notebooks directory and commit the results to the repository.
### Scripts
The scripts directory contains some useful command line scripts. They can be run from the top level directory using `python scripts/script_name --options`. Using `--help` shows available options. In general, modified versions and bespoke scripts should not be committed to the repository - or alternatively committed to a separate non-master branch.
As scripts are developed and refined `tool.poetry.scripts` directives can be added to facilitate script usage - see `pyproject.toml` file example.
Raw data
{
"_id": null,
"home_page": "https://github.com/4dn-dcic/dcicwrangling",
"name": "dcicwrangling",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.8.0",
"maintainer_email": null,
"keywords": null,
"author": "4DN-DCIC Team",
"author_email": "support@4dnucleome.org",
"download_url": "https://files.pythonhosted.org/packages/82/14/4e84ffc1c8df231ad73009950a70b00acea7549c8d80fd8ff1c59f94a134/dcicwrangling-3.2.0.tar.gz",
"platform": null,
"description": "# DCIC-wrangling\n\nThis is a collection of scripts and Jupyter notebooks that can be helpful when performing many data wrangling tasks. Most of the tools are specific to 4DN Nucleome wrangling needs, however may be modified to be more generally useful for certain tasks.\n\n## Install\n\nPackaged with `poetry` can be installed using `make`, `poetry` or `pip`.\n\nFrom the dcicwrangling directory - `make build`\n\nIf you already have poetry installed - `poetry install`\n\nOr to pip install from PyPi - `pip install dcicwrangling`\n\nAll dependencies are installed by default - if for some reason you don't want to install `pytest` packages or `invoke` (used to launch notebooks) you can do `poetry install --no-dev` - not recommended.\n\n## Usage\n\n### Jupyter notebooks\nThere are a collection of commonly used jupyter notebooks in the `notebooks/useful_notebooks` directory. You can start a jupyter notebook server locally using `invoke notebook` from the top level directory. This should launch the server and open a browser page where the notebooks can be accessed.\n\n**IMPORTANT!** - You should create your own folder in the `notebooks` directory named `Yourname_scripts`. This folder is where you should create, access and run your notebooks. If you want to start with one of the notebooks in the useful_notebooks directory please create a copy and move it to your own folder. This keeps the repository clean and organized. Please **DO NOT** run notebooks in the useful_notebooks directory and commit the results to the repository.\n\n### Scripts\n\nThe scripts directory contains some useful command line scripts. They can be run from the top level directory using `python scripts/script_name --options`. Using `--help` shows available options. In general, modified versions and bespoke scripts should not be committed to the repository - or alternatively committed to a separate non-master branch.\n\nAs scripts are developed and refined `tool.poetry.scripts` directives can be added to facilitate script usage - see `pyproject.toml` file example.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Scripts and Jupyter notebooks for 4DN wrangling",
"version": "3.2.0",
"project_urls": {
"Homepage": "https://github.com/4dn-dcic/dcicwrangling",
"Repository": "https://github.com/4dn-dcic/dcicwrangling"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "683efe286734dc36a8cbdd560910741cbd13ee24100f263a7db56f6f8ddeeeb7",
"md5": "1cd4149f7013bc9deb53db43a8ec04a9",
"sha256": "99f177e1d719a530f663d76efac7659eb22e05d225b6817bedf6b6c6e2e755fe"
},
"downloads": -1,
"filename": "dcicwrangling-3.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1cd4149f7013bc9deb53db43a8ec04a9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.8.0",
"size": 68976,
"upload_time": "2024-08-09T15:10:25",
"upload_time_iso_8601": "2024-08-09T15:10:25.775406Z",
"url": "https://files.pythonhosted.org/packages/68/3e/fe286734dc36a8cbdd560910741cbd13ee24100f263a7db56f6f8ddeeeb7/dcicwrangling-3.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "82144e84ffc1c8df231ad73009950a70b00acea7549c8d80fd8ff1c59f94a134",
"md5": "b9c9ecbc70aa316b615cdddfb8a9e28e",
"sha256": "0365b00e13fd7d4be524c2e2d74e105d64250bbcfba7328f97d3b3dc985a3e7b"
},
"downloads": -1,
"filename": "dcicwrangling-3.2.0.tar.gz",
"has_sig": false,
"md5_digest": "b9c9ecbc70aa316b615cdddfb8a9e28e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.8.0",
"size": 57745,
"upload_time": "2024-08-09T15:10:27",
"upload_time_iso_8601": "2024-08-09T15:10:27.544859Z",
"url": "https://files.pythonhosted.org/packages/82/14/4e84ffc1c8df231ad73009950a70b00acea7549c8d80fd8ff1c59f94a134/dcicwrangling-3.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-09 15:10:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "4dn-dcic",
"github_project": "dcicwrangling",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "dcicwrangling"
}