[![PyPI version](https://img.shields.io/pypi/v/blackbricks.svg?logo=pypi&logoColor=FFE873)](https://pypi.org/project/blackbricks/)
[![Downloads](https://pepy.tech/badge/blackbricks)](https://pepy.tech/project/blackbricks)
[![Downloads per month](https://pepy.tech/badge/blackbricks/month)](https://pepy.tech/project/blackbricks/month)
[![License](https://img.shields.io/pypi/l/blackbricks)](LICENSE)
[![Code style: Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
# Blackbricks
A formatting tool for your Databricks notebooks.
- Python cells are formatted with [black](https://github.com/psf/black)
- SQL cells are formatted with [sqlparse](https://github.com/andialbrecht/sqlparse)
## Table of Contents
* [Installation](#installation)
* [Usage](#usage)
* [Version control integration](#version-control-integration)
* [Contributing](#contributing)
* [FAQ](#faq)
* [Breaking changes](#breaking-changes)
## Installation
While you can use `pip` directly, you should prefer using [pipx](https://pypa.github.io/pipx/).
```bash
$ pipx install blackbricks
```
You probably also want to have installed the `databricks-cli`, in order to use `blackbricks` directly on your notebooks.
``` bash
$ pipx install databricks-cli
$ databricks configure # Required in order to use `blackbricks` on remote notebooks.
```
## Usage
You can use `blackbricks` on Python notebook files stored locally, or directly on the notebooks stored in Databricks.
For the most part, `blackbricks` operates very similarly to `black`.
``` bash
$ blackbricks notebook1.py notebook2.py # Formats both notebooks.
$ blackbricks notebook_directory/ # Formats every notebook under the directory (recursively).
```
An important difference is that `blackbricks` will ignore any file that does not contain the `# Databricks notebook
source` header on the first line. Databricks adds this line to all Python notebooks. This means you can happily run
`blackbricks` on a directory with both notebooks and regular Python files, and `blackbricks` won't touch the latter.
If you specify the `-r` or `--remote` flag, `blackbricks` will work directly on your notebooks stored in Databricks.
``` bash
$ blackbricks --remote /Users/username/notebook.py
$ blackbricks --remote /Repos/username/repo-name/notebook.py
```
### Full usage
```text
$ poetry run blackbricks --help
Usage: blackbricks [OPTIONS] [FILENAMES]...
Formatting tool for Databricks python notebooks.
Python cells are formatted using `black`, and SQL cells are formatted by `sqlparse`.
Local files (without the `--remote` option):
- Only files that look like Databricks (Python) notebooks will be processed. That is,
they must start with the header `# Databricks notebook source`
- If you specify a directory as one of the file names, all files in that directory will
be added, including any subdirectory.
Remote files (with the `--remote` option):
- Make sure you have installed the Databricks CLI (``pip install databricks_cli``)
- Make sure you have configured at least one profile (`databricks configure`). Check the
file `~/.databrickscfg` if you are not sure.
- File paths should start with `/`. Otherwise they are interpreted as relative to
`/Users/username`, where `username` is the username specified in the Databricks profile
used.
╭─ Arguments ────────────────────────────────────────────────────────────────────────────╮
│ filenames [FILENAMES]... Path to the notebook(s) to format. [default: None] │
╰────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --remote -r If this option is used, │
│ all filenames are treated │
│ as paths to notebooks on │
│ your Databricks host (i.e. │
│ not local files). │
│ --profile -p NAME If using --remote, which │
│ Databricks profile to use. │
│ [default: DEFAULT] │
│ --line-length INTEGER How many characters per │
│ line to allow. │
│ [default: 88] │
│ --sql-upper --no-sql-upper SQL keywords should be │
│ UPPERCASE or lowercase. │
│ [default: sql-upper] │
│ --check Don't write the files │
│ back, just return the │
│ status. Return code 0 │
│ means nothing would │
│ change. │
│ --diff Don't write the files │
│ back, just output a diff │
│ for each file on stdout. │
│ --version Display version │
│ information and exit. │
│ --help Show this message and │
│ exit. │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```
## Version control integration
Use [pre-commit](https://pre-commit.com). Add a `.pre-commit-config.yaml` file
to your repo with the following content (changing/removing the `args` as you
wish):
```yaml
repos:
- repo: https://github.com/inspera/blackbricks
rev: 1.0.0
hooks:
- id: blackbricks
args: [--line-length=120]
```
Set the `rev` attribute to the most recent version of `blackbricks`.
The `args` are optional and can be used to set any of `blackbricks` options.
## Contributing
If you find blackbricks useful, feel free to say so with a star. If you think it is utterly broken, you are more than
welcome to contribute improvements. Please open an issue first to discuss what you want added/fixed. Unless you are just
adding tests. In that case your pull request is extremely likely to be merged right away.
## FAQ
### Can I disable SQL formatting?
Sure! Certain SQL statements might not be parsed and indented properly by `sqlparse`, and the result can be jumbled
formatting. You can disable SQL formatting for a cell by adding `-- nofmt` to the very first line of a cell:
```sql
%sql -- nofmt
select this,
sql_will, -- be kept just
like_this
from if_that_is.what_you_need
```
### How do I use `blackbricks` on my Databricks notebooks?
First, make sure you have set up `databricks-cli` on your system (see [installation](#installation)), and that you have
at least one profile setup in `~/.databrickscfg`. As an example:
```cfg
# File: ~/.databrickscfg
[DEFAULT]
host = https://dbc-b23456-a1243.cloud.databricks.com/
username = username@example.com
password = dapi12345678901234567890
[OTHERPROFILE]
host = https://dbc-c54321-d234.cloud.databricks.com
username = name.user@example.com
password = dapi09876543211234567890
```
You should use [access tokens](https://docs.databricks.com/dev-tools/api/latest/authentication.html) instead of your actual password.
You can then do:
``` bash
$ blackbricks --remote /Users/username@example.com/notebook.py # Uses DEFAULT profile.
$ blackbricks --remote notebook.py # Equivalent to the above.
$ blackbricks --remote --profile OTHERPROFILE /Users/name.user@example.com/notebook.py
$ blackbricks --remote --profile OTHERPROFILE notebook.py # Equivalent to the above.
$ blackbricks --remote /Repos/username@example.com/repo-name/notebook.py # Targeting notebook in a Repo
```
### Can you run blackbricks while using Databricks in the browser?
No. See https://github.com/inspera/blackbricks/issues/27 for why.
However, Databricks now allows you to [format your notebooks with black directly](https://docs.databricks.com/notebooks/notebooks-use.html#format-code-cells).
### I get an error: `TypeError: init() got an unexpected keyword argument 'no_args_is_help'`
This means you had an old version of `click` installed from before, and your installation didn't upgrade it
automatically. Updating your installation should do the trick, e.g. `pip install -U blackbricks` or similar depending on
your installation method of choice.
### Shell commands like `!ls` throws an error
See https://github.com/inspera/blackbricks/issues/21.
## Breaking changes
### Version policy
Style choices made by `blackbricks` will follow semantic versioning, with changes that cause differences resulting in
new major versions. Such changes will be kept to an absolute minimum, with none currently planned.
Style choices made by `black` (responsible for 95% of the formatting in a notebook) will not follow the same strict
semantic versioning. This is because `black` itself does not use semver, but instead provide a [year-based
policy](https://black.readthedocs.io/en/stable/the_black_code_style/index.html#stability-policy). `blackbricks` will
make a _minor_ version increase when it upgrades black to a new year. Such a bump should be made once the new year's
release of `black` is available. Feel free to open an issue if this has not been done yet.
### Breaking changes with version 2.0
Notebooks will be terminated with a `\n` starting with version `2.0.0`. This harmonizes EOF handling and should be much
less annoying in practice than prior versions. This causes a diff on _any_ notebook that was previously formatted with
`blackbricks<2.0.0`.
Also, the deprecated and non-functional flag for two space indentation is removed, and providing said flag is now an error.
### Breaking changes with version 1.0
Earlier versions of blackbricks applied a patched version of black in order to allow two-space indentation. This was
done because Databricks used two-space indentation, and did not allow you to change that.
Since then, Databricks has added the option to choose. Because you can now choose, blackbricks re-joins black in being
uncompromising, and since version 1.0 you can no longer choose anything but 4 space indentation.
If you _must_ keep using two-space indentation, then stick to versions `<1.0`.
Raw data
{
"_id": null,
"home_page": "https://github.com/inspera/blackbricks",
"name": "blackbricks",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.8",
"maintainer_email": null,
"keywords": "automation, formatter, black, sql, yapf, autopep8, pyfmt, gofmt, rustfmt",
"author": "Bendik Samseth",
"author_email": "bendik.samseth@inspera.no",
"download_url": "https://files.pythonhosted.org/packages/be/63/ad5369165f7a980069184bf6ae979be3a3c7203b315884f7bae78b3e3859/blackbricks-2.2.0.tar.gz",
"platform": null,
"description": "[![PyPI version](https://img.shields.io/pypi/v/blackbricks.svg?logo=pypi&logoColor=FFE873)](https://pypi.org/project/blackbricks/)\n[![Downloads](https://pepy.tech/badge/blackbricks)](https://pepy.tech/project/blackbricks)\n[![Downloads per month](https://pepy.tech/badge/blackbricks/month)](https://pepy.tech/project/blackbricks/month)\n[![License](https://img.shields.io/pypi/l/blackbricks)](LICENSE)\n[![Code style: Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n# Blackbricks\n\nA formatting tool for your Databricks notebooks.\n\n- Python cells are formatted with [black](https://github.com/psf/black)\n- SQL cells are formatted with [sqlparse](https://github.com/andialbrecht/sqlparse)\n\n## Table of Contents\n\n* [Installation](#installation)\n* [Usage](#usage)\n* [Version control integration](#version-control-integration)\n* [Contributing](#contributing)\n* [FAQ](#faq)\n* [Breaking changes](#breaking-changes)\n\n## Installation\n\nWhile you can use `pip` directly, you should prefer using [pipx](https://pypa.github.io/pipx/).\n\n```bash\n$ pipx install blackbricks\n```\n\nYou probably also want to have installed the `databricks-cli`, in order to use `blackbricks` directly on your notebooks.\n\n``` bash\n$ pipx install databricks-cli\n$ databricks configure # Required in order to use `blackbricks` on remote notebooks.\n```\n\n## Usage\nYou can use `blackbricks` on Python notebook files stored locally, or directly on the notebooks stored in Databricks. \n\nFor the most part, `blackbricks` operates very similarly to `black`.\n\n``` bash\n$ blackbricks notebook1.py notebook2.py # Formats both notebooks.\n$ blackbricks notebook_directory/ # Formats every notebook under the directory (recursively).\n```\nAn important difference is that `blackbricks` will ignore any file that does not contain the `# Databricks notebook\nsource` header on the first line. Databricks adds this line to all Python notebooks. This means you can happily run\n`blackbricks` on a directory with both notebooks and regular Python files, and `blackbricks` won't touch the latter.\n\nIf you specify the `-r` or `--remote` flag, `blackbricks` will work directly on your notebooks stored in Databricks.\n\n``` bash\n$ blackbricks --remote /Users/username/notebook.py\n$ blackbricks --remote /Repos/username/repo-name/notebook.py\n```\n\n### Full usage\n\n```text\n$ poetry run blackbricks --help\n\n Usage: blackbricks [OPTIONS] [FILENAMES]...\n\n Formatting tool for Databricks python notebooks.\n Python cells are formatted using `black`, and SQL cells are formatted by `sqlparse`.\n Local files (without the `--remote` option):\n - Only files that look like Databricks (Python) notebooks will be processed. That is,\n they must start with the header `# Databricks notebook source`\n - If you specify a directory as one of the file names, all files in that directory will\n be added, including any subdirectory.\n Remote files (with the `--remote` option):\n - Make sure you have installed the Databricks CLI (``pip install databricks_cli``)\n - Make sure you have configured at least one profile (`databricks configure`). Check the\n file `~/.databrickscfg` if you are not sure.\n - File paths should start with `/`. Otherwise they are interpreted as relative to\n `/Users/username`, where `username` is the username specified in the Databricks profile\n used.\n\n\u256d\u2500 Arguments \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 filenames [FILENAMES]... Path to the notebook(s) to format. [default: None] \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 --remote -r If this option is used, \u2502\n\u2502 all filenames are treated \u2502\n\u2502 as paths to notebooks on \u2502\n\u2502 your Databricks host (i.e. \u2502\n\u2502 not local files). \u2502\n\u2502 --profile -p NAME If using --remote, which \u2502\n\u2502 Databricks profile to use. \u2502\n\u2502 [default: DEFAULT] \u2502\n\u2502 --line-length INTEGER How many characters per \u2502\n\u2502 line to allow. \u2502\n\u2502 [default: 88] \u2502\n\u2502 --sql-upper --no-sql-upper SQL keywords should be \u2502\n\u2502 UPPERCASE or lowercase. \u2502\n\u2502 [default: sql-upper] \u2502\n\u2502 --check Don't write the files \u2502\n\u2502 back, just return the \u2502\n\u2502 status. Return code 0 \u2502\n\u2502 means nothing would \u2502\n\u2502 change. \u2502\n\u2502 --diff Don't write the files \u2502\n\u2502 back, just output a diff \u2502\n\u2502 for each file on stdout. \u2502\n\u2502 --version Display version \u2502\n\u2502 information and exit. \u2502\n\u2502 --help Show this message and \u2502\n\u2502 exit. \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n```\n\n\n\n## Version control integration\n\nUse [pre-commit](https://pre-commit.com). Add a `.pre-commit-config.yaml` file\nto your repo with the following content (changing/removing the `args` as you\nwish): \n\n```yaml\nrepos:\n- repo: https://github.com/inspera/blackbricks\n rev: 1.0.0\n hooks:\n - id: blackbricks\n args: [--line-length=120]\n```\n\nSet the `rev` attribute to the most recent version of `blackbricks`.\nThe `args` are optional and can be used to set any of `blackbricks` options.\n\n## Contributing\n\nIf you find blackbricks useful, feel free to say so with a star. If you think it is utterly broken, you are more than\nwelcome to contribute improvements. Please open an issue first to discuss what you want added/fixed. Unless you are just\nadding tests. In that case your pull request is extremely likely to be merged right away.\n\n## FAQ\n\n### Can I disable SQL formatting?\n\nSure! Certain SQL statements might not be parsed and indented properly by `sqlparse`, and the result can be jumbled\nformatting. You can disable SQL formatting for a cell by adding `-- nofmt` to the very first line of a cell:\n\n```sql\n%sql -- nofmt\nselect this,\n sql_will, -- be kept just\n like_this\n from if_that_is.what_you_need\n```\n\n### How do I use `blackbricks` on my Databricks notebooks?\n\nFirst, make sure you have set up `databricks-cli` on your system (see [installation](#installation)), and that you have\nat least one profile setup in `~/.databrickscfg`. As an example:\n\n```cfg\n# File: ~/.databrickscfg\n\n[DEFAULT]\nhost = https://dbc-b23456-a1243.cloud.databricks.com/\nusername = username@example.com\npassword = dapi12345678901234567890\n\n[OTHERPROFILE]\nhost = https://dbc-c54321-d234.cloud.databricks.com\nusername = name.user@example.com\npassword = dapi09876543211234567890\n```\n\nYou should use [access tokens](https://docs.databricks.com/dev-tools/api/latest/authentication.html) instead of your actual password.\n\nYou can then do:\n\n``` bash\n$ blackbricks --remote /Users/username@example.com/notebook.py # Uses DEFAULT profile.\n$ blackbricks --remote notebook.py # Equivalent to the above.\n$ blackbricks --remote --profile OTHERPROFILE /Users/name.user@example.com/notebook.py\n$ blackbricks --remote --profile OTHERPROFILE notebook.py # Equivalent to the above.\n$ blackbricks --remote /Repos/username@example.com/repo-name/notebook.py # Targeting notebook in a Repo\n```\n\n### Can you run blackbricks while using Databricks in the browser?\n\nNo. See https://github.com/inspera/blackbricks/issues/27 for why.\n\nHowever, Databricks now allows you to [format your notebooks with black directly](https://docs.databricks.com/notebooks/notebooks-use.html#format-code-cells).\n\n### I get an error: `TypeError: init() got an unexpected keyword argument 'no_args_is_help'`\n\nThis means you had an old version of `click` installed from before, and your installation didn't upgrade it\nautomatically. Updating your installation should do the trick, e.g. `pip install -U blackbricks` or similar depending on\nyour installation method of choice.\n\n\n### Shell commands like `!ls` throws an error\n\nSee https://github.com/inspera/blackbricks/issues/21.\n\n## Breaking changes\n\n### Version policy\n\nStyle choices made by `blackbricks` will follow semantic versioning, with changes that cause differences resulting in\nnew major versions. Such changes will be kept to an absolute minimum, with none currently planned.\n\nStyle choices made by `black` (responsible for 95% of the formatting in a notebook) will not follow the same strict\nsemantic versioning. This is because `black` itself does not use semver, but instead provide a [year-based\npolicy](https://black.readthedocs.io/en/stable/the_black_code_style/index.html#stability-policy). `blackbricks` will\nmake a _minor_ version increase when it upgrades black to a new year. Such a bump should be made once the new year's\nrelease of `black` is available. Feel free to open an issue if this has not been done yet. \n\n### Breaking changes with version 2.0\n\nNotebooks will be terminated with a `\\n` starting with version `2.0.0`. This harmonizes EOF handling and should be much\nless annoying in practice than prior versions. This causes a diff on _any_ notebook that was previously formatted with\n`blackbricks<2.0.0`.\n\nAlso, the deprecated and non-functional flag for two space indentation is removed, and providing said flag is now an error.\n\n### Breaking changes with version 1.0\n\nEarlier versions of blackbricks applied a patched version of black in order to allow two-space indentation. This was\ndone because Databricks used two-space indentation, and did not allow you to change that. \n\nSince then, Databricks has added the option to choose. Because you can now choose, blackbricks re-joins black in being\nuncompromising, and since version 1.0 you can no longer choose anything but 4 space indentation.\n\nIf you _must_ keep using two-space indentation, then stick to versions `<1.0`.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Black for Databricks notebooks",
"version": "2.2.0",
"project_urls": {
"Homepage": "https://github.com/inspera/blackbricks",
"Repository": "https://github.com/inspera/blackbricks"
},
"split_keywords": [
"automation",
" formatter",
" black",
" sql",
" yapf",
" autopep8",
" pyfmt",
" gofmt",
" rustfmt"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c8ed47643368d4f520bdf8d941cd5e86338024a8c6e16dad66c89fc4d8db124b",
"md5": "3c9f0fdb7cd4577ed1e1a71674a66fb8",
"sha256": "1a31e68dbf88c0535da65d5be9a69066a2e88096ce29fdfff0763eff7f2d0906"
},
"downloads": -1,
"filename": "blackbricks-2.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3c9f0fdb7cd4577ed1e1a71674a66fb8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.8",
"size": 13821,
"upload_time": "2024-08-02T04:12:54",
"upload_time_iso_8601": "2024-08-02T04:12:54.667271Z",
"url": "https://files.pythonhosted.org/packages/c8/ed/47643368d4f520bdf8d941cd5e86338024a8c6e16dad66c89fc4d8db124b/blackbricks-2.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "be63ad5369165f7a980069184bf6ae979be3a3c7203b315884f7bae78b3e3859",
"md5": "b466ea5f9bff76302aa579406a6151f0",
"sha256": "3148706510aee898b8fc2ca99ae59870d598370e419a5f6720f38aaccbf8a3dd"
},
"downloads": -1,
"filename": "blackbricks-2.2.0.tar.gz",
"has_sig": false,
"md5_digest": "b466ea5f9bff76302aa579406a6151f0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.8",
"size": 14574,
"upload_time": "2024-08-02T04:12:55",
"upload_time_iso_8601": "2024-08-02T04:12:55.946289Z",
"url": "https://files.pythonhosted.org/packages/be/63/ad5369165f7a980069184bf6ae979be3a3c7203b315884f7bae78b3e3859/blackbricks-2.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-02 04:12:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "inspera",
"github_project": "blackbricks",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "blackbricks"
}