databricks-mosaic-gdal


Namedatabricks-mosaic-gdal JSON
Version 3.4.3.post5 PyPI version JSON
download
home_pagehttps://github.com/databrickslabs/mosaic/tree/main/modules/python/gdal_package
SummaryGDAL install with Java Bindings for Databricks Runtime 11+
upload_time2023-04-04 12:37:25
maintainer
docs_urlNone
authorDatabricks
requires_python>=3.7.0
licenseDatabricks License
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PyPI Project 'databricks-mosaic-gdal' Overview

> Current version is 3.4.3 (to match GDAL).

This is a filetree (vs apt based) drop-in packaging of GDAL with Java Bindings for Ubuntu 20.04 (Focal Fossa) which is used by [Databricks Runtime](https://docs.databricks.com/release-notes/runtime/releases.html) (DBR) 11.3+. 

1. `gdal-3.4.3-filetree.tar.xz` is ~50MB - it is extracted with `tar -xf gdal-3.4.3-filetree.tar.xz -C /`
2. `gdal-3.4.3.-symlinks.tar.xz` is ~19MB - it is extracted with `tar -xhf gdal-3.4.3-symlinks.tar.xz -C /`

An [init script](https://docs.databricks.com/clusters/init-scripts.html) based approach is provided at [gdal-3.4.3-filetree-init.sh](https://github.com/databrickslabs/mosaic/blob/main/modules/python/gdal_package/databricks-mosaic-gdal/resources/scripts/mosaic-gdal-3.4.3-filetree-init.sh) for managing install of the tarballs across a DBR cluster with install from PyPI.

Starting in version `3.4.3.post5`, handling tarball unpacking without requiring an init script through [cluster library](https://docs.databricks.com/libraries/cluster-libraries.html#cluster-installed-library) vs a notebook scoped is being tested. __Use the init script if you run into issues.__

 __Requirements:__

* This will not install the GDAL tarballs if not on Databricks Runtime >= 11.3
* Added benefit of not accidentally installing on your local machine which would be bad since it unpacks to root (`/`)
* This is deployed to pypi without a wheel (WHL) to avoid uneccessary duplication of the resource files and to trigger `setup.py` building on install

 __Notes:__

* This is a very specific packaging for GDAL + dependencies which removes any libraries that are already provided by DBR, so it __will not be not useful outside Databricks.__
* It additionally includes GDAL shared objects (`.so`) for Java Bindings, GDAL 3.4.3 Python bindings, and tweak for OSGEO as currently supplied by [UbuntuGIS PPA](https://launchpad.net/~ubuntugis/+archive/ubuntu/ubuntugis-unstable) based init script [install-gdal-databricks.sh](https://github.com/databrickslabs/mosaic/blob/main/src/main/resources/scripts/install-gdal-databricks.sh) provided by Mosaic. __This install replaces the existing way on Mosaic, so choose one or the other.__
* The GDAL JAR for 3.4 is not included but is provided by Mosaic itself and added to your Databricks cluster as part of the [enable_gdal](https://databrickslabs.github.io/mosaic/usage/install-gdal.html#enable-gdal-for-a-notebook) called when configuring Mosaic for GDAL. Separately, the JAR could be added as a [cluster-installed library](https://docs.databricks.com/libraries/cluster-libraries.html#cluster-installed-library), e.g. through Maven coordinates `org.gdal:gdal:3.4.0` from [mvnrepository](https://mvnrepository.com/artifact/org.gdal/gdal/3.4.0).
* Don't use versions below `3.4.3.post5` - __This is still being developed / refined.__

__Install:__

_Install 'databricks-mosaic-gdal' from PyPI as a [cluster library](https://docs.databricks.com/libraries/cluster-libraries.html#cluster-installed-library) vs a notebook scoped._

_Install 'databricks-mosaic' from PyPI in a notebook or as a cluster library (handles JARs as well):_

```
# install in notebook e.g.
%pip install databricks-mosaic
```

_Then you can initialize:_

```
import mosaic as mos
mos.enable_mosaic(spark, dbutils)
mos.enable_gdal(spark)
```

_Check Mosaic [GDAL Installation Guide](https://databrickslabs.github.io/mosaic/usage/install-gdal.html#) for updated instructions on/around APR 2023._

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/databrickslabs/mosaic/tree/main/modules/python/gdal_package",
    "name": "databricks-mosaic-gdal",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Databricks",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/7c/c4/80b1a791050d4a9254ad471f829e332599f6d80335bfa2aa09605b164463/databricks-mosaic-gdal-3.4.3.post5.tar.gz",
    "platform": null,
    "description": "# PyPI Project 'databricks-mosaic-gdal' Overview\n\n> Current version is 3.4.3 (to match GDAL).\n\nThis is a filetree (vs apt based) drop-in packaging of GDAL with Java Bindings for Ubuntu 20.04 (Focal Fossa) which is used by [Databricks Runtime](https://docs.databricks.com/release-notes/runtime/releases.html) (DBR) 11.3+. \n\n1. `gdal-3.4.3-filetree.tar.xz` is ~50MB - it is extracted with `tar -xf gdal-3.4.3-filetree.tar.xz -C /`\n2. `gdal-3.4.3.-symlinks.tar.xz` is ~19MB - it is extracted with `tar -xhf gdal-3.4.3-symlinks.tar.xz -C /`\n\nAn [init script](https://docs.databricks.com/clusters/init-scripts.html) based approach is provided at [gdal-3.4.3-filetree-init.sh](https://github.com/databrickslabs/mosaic/blob/main/modules/python/gdal_package/databricks-mosaic-gdal/resources/scripts/mosaic-gdal-3.4.3-filetree-init.sh) for managing install of the tarballs across a DBR cluster with install from PyPI.\n\nStarting in version `3.4.3.post5`, handling tarball unpacking without requiring an init script through [cluster library](https://docs.databricks.com/libraries/cluster-libraries.html#cluster-installed-library) vs a notebook scoped is being tested. __Use the init script if you run into issues.__\n\n __Requirements:__\n\n* This will not install the GDAL tarballs if not on Databricks Runtime >= 11.3\n* Added benefit of not accidentally installing on your local machine which would be bad since it unpacks to root (`/`)\n* This is deployed to pypi without a wheel (WHL) to avoid uneccessary duplication of the resource files and to trigger `setup.py` building on install\n\n __Notes:__\n\n* This is a very specific packaging for GDAL + dependencies which removes any libraries that are already provided by DBR, so it __will not be not useful outside Databricks.__\n* It additionally includes GDAL shared objects (`.so`) for Java Bindings, GDAL 3.4.3 Python bindings, and tweak for OSGEO as currently supplied by [UbuntuGIS PPA](https://launchpad.net/~ubuntugis/+archive/ubuntu/ubuntugis-unstable) based init script [install-gdal-databricks.sh](https://github.com/databrickslabs/mosaic/blob/main/src/main/resources/scripts/install-gdal-databricks.sh) provided by Mosaic. __This install replaces the existing way on Mosaic, so choose one or the other.__\n* The GDAL JAR for 3.4 is not included but is provided by Mosaic itself and added to your Databricks cluster as part of the [enable_gdal](https://databrickslabs.github.io/mosaic/usage/install-gdal.html#enable-gdal-for-a-notebook) called when configuring Mosaic for GDAL. Separately, the JAR could be added as a [cluster-installed library](https://docs.databricks.com/libraries/cluster-libraries.html#cluster-installed-library), e.g. through Maven coordinates `org.gdal:gdal:3.4.0` from [mvnrepository](https://mvnrepository.com/artifact/org.gdal/gdal/3.4.0).\n* Don't use versions below `3.4.3.post5` - __This is still being developed / refined.__\n\n__Install:__\n\n_Install 'databricks-mosaic-gdal' from PyPI as a [cluster library](https://docs.databricks.com/libraries/cluster-libraries.html#cluster-installed-library) vs a notebook scoped._\n\n_Install 'databricks-mosaic' from PyPI in a notebook or as a cluster library (handles JARs as well):_\n\n```\n# install in notebook e.g.\n%pip install databricks-mosaic\n```\n\n_Then you can initialize:_\n\n```\nimport mosaic as mos\nmos.enable_mosaic(spark, dbutils)\nmos.enable_gdal(spark)\n```\n\n_Check Mosaic [GDAL Installation Guide](https://databrickslabs.github.io/mosaic/usage/install-gdal.html#) for updated instructions on/around APR 2023._\n",
    "bugtrack_url": null,
    "license": "Databricks License",
    "summary": "GDAL install with Java Bindings for Databricks Runtime 11+",
    "version": "3.4.3.post5",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7cc480b1a791050d4a9254ad471f829e332599f6d80335bfa2aa09605b164463",
                "md5": "e9f2023deab61ba459f0882c4574cd2e",
                "sha256": "a39edf2e02400ff470a133a8737a354ee93dca031dca3af78a547e37bac3b080"
            },
            "downloads": -1,
            "filename": "databricks-mosaic-gdal-3.4.3.post5.tar.gz",
            "has_sig": false,
            "md5_digest": "e9f2023deab61ba459f0882c4574cd2e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7.0",
            "size": 71421132,
            "upload_time": "2023-04-04T12:37:25",
            "upload_time_iso_8601": "2023-04-04T12:37:25.148934Z",
            "url": "https://files.pythonhosted.org/packages/7c/c4/80b1a791050d4a9254ad471f829e332599f6d80335bfa2aa09605b164463/databricks-mosaic-gdal-3.4.3.post5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-04 12:37:25",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "databricks-mosaic-gdal"
}
        
Elapsed time: 0.05104s