This library provides bindings for the Domino APIs. It ships with the Domino Standard Environment (DSE).
See this documentation for details about the APIs:
- [Latest public Domino
APIs](https://docs.dominodatalab.com/en/latest/api_guide/8c929e/domino-public-apis/)
- [Legacy APIs](https://dominodatalab.github.io/api-docs/)
The latest released version of `python-domino` is `1.4.3`.
# Version compatibility matrix
The `python-domino` library is compatible with different versions of
Domino:
| Domino Versions | Python-Domino |
|-----------------|:-----------------------------------------------------------------------------------------------------:|
| 3.6.x or lower | [0.3.5](https://github.com/dominodatalab/python-domino/archive/0.3.5.zip) |
| 4.1.0 or higher | [1.0.0](https://github.com/dominodatalab/python-domino/archive/refs/tags/1.0.0.zip) or Higher |
| 5.3.0 or higher | [1.2.0](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.2.0.zip) or Higher |
| 5.5.0 or higher | [1.2.2](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.2.2.zip) or Higher |
| 5.10.0 or higher | [1.3.1](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.3.1.zip) or Higher |
| 5.11.0 or higher | [1.4.1](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.4.1.zip) or Higher |
| 6.0.0 or higher | [1.4.3](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.4.3.zip) or Higher |
# Development
The current `python-domino` is based on Python 3.9, which is therefore recommended for development. `Pipenv` is also recommended to manage the dependencies.
To use the Python binding in a Domino workbook session, include `dominodatalab` in your project's requirements.txt file.
This makes the Python binding available for each new workbook session (or batch run) started within the project.
To install dependencies from `setup.py` for development:
pipenv install -e ".[dev]"
Use the same process for Airflow and data:
pipenv install -e ".[data]" ".[airflow]"
# Set up the connection
You can set up the connection by creating a new instance of `Domino`:
_class_ Domino(project, api_key=None, host=None, domino_token_file=None, auth_token=None)
- *project:* A project identifier (in the form of ownerusername/projectname).
- *api_proxy:* (Optional) Location of the Domino API reverse proxy as host:port.
If set, this proxy is used to intercept any Domino API requests and insert an authentication token.
_This is the preferred method of authentication_.
Alternatively, set the `DOMINO_API_PROXY` environment variable.
In Domino 5.4.0 or later, this variable is set inside a Domino run container.
NOTE: This mechanism does not work when connecting to an HTTPS endpoint; it is meant to be used inside Domino runs.
- *api_key:* (Optional) An API key to authenticate with.
If not provided, the library expects to find one in the `DOMINO_USER_API_KEY` environment variable.
If you are using the Python package in code that is already running in Domino, the `DOMINO_API_USER_KEY` variable is set automatically to be the key for the user who started the run.
- *host:* (Optional) A host URL.
If not provided, the library expects to find one in the `DOMINO_API_HOST` environment variable.
- *domino_token_file:* (Optional) Path to the Domino token file
containing the auth token.
If not provided, the library expects to find one in the `DOMINO_TOKEN_FILE` environment variable.
If you are using Python package in code that is already running in Domino, the `DOMINO_TOKEN_FILE` is set automatically to be the token file for the user who started the run.
- *auth_token:* (Optional) Authentication token.
## Authentication
Domino looks for the authentication method in the following order and uses the first one it finds:
1. `api_proxy`
2. `auth_token`
3. `domino_token_file`
4. `api_key`
5. `DOMINO_API_PROXY`
6. `DOMINO_TOKEN_FILE`
7. `DOMINO_USER_API_KEY`
The API proxy is the preferred method of authentication.
See
[Use the API Proxy to Authenticate Calls to the Domino API](https://docs.dominodatalab.com/en/latest/user_guide/40b91f/domino-api/).
## Additional environment variables
- `DOMINO_LOG_LEVEL`
The default log level is `INFO`.
You can change the log level by setting `DOMINO_LOG_LEVEL`, for example to `DEBUG`.
- `DOMINO_VERIFY_CERTIFICATE`
For testing purposes and issues with SSL certificates, set `DOMINO_VERIFY_CERTIFICATE` to `false`.
Be sure to unset this variable when not in use.
- `DOMINO_MAX_RETRIES`
Default Retry is set to 4
Determines the number of attempts for the request session in case of a ConnectionError
Get more info on request max timeout/error durations based on Retry and backoff factors [here](https://urllib3.readthedocs.io/en/latest/reference/urllib3.util.html#module-urllib3.util.retry)
# Methods
# Budgets and Billing Tags
See
[`example_budget_manager.py`](https://github.com/dominodatalab/python-domino/blob/release-1.4.3/examples/example_budget_manager.py)
for example code.
### budget_defaults_list()
Get a list of the available default budgets with the assigned (if any) limits
Requires Admin permission
### budget_defaults_update(budget_label, budget_limit)
Update default budgets by BudgetLabel
Requires Admin roles
- *budget_label:* (required) label of budget to be updated ex: `BillingTag`, `Organization`
- *budget_limit:* (required) new budget quota to assign to default label
### budget_overrides_list()
Get a list of the available budgets overrides with the assigned limits.
Requires Admin permission
### budget_override_create(budget_label, budget_id, budget_limit)
Create Budget overrides based on BudgetLabels, ie BillingTags, Organization, or Projects
the object id is used as budget ids
Requires Admin roles
- *budget_label:* label of budget to be updated
- *budget_id:* id of project or organization to be used as new budget override id.
- *budget_limit:* budget quota to assign to override
### budget_override_update(budget_label, budget_id, budget_limit)
Update Budget overrides based on BudgetLabel and budget id
Requires Admin roles
- *budget_label:* label of budget to be updated
- *budget_id:* id of budget override to be updated.
- *budget_limit:* new budget quota to assign to override
### budget_override_delete(budget_id)
Delete an existing budget override
Requires Admin roles
- *budget_id:* id of budget override to be deleted.
### budget_alerts_settings()
Get the current budget alerts settings
Requires Admin permission
### budget_alerts_settings_update(alerts_enabled, notify_org_owner)
Update the current budget alerts settings to enable/disable budget notifications
and whether to notify org owners on projects notifications
Requires Admin permission
- *alerts_enabled:* whether to enable or disable notifications.
- *notify_org_owner:* whether to notify organizations owners on projects reaching threshold.
### budget_alerts_targets_update(targets)
Update the current budget alerts settings with additional email targets per budget label
Requires Admin permission
- *targets:* dictionary of budget labels and list of email addresses
### billing_tags_list_active()
Get a list of active billing tags
Requires Admin permission
### billing_tags_create(tags_list)
Create a list of active billing tags
Requires Admin permission
- *tags_list:* list of billing tags names to be created
### active_billing_tag_by_name(name)
Get detailed info on active or archived billing tag
Requires Admin permission
- *name:* name of existing billing tag
### billing_tag_archive(name)
Archive an active billing tag
Requires Admin permission
- *name:* name of existing billing tag to archive
### billing_tag_settings()
Get the current billing tag settings
Requires Admin permission
### billing_tag_settings_mode()
Get the current billing tag settings mode
Requires Admin permission
### billing_tag_settings_mode_update(mode)
Update the current billing tag settings mode
Requires Admin permission
- *mode:* new mode to set the billing tag settings (see BillingTagSettingMode)
### project_billing_tag(project_id)
Get a billing tag assigned to a particular project by project id
Requires Admin permission
- *project_id:* id of the project to find assigned billing tag
### project_billing_tag_update(billing_tag, project_id)
Update project's billing tag with new billing tag.
Requires Admin permission
- *billing_tag:* billing tag to assign to a project
- *project_id:* id of the project to assign a billing tag
### project_billing_tag_reset(project_id)
Remove a billing tag from a specified project
Requires Admin permission
- *project_id:* id of the project to reset billing tag field
### projects_by_billing_tag( billing_tag, offset, page_size, name_filter, sort_by, sort_order, missing_tag_only=False)
Remove a billing tag from a specified project
Requires Admin permission
- *billing_tag:* billing tag string to filter projects by
- *offset:* The index of the start of the page, where checkpointProjectId is index 0.
If the offset is negative the project it points to will be the end of the page.
- *page_size:* The number of records to return per page.
- *name_filter:* matches projects by name substring
- *sort_by:* (Optional) field to sort the projects on
- *sort_order:* (Optional) Whether to sort in asc or desc order
- *missing_tag_only:* (Optional) determine whether to only return projects with missing tag
### project_billing_tag_bulk_update(projects_tag)
Update project's billing tags in bulk
Requires Admin permission
- *projects_tag:* dictionary of project_id and billing_tags
## Projects
See
[`example_projects_usage.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/example_projects_usage.py)
for example code.
### project_create_v4(project_name, owner_id, owner_username, description, collaborators, tags, billing_tag, visibility=PUBLIC)
Newer version of projects creation using the v4 endpoints which allows more optional fields.
- *project_name:* (required) The name of the project.
- *owner_id:* (Optional) user id of the owner of the new project to be created (must be admin to create projects for other users)
owner_id or owner_username can be used, both are not needed (Defaults to current owner_username)
- *owner_username:* (Optional) username of the owner of the new project to be created (must be admin to create projects for other users)
owner_id or owner_username can be used, both are not needed (Defaults to current owner_username)
- *description:* (Optional) description of the project
- *collaborators:* (Optional) list of collaborators to be added to the project
- *tags:* (Optional) list of tags to add to project
- *billing_tag:* (Optional unless billingTag settings mode is Required) active billing tag to be added to projects for governance
- *visibility:* (Optional) (Defaults to Public) project visibility
### project_create(project_name, owner_username=None)
Create a new project with given project name.
- *project_name:* The name of the project.
- *owner_username:* (Optional) The owner username for the project.
This parameter is useful when you need to create a project under an
organization.
### collaborators_get()
Get the list of collaborators on a project.
### collaborators_add(username_or_email, message="")
Add collaborators to a project.
- *username_or_email:* Name or email of the Domino user to add as
collaborator to the current project.
- *message:* Optional - Message related to the user’s role or purpose
to the project.
## Project tags
Project tags are an easy way to add freeform metadata to a project. Tags
help colleagues and consumers organize and find the Domino projects that
interest them. Tags can be used to describe the subject explored by a
project, the packages and libraries it uses, or the source of the data
within.
See
[`example_projects_usage.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/example_projects_usage.py)
for example code.
### tags_list(\*project_id)
List a project’s tags.
- *project_id:* The project identifier.
### tag_details(tag_id)
Get details about a tag.
- *tag_id:* The tag identifier.
### tags_add(tags, \*project_id)
Create a tag, if it does not exist, and add it to a project.
- *tags (list):* One or more tag names.
- *project_id:* (Defaults to current project ID) The project
identifier.
### tag_get_id(tag_name, \*project_id)
Get the tag ID using the tag string name.
- *tag_name (string):* The tag name.
- *project_id:* (Defaults to current project id) The project ID.
### tags_remove(tag_name, project_id=None)
Remove a tag from a project.
- *tag_name (string):* The tag name.
- *project_id:* (Defaults to current project id) The project ID.
## Executions
See these code example files:
- [`start_run_and_check_status.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/start_run_and_check_status.py)
- [`export_runs.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/export_runs.py)
### runs_list()
List the executions on the selected project.
### runs_start(command, isDirect, commitId, title, tier, publishApiEndpoint)
Start a new execution on the selected project.
- *command:* The command to execution as an array of strings where
members of the array represent arguments of the command. For
example: `["main.py", "hi mom"]`
- *isDirect:* (Optional) Whether this command should be passed
directly to a shell.
- *commitId:* (Optional) The `commitId` to launch from. If not
provided, the project launches from the latest commit.
- *title:* (Optional) A title for the execution.
- *tier:* (Optional) The hardware tier to use for the execution. This
is the human-readable name of the hardware tier, such as "Free",
"Small", or "Medium". If not provided, the project’s default tier is
used.
- *publishApiEndpoint:* (Optional) Whether to publish an API endpoint
from the resulting output.
### runs_start_blocking(command, isDirect, commitId, title, tier, publishApiEndpoint, poll_freq=5, max_poll_time=6000)
Start a new execution on the selected project and make a blocking
request that waits until job is finished.
- *command:* The command to execution as an array of strings where
members of the array represent arguments of the command. For
example: `["main.py", "hi mom"]`
- *isDirect:* (Optional) Whether this command should be passed
directly to a shell.
- *commitId:* (Optional) The `commitId` to launch from. If not
provided, the project launches from the latest commit.
- *title:* (Optional) A title for the execution.
- *tier:* (Optional) The hardware tier to use for the execution. Will
use project’s default tier if not provided. If not provided, the
project’s default tier is used.
- *publishApiEndpoint:* (Optional) Whether to publish an API endpoint
from the resulting output.
- *poll_freq:* (Optional) Number of seconds between polling of the
Domino server for status of the task that is running.
- *max_poll_time:* (Optional) Maximum number of seconds to wait for
a task to complete. If this threshold is exceeded, an exception is
raised.
- *retry_count:* (Optional) Maximum number of polling retries (in
case of transient HTTP errors). If this threshold is exceeded, an
exception is raised.
### run_stop(runId, saveChanges=True):
Stop an existing execution in the selected project.
- *runId:* String that identifies the execution.
- *saveChanges:* (Defaults to True) If false, execution results are
discarded.
### runs_stdout(runId)
Get `stdout` emitted by a particular execution.
- *runId:* string that identifies the execution
## Files and blobs
See these code example files:
- [`upload_file.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/upload_file.py)
- [`upload_and_run_file_and_download_results.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/upload_and_run_file_and_download_results.py)
### files_list(commitId, path)
List the files in a folder in the Domino project.
- *commitId:* The `commitId` to list files from.
- *path:* (Defaults to "/") The path to list from.
### files_upload(path, file)
Upload a Python file object into the specified path inside the project.
See `examples/upload_file.py` for an example. All parameters are
required.
- *path:* The path to save the file to. For example, `/README.md`
writes to the root directory of the project while
`/data/numbers.csv` saves the file to a sub folder named `data`. If
the specified folder does not yet exist, it is created.
- *file:* A Python file object. For example:
`f = open("authors.txt","rb")`
### blobs_get(key)
*_Deprecated_* Retrieve a file from the Domino server by blob key. Use blobs_get_v2(path, commit_id, project_id) instead.
- *key:* The key of the file to fetch from the blob server.
### blobs_get_v2(path, commit_id, project_id)
Retrieve a file from the Domino server in a project from its path and commit id.
- *path:* The path to the file in the Domino project.
- *commit_id:* ID of the commit to retrieve the file from.
- *project_id:* ID of the project to retrieve the file from.
## Apps
### app_publish(unpublishRunningApps=True, hardwareTierId=None)
Publish an app within a project, or republish an existing app.
- *unpublishRunningApps:* (Defaults to True) Check for an active app
instance in the current project and unpublish it before
re/publishing.
- *hardwareTierId:* (Optional) Launch the app on the specified
hardware tier.
### app_unpublish()
Stop the running app in the project.
## Jobs
### job_start(command, commit_id=None, hardware_tier_name=None, environment_id=None, on_demand_spark_cluster_properties=None, compute_cluster_properties=None, external_volume_mounts=None, title=None):
Start a new job (execution) in the project.
- *command (string):* Command to execute in Job. For example:
`domino.job_start(command="main.py arg1 arg2")`
- *commit_id (string):* (Optional) The `commitId` to launch from. If
not provided, the job launches from the latest commit.
- *hardware_tier_name (string):* (Optional) The hardware tier NAME
to launch job in. If not provided, the project’s default tier is
used.
- *environment_id (string):* (Optional) The environment ID with which
to launch the job. If not provided, the project’s default
environment is used.
- *on_demand_spark_cluster_properties (dict):* (Optional) On
demand spark cluster properties. The following properties can be
provided in the Spark cluster:
{
"computeEnvironmentId": "<Environment ID configured with spark>"
"executorCount": "<Number of Executors in cluster>"
(optional defaults to 1)
"executorHardwareTierId": "<Hardware tier ID for Spark Executors>"
(optional defaults to last used historically if available)
"masterHardwareTierId": "<Hardware tier ID for Spark master"
(optional defaults to last used historically if available)
"executorStorageMB": "<Executor's storage in MB>"
(optional defaults to 0; 1GB is 1000MB Here)
}
- *param compute_cluster_properties (dict):* (Optional) The
compute-cluster properties definition contains parameters for
launching any Domino supported compute cluster for a job. Use this
to launch a job that uses a compute-cluster instead of the
deprecated `on_demand_spark_cluster_properties` field. If
`on_demand_spark_cluster_properties` and
`compute_cluster_properties` are both present,
`on_demand_spark_cluster_properties` is ignored.
`compute_cluster_properties` contains the following fields:
{
"clusterType": <string, one of "Ray", "Spark", "Dask", "MPI">,
"computeEnvironmentId": <string, The environment ID for the cluster's nodes>,
"computeEnvironmentRevisionSpec": <one of "ActiveRevision", "LatestRevision",
{"revisionId":"<environment_revision_id>"} (optional)>,
"masterHardwareTierId": <string, the Hardware tier ID for the cluster's master node (required unless clusterType is MPI)>,
"workerCount": <number, the total workers to spawn for the cluster>,
"workerHardwareTierId": <string, The Hardware tier ID for the cluster workers>,
"workerStorage": <{ "value": <number>, "unit": <one of "GiB", "MB"> },
The disk storage size for the cluster's worker nodes (optional)>
"maxWorkerCount": <number, The max number of workers allowed. When
this configuration exists, autoscaling is enabled for the cluster and
"workerCount" is interpreted as the min number of workers allowed in the cluster
(optional)>
}
- *external_volume_mounts (List\[string\]):* (Optional) External
volume mount IDs to mount to execution. If not provided, the job
launches with no external volumes mounted.
- *title (string): (Optional) Title for Job.
### job_stop(job_id, commit_results=True):
Stop the Job (execution) in the project.
- *job_id (string):* Job identifier.
- *commit_results (boolean):* (Defaults to `true`) If `false`, the
job results are not committed.
### job_status(job_id):
Get the status of a job.
- *job_id (string):* Job identifier.
### job_start_blocking(poll_freq=5, max_poll_time=6000, \*\*kwargs):
Start a job and poll until the job is finished. Additionally, this
method supports all the parameters in the `job_start` method.
- *poll_freq:* Poll frequency interval in seconds.
- *max_poll_time:* Max poll time in seconds.
## Datasets
A Domino dataset is a collection of files that are available in user
executions as a filesystem directory. A dataset always reflects the most
recent version of the data. You can modify the contents of a dataset
through the Domino UI or through workload executions.
See [Domino
Datasets](https://docs.dominodatalab.com/en/latest/user_guide/0a8d11/datasets-overview/)
for more details, and
[`example_dataset.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/example_dataset.py)
for example code.
### datasets_list(project_id=None)
Provide a JSON list of all the available datasets.
- *project_id (string):* (Defaults to None) The project identifier.
Each project can hold up to 5 datasets.
### datasets_ids(project_id)
List the IDs the datasets for a particular project.
- *project_id:* The project identifier.
### datasets_names(project_id)
List the names the datasets for a particular project.
- *project_id:* The project identifier.
### datasets_details(dataset_id)
Provide details about a dataset.
- *dataset_id:* The dataset identifier.
### datasets_create(dataset_name, dataset_description)
Create a new dataset.
- *dataset_name:* Name of the new dataset. NOTE: The name must be
unique.
- *dataset_description:* Description of the dataset.
### datasets_update_details(dataset_id, dataset_name=None, dataset_description=None)
Update a dataset’s name or description.
- *dataset_id:* The dataset identifier.
- *dataset_name:* (Optional) New name of the dataset.
- *dataset_description:* (Optional) New description of the dataset.
### datasets_remove(dataset_ids)
Delete a set of datasets.
- *dataset_ids (list\[string\]):* List of IDs of the datasets to
delete. NOTE: Datasets are first marked for deletion, then deleted
after a grace period (15 minutes, configurable). A Domino admin may
also need to complete this process before the name can be reused.
### datasets_upload_files(dataset_id, local_path_to_file_or_directory, file_upload_setting, max_workers, target_chunk_size, target_relative_path)
Uploads a file or entire directory to a dataset.
- *dataset_id:* The dataset identifier.
- *local_path_to_file_or_directory:* The path to the file or directory in local machine.
- *file_upload_setting:* The setting to resolve naming conflict, must be one of `Overwrite`, `Rename`, `Ignore` (default).
- *max_workers:* The max amount of threads (default: 10).
- *target_chunk_size:* The max chunk size for multipart upload (default: 8MB).
- *target_relative_path:* The path on the dataset to upload the file or directory to. Note that the path must exist or the upload will fail.
# Airflow
The `python-domino` client comes bundled with an
[Operator](https://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html)
for use with [Apache Airflow](https://airflow.apache.org/) as an extra.
When installing the client from PyPI, add the `airflow` flag to extras:
pip install "dominodatalab[airflow]"
Similarly, when installing the client from GitHub, use the following
command:
pip install -e git+https://github.com/dominodatalab/python-domino.git@1.0.6#egg="dominodatalab[airflow]"
See also
[example_airflow_dag.py](https://github.com/dominodatalab/python-domino/blob/release-1.2.1/examples/example_airflow_dag.py)
for example code.
## DominoOperator
from domino.airflow import DominoOperator
Allows a user to schedule Domino executions via Airflow. Follows the
same function signature as `domino.runs_start` with two extra arguments:
- `startup_delay: Optional[int] = 10` | Add a startup delay to your job, useful if you want to delay execution until after other work finishes.
- `include_setup_log: Optional[bool] = True` | Determine whether or not to publish the setup log of the job as the log prefix before `stdout`.
## DominoSparkOperator
from domino.airflow import DominoSparkOperator
Allows a user to schedule Domino executions via the v4 API, which
supports `onDemandSparkClusters`. Follows the same function signature as
`domino.job_start`, with the addition of `startup_delay` from above.
# Example
from domino import Domino
# By and large your commands will run against a single project,
# so you must specify the full project name
domino = Domino("chris/canon")
# List all runs in the project, most-recently queued first
all_runs = domino.runs_list()['data']
latest_100_runs = all_runs[0:100]
print(latest_100_runs)
# all runs have a commitId (the snapshot of the project when the
# run starts) and, if the run completed, an "outputCommitId"
# (the snapshot of the project after the run completed)
most_recent_run = all_runs[0]
commitId = most_recent_run['outputCommitId']
# list all the files in the output commit ID -- only showing the
# entries under the results directory. If not provided, will
# list all files in the project. Or you can say path=“/“ to
# list all files
files = domino.files_list(commitId, path='results/')['data']
for file in files:
print file['path'], '->', file['url']
print(files)
# Get the content (i.e. blob) for the file you're interested in.
# blobs_get returns a connection rather than the content, because
# the content can get quite large and it's up to you how you want
# to handle it
print(domino.blobs_get(files[0]['key']).read())
# Start a run of file main.py using the latest copy of that file
domino.runs_start(["main.py", "arg1", "arg2"])
# Start a "direct" command
domino.runs_start(["echo 'Hello, World!'"], isDirect=True)
# Start a run of a specific commit
domino.runs_start(["main.py"], commitId="aabbccddee")
# Manual installation
Because `python-domino` ships with the DSE, normally you do not need to install it.
This section provides instructions for installing it in another environment or updating it to a newer version.
Starting from version `1.0.6`, `python-domino` is available on PyPI as `dominodatalab`:
pip install dominodatalab
If you are adding install instructions for `python-domino` to your [Domino Environment](https://support.dominodatalab.com/hc/en-us/articles/115000392643-Compute-Environment-Management) Dockerfile Instructions field, you must add `RUN` to the beginning:
RUN pip install dominodatalab
To install a specific version of the library from PyPI, such as `1.0.6`:
pip install dominodatalab==1.0.6
To install a specific version of the library from GitHub, such as
`1.0.6`:
pip install https://github.com/dominodatalab/python-domino/archive/1.0.6.zip
# License
This library is made available under the Apache 2.0 License. This is an
open-source project of [Domino Data Lab](https://www.dominodatalab.com).
Raw data
{
"_id": null,
"home_page": "https://github.com/dominodatalab/python-domino",
"name": "dominodatalab",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9.0",
"maintainer_email": null,
"keywords": "Domino Data Lab, API",
"author": "Domino Data Lab",
"author_email": "support@dominodatalab.com",
"download_url": "https://files.pythonhosted.org/packages/f0/02/235fdeb7f5674e8bd454d7d133377463b90fbfe6b7302ee4654f42c4945d/dominodatalab-1.4.3.tar.gz",
"platform": null,
"description": "This library provides bindings for the Domino APIs. It ships with the Domino Standard Environment (DSE).\n\nSee this documentation for details about the APIs:\n\n- [Latest public Domino\n APIs](https://docs.dominodatalab.com/en/latest/api_guide/8c929e/domino-public-apis/)\n\n- [Legacy APIs](https://dominodatalab.github.io/api-docs/)\n\nThe latest released version of `python-domino` is `1.4.3`.\n\n# Version compatibility matrix\n\nThe `python-domino` library is compatible with different versions of\nDomino:\n\n| Domino Versions | Python-Domino |\n|-----------------|:-----------------------------------------------------------------------------------------------------:|\n| 3.6.x or lower | [0.3.5](https://github.com/dominodatalab/python-domino/archive/0.3.5.zip) |\n| 4.1.0 or higher | [1.0.0](https://github.com/dominodatalab/python-domino/archive/refs/tags/1.0.0.zip) or Higher |\n| 5.3.0 or higher | [1.2.0](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.2.0.zip) or Higher |\n| 5.5.0 or higher | [1.2.2](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.2.2.zip) or Higher |\n| 5.10.0 or higher | [1.3.1](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.3.1.zip) or Higher |\n| 5.11.0 or higher | [1.4.1](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.4.1.zip) or Higher |\n| 6.0.0 or higher | [1.4.3](https://github.com/dominodatalab/python-domino/archive/refs/tags/Release-1.4.3.zip) or Higher |\n\n# Development\n\nThe current `python-domino` is based on Python 3.9, which is therefore recommended for development. `Pipenv` is also recommended to manage the dependencies.\n\nTo use the Python binding in a Domino workbook session, include `dominodatalab` in your project's requirements.txt file.\nThis makes the Python binding available for each new workbook session (or batch run) started within the project.\n\nTo install dependencies from `setup.py` for development:\n\n pipenv install -e \".[dev]\"\n\nUse the same process for Airflow and data:\n\n pipenv install -e \".[data]\" \".[airflow]\"\n\n# Set up the connection\n\nYou can set up the connection by creating a new instance of `Domino`:\n\n _class_ Domino(project, api_key=None, host=None, domino_token_file=None, auth_token=None)\n\n- *project:* A project identifier (in the form of ownerusername/projectname).\n \n- *api_proxy:* (Optional) Location of the Domino API reverse proxy as host:port.\n\n If set, this proxy is used to intercept any Domino API requests and insert an authentication token.\n _This is the preferred method of authentication_. \n Alternatively, set the `DOMINO_API_PROXY` environment variable.\n In Domino 5.4.0 or later, this variable is set inside a Domino run container.\n\n NOTE: This mechanism does not work when connecting to an HTTPS endpoint; it is meant to be used inside Domino runs.\n\n- *api_key:* (Optional) An API key to authenticate with. \n\n If not provided, the library expects to find one in the `DOMINO_USER_API_KEY` environment variable.\n If you are using the Python package in code that is already running in Domino, the `DOMINO_API_USER_KEY` variable is set automatically to be the key for the user who started the run.\n\n- *host:* (Optional) A host URL. \n\n If not provided, the library expects to find one in the `DOMINO_API_HOST` environment variable.\n\n- *domino_token_file:* (Optional) Path to the Domino token file\n containing the auth token. \n\n If not provided, the library expects to find one in the `DOMINO_TOKEN_FILE` environment variable.\n If you are using Python package in code that is already running in Domino, the `DOMINO_TOKEN_FILE` is set automatically to be the token file for the user who started the run.\n\n- *auth_token:* (Optional) Authentication token.\n\n## Authentication\n\nDomino looks for the authentication method in the following order and uses the first one it finds:\n\n1. `api_proxy`\n2. `auth_token`\n3. `domino_token_file`\n4. `api_key`\n5. `DOMINO_API_PROXY`\n6. `DOMINO_TOKEN_FILE`\n7. `DOMINO_USER_API_KEY`\n\nThe API proxy is the preferred method of authentication.\nSee \n[Use the API Proxy to Authenticate Calls to the Domino API](https://docs.dominodatalab.com/en/latest/user_guide/40b91f/domino-api/).\n\n## Additional environment variables\n\n- `DOMINO_LOG_LEVEL`\n \n The default log level is `INFO`.\n You can change the log level by setting `DOMINO_LOG_LEVEL`, for example to `DEBUG`.\n\n- `DOMINO_VERIFY_CERTIFICATE`\n \n For testing purposes and issues with SSL certificates, set `DOMINO_VERIFY_CERTIFICATE` to `false`. \n Be sure to unset this variable when not in use.\n\n- `DOMINO_MAX_RETRIES`\n \n Default Retry is set to 4 \n Determines the number of attempts for the request session in case of a ConnectionError\n Get more info on request max timeout/error durations based on Retry and backoff factors [here](https://urllib3.readthedocs.io/en/latest/reference/urllib3.util.html#module-urllib3.util.retry)\n\n# Methods\n\n# Budgets and Billing Tags\n\nSee\n[`example_budget_manager.py`](https://github.com/dominodatalab/python-domino/blob/release-1.4.3/examples/example_budget_manager.py)\nfor example code.\n\n### budget_defaults_list()\n\nGet a list of the available default budgets with the assigned (if any) limits\nRequires Admin permission\n\n### budget_defaults_update(budget_label, budget_limit)\n\nUpdate default budgets by BudgetLabel\nRequires Admin roles\n\n- *budget_label:* (required) label of budget to be updated ex: `BillingTag`, `Organization`\n\n- *budget_limit:* (required) new budget quota to assign to default label\n\n### budget_overrides_list()\n\nGet a list of the available budgets overrides with the assigned limits.\nRequires Admin permission\n\n### budget_override_create(budget_label, budget_id, budget_limit)\n\nCreate Budget overrides based on BudgetLabels, ie BillingTags, Organization, or Projects\nthe object id is used as budget ids\nRequires Admin roles\n\n- *budget_label:* label of budget to be updated\n\n- *budget_id:* id of project or organization to be used as new budget override id.\n\n- *budget_limit:* budget quota to assign to override\n\n### budget_override_update(budget_label, budget_id, budget_limit)\n\nUpdate Budget overrides based on BudgetLabel and budget id\nRequires Admin roles\n\n- *budget_label:* label of budget to be updated\n\n- *budget_id:* id of budget override to be updated.\n\n- *budget_limit:* new budget quota to assign to override\n\n### budget_override_delete(budget_id)\n\nDelete an existing budget override\nRequires Admin roles\n\n- *budget_id:* id of budget override to be deleted.\n\n### budget_alerts_settings()\n\nGet the current budget alerts settings\nRequires Admin permission\n\n### budget_alerts_settings_update(alerts_enabled, notify_org_owner)\n\nUpdate the current budget alerts settings to enable/disable budget notifications\nand whether to notify org owners on projects notifications\nRequires Admin permission\n\n- *alerts_enabled:* whether to enable or disable notifications.\n\n- *notify_org_owner:* whether to notify organizations owners on projects reaching threshold.\n\n### budget_alerts_targets_update(targets)\n\nUpdate the current budget alerts settings with additional email targets per budget label\nRequires Admin permission\n\n- *targets:* dictionary of budget labels and list of email addresses\n\n### billing_tags_list_active()\n\nGet a list of active billing tags\nRequires Admin permission\n\n### billing_tags_create(tags_list)\n\nCreate a list of active billing tags\nRequires Admin permission\n\n- *tags_list:* list of billing tags names to be created\n\n### active_billing_tag_by_name(name)\n\nGet detailed info on active or archived billing tag\nRequires Admin permission\n\n- *name:* name of existing billing tag\n\n### billing_tag_archive(name)\n\nArchive an active billing tag\nRequires Admin permission\n\n- *name:* name of existing billing tag to archive\n\n### billing_tag_settings()\n\nGet the current billing tag settings\nRequires Admin permission\n\n### billing_tag_settings_mode()\n\nGet the current billing tag settings mode\nRequires Admin permission\n\n### billing_tag_settings_mode_update(mode)\n\nUpdate the current billing tag settings mode\nRequires Admin permission\n\n- *mode:* new mode to set the billing tag settings (see BillingTagSettingMode)\n\n### project_billing_tag(project_id)\n\nGet a billing tag assigned to a particular project by project id\nRequires Admin permission\n\n- *project_id:* id of the project to find assigned billing tag\n\n### project_billing_tag_update(billing_tag, project_id)\n\nUpdate project's billing tag with new billing tag.\nRequires Admin permission\n\n- *billing_tag:* billing tag to assign to a project\n\n- *project_id:* id of the project to assign a billing tag\n\n### project_billing_tag_reset(project_id)\n\nRemove a billing tag from a specified project\nRequires Admin permission\n\n- *project_id:* id of the project to reset billing tag field\n\n### projects_by_billing_tag( billing_tag, offset, page_size, name_filter, sort_by, sort_order, missing_tag_only=False)\n\nRemove a billing tag from a specified project\nRequires Admin permission\n\n- *billing_tag:* billing tag string to filter projects by\n\n- *offset:* The index of the start of the page, where checkpointProjectId is index 0.\nIf the offset is negative the project it points to will be the end of the page.\n- *page_size:* The number of records to return per page.\n\n- *name_filter:* matches projects by name substring\n\n- *sort_by:* (Optional) field to sort the projects on\n\n- *sort_order:* (Optional) Whether to sort in asc or desc order\n\n- *missing_tag_only:* (Optional) determine whether to only return projects with missing tag\n\n### project_billing_tag_bulk_update(projects_tag)\n\nUpdate project's billing tags in bulk\nRequires Admin permission\n\n- *projects_tag:* dictionary of project_id and billing_tags\n\n\n## Projects\n\nSee\n[`example_projects_usage.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/example_projects_usage.py)\nfor example code.\n\n### project_create_v4(project_name, owner_id, owner_username, description, collaborators, tags, billing_tag, visibility=PUBLIC)\n\nNewer version of projects creation using the v4 endpoints which allows more optional fields.\n\n- *project_name:* (required) The name of the project.\n\n- *owner_id:* (Optional) user id of the owner of the new project to be created (must be admin to create projects for other users)\n owner_id or owner_username can be used, both are not needed (Defaults to current owner_username)\n\n- *owner_username:* (Optional) username of the owner of the new project to be created (must be admin to create projects for other users)\n owner_id or owner_username can be used, both are not needed (Defaults to current owner_username)\n\n- *description:* (Optional) description of the project\n\n- *collaborators:* (Optional) list of collaborators to be added to the project\n\n- *tags:* (Optional) list of tags to add to project\n\n- *billing_tag:* (Optional unless billingTag settings mode is Required) active billing tag to be added to projects for governance\n\n- *visibility:* (Optional) (Defaults to Public) project visibility \n\n### project_create(project_name, owner_username=None)\n\nCreate a new project with given project name.\n\n- *project_name:* The name of the project.\n\n- *owner_username:* (Optional) The owner username for the project.\n This parameter is useful when you need to create a project under an\n organization.\n\n### collaborators_get()\n\nGet the list of collaborators on a project.\n\n### collaborators_add(username_or_email, message=\"\")\n\nAdd collaborators to a project.\n\n- *username_or_email:* Name or email of the Domino user to add as\n collaborator to the current project.\n\n- *message:* Optional - Message related to the user\u2019s role or purpose\n to the project.\n\n## Project tags\n\nProject tags are an easy way to add freeform metadata to a project. Tags\nhelp colleagues and consumers organize and find the Domino projects that\ninterest them. Tags can be used to describe the subject explored by a\nproject, the packages and libraries it uses, or the source of the data\nwithin.\n\nSee\n[`example_projects_usage.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/example_projects_usage.py)\nfor example code.\n\n### tags_list(\\*project_id)\n\nList a project\u2019s tags.\n\n- *project_id:* The project identifier.\n\n### tag_details(tag_id)\n\nGet details about a tag.\n\n- *tag_id:* The tag identifier.\n\n### tags_add(tags, \\*project_id)\n\nCreate a tag, if it does not exist, and add it to a project.\n\n- *tags (list):* One or more tag names.\n\n- *project_id:* (Defaults to current project ID) The project\n identifier.\n\n### tag_get_id(tag_name, \\*project_id)\n\nGet the tag ID using the tag string name.\n\n- *tag_name (string):* The tag name.\n\n- *project_id:* (Defaults to current project id) The project ID.\n\n### tags_remove(tag_name, project_id=None)\n\nRemove a tag from a project.\n\n- *tag_name (string):* The tag name.\n\n- *project_id:* (Defaults to current project id) The project ID.\n\n## Executions\n\nSee these code example files:\n\n- [`start_run_and_check_status.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/start_run_and_check_status.py)\n\n- [`export_runs.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/export_runs.py)\n\n### runs_list()\n\nList the executions on the selected project.\n\n### runs_start(command, isDirect, commitId, title, tier, publishApiEndpoint)\n\nStart a new execution on the selected project.\n\n- *command:* The command to execution as an array of strings where\n members of the array represent arguments of the command. For\n example: `[\"main.py\", \"hi mom\"]`\n\n- *isDirect:* (Optional) Whether this command should be passed\n directly to a shell.\n\n- *commitId:* (Optional) The `commitId` to launch from. If not\n provided, the project launches from the latest commit.\n\n- *title:* (Optional) A title for the execution.\n\n- *tier:* (Optional) The hardware tier to use for the execution. This\n is the human-readable name of the hardware tier, such as \"Free\",\n \"Small\", or \"Medium\". If not provided, the project\u2019s default tier is\n used.\n\n- *publishApiEndpoint:* (Optional) Whether to publish an API endpoint\n from the resulting output.\n\n### runs_start_blocking(command, isDirect, commitId, title, tier, publishApiEndpoint, poll_freq=5, max_poll_time=6000)\n\nStart a new execution on the selected project and make a blocking\nrequest that waits until job is finished.\n\n- *command:* The command to execution as an array of strings where\n members of the array represent arguments of the command. For\n example: `[\"main.py\", \"hi mom\"]`\n\n- *isDirect:* (Optional) Whether this command should be passed\n directly to a shell.\n\n- *commitId:* (Optional) The `commitId` to launch from. If not\n provided, the project launches from the latest commit.\n\n- *title:* (Optional) A title for the execution.\n\n- *tier:* (Optional) The hardware tier to use for the execution. Will\n use project\u2019s default tier if not provided. If not provided, the\n project\u2019s default tier is used.\n\n- *publishApiEndpoint:* (Optional) Whether to publish an API endpoint\n from the resulting output.\n\n- *poll_freq:* (Optional) Number of seconds between polling of the\n Domino server for status of the task that is running.\n\n- *max_poll_time:* (Optional) Maximum number of seconds to wait for\n a task to complete. If this threshold is exceeded, an exception is\n raised.\n\n- *retry_count:* (Optional) Maximum number of polling retries (in\n case of transient HTTP errors). If this threshold is exceeded, an\n exception is raised.\n\n### run_stop(runId, saveChanges=True):\n\nStop an existing execution in the selected project.\n\n- *runId:* String that identifies the execution.\n\n- *saveChanges:* (Defaults to True) If false, execution results are\n discarded.\n\n### runs_stdout(runId)\n\nGet `stdout` emitted by a particular execution.\n\n- *runId:* string that identifies the execution\n\n## Files and blobs\n\nSee these code example files:\n\n- [`upload_file.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/upload_file.py)\n\n- [`upload_and_run_file_and_download_results.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/upload_and_run_file_and_download_results.py)\n\n### files_list(commitId, path)\n\nList the files in a folder in the Domino project.\n\n- *commitId:* The `commitId` to list files from.\n\n- *path:* (Defaults to \"/\") The path to list from.\n\n### files_upload(path, file)\n\nUpload a Python file object into the specified path inside the project.\nSee `examples/upload_file.py` for an example. All parameters are\nrequired.\n\n- *path:* The path to save the file to. For example, `/README.md`\n writes to the root directory of the project while\n `/data/numbers.csv` saves the file to a sub folder named `data`. If\n the specified folder does not yet exist, it is created.\n\n- *file:* A Python file object. For example:\n `f = open(\"authors.txt\",\"rb\")`\n\n### blobs_get(key)\n\n*_Deprecated_* Retrieve a file from the Domino server by blob key. Use blobs_get_v2(path, commit_id, project_id) instead.\n\n- *key:* The key of the file to fetch from the blob server.\n\n### blobs_get_v2(path, commit_id, project_id)\n\nRetrieve a file from the Domino server in a project from its path and commit id.\n\n- *path:* The path to the file in the Domino project.\n- *commit_id:* ID of the commit to retrieve the file from.\n- *project_id:* ID of the project to retrieve the file from.\n\n## Apps\n\n### app_publish(unpublishRunningApps=True, hardwareTierId=None)\n\nPublish an app within a project, or republish an existing app.\n\n- *unpublishRunningApps:* (Defaults to True) Check for an active app\n instance in the current project and unpublish it before\n re/publishing.\n\n- *hardwareTierId:* (Optional) Launch the app on the specified\n hardware tier.\n\n### app_unpublish()\n\nStop the running app in the project.\n\n## Jobs\n\n### job_start(command, commit_id=None, hardware_tier_name=None, environment_id=None, on_demand_spark_cluster_properties=None, compute_cluster_properties=None, external_volume_mounts=None, title=None):\n\nStart a new job (execution) in the project.\n\n- *command (string):* Command to execute in Job. For example:\n `domino.job_start(command=\"main.py arg1 arg2\")`\n\n- *commit_id (string):* (Optional) The `commitId` to launch from. If\n not provided, the job launches from the latest commit.\n\n- *hardware_tier_name (string):* (Optional) The hardware tier NAME\n to launch job in. If not provided, the project\u2019s default tier is\n used.\n\n- *environment_id (string):* (Optional) The environment ID with which\n to launch the job. If not provided, the project\u2019s default\n environment is used.\n\n- *on_demand_spark_cluster_properties (dict):* (Optional) On\n demand spark cluster properties. The following properties can be\n provided in the Spark cluster:\n\n {\n \"computeEnvironmentId\": \"<Environment ID configured with spark>\"\n \"executorCount\": \"<Number of Executors in cluster>\"\n (optional defaults to 1)\n \"executorHardwareTierId\": \"<Hardware tier ID for Spark Executors>\"\n (optional defaults to last used historically if available)\n \"masterHardwareTierId\": \"<Hardware tier ID for Spark master\"\n (optional defaults to last used historically if available)\n \"executorStorageMB\": \"<Executor's storage in MB>\"\n (optional defaults to 0; 1GB is 1000MB Here)\n }\n\n- *param compute_cluster_properties (dict):* (Optional) The\n compute-cluster properties definition contains parameters for\n launching any Domino supported compute cluster for a job. Use this\n to launch a job that uses a compute-cluster instead of the\n deprecated `on_demand_spark_cluster_properties` field. If\n `on_demand_spark_cluster_properties` and\n `compute_cluster_properties` are both present,\n `on_demand_spark_cluster_properties` is ignored.\n `compute_cluster_properties` contains the following fields:\n\n {\n \"clusterType\": <string, one of \"Ray\", \"Spark\", \"Dask\", \"MPI\">,\n \"computeEnvironmentId\": <string, The environment ID for the cluster's nodes>,\n \"computeEnvironmentRevisionSpec\": <one of \"ActiveRevision\", \"LatestRevision\",\n {\"revisionId\":\"<environment_revision_id>\"} (optional)>,\n \"masterHardwareTierId\": <string, the Hardware tier ID for the cluster's master node (required unless clusterType is MPI)>,\n \"workerCount\": <number, the total workers to spawn for the cluster>,\n \"workerHardwareTierId\": <string, The Hardware tier ID for the cluster workers>,\n \"workerStorage\": <{ \"value\": <number>, \"unit\": <one of \"GiB\", \"MB\"> },\n The disk storage size for the cluster's worker nodes (optional)>\n \"maxWorkerCount\": <number, The max number of workers allowed. When\n this configuration exists, autoscaling is enabled for the cluster and\n \"workerCount\" is interpreted as the min number of workers allowed in the cluster\n (optional)>\n }\n\n- *external_volume_mounts (List\\[string\\]):* (Optional) External\n volume mount IDs to mount to execution. If not provided, the job\n launches with no external volumes mounted.\n\n- *title (string): (Optional) Title for Job.\n\n### job_stop(job_id, commit_results=True):\n\nStop the Job (execution) in the project.\n\n- *job_id (string):* Job identifier.\n\n- *commit_results (boolean):* (Defaults to `true`) If `false`, the\n job results are not committed.\n\n### job_status(job_id):\n\nGet the status of a job.\n\n- *job_id (string):* Job identifier.\n\n### job_start_blocking(poll_freq=5, max_poll_time=6000, \\*\\*kwargs):\n\nStart a job and poll until the job is finished. Additionally, this\nmethod supports all the parameters in the `job_start` method.\n\n- *poll_freq:* Poll frequency interval in seconds.\n\n- *max_poll_time:* Max poll time in seconds.\n\n## Datasets\n\nA Domino dataset is a collection of files that are available in user\nexecutions as a filesystem directory. A dataset always reflects the most\nrecent version of the data. You can modify the contents of a dataset\nthrough the Domino UI or through workload executions.\n\nSee [Domino\nDatasets](https://docs.dominodatalab.com/en/latest/user_guide/0a8d11/datasets-overview/)\nfor more details, and\n[`example_dataset.py`](https://github.com/dominodatalab/python-domino/blob/release-1.3.1/examples/example_dataset.py)\nfor example code.\n\n### datasets_list(project_id=None)\n\nProvide a JSON list of all the available datasets.\n\n- *project_id (string):* (Defaults to None) The project identifier.\n Each project can hold up to 5 datasets.\n\n### datasets_ids(project_id)\n\nList the IDs the datasets for a particular project.\n\n- *project_id:* The project identifier.\n\n### datasets_names(project_id)\n\nList the names the datasets for a particular project.\n\n- *project_id:* The project identifier.\n\n### datasets_details(dataset_id)\n\nProvide details about a dataset.\n\n- *dataset_id:* The dataset identifier.\n\n### datasets_create(dataset_name, dataset_description)\n\nCreate a new dataset.\n\n- *dataset_name:* Name of the new dataset. NOTE: The name must be\n unique.\n\n- *dataset_description:* Description of the dataset.\n\n### datasets_update_details(dataset_id, dataset_name=None, dataset_description=None)\n\nUpdate a dataset\u2019s name or description.\n\n- *dataset_id:* The dataset identifier.\n\n- *dataset_name:* (Optional) New name of the dataset.\n\n- *dataset_description:* (Optional) New description of the dataset.\n\n### datasets_remove(dataset_ids)\n\nDelete a set of datasets.\n\n- *dataset_ids (list\\[string\\]):* List of IDs of the datasets to\n delete. NOTE: Datasets are first marked for deletion, then deleted\n after a grace period (15 minutes, configurable). A Domino admin may\n also need to complete this process before the name can be reused.\n\n### datasets_upload_files(dataset_id, local_path_to_file_or_directory, file_upload_setting, max_workers, target_chunk_size, target_relative_path)\n\nUploads a file or entire directory to a dataset.\n\n- *dataset_id:* The dataset identifier.\n- *local_path_to_file_or_directory:* The path to the file or directory in local machine.\n- *file_upload_setting:* The setting to resolve naming conflict, must be one of `Overwrite`, `Rename`, `Ignore` (default).\n- *max_workers:* The max amount of threads (default: 10).\n- *target_chunk_size:* The max chunk size for multipart upload (default: 8MB).\n- *target_relative_path:* The path on the dataset to upload the file or directory to. Note that the path must exist or the upload will fail.\n \n\n# Airflow\n\nThe `python-domino` client comes bundled with an\n[Operator](https://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html)\nfor use with [Apache Airflow](https://airflow.apache.org/) as an extra.\n\nWhen installing the client from PyPI, add the `airflow` flag to extras:\n\n pip install \"dominodatalab[airflow]\"\n\nSimilarly, when installing the client from GitHub, use the following\ncommand:\n\n pip install -e git+https://github.com/dominodatalab/python-domino.git@1.0.6#egg=\"dominodatalab[airflow]\"\n\nSee also\n[example_airflow_dag.py](https://github.com/dominodatalab/python-domino/blob/release-1.2.1/examples/example_airflow_dag.py)\nfor example code.\n\n## DominoOperator\n\n from domino.airflow import DominoOperator\n\nAllows a user to schedule Domino executions via Airflow. Follows the\nsame function signature as `domino.runs_start` with two extra arguments:\n\n- `startup_delay: Optional[int] = 10` | Add a startup delay to your job, useful if you want to delay execution until after other work finishes.\n- `include_setup_log: Optional[bool] = True` | Determine whether or not to publish the setup log of the job as the log prefix before `stdout`.\n\n## DominoSparkOperator\n\n from domino.airflow import DominoSparkOperator\n\nAllows a user to schedule Domino executions via the v4 API, which\nsupports `onDemandSparkClusters`. Follows the same function signature as\n`domino.job_start`, with the addition of `startup_delay` from above.\n\n# Example\n\n from domino import Domino\n\n # By and large your commands will run against a single project,\n # so you must specify the full project name\n domino = Domino(\"chris/canon\")\n\n # List all runs in the project, most-recently queued first\n all_runs = domino.runs_list()['data']\n\n latest_100_runs = all_runs[0:100]\n\n print(latest_100_runs)\n\n # all runs have a commitId (the snapshot of the project when the\n # run starts) and, if the run completed, an \"outputCommitId\"\n # (the snapshot of the project after the run completed)\n most_recent_run = all_runs[0]\n\n commitId = most_recent_run['outputCommitId']\n\n # list all the files in the output commit ID -- only showing the\n # entries under the results directory. If not provided, will\n # list all files in the project. Or you can say path=\u201c/\u201c to\n # list all files\n files = domino.files_list(commitId, path='results/')['data']\n\n for file in files:\n print file['path'], '->', file['url']\n\n print(files)\n\n # Get the content (i.e. blob) for the file you're interested in.\n # blobs_get returns a connection rather than the content, because\n # the content can get quite large and it's up to you how you want\n # to handle it\n print(domino.blobs_get(files[0]['key']).read())\n\n # Start a run of file main.py using the latest copy of that file\n domino.runs_start([\"main.py\", \"arg1\", \"arg2\"])\n\n # Start a \"direct\" command\n domino.runs_start([\"echo 'Hello, World!'\"], isDirect=True)\n\n # Start a run of a specific commit\n domino.runs_start([\"main.py\"], commitId=\"aabbccddee\")\n\n# Manual installation\n\nBecause `python-domino` ships with the DSE, normally you do not need to install it. \nThis section provides instructions for installing it in another environment or updating it to a newer version.\n\nStarting from version `1.0.6`, `python-domino` is available on PyPI as `dominodatalab`:\n\n pip install dominodatalab\n\nIf you are adding install instructions for `python-domino` to your [Domino Environment](https://support.dominodatalab.com/hc/en-us/articles/115000392643-Compute-Environment-Management) Dockerfile Instructions field, you must add `RUN` to the beginning:\n\n RUN pip install dominodatalab\n\nTo install a specific version of the library from PyPI, such as `1.0.6`:\n\n pip install dominodatalab==1.0.6\n\nTo install a specific version of the library from GitHub, such as\n`1.0.6`:\n\n pip install https://github.com/dominodatalab/python-domino/archive/1.0.6.zip\n\n# License\n\nThis library is made available under the Apache 2.0 License. This is an\nopen-source project of [Domino Data Lab](https://www.dominodatalab.com).\n",
"bugtrack_url": null,
"license": "Apache Software License (Apache 2.0)",
"summary": "Python bindings for the Domino API",
"version": "1.4.3",
"project_urls": {
"Download": "https://github.com/dominodatalab/python-domino/archive/release-1.4.3.zip",
"Homepage": "https://github.com/dominodatalab/python-domino"
},
"split_keywords": [
"domino data lab",
" api"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "102dacb16de21787678898427f064b9e647b50c7470ed6dcb1c8226e855e32ad",
"md5": "01b7dfefbd6e4e9a079858c92384a47b",
"sha256": "162406c365d626f2b5415e5d64f17b5825a644667081680aef5bbaf269519135"
},
"downloads": -1,
"filename": "dominodatalab-1.4.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "01b7dfefbd6e4e9a079858c92384a47b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9.0",
"size": 129249,
"upload_time": "2024-10-09T18:59:01",
"upload_time_iso_8601": "2024-10-09T18:59:01.102877Z",
"url": "https://files.pythonhosted.org/packages/10/2d/acb16de21787678898427f064b9e647b50c7470ed6dcb1c8226e855e32ad/dominodatalab-1.4.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f002235fdeb7f5674e8bd454d7d133377463b90fbfe6b7302ee4654f42c4945d",
"md5": "c752b21c49ca4cbcee8b9eec4741be58",
"sha256": "370ef3f6c66f386638f38bdcc49d2ff4b36e00627ff8bd4086b26f3e6caee2ba"
},
"downloads": -1,
"filename": "dominodatalab-1.4.3.tar.gz",
"has_sig": false,
"md5_digest": "c752b21c49ca4cbcee8b9eec4741be58",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9.0",
"size": 106135,
"upload_time": "2024-10-09T18:59:02",
"upload_time_iso_8601": "2024-10-09T18:59:02.490114Z",
"url": "https://files.pythonhosted.org/packages/f0/02/235fdeb7f5674e8bd454d7d133377463b90fbfe6b7302ee4654f42c4945d/dominodatalab-1.4.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-09 18:59:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dominodatalab",
"github_project": "python-domino",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"tox": true,
"lcname": "dominodatalab"
}