datarobot-drum


Namedatarobot-drum JSON
Version 1.10.20 PyPI version JSON
download
home_pagehttp://datarobot.com
SummaryDRUM - develop, test and deploy custom models
upload_time2024-02-06 16:46:03
maintainer
docs_urlNone
authorDataRobot
requires_python>=3.4,<3.12
licenseApache License, Version 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ## About
The DataRobot Model Runner (DRUM) is a tool that allows you to work locally with Python, R, and Java custom models.
It can be used to verify that a custom model can run and make predictions before it is uploaded to DataRobot.
However, this testing is only for development purposes. DataRobot recommends that any model you wish to deploy should also be tested in the Custom Model Workshop.

DRUM can also:
- run performance and memory usage testing for models.
- perform model validation tests, e.g., checking model functionality on corner cases, like null values imputation.
- run models in a Docker container.

***DataRobot DRUM is only tested to support Linux/macOS operating systems. To run DRUM in Windows 10 please use WSL (Windows Subsystem for Linux).***

## Communication
- open an issue in the [DRUM GitHub repository](https://github.com/datarobot/datarobot-user-models/issues).

## Custom inference models quickstart guide
View examples [here](https://github.com/datarobot/datarobot-user-models#quickstart).

## Custom tasks
View examples [here](https://github.com/datarobot/datarobot-user-models#training_model_folder).

## Installation

### Prerequisites:
All models:
- Install the dependencies needed to run your code.
- If you are using a drop-in environment found in this repo, you must pip install these dependencies.

Python models:
- Check https://pypi.org/project/datarobot-drum/ for supported Python versions.

Java models:
- JRE >= 11.

R models:
- Python >= 3.6.
- The R framework must be installed.
- DRUM uses the `rpy2` package (by default the latest version is installed) to run R.
You may need to adjust the **rpy2** and **pandas** versions for compatibility.

To install DRUM with Python/Java models support:  
```pip install datarobot-drum```

To install DRUM with R support:  
```pip install datarobot-drum[R]```

### Autocompletion
DRUM supports autocompletion based on the `argcomplete` package. Additional configuration is required to use it:
- run `activate-global-python-argcomplete --user`; this should create a file: `~/.bash_completion.d/python-argcomplete`,
- source created file `source ~/.bash_completion.d/python-argcomplete` in your `~/.bashrc` or another profile-related file according to your system.

If global completion is not completing your script, bash may have registered a default completion function:
- run `complete | grep drum`; if there is an output `complete -F _minimal <some_line_containing_drum>` do
- `complete -r <some_line_containing_drum>`

For more information and troubleshooting visit the [argcomplete](https://pypi.org/project/argcomplete/) information page.


## Usage
Help:  
```**drum** -help```

### Operations
- [score](#score)
- [fit](#fit)
- [perf-test](#perf)
- [validation](#validation)
- [server](#server)
- [new](#new)


### Code Directory *--code-dir*
The *--code-dir* (code directory) argument is required in all commands and should point to a folder which contains your model artifacts and any other code needed for DRUM to run your model. For example, if you're running DRUM from **testdir** with a test input file at the root and your model in a subdirectory called **model**, you would enter:

`drum score --code-dir ./model/ --input ./testfile.csv`

#### Additional model code dependencies
Code dir may contain a `requirements.txt` file, listing dependency packages which are required by code. Only Python and R models are supported.

**Format of requirements.txt file**
* for Python: pip requrements file format
* for R: a package per line

DRUM will attempt to install dependencies only when running with [`--docker`](#docker) option.

### Model template generation
<a name="new"></a>
DRUM can help you to generate a code folder template with the `custom` file described above.  
`drum new model --code-dir ~/user_code_dir/ --language r`  
This command creates a folder with a `custom.py/R` file and a short description: `README.md`.

### Batch scoring mode
<a name="score"></a>
#### Run a binary classification custom model
Make batch predictions with a binary classification model. Optionally, specify an output file. Otherwise, predictions are returned to the command line:  
```drum score --code-dir ~/user_code_dir/ --input 10k.csv --target-type binary --positive-class-label yes --negative-class-label no --output 10k-results.csv --verbose```

#### Run a regression custom model
Make batch predictions with a regression model:  
```drum score --code-dir ~/user_code_dir/ --input fast-iron.csv --target-type regression --verbose```

### Testing model performance
<a name="perf"></a>
You can test how the model performs and get its latency times and memory usage.  
In this mode, the model is started with a prediction server. Different request combinations are submitted to it.
After it completes, it returns a report.  
```drum perf-test --code-dir ~/user_code_dir/ --input 10k.csv --target-type binary --positive-class-label yes --negative-class-label no```  
Report example:
```
samples   iters    min     avg     max    used (MB)   total (MB)
============================================================================
Test case         1     100   0.028   0.030   0.054     306.934    31442.840
Test case        10     100   0.030   0.034   0.069     307.375    31442.840
Test case       100      10   0.036   0.038   0.045     307.512    31442.840
Test case      1000      10   0.042   0.047   0.058     308.258    31442.840
Test case    100000       1   0.674   0.674   0.674     330.902    31442.840
50MB file    838861       1   5.206   5.206   5.206     453.121    31442.840
```
For more feature options, see:
```drum perf-test --help```

### Model validation checks
<a name="validation"></a>
You can validate the model on a set of various checks.
It is highly recommended running these checks, as they are performed in DataRobot before the model can be deployed.

List of checks:
- null values imputation: each feature of the provided dataset is set to missing and fed to the model.

To run:
```drum validation --code-dir ~/user_code_dir/ --input 10k.csv --target-type binary --positive-class-label yes --negative-class-label no```
Sample report:
```
Validation check results
Test case         Status
==============================
Null value imputation   PASSED
```
In case of check failure more information will be provided.

### Runtime Parameters
Runtime parameters are created by the user via the custom model's version create routes (.e.g
DataRobot WEB UI, DataRobot client, etc.). These runtime parameters can then be loaded into the
custom model using the `RuntimeParameters` class, as follows:

```
from datarobot_drum import RuntimeParameters

def load_model(code_dir):
    url = RuntimeParameters.get("URL_PARAM_1")
    aws_credential = RuntimeParameters.get("AWS_CREDENTIAL_PARAM_1")
    ...
```

During testing and debugging in a local development environment, the user can write the runtime
parameter values into a YAML file and provide it as an input to the `drum` utility. The YAML file
can have any name ending with .yaml and should follow the example layout below:

```
URL_PARAM_1: http://any-desired-location/
AWS_CRED_PARAM_1:
    # See the REST API documentation for details on all supported credential types:
    #     https://docs.datarobot.com/en/docs/api/reference/public-api/credentials.html#properties_3
    credentialType: s3
    awsAccessKeyId: ABDEFGHIJK...
    awsSecretAccessKey: asdjDFSDJafslkjsdDLKGDSDlkjlkj...
    awsSessionToken: null
```

For credential type parameters, the value matches the Credentials REST API payload.
For a complete example, see the following [model template with runtime parameters](https://github.com/datarobot/datarobot-user-models/blob/master/model_templates/python3_sklearn_runtime_params/README.md).

And here is how you use it when running the drum utility:

```
drum score --runtime-params-file <filepath> --code-dir ~/user_code_dir/ --target-type <target type> --input dataset.csv
```

### Prediction server mode
<a name="server"></a>

DRUM can also run as a prediction server. To do so, provide a server address argument:  
```drum server --code-dir ~/user_code_dir --target-type regression --address localhost:6789```

The DRUM prediction server provides the following routes. You may provide the environment variable URL_PREFIX. Note that URLs must end with /.  
For complete API specification in Openapi 3.0 format check here [drum_server_api.yaml](https://raw.githubusercontent.com/datarobot/datarobot-user-models/master/custom_model_runner/drum_server_api.yaml), you can also open it rendered in the [Swagger Editor](https://editor.swagger.io/?url=https://raw.githubusercontent.com/datarobot/datarobot-user-models/master/custom_model_runner/drum_server_api.yaml).

* Status routes:   
A GET **URL_PREFIX/** and **URL_PREFIX/ping/** routes, shows server status - if the server is alive.  
Example: GET http://localhost:6789/  
Response:
```json
   {"message": "OK"}
```

* Health route:  
A GET **URL_PREFIX/health/** route, shows functional health. E.g. model is loaded and functioning properly.  
Example: GET http://localhost:6789/health/  
Response:
  * Success:
    ```json
    {"message": "OK"}
    ```
  * Error:
    ```json
    {
      "message": "ERROR: \n\nRunning environment language: Python.\n Failed loading hooks from [/tmp/model/python3_sklearn/custom.py] : No module named 'andas'"
    }
    ```
  
* Info route:  
A GET **URL_PREFIX/info/** route, shows information about running model (metadata, paths, predictor type, etc.).  
Example: GET http://localhost:6789/info/  
Response:
  ```json
  {
     "codeDir": "/tmp/model/python3_sklearn",
     "drumServer": "flask",
     "drumVersion": "1.5.3",
     "language": "python",
     "modelMetadata": {
       "environmentID": "5e8c889607389fe0f466c72d",
       "inferenceModel": {
         "targetName": "Grade 2014"
       },
       "modelID": "5f1f15a4d6111f01cb7f91fd",
       "name": "regression model",
       "targetType": "regression",
       "type": "inference",
       "validation": {
         "input": "../../../tests/testdata/juniors_3_year_stats_regression.csv"
       }
     },
     "predictor": "scikit-learn",
     "targetType": "regression"
  }
        ```  

* Statistics route:  
A GET **URL_PREFIX/stats/** route, shows running model statistics (memory).  
Example: GET http://localhost:6789/stats/  
`mem_info::drum_rss` represent a sum of `drum_info::mem` values.  
Response:
    ```json
  {
      "drum_info": [{
          "cmdline": [
              "/tmp/drum_tests_virtual_environment/bin/python3",
              "/tmp/drum_tests_virtual_environment/bin/drum",
              "server",
              "--code-dir",
              "/tmp/model/python3_sklearn",
              "--target-type",
              "regression",
              "--address",
              "localhost:6789",
              "--with-error-server",
              "--show-perf"
          ],
          "mem": 256.71484375,
          "pid": 342391
      }],
      "mem_info": {
          "avail": 17670.828125,
          "container_limit": null,
          "container_max_used": null,
          "container_used": null,
          "drum_rss": 256.71484375,
          "free": 312.33203125,
          "nginx_rss": 0,
          "total": 31442.73046875
      },
      "time_info": {
        "run_predictor_total": {
          "avg": 0.0165,
          "max": 0.023,
          "min": 0.013
        }
      }
  }
    ```  

* Capabilities route:  
A GET **URL_PREFIX/capabilities/** route, shows payload formats supported by the running model.  
Example: GET http://localhost:6789/capabilities/

* Structured predictions routes:   
A POST **URL_PREFIX/predict/** and **URL_PREFIX/predictions/** routes, which returns predictions on data.  
Example: POST http://localhost:6789/predict/; POST http://localhost:6789/predictions/  
For these routes data can be posted in two ways:
  * as form data parameter with a <key:value> pair, where:  
key = X  
value = filename of the `csv/arrow/mtx` format, that contains the inference data.
  * as binary data; in case of `arrow` or `mtx` formats, mimetype `application/x-apache-arrow-stream` or `text/mtx` must be set.
   
* Structured transform route (for Python predictor only):   
A POST **URL_PREFIX/transform/** route, which returns transformed data.  
Example: POST http://localhost:6789/transform/;  
For this route data can be posted in two ways:
  * as form data parameter with a <key:value> pair, where:  
key = `X`.  
value = filename of the `csv/arrow/mtx` format, that contains the inference data.
 
    optionally a second key, `y`, can be passed with value = a second filename containing target data. 
    
    if `y` is passed, the route will return both `X.transformed` and `y.transformed` keys, along with `out.format`
     indicating the format of the transformed X output. This will take a value of `csv`, 
    `sparse` or `arrow`. `y.transformed` is never sparse.
    
    an `arrow_version` key may also be passed if you desire to use `arrow` format for `X.transformed` or `y.transformed`.
    this is used to ensure that the endpoint returns data that can be opened by the caller's version of arrow. without this
    key, all dense data returned will default to csv format.

  * as binary data; in case of `arrow` or `mtx` formats, mimetype `application/x-apache-arrow-stream` or `text/mtx` must be set.
  
  
* Unstructured predictions routes:  
A POST **URL_PREFIX/predictUnstructured/** and **URL_PREFIX/predictionsUnstructured/** routes, which returns predictions on data.  
Example: POST http://localhost:6789/predictUnstructured/; POST http://localhost:6789/predictionsUnstructured/  
For these routes data is posted as binary data. Provide mimetype and charset to properly handle the data.
For more detailed information please go [here](https://github.com/datarobot/datarobot-user-models#unstructured_inference_models).  

#### Starting drum as prediction server in production mode.
DRUM prediction server can be started in *production* mode which has nginx and uwsgi as the backend.
This provides better stability and scalability - depending on how many CPUs are available several workers will be started to serve predictions.  
*--max-workers* parameter  can be used to limit number of workers.  
E.g. ```drum server --code-dir ~/user_code_dir --address localhost:6789 --production --max-workers 2```

> Note: `uwsgi` is an extra dependency for DRUM. Install it using: `pip install datarobot-drum[uwsgi]` or `pip install uwsgi`  
> Note: *Production* mode may not be available on Windows-based systems out ot the box, as uwsgi installation requires special handling.
> Docker container based Linux environment can be used for such cases.

### Fit mode
<a name="fit"></a>
> Note: Running fit inside of DataRobot is currently in alpha. Check back soon for the opportunity
to test out this functionality yourself.

DRUM can run your training model to make sure it can produce a trained model artifact before
adding the training model into DataRobot.

You can try this out on our sklearn classifier model template this this command:

```
drum fit --code-dir task_templates/3_pipelines/python3_sklearn_binary --target-type binary --target Species --input \
tests/testdata/iris_binary_training.csv --output . --positive-class-label Iris-setosa \
--negative-class-label Iris-versicolor
```
> Note: If you don't provide class label, DataRobot tries to autodetect the labels for you.

You can also use DRUM on regression datasets, and soon you will also be able to provide row weights. Checkout the ```drum fit --help``` output for further details.



### Running inside a docker container
<a name="docker"></a>
In every mode, DRUM can be run inside a docker container by providing the option ```--docker <image_name/directory_path>```.
The container should implement an environment required to perform desired action.
DRUM must be installed as a part of this environment.  
The following is an example on how to run DRUM inside of container:  
```drum score --code-dir ~/user_code_dir/ --target-type <target type> --input dataset.csv --docker <container_name>```  
```drum perf-test --code-dir ~/user_code_dir/ --target-type <target type> --input dataset.csv --docker <container_name>```

Alternatively, the argument passed through the `--docker` flag may be a directory containing the unbuilt contents
of an image. The DRUM tool will then attempt to build an image using this directory and run your model inside
the newly built image.

If the argument passed to `--docker` is a docker context directory, and code dir contains dependencies file `requirements.txt`, DRUM will try to install the packages during the image build.  
To skip dependencies installation you can use `--skip-deps-install` flag. 

## Drum Push
Starting in version 1.1.4, drum includes a new verb called `push`. When the user writes
`drum push -cd /dirtopush/` the contents of that directory will be submitted as a custom model
to DataRobot. However, for this to work, you must create two types of configuration.
1. **DataRobot client configuration**
`push` relies on correct global configuration of the client to access a DataRobot server.
There are two options for supplying this configuration, through environment variables or through
a config file which is read by the DataRobot client. Both of these options will include an endpoint
and an API token to authenticate the requests.

* Option 1: Environment variables.
    Example:
    ```
    export DATAROBOT_ENDPOINT=https://app.datarobot.com/api/v2
    export DATAROBOT_API_TOKEN=<yourtoken>
    ```
* Option 2: Create this file, which we check for: `~/.config/datarobot/drconfig.yaml`  
    Example:
    ```
    endpoint: https://app.datarobot.com/api/v2
    token: <yourtoken>
    ```
2. **Model Metadata** `push` also relies on a metadata file, which is parsed on DRUM to create
the correct sort of model in DataRobot. This metadata file includes quite a few options. You can
[read about those options](https://github.com/datarobot/datarobot-user-models/blob/master/MODEL-METADATA.md) or [see an example](https://github.com/datarobot/datarobot-user-models/blob/master/model_templates/python3_sklearn/model-metadata.yaml).

            

Raw data

            {
    "_id": null,
    "home_page": "http://datarobot.com",
    "name": "datarobot-drum",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.4,<3.12",
    "maintainer_email": "",
    "keywords": "",
    "author": "DataRobot",
    "author_email": "support@datarobot.com",
    "download_url": "",
    "platform": null,
    "description": "## About\nThe DataRobot Model Runner (DRUM) is a tool that allows you to work locally with Python, R, and Java custom models.\nIt can be used to verify that a custom model can run and make predictions before it is uploaded to DataRobot.\nHowever, this testing is only for development purposes. DataRobot recommends that any model you wish to deploy should also be tested in the Custom Model Workshop.\n\nDRUM can also:\n- run performance and memory usage testing for models.\n- perform model validation tests, e.g., checking model functionality on corner cases, like null values imputation.\n- run models in a Docker container.\n\n***DataRobot DRUM is only tested to support Linux/macOS operating systems. To run DRUM in Windows 10 please use WSL (Windows Subsystem for Linux).***\n\n## Communication\n- open an issue in the [DRUM GitHub repository](https://github.com/datarobot/datarobot-user-models/issues).\n\n## Custom inference models quickstart guide\nView examples [here](https://github.com/datarobot/datarobot-user-models#quickstart).\n\n## Custom tasks\nView examples [here](https://github.com/datarobot/datarobot-user-models#training_model_folder).\n\n## Installation\n\n### Prerequisites:\nAll models:\n- Install the dependencies needed to run your code.\n- If you are using a drop-in environment found in this repo, you must pip install these dependencies.\n\nPython models:\n- Check https://pypi.org/project/datarobot-drum/ for supported Python versions.\n\nJava models:\n- JRE >= 11.\n\nR models:\n- Python >= 3.6.\n- The R framework must be installed.\n- DRUM uses the `rpy2` package (by default the latest version is installed) to run R.\nYou may need to adjust the **rpy2** and **pandas** versions for compatibility.\n\nTo install DRUM with Python/Java models support:  \n```pip install datarobot-drum```\n\nTo install DRUM with R support:  \n```pip install datarobot-drum[R]```\n\n### Autocompletion\nDRUM supports autocompletion based on the `argcomplete` package. Additional configuration is required to use it:\n- run `activate-global-python-argcomplete --user`; this should create a file: `~/.bash_completion.d/python-argcomplete`,\n- source created file `source ~/.bash_completion.d/python-argcomplete` in your `~/.bashrc` or another profile-related file according to your system.\n\nIf global completion is not completing your script, bash may have registered a default completion function:\n- run `complete | grep drum`; if there is an output `complete -F _minimal <some_line_containing_drum>` do\n- `complete -r <some_line_containing_drum>`\n\nFor more information and troubleshooting visit the [argcomplete](https://pypi.org/project/argcomplete/) information page.\n\n\n## Usage\nHelp:  \n```**drum** -help```\n\n### Operations\n- [score](#score)\n- [fit](#fit)\n- [perf-test](#perf)\n- [validation](#validation)\n- [server](#server)\n- [new](#new)\n\n\n### Code Directory *--code-dir*\nThe *--code-dir* (code directory) argument is required in all commands and should point to a folder which contains your model artifacts and any other code needed for DRUM to run your model. For example, if you're running DRUM from **testdir** with a test input file at the root and your model in a subdirectory called **model**, you would enter:\n\n`drum score --code-dir ./model/ --input ./testfile.csv`\n\n#### Additional model code dependencies\nCode dir may contain a `requirements.txt` file, listing dependency packages which are required by code. Only Python and R models are supported.\n\n**Format of requirements.txt file**\n* for Python: pip requrements file format\n* for R: a package per line\n\nDRUM will attempt to install dependencies only when running with [`--docker`](#docker) option.\n\n### Model template generation\n<a name=\"new\"></a>\nDRUM can help you to generate a code folder template with the `custom` file described above.  \n`drum new model --code-dir ~/user_code_dir/ --language r`  \nThis command creates a folder with a `custom.py/R` file and a short description: `README.md`.\n\n### Batch scoring mode\n<a name=\"score\"></a>\n#### Run a binary classification custom model\nMake batch predictions with a binary classification model. Optionally, specify an output file. Otherwise, predictions are returned to the command line:  \n```drum score --code-dir ~/user_code_dir/ --input 10k.csv --target-type binary --positive-class-label yes --negative-class-label no --output 10k-results.csv --verbose```\n\n#### Run a regression custom model\nMake batch predictions with a regression model:  \n```drum score --code-dir ~/user_code_dir/ --input fast-iron.csv --target-type regression --verbose```\n\n### Testing model performance\n<a name=\"perf\"></a>\nYou can test how the model performs and get its latency times and memory usage.  \nIn this mode, the model is started with a prediction server. Different request combinations are submitted to it.\nAfter it completes, it returns a report.  \n```drum perf-test --code-dir ~/user_code_dir/ --input 10k.csv --target-type binary --positive-class-label yes --negative-class-label no```  \nReport example:\n```\nsamples   iters    min     avg     max    used (MB)   total (MB)\n============================================================================\nTest case         1     100   0.028   0.030   0.054     306.934    31442.840\nTest case        10     100   0.030   0.034   0.069     307.375    31442.840\nTest case       100      10   0.036   0.038   0.045     307.512    31442.840\nTest case      1000      10   0.042   0.047   0.058     308.258    31442.840\nTest case    100000       1   0.674   0.674   0.674     330.902    31442.840\n50MB file    838861       1   5.206   5.206   5.206     453.121    31442.840\n```\nFor more feature options, see:\n```drum perf-test --help```\n\n### Model validation checks\n<a name=\"validation\"></a>\nYou can validate the model on a set of various checks.\nIt is highly recommended running these checks, as they are performed in DataRobot before the model can be deployed.\n\nList of checks:\n- null values imputation: each feature of the provided dataset is set to missing and fed to the model.\n\nTo run:\n```drum validation --code-dir ~/user_code_dir/ --input 10k.csv --target-type binary --positive-class-label yes --negative-class-label no```\nSample report:\n```\nValidation check results\nTest case         Status\n==============================\nNull value imputation   PASSED\n```\nIn case of check failure more information will be provided.\n\n### Runtime Parameters\nRuntime parameters are created by the user via the custom model's version create routes (.e.g\nDataRobot WEB UI, DataRobot client, etc.). These runtime parameters can then be loaded into the\ncustom model using the `RuntimeParameters` class, as follows:\n\n```\nfrom datarobot_drum import RuntimeParameters\n\ndef load_model(code_dir):\n    url = RuntimeParameters.get(\"URL_PARAM_1\")\n    aws_credential = RuntimeParameters.get(\"AWS_CREDENTIAL_PARAM_1\")\n    ...\n```\n\nDuring testing and debugging in a local development environment, the user can write the runtime\nparameter values into a YAML file and provide it as an input to the `drum` utility. The YAML file\ncan have any name ending with .yaml and should follow the example layout below:\n\n```\nURL_PARAM_1: http://any-desired-location/\nAWS_CRED_PARAM_1:\n    # See the REST API documentation for details on all supported credential types:\n    #     https://docs.datarobot.com/en/docs/api/reference/public-api/credentials.html#properties_3\n    credentialType: s3\n    awsAccessKeyId: ABDEFGHIJK...\n    awsSecretAccessKey: asdjDFSDJafslkjsdDLKGDSDlkjlkj...\n    awsSessionToken: null\n```\n\nFor credential type parameters, the value matches the Credentials REST API payload.\nFor a complete example, see the following [model template with runtime parameters](https://github.com/datarobot/datarobot-user-models/blob/master/model_templates/python3_sklearn_runtime_params/README.md).\n\nAnd here is how you use it when running the drum utility:\n\n```\ndrum score --runtime-params-file <filepath> --code-dir ~/user_code_dir/ --target-type <target type> --input dataset.csv\n```\n\n### Prediction server mode\n<a name=\"server\"></a>\n\nDRUM can also run as a prediction server. To do so, provide a server address argument:  \n```drum server --code-dir ~/user_code_dir --target-type regression --address localhost:6789```\n\nThe DRUM prediction server provides the following routes. You may provide the environment variable URL_PREFIX. Note that URLs must end with /.  \nFor complete API specification in Openapi 3.0 format check here [drum_server_api.yaml](https://raw.githubusercontent.com/datarobot/datarobot-user-models/master/custom_model_runner/drum_server_api.yaml), you can also open it rendered in the [Swagger Editor](https://editor.swagger.io/?url=https://raw.githubusercontent.com/datarobot/datarobot-user-models/master/custom_model_runner/drum_server_api.yaml).\n\n* Status routes:   \nA GET **URL_PREFIX/** and **URL_PREFIX/ping/** routes, shows server status - if the server is alive.  \nExample: GET http://localhost:6789/  \nResponse:\n```json\n   {\"message\": \"OK\"}\n```\n\n* Health route:  \nA GET **URL_PREFIX/health/** route, shows functional health. E.g. model is loaded and functioning properly.  \nExample: GET http://localhost:6789/health/  \nResponse:\n  * Success:\n    ```json\n    {\"message\": \"OK\"}\n    ```\n  * Error:\n    ```json\n    {\n      \"message\": \"ERROR: \\n\\nRunning environment language: Python.\\n Failed loading hooks from [/tmp/model/python3_sklearn/custom.py] : No module named 'andas'\"\n    }\n    ```\n  \n* Info route:  \nA GET **URL_PREFIX/info/** route, shows information about running model (metadata, paths, predictor type, etc.).  \nExample: GET http://localhost:6789/info/  \nResponse:\n  ```json\n  {\n     \"codeDir\": \"/tmp/model/python3_sklearn\",\n     \"drumServer\": \"flask\",\n     \"drumVersion\": \"1.5.3\",\n     \"language\": \"python\",\n     \"modelMetadata\": {\n       \"environmentID\": \"5e8c889607389fe0f466c72d\",\n       \"inferenceModel\": {\n         \"targetName\": \"Grade 2014\"\n       },\n       \"modelID\": \"5f1f15a4d6111f01cb7f91fd\",\n       \"name\": \"regression model\",\n       \"targetType\": \"regression\",\n       \"type\": \"inference\",\n       \"validation\": {\n         \"input\": \"../../../tests/testdata/juniors_3_year_stats_regression.csv\"\n       }\n     },\n     \"predictor\": \"scikit-learn\",\n     \"targetType\": \"regression\"\n  }\n        ```  \n\n* Statistics route:  \nA GET **URL_PREFIX/stats/** route, shows running model statistics (memory).  \nExample: GET http://localhost:6789/stats/  \n`mem_info::drum_rss` represent a sum of `drum_info::mem` values.  \nResponse:\n    ```json\n  {\n      \"drum_info\": [{\n          \"cmdline\": [\n              \"/tmp/drum_tests_virtual_environment/bin/python3\",\n              \"/tmp/drum_tests_virtual_environment/bin/drum\",\n              \"server\",\n              \"--code-dir\",\n              \"/tmp/model/python3_sklearn\",\n              \"--target-type\",\n              \"regression\",\n              \"--address\",\n              \"localhost:6789\",\n              \"--with-error-server\",\n              \"--show-perf\"\n          ],\n          \"mem\": 256.71484375,\n          \"pid\": 342391\n      }],\n      \"mem_info\": {\n          \"avail\": 17670.828125,\n          \"container_limit\": null,\n          \"container_max_used\": null,\n          \"container_used\": null,\n          \"drum_rss\": 256.71484375,\n          \"free\": 312.33203125,\n          \"nginx_rss\": 0,\n          \"total\": 31442.73046875\n      },\n      \"time_info\": {\n        \"run_predictor_total\": {\n          \"avg\": 0.0165,\n          \"max\": 0.023,\n          \"min\": 0.013\n        }\n      }\n  }\n    ```  \n\n* Capabilities route:  \nA GET **URL_PREFIX/capabilities/** route, shows payload formats supported by the running model.  \nExample: GET http://localhost:6789/capabilities/\n\n* Structured predictions routes:   \nA POST **URL_PREFIX/predict/** and **URL_PREFIX/predictions/** routes, which returns predictions on data.  \nExample: POST http://localhost:6789/predict/; POST http://localhost:6789/predictions/  \nFor these routes data can be posted in two ways:\n  * as form data parameter with a <key:value> pair, where:  \nkey = X  \nvalue = filename of the `csv/arrow/mtx` format, that contains the inference data.\n  * as binary data; in case of `arrow` or `mtx` formats, mimetype `application/x-apache-arrow-stream` or `text/mtx` must be set.\n   \n* Structured transform route (for Python predictor only):   \nA POST **URL_PREFIX/transform/** route, which returns transformed data.  \nExample: POST http://localhost:6789/transform/;  \nFor this route data can be posted in two ways:\n  * as form data parameter with a <key:value> pair, where:  \nkey = `X`.  \nvalue = filename of the `csv/arrow/mtx` format, that contains the inference data.\n \n    optionally a second key, `y`, can be passed with value = a second filename containing target data. \n    \n    if `y` is passed, the route will return both `X.transformed` and `y.transformed` keys, along with `out.format`\n     indicating the format of the transformed X output. This will take a value of `csv`, \n    `sparse` or `arrow`. `y.transformed` is never sparse.\n    \n    an `arrow_version` key may also be passed if you desire to use `arrow` format for `X.transformed` or `y.transformed`.\n    this is used to ensure that the endpoint returns data that can be opened by the caller's version of arrow. without this\n    key, all dense data returned will default to csv format.\n\n  * as binary data; in case of `arrow` or `mtx` formats, mimetype `application/x-apache-arrow-stream` or `text/mtx` must be set.\n  \n  \n* Unstructured predictions routes:  \nA POST **URL_PREFIX/predictUnstructured/** and **URL_PREFIX/predictionsUnstructured/** routes, which returns predictions on data.  \nExample: POST http://localhost:6789/predictUnstructured/; POST http://localhost:6789/predictionsUnstructured/  \nFor these routes data is posted as binary data. Provide mimetype and charset to properly handle the data.\nFor more detailed information please go [here](https://github.com/datarobot/datarobot-user-models#unstructured_inference_models).  \n\n#### Starting drum as prediction server in production mode.\nDRUM prediction server can be started in *production* mode which has nginx and uwsgi as the backend.\nThis provides better stability and scalability - depending on how many CPUs are available several workers will be started to serve predictions.  \n*--max-workers* parameter  can be used to limit number of workers.  \nE.g. ```drum server --code-dir ~/user_code_dir --address localhost:6789 --production --max-workers 2```\n\n> Note: `uwsgi` is an extra dependency for DRUM. Install it using: `pip install datarobot-drum[uwsgi]` or `pip install uwsgi`  \n> Note: *Production* mode may not be available on Windows-based systems out ot the box, as uwsgi installation requires special handling.\n> Docker container based Linux environment can be used for such cases.\n\n### Fit mode\n<a name=\"fit\"></a>\n> Note: Running fit inside of DataRobot is currently in alpha. Check back soon for the opportunity\nto test out this functionality yourself.\n\nDRUM can run your training model to make sure it can produce a trained model artifact before\nadding the training model into DataRobot.\n\nYou can try this out on our sklearn classifier model template this this command:\n\n```\ndrum fit --code-dir task_templates/3_pipelines/python3_sklearn_binary --target-type binary --target Species --input \\\ntests/testdata/iris_binary_training.csv --output . --positive-class-label Iris-setosa \\\n--negative-class-label Iris-versicolor\n```\n> Note: If you don't provide class label, DataRobot tries to autodetect the labels for you.\n\nYou can also use DRUM on regression datasets, and soon you will also be able to provide row weights. Checkout the ```drum fit --help``` output for further details.\n\n\n\n### Running inside a docker container\n<a name=\"docker\"></a>\nIn every mode, DRUM can be run inside a docker container by providing the option ```--docker <image_name/directory_path>```.\nThe container should implement an environment required to perform desired action.\nDRUM must be installed as a part of this environment.  \nThe following is an example on how to run DRUM inside of container:  \n```drum score --code-dir ~/user_code_dir/ --target-type <target type> --input dataset.csv --docker <container_name>```  \n```drum perf-test --code-dir ~/user_code_dir/ --target-type <target type> --input dataset.csv --docker <container_name>```\n\nAlternatively, the argument passed through the `--docker` flag may be a directory containing the unbuilt contents\nof an image. The DRUM tool will then attempt to build an image using this directory and run your model inside\nthe newly built image.\n\nIf the argument passed to `--docker` is a docker context directory, and code dir contains dependencies file `requirements.txt`, DRUM will try to install the packages during the image build.  \nTo skip dependencies installation you can use `--skip-deps-install` flag. \n\n## Drum Push\nStarting in version 1.1.4, drum includes a new verb called `push`. When the user writes\n`drum push -cd /dirtopush/` the contents of that directory will be submitted as a custom model\nto DataRobot. However, for this to work, you must create two types of configuration.\n1. **DataRobot client configuration**\n`push` relies on correct global configuration of the client to access a DataRobot server.\nThere are two options for supplying this configuration, through environment variables or through\na config file which is read by the DataRobot client. Both of these options will include an endpoint\nand an API token to authenticate the requests.\n\n* Option 1: Environment variables.\n    Example:\n    ```\n    export DATAROBOT_ENDPOINT=https://app.datarobot.com/api/v2\n    export DATAROBOT_API_TOKEN=<yourtoken>\n    ```\n* Option 2: Create this file, which we check for: `~/.config/datarobot/drconfig.yaml`  \n    Example:\n    ```\n    endpoint: https://app.datarobot.com/api/v2\n    token: <yourtoken>\n    ```\n2. **Model Metadata** `push` also relies on a metadata file, which is parsed on DRUM to create\nthe correct sort of model in DataRobot. This metadata file includes quite a few options. You can\n[read about those options](https://github.com/datarobot/datarobot-user-models/blob/master/MODEL-METADATA.md) or [see an example](https://github.com/datarobot/datarobot-user-models/blob/master/model_templates/python3_sklearn/model-metadata.yaml).\n",
    "bugtrack_url": null,
    "license": "Apache License, Version 2.0",
    "summary": "DRUM - develop, test and deploy custom models",
    "version": "1.10.20",
    "project_urls": {
        "Changelog": "https://github.com/datarobot/datarobot-user-models/blob/master/custom_model_runner/CHANGELOG.md",
        "Homepage": "http://datarobot.com",
        "Source": "https://github.com/datarobot/datarobot-user-models"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8dd397a70b2755d1f47b5a35c0da68181cf7965561343e74e2c1ec42aea245bd",
                "md5": "05e661cd0c8e6dcf302441960a11c26f",
                "sha256": "6c6e01794015fb998a26c4376df00ad0457c1c5335dbeb98be9d57e96bed633a"
            },
            "downloads": -1,
            "filename": "datarobot_drum-1.10.20-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "05e661cd0c8e6dcf302441960a11c26f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.4,<3.12",
            "size": 10494522,
            "upload_time": "2024-02-06T16:46:03",
            "upload_time_iso_8601": "2024-02-06T16:46:03.702374Z",
            "url": "https://files.pythonhosted.org/packages/8d/d3/97a70b2755d1f47b5a35c0da68181cf7965561343e74e2c1ec42aea245bd/datarobot_drum-1.10.20-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-06 16:46:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "datarobot",
    "github_project": "datarobot-user-models",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "datarobot-drum"
}
        
Elapsed time: 0.19087s