[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![PyPI version fury.io](https://badge.fury.io/py/qgate-sln-mlrun.svg)](https://pypi.python.org/pypi/qgate-sln-mlrun/)
![coverage](https://github.com/george0st/qgate-sln-mlrun/blob/master/coverage.svg?raw=true)
![GitHub commit activity](https://img.shields.io/github/commit-activity/w/george0st/qgate-sln-mlrun)
![GitHub release](https://img.shields.io/github/v/release/george0st/qgate-sln-mlrun)
# QGate-Sln-MLRun
The Quality Gate for solution MLRun (and Iguazio). The main aims of the project are:
- independent quality test (function, integration, acceptance, ... tests)
- deeper quality checks before full rollout/use in company environments
- identification of possible compatibility issues (if any)
- external and independent test coverage
- etc.
The tests use these key components, MLRun solution see **[GIT mlrun](https://github.com/mlrun/mlrun)**,
sample meta-data model see **[GIT qgate-model](https://github.com/george0st/qgate-model)** and this project.
## Test scenarios
The quality gate covers these test scenarios (✅ done, ✔ in-progress, ❌ planned):
- **Project**
- ✅ TS101: Create project(s)
- ✅ TS102: Delete project(s)
- **Feature set**
- ✅ TS201: Create feature set(s)
- ✅ TS202: Create feature set(s) & Ingest from DataFrame source (one step)
- ✅ TS203: Create feature set(s) & Ingest from CSV source (one step)
- ✅ TS204: Create feature set(s) & Ingest from Parquet source (one step)
- ❌ TS205: Create feature set(s) & Ingest from SQL source (one step)
- ❌ TS206: Create feature set(s) & Ingest from Kafka source (one step)
- ❌ TS207: Create feature set(s) & Ingest from HTTP source (one step)
- **Ingest data**
- ✅ TS301: Ingest data to feature set(s) from DataFrame source
- ✅ TS302: Ingest data to feature set(s) from CSV source
- ✅ TS303: Ingest data to feature set(s) from Parquet source
- ✔ TS304: Ingest data to feature set(s) from SQL source
- ❌ TS305: Ingest data to feature set(s) from Kafka source
- ❌ TS306: Ingest data to feature set(s) from HTTP source
- **Feature vector**
- ✅ TS401: Create feature vector(s)
- **Get data from vector**
- ✅ TS501: Get data from off-line feature vector(s)
- ✅ TS502: Get data from on-line feature vector(s)
- **Pipelines**
- ❌ TS601: Simple pipeline for DataFrame source
- ❌ TS602: Simple pipeline for CSV source
- ❌ TS603: Simple pipeline for Parquet source
- ❌ TS604: Complex pipeline for DataFrame source
- ❌ TS605: Complex pipeline for CSV source
- ❌ TS606: Complex pipeline for Parquet source
- **Build model**
- ✅ TS701: Build CART model
- ❌ TS702: Build XGBoost model
- ❌ TS703: Build DNN model
- **Serve model**
- ✅ TS801: Serving score from CART
- ❌ TS802: Serving score from XGBoost
- ❌ TS803: Serving score from DNN
NOTE: Each test scenario contains addition specific test cases (e.g. with different
targets for feature sets, etc.).
## Test inputs/outputs
The quality gate tests these inputs/outputs (✅ done, ✔ in-progress, ❌ planned):
- Outputs (targets)
- ✅ RedisTarget, ✔ SQLTarget/MySQL, ✔ SQLTarget/Postgres, ✅ KafkaTarget
- ✅ ParquetTarget, ✅ CSVTarget
- ✅ File system, ❌ S3, ❌ BlobStorage
- Inputs (sources)
- ✅ Pandas/DataFrame, ❌ SQLSource/MySQL, ❌ SQLSource/Postgres, ❌ KafkaSource
- ✔ ParquetSource, ✅ CSVSource
- ✅ File system, ❌ S3, ❌ BlobStorage
The supported [sources/targets from MLRun](https://docs.mlrun.org/en/latest/feature-store/sources-targets.html).
## Sample of outputs
![Sample of outputs](https://github.com/george0st/qgate-sln-mlrun/blob/master/assets/imgs/qgt-mlrun-samples.png?raw=true)
The reports in original form, see:
- all DONE - [HTML](https://htmlpreview.github.io/?https://github.com/george0st/qgate-sln-mlrun/blob/master/docs/samples/outputs/qgt-mlrun-sample.html),
[TXT](https://github.com/george0st/qgate-sln-mlrun/blob/master/docs/samples/outputs/qgt-mlrun-sample.txt?raw=true)
- with ERR - [HTML](https://htmlpreview.github.io/?https://github.com/george0st/qgate-sln-mlrun/blob/master/docs/samples/outputs/qgt-mlrun-sample-err.html),
[TXT](https://github.com/george0st/qgate-sln-mlrun/blob/master/docs/samples/outputs/qgt-mlrun-sample-err.txt?raw=true)
## Usage
You can easy use this solution in four steps:
1. Download content of these two GIT repositories to your local environment
- [qgate-sln-mlrun](https://github.com/george0st/qgate-sln-mlrun)
- [qgate-model](https://github.com/george0st/qgate-model)
2. Update file `qgate-sln-mlrun.env` from qgate-model
- Update variables for MLRun/Iguazio, see `MLRUN_DBPATH`, `V3IO_USERNAME`, `V3IO_ACCESS_KEY`, `V3IO_API`
- setting of `V3IO_*` is needed only in case of Iguazio installation (not for pure free MLRun)
- Update variables for QGate, see `QGATE_*` (basic description directly in *.env)
- detail setup [configuration](./docs/configuration.md)
3. Run from `qgate-sln-mlrun`
- **python main.py**
4. See outputs (location is based on `QGATE_OUTPUT` in configuration)
- './output/qgt-mlrun-<date_time>.html'
- './output/qgt-mlrun-<date_time>.txt'
Precondition: You have available MLRun or Iguazio solution (MLRun is part of that),
see official [installation steps](https://docs.mlrun.org/en/latest/install.html), or directly installation for [Desktop Docker](https://docs.mlrun.org/en/latest/install/local-docker.html).
## Tested with
The project was tested with these MLRun versions (see [change log](https://docs.mlrun.org/en/latest/change-log/index.html)):
- **MLRun** (in Desktop Docker)
- MLRun 1.7.0 (plan 05/2024)
- MLRun 1.6.3 (plan 04/2024), 1.6.2, 1.6.1, 1.6.0
- MLRun 1.5.2, 1.5.1, 1.5.0
- MLRun 1.4.1, 1.3.0
- **Iguazio** (k8s, on-prem, VM on VMware)
- Iguazio 3.5.3 (with MLRun 1.4.1)
- Iguazio 3.5.1 (with MLRun 1.3.0)
NOTE: Current state, only the last MLRun/Iguazio versions are valid for testing
(these tests are without back-compatibilities).
## Others
- **To-Do**, the list of expected/future improvements, [see](./docs/todo_list.md)
- **Applied limits**, the list of applied limits, [see](./docs/applied-limits.md)
- **How can you test the solution?**, you have to focus on Linux env. or
Windows with WSL2 ([see](./docs/testing.md) step by step tutorial)
- **MLRun/Iguazio, the key changes in a nutshell**, [see](./docs/mlrun-iguazio-release-notes.md)
Raw data
{
"_id": null,
"home_page": null,
"name": "qgate-sln-mlrun",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "Jiri Steuer <steuer.jiri@gmail.com>",
"keywords": "testing, data-science, machine-learning, quality-assurance, quality-assessment, iguazio, mlrun, mlops, quality-gate, feature-store",
"author": null,
"author_email": "Jiri Steuer <steuer.jiri@gmail.com>",
"download_url": null,
"platform": null,
"description": "[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\r\n[![PyPI version fury.io](https://badge.fury.io/py/qgate-sln-mlrun.svg)](https://pypi.python.org/pypi/qgate-sln-mlrun/)\r\n![coverage](https://github.com/george0st/qgate-sln-mlrun/blob/master/coverage.svg?raw=true)\r\n![GitHub commit activity](https://img.shields.io/github/commit-activity/w/george0st/qgate-sln-mlrun)\r\n![GitHub release](https://img.shields.io/github/v/release/george0st/qgate-sln-mlrun)\r\n\r\n# QGate-Sln-MLRun\r\nThe Quality Gate for solution MLRun (and Iguazio). The main aims of the project are:\r\n- independent quality test (function, integration, acceptance, ... tests)\r\n- deeper quality checks before full rollout/use in company environments\r\n- identification of possible compatibility issues (if any)\r\n- external and independent test coverage\r\n- etc.\r\n\r\nThe tests use these key components, MLRun solution see **[GIT mlrun](https://github.com/mlrun/mlrun)**, \r\nsample meta-data model see **[GIT qgate-model](https://github.com/george0st/qgate-model)** and this project.\r\n\r\n## Test scenarios\r\nThe quality gate covers these test scenarios (\u2705 done, \u2714 in-progress, \u274c planned):\r\n - **Project**\r\n - \u2705 TS101: Create project(s)\r\n - \u2705 TS102: Delete project(s)\r\n - **Feature set**\r\n - \u2705 TS201: Create feature set(s)\r\n - \u2705 TS202: Create feature set(s) & Ingest from DataFrame source (one step)\r\n - \u2705 TS203: Create feature set(s) & Ingest from CSV source (one step) \r\n - \u2705 TS204: Create feature set(s) & Ingest from Parquet source (one step)\r\n - \u274c TS205: Create feature set(s) & Ingest from SQL source (one step)\r\n - \u274c TS206: Create feature set(s) & Ingest from Kafka source (one step)\r\n - \u274c TS207: Create feature set(s) & Ingest from HTTP source (one step)\r\n - **Ingest data**\r\n - \u2705 TS301: Ingest data to feature set(s) from DataFrame source\r\n - \u2705 TS302: Ingest data to feature set(s) from CSV source \r\n - \u2705 TS303: Ingest data to feature set(s) from Parquet source\r\n - \u2714 TS304: Ingest data to feature set(s) from SQL source\r\n - \u274c TS305: Ingest data to feature set(s) from Kafka source\r\n - \u274c TS306: Ingest data to feature set(s) from HTTP source\r\n - **Feature vector**\r\n - \u2705 TS401: Create feature vector(s)\r\n - **Get data from vector**\r\n - \u2705 TS501: Get data from off-line feature vector(s)\r\n - \u2705 TS502: Get data from on-line feature vector(s)\r\n - **Pipelines**\r\n - \u274c TS601: Simple pipeline for DataFrame source\r\n - \u274c TS602: Simple pipeline for CSV source\r\n - \u274c TS603: Simple pipeline for Parquet source\r\n - \u274c TS604: Complex pipeline for DataFrame source\r\n - \u274c TS605: Complex pipeline for CSV source\r\n - \u274c TS606: Complex pipeline for Parquet source\r\n - **Build model**\r\n - \u2705 TS701: Build CART model\r\n - \u274c TS702: Build XGBoost model\r\n - \u274c TS703: Build DNN model\r\n - **Serve model**\r\n - \u2705 TS801: Serving score from CART\r\n - \u274c TS802: Serving score from XGBoost\r\n - \u274c TS803: Serving score from DNN\r\n \r\nNOTE: Each test scenario contains addition specific test cases (e.g. with different\r\ntargets for feature sets, etc.).\r\n\r\n## Test inputs/outputs\r\nThe quality gate tests these inputs/outputs (\u2705 done, \u2714 in-progress, \u274c planned):\r\n - Outputs (targets)\r\n - \u2705 RedisTarget, \u2714 SQLTarget/MySQL, \u2714 SQLTarget/Postgres, \u2705 KafkaTarget\r\n - \u2705 ParquetTarget, \u2705 CSVTarget\r\n - \u2705 File system, \u274c S3, \u274c BlobStorage\r\n - Inputs (sources)\r\n - \u2705 Pandas/DataFrame, \u274c SQLSource/MySQL, \u274c SQLSource/Postgres, \u274c KafkaSource\r\n - \u2714 ParquetSource, \u2705 CSVSource\r\n - \u2705 File system, \u274c S3, \u274c BlobStorage\r\n\r\n\r\nThe supported [sources/targets from MLRun](https://docs.mlrun.org/en/latest/feature-store/sources-targets.html).\r\n\r\n## Sample of outputs\r\n\r\n![Sample of outputs](https://github.com/george0st/qgate-sln-mlrun/blob/master/assets/imgs/qgt-mlrun-samples.png?raw=true)\r\n\r\nThe reports in original form, see:\r\n - all DONE - [HTML](https://htmlpreview.github.io/?https://github.com/george0st/qgate-sln-mlrun/blob/master/docs/samples/outputs/qgt-mlrun-sample.html), \r\n [TXT](https://github.com/george0st/qgate-sln-mlrun/blob/master/docs/samples/outputs/qgt-mlrun-sample.txt?raw=true)\r\n - with ERR - [HTML](https://htmlpreview.github.io/?https://github.com/george0st/qgate-sln-mlrun/blob/master/docs/samples/outputs/qgt-mlrun-sample-err.html),\r\n [TXT](https://github.com/george0st/qgate-sln-mlrun/blob/master/docs/samples/outputs/qgt-mlrun-sample-err.txt?raw=true)\r\n\r\n## Usage\r\n\r\nYou can easy use this solution in four steps:\r\n1. Download content of these two GIT repositories to your local environment\r\n - [qgate-sln-mlrun](https://github.com/george0st/qgate-sln-mlrun)\r\n - [qgate-model](https://github.com/george0st/qgate-model)\r\n2. Update file `qgate-sln-mlrun.env` from qgate-model\r\n - Update variables for MLRun/Iguazio, see `MLRUN_DBPATH`, `V3IO_USERNAME`, `V3IO_ACCESS_KEY`, `V3IO_API`\r\n - setting of `V3IO_*` is needed only in case of Iguazio installation (not for pure free MLRun)\r\n - Update variables for QGate, see `QGATE_*` (basic description directly in *.env)\r\n - detail setup [configuration](./docs/configuration.md)\r\n3. Run from `qgate-sln-mlrun`\r\n - **python main.py**\r\n4. See outputs (location is based on `QGATE_OUTPUT` in configuration)\r\n - './output/qgt-mlrun-<date_time>.html'\r\n - './output/qgt-mlrun-<date_time>.txt'\r\n\r\nPrecondition: You have available MLRun or Iguazio solution (MLRun is part of that),\r\nsee official [installation steps](https://docs.mlrun.org/en/latest/install.html), or directly installation for [Desktop Docker](https://docs.mlrun.org/en/latest/install/local-docker.html). \r\n\r\n## Tested with\r\nThe project was tested with these MLRun versions (see [change log](https://docs.mlrun.org/en/latest/change-log/index.html)):\r\n - **MLRun** (in Desktop Docker)\r\n - MLRun 1.7.0 (plan 05/2024)\r\n - MLRun 1.6.3 (plan 04/2024), 1.6.2, 1.6.1, 1.6.0\r\n - MLRun 1.5.2, 1.5.1, 1.5.0\r\n - MLRun 1.4.1, 1.3.0\r\n - **Iguazio** (k8s, on-prem, VM on VMware)\r\n - Iguazio 3.5.3 (with MLRun 1.4.1)\r\n - Iguazio 3.5.1 (with MLRun 1.3.0)\r\n\r\nNOTE: Current state, only the last MLRun/Iguazio versions are valid for testing \r\n(these tests are without back-compatibilities).\r\n\r\n## Others\r\n - **To-Do**, the list of expected/future improvements, [see](./docs/todo_list.md)\r\n - **Applied limits**, the list of applied limits, [see](./docs/applied-limits.md) \r\n - **How can you test the solution?**, you have to focus on Linux env. or \r\n Windows with WSL2 ([see](./docs/testing.md) step by step tutorial)\r\n - **MLRun/Iguazio, the key changes in a nutshell**, [see](./docs/mlrun-iguazio-release-notes.md)\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "The quality gate for testing MLRun/Iguazio solution.",
"version": "0.2.1",
"project_urls": {
"homepage": "https://github.com/george0st/qgate-sln-mlrun/",
"repository": "https://pypi.org/project/qgate-sln-mlrun/"
},
"split_keywords": [
"testing",
" data-science",
" machine-learning",
" quality-assurance",
" quality-assessment",
" iguazio",
" mlrun",
" mlops",
" quality-gate",
" feature-store"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c0abfea115c7fa44cca9113e0f3e77377b91717e85d47b55b08eb89890d3f6c0",
"md5": "dd64e3a26757bc5bbb1937f6b513fb38",
"sha256": "76761eb270e7bcba1a1a31b58c01ef0bf539881aac5be950a60cf0785e4f3df2"
},
"downloads": -1,
"filename": "qgate_sln_mlrun-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "dd64e3a26757bc5bbb1937f6b513fb38",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 40779,
"upload_time": "2024-04-20T11:04:45",
"upload_time_iso_8601": "2024-04-20T11:04:45.420539Z",
"url": "https://files.pythonhosted.org/packages/c0/ab/fea115c7fa44cca9113e0f3e77377b91717e85d47b55b08eb89890d3f6c0/qgate_sln_mlrun-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-20 11:04:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "george0st",
"github_project": "qgate-sln-mlrun",
"travis_ci": false,
"coveralls": true,
"github_actions": false,
"requirements": [
{
"name": "mlrun",
"specs": [
[
"==",
"1.6.2"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
"~=",
"0.17.0"
]
]
},
{
"name": "jinja2",
"specs": [
[
"~=",
"3.1"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
"==",
"1.4.0"
]
]
},
{
"name": "redis",
"specs": [
[
"~=",
"5.0"
]
]
},
{
"name": "sqlalchemy",
"specs": [
[
"~=",
"1.4"
]
]
},
{
"name": "cryptography",
"specs": [
[
"~=",
"42.0"
]
]
},
{
"name": "pymysql",
"specs": [
[
"~=",
"1.1"
]
]
},
{
"name": "psycopg2",
"specs": [
[
"~=",
"2.9"
]
]
},
{
"name": "kafka-python",
"specs": [
[
"~=",
"2.0"
]
]
},
{
"name": "avro",
"specs": [
[
"~=",
"1.11"
]
]
}
],
"lcname": "qgate-sln-mlrun"
}