Name | snowflake-ml-python JSON |
Version |
1.7.1
JSON |
| download |
home_page | None |
Summary | The machine learning client library that is used for interacting with Snowflake to build machine learning solutions. |
upload_time | 2024-11-05 19:23:45 |
maintainer | None |
docs_url | None |
author | None |
requires_python | <3.12,>=3.9 |
license | Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright (c) 2012-2023 Snowflake Computing, Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. |
keywords |
|
VCS |
 |
bugtrack_url |
|
requirements |
absl-py
accelerate
anyio
boto3
build
cachetools
catboost
cloudpickle
coverage
cryptography
flask-cors
flask
fsspec
httpx
importlib_resources
inflection
joblib
jsonschema
lightgbm
mlflow
moto
mypy
networkx
numpy
packaging
pandas
peft
protobuf
psutil
pyarrow
pytest-rerunfailures
pytest-xdist
pytest
pytimeparse
pyyaml
retrying
ruamel.yaml
s3fs
scikit-learn
scipy
sentence-transformers
sentencepiece
shap
snowflake-connector-python
snowflake-snowpark-python
sphinx
sqlparse
starlette
tensorflow
tokenizers
toml
torch
torchdata
transformers
types-PyYAML
types-cachetools
types-protobuf
types-requests
types-toml
typing-extensions
werkzeug
xgboost
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Snowpark ML
Snowpark ML is a set of tools including SDKs and underlying infrastructure to build and deploy machine learning models.
With Snowpark ML, you can pre-process data, train, manage and deploy ML models all within Snowflake, using a single SDK,
and benefit from Snowflake’s proven performance, scalability, stability and governance at every stage of the Machine
Learning workflow.
## Key Components of Snowpark ML
The Snowpark ML Python SDK provides a number of APIs to support each stage of an end-to-end Machine Learning development
and deployment process, and includes two key components.
### Snowpark ML Development
[Snowpark ML Development](https://docs.snowflake.com/en/developer-guide/snowpark-ml/index#snowpark-ml-development)
provides a collection of python APIs enabling efficient ML model development directly in Snowflake:
1. Modeling API (`snowflake.ml.modeling`) for data preprocessing, feature engineering and model training in Snowflake.
This includes the `snowflake.ml.modeling.preprocessing` module for scalable data transformations on large data sets
utilizing the compute resources of underlying Snowpark Optimized High Memory Warehouses, and a large collection of ML
model development classes based on sklearn, xgboost, and lightgbm.
1. Framework Connectors: Optimized, secure and performant data provisioning for Pytorch and Tensorflow frameworks in
their native data loader formats.
1. FileSet API: FileSet provides a Python fsspec-compliant API for materializing data into a Snowflake internal stage
from a query or Snowpark Dataframe along with a number of convenience APIs.
### Snowpark Model Management [Public Preview]
[Snowpark Model Management](https://docs.snowflake.com/en/developer-guide/snowpark-ml/index#snowpark-ml-ops) complements
the Snowpark ML Development API, and provides model management capabilities along with integrated deployment into Snowflake.
Currently, the API consists of:
1. Registry: A python API for managing models within Snowflake which also supports deployment of ML models into Snowflake
as native MODEL object running with Snowflake Warehouse.
## Getting started
### Have your Snowflake account ready
If you don't have a Snowflake account yet, you can [sign up for a 30-day free trial account](https://signup.snowflake.com/).
### Installation
Follow the [installation instructions](https://docs.snowflake.com/en/developer-guide/snowpark-ml/index#installing-snowpark-ml)
in the Snowflake documentation.
Python versions 3.9 to 3.11 are supported. You can use [miniconda](https://docs.conda.io/en/latest/miniconda.html) or
[anaconda](https://www.anaconda.com/) to create a Conda environment (recommended),
or [virtualenv](https://docs.python.org/3/tutorial/venv.html) to create a virtual environment.
### Conda channels
The [Snowflake Conda Channel](https://repo.anaconda.com/pkgs/snowflake/) contains the official snowpark ML package releases.
The recommended approach is to install `snowflake-ml-python` this conda channel:
```sh
conda install \
-c https://repo.anaconda.com/pkgs/snowflake \
--override-channels \
snowflake-ml-python
```
See [the developer guide](https://docs.snowflake.com/en/developer-guide/snowpark-ml/index) for installation instructions.
The latest version of the `snowpark-ml-python` package is also published in a conda channel in this repository. Package versions
in this channel may not yet be present in the official Snowflake conda channel.
Install `snowflake-ml-python` from this channel with the following (being sure to replace `<version_specifier>` with the
desired version, e.g. `1.0.10`):
```bash
conda install \
-c https://raw.githubusercontent.com/snowflakedb/snowflake-ml-python/conda/releases/ \
-c https://repo.anaconda.com/pkgs/snowflake \
--override-channels \
snowflake-ml-python==<version_specifier>
```
Note that until a `snowflake-ml-python` package version is available in the official Snowflake conda channel, there may
be compatibility issues. Server-side functionality that `snowflake-ml-python` depends on may not yet be released.
# Release History
## 1.7.1
### Bug Fixes
- Registry: Null value is now allowed in the dataframe used in model signature inference. Null values will be ignored
and others will be used to infer the signature.
- Registry: Pandas Extension DTypes (`pandas.StringDType()`, `pandas.BooleanDType()`, etc.) are now supported in model
signature inference.
- Registry: Null value is now allowed in the dataframe used to predict.
- Data: Fix missing `snowflake.ml.data.*` module exports in wheel
- Dataset: Fix missing `snowflake.ml.dataset.*` module exports in wheel.
- Registry: Fix the issue that `tf_keras.Model` is not recognized as keras model when logging.
### Behavior Changes
### New Features
- Registry: Option to `enable_monitoring` set to False by default. This will gate access to preview features of Model Monitoring.
- Model Monitoring: `show_model_monitors` Registry method. This feature is still in Private Preview.
- Registry: Support `pd.Series` in input and output data.
- Model Monitoring: `add_monitor` Registry method. This feature is still in Private Preview.
- Model Monitoring: `resume` and `suspend` ModelMonitor. This feature is still in Private Preview.
- Model Monitoring: `get_monitor` Registry method. This feature is still in Private Preview.
- Model Monitoring: `delete_monitor` Registry method. This feature is still in Private Preview.
## 1.7.0 (10-22-2024)
### Behavior Change
- Generic: Require python >= 3.9.
- Data Connector: Update `to_torch_dataset` and `to_torch_datapipe` to add a dimension for scalar data.
This allows for more seamless integration with PyTorch `DataLoader`, which creates batches by stacking inputs of each batch.
Examples:
```python
ds = connector.to_torch_dataset(shuffle=False, batch_size=3)
```
- Input: "col1": [10, 11, 12]
- Previous batch: array([10., 11., 12.]) with shape (3,)
- New batch: array([[10.], [11.], [12.]]) with shape (3, 1)
- Input: "col2": [[0, 100], [1, 110], [2, 200]]
- Previous batch: array([[ 0, 100], [ 1, 110], [ 2, 200]]) with shape (3,2)
- New batch: No change
- Model Registry: External access integrations are optional when creating a model inference service in
Snowflake >= 8.40.0.
- Model Registry: Deprecate `build_external_access_integration` with `build_external_access_integrations` in
`ModelVersion.create_service()`.
### Bug Fixes
- Registry: Updated `log_model` API to accept both signature and sample_input_data parameters.
- Feature Store: ExampleHelper uses fully qualified path for table name. change weather features aggregation from 1d to 1h.
- Data Connector: Return numpy array with appropriate object type instead of list for multi-dimensional
data from `to_torch_dataset` and `to_torch_datapipe`
- Model explainability: Incompatibility between SHAP 0.42.1 and XGB 2.1.1 resolved by using latest SHAP 0.46.0.
### New Features
- Registry: Provide pass keyworded variable length of arguments to class ModelContext. Example usage:
```python
mc = custom_model.ModelContext(
config = 'local_model_dir/config.json',
m1 = model1
)
class ExamplePipelineModel(custom_model.CustomModel):
def __init__(self, context: custom_model.ModelContext) -> None:
super().__init__(context)
v = open(self.context['config']).read()
self.bias = json.loads(v)['bias']
@custom_model.inference_api
def predict(self, input: pd.DataFrame) -> pd.DataFrame:
model_output = self.context['m1'].predict(input)
return pd.DataFrame({'output': model_output + self.bias})
```
- Model Development: Upgrade scikit-learn in UDTF backend for log_loss metric. As a result, `eps` argument is now ignored.
- Data Connector: Add the option of passing a `None` sized batch to `to_torch_dataset` for better
interoperability with PyTorch DataLoader.
- Model Registry: Support [pandas.CategoricalDtype](https://pandas.pydata.org/docs/reference/api/pandas.CategoricalDtype.html#pandas-categoricaldtype)
- Registry: It is now possible to pass `signatures` and `sample_input_data` at the same time to capture background
data from explainablity and data lineage.
## 1.6.4 (2024-10-17)
### Bug Fixes
- Registry: Fix an issue that leads to incident when using `ModelVersion.run` with service.
## 1.6.3 (2024-10-07)
- Model Registry (PrPr) has been removed.
### Bug Fixes
- Registry: Fix a bug that when package whose name does not follow PEP-508 is provided when logging the model,
an unexpected normalization is happening.
- Registry: Fix `not a valid remote uri` error when logging mlflow models.
- Registry: Fix a bug that `ModelVersion.run` is called in a nested way.
- Registry: Fix an issue that leads to `log_model` failure when local package version contains parts other than
base version.
- Fix issue where `sample_weights` were not being applied to search estimators.
- Model explainability: Fix bug which creates explain as a function instead of table function when enabling by default.
- Model explainability: Update lightgbm binary classification to return non-json values, from customer feedback.
### New Features
- Data: Improve `DataConnector.to_pandas()` performance when loading from Snowpark DataFrames.
- Model Registry: Allow users to set a model task while using `log_model`.
- Feature Store: FeatureView supports ON_CREATE or ON_SCHEDULE initialize mode.
## 1.6.2 (2024-09-04)
### Bug Fixes
- Modeling: Support XGBoost version that is larger than 2.
- Data: Fix multiple epoch iteration over `DataConnector.to_torch_datapipe()` DataPipes.
- Generic: Fix a bug that when an invalid name is provided to argument where fully qualified name is expected, it will
be parsed wrongly. Now it raises an exception correctly.
- Model Explainability: Handle explanations for multiclass XGBoost classification models
- Model Explainability: Workarounds and better error handling for XGB>2.1.0 not working with SHAP==0.42.1
### New Features
- Data: Add top-level exports for `DataConnector` and `DataSource` to `snowflake.ml.data`.
- Data: Add native batching support via `batch_size` and `drop_last_batch` arguments to `DataConnector.to_torch_dataset()`
- Feature Store: update_feature_view() supports taking feature view object as argument.
## 1.6.1 (2024-08-12)
### Bug Fixes
- Feature Store: Support large metadata blob when generating dataset
- Feature Store: Added a hidden knob in FeatureView as kargs for setting customized
refresh_mode
- Registry: Fix an error message in Model Version `run` when `function_name` is not mentioned and model has multiple
target methods.
- Cortex inference: snowflake.cortex.Complete now only uses the REST API for streaming and the use_rest_api_experimental
is no longer needed.
- Feature Store: Add a new API: FeatureView.list_columns() which list all column information.
- Data: Fix `DataFrame` ingestion with `ArrowIngestor`.
### New Features
- Enable `set_params` to set the parameters of the underlying sklearn estimator, if the snowflake-ml model has been fit.
- Data: Add `snowflake.ml.data.ingestor_utils` module with utility functions helpful for `DataIngestor` implementations.
- Data: Add new `to_torch_dataset()` connector to `DataConnector` to replace deprecated DataPipe.
- Registry: Option to `enable_explainability` set to True by default for XGBoost, LightGBM and CatBoost as PuPr feature.
- Registry: Option to `enable_explainability` when registering SHAP supported sklearn models.
## 1.6.0 (2024-07-29)
### Bug Fixes
- Modeling: `SimpleImputer` can impute integer columns with integer values.
- Registry: Fix an issue when providing a pandas Dataframe whose index is not starting from 0 as the input to
the `ModelVersion.run`.
### New Features
- Feature Store: Add overloads to APIs accept both object and name/version. Impacted APIs include read_feature_view(),
refresh_feature_view(), get_refresh_history(), resume_feature_view(), suspend_feature_view(), delete_feature_view().
- Feature Store: Add docstring inline examples for all public APIs.
- Feature Store: Add new utility class `ExampleHelper` to help with load source data to simplify public notebooks.
- Registry: Option to `enable_explainability` when registering XGBoost models as a pre-PuPr feature.
- Feature Store: add new API `update_entity()`.
- Registry: Option to `enable_explainability` when registering Catboost models as a pre-PuPr feature.
- Feature Store: Add new argument warehouse to FeatureView constructor to overwrite the default warehouse. Also add
a new column 'warehouse' to the output of list_feature_views().
- Registry: Add support for logging model from a model version.
- Modeling: Distributed Hyperparameter Optimization now announce GA refresh version. The latest memory efficient version
will not have the 10GB training limitation for dataset any more. To turn off, please run
`
from snowflake.ml.modeling._internal.snowpark_implementations import (
distributed_hpo_trainer,
)
distributed_hpo_trainer.ENABLE_EFFICIENT_MEMORY_USAGE = False
`
- Registry: Option to `enable_explainability` when registering LightGBM models as a pre-PuPr feature.
- Data: Add new `snowflake.ml.data` preview module which contains data reading utilities like `DataConnector`
- `DataConnector` provides efficient connectors from Snowpark `DataFrame`
and Snowpark ML `Dataset` to external frameworks like PyTorch, TensorFlow, and Pandas. Create `DataConnector`
instances using the classmethod constructors `DataConnector.from_dataset()` and `DataConnector.from_dataframe()`.
- Data: Add new `DataConnector.from_sources()` classmethod constructor for constructing from `DataSource` objects.
- Data: Add new `ingestor_class` arg to `DataConnector` classmethod constructors for easier `DataIngestor` injection.
- Dataset: `DatasetReader` now subclasses new `DataConnector` class.
- Add optional `limit` arg to `DatasetReader.to_pandas()`
### Behavior Changes
- Feature Store: change some positional parameters to keyword arguments in following APIs:
- Entity(): desc.
- FeatureView(): timestamp_col, refresh_freq, desc.
- FeatureStore(): creation_mode.
- update_entity(): desc.
- register_feature_view(): block, overwrite.
- list_feature_views(): entity_name, feature_view_name.
- get_refresh_history(): verbose.
- retrieve_feature_values(): spine_timestamp_col, exclude_columns, include_feature_view_timestamp_col.
- generate_training_set(): save_as, spine_timestamp_col, spine_label_cols, exclude_columns,
include_feature_view_timestamp_col.
- generate_dataset(): version, spine_timestamp_col, spine_label_cols, exclude_columns,
include_feature_view_timestamp_col, desc, output_type.
## 1.5.4 (2024-07-11)
### Bug Fixes
- Model Registry (PrPr): Fix 401 Unauthorized issue when deploying model to SPCS.
- Feature Store: Downgrades exceptions to warnings for few property setters in feature view. Now you can set
desc, refresh_freq and warehouse for draft feature views.
- Modeling: Fix an issue with calling `OrdinalEncoder` with `categories` as a dictionary and a pandas DataFrame
- Modeling: Fix an issue with calling `OneHotEncoder` with `categories` as a dictionary and a pandas DataFrame
### New Features
- Registry: Allow overriding `device_map` and `device` when loading huggingface pipeline models.
- Registry: Add `set_alias` method to `ModelVersion` instance to set an alias to model version.
- Registry: Add `unset_alias` method to `ModelVersion` instance to unset an alias to model version.
- Registry: Add `partitioned_inference_api` allowing users to create partitioned inference functions in registered
models. Enable model inference methods with table functions with vectorized process methods in registered models.
- Feature Store: add 3 more columns: refresh_freq, refresh_mode and scheduling_state to the result of
`list_feature_views()`.
- Feature Store: `update_feature_view()` supports updating description.
- Feature Store: add new API `refresh_feature_view()`.
- Feature Store: add new API `get_refresh_history()`.
- Feature Store: Add `generate_training_set()` API for generating table-backed feature snapshots.
- Feature Store: Add `DeprecationWarning` for `generate_dataset(..., output_type="table")`.
- Feature Store: `update_feature_view()` supports updating description.
- Feature Store: add new API `refresh_feature_view()`.
- Feature Store: add new API `get_refresh_history()`.
- Model Development: OrdinalEncoder supports a list of array-likes for `categories` argument.
- Model Development: OneHotEncoder supports a list of array-likes for `categories` argument.
## 1.5.3 (06-17-2024)
### Bug Fixes
- Modeling: Fix an issue causing lineage information to be missing for
`Pipeline`, `GridSearchCV` , `SimpleImputer`, and `RandomizedSearchCV`
- Registry: Fix an issue that leads to incorrect result when using pandas Dataframe with over 100, 000 rows as the input
of `ModelVersion.run` method in Stored Procedure.
### New Features
- Registry: Add support for TIMESTAMP_NTZ model signature data type, allowing timestamp input and output.
- Dataset: Add `DatasetVersion.label_cols` and `DatasetVersion.exclude_cols` properties.
## 1.5.2 (06-10-2024)
### Bug Fixes
- Registry: Fix an issue that leads to unable to log model in store procedure.
- Modeling: Quick fix `import snowflake.ml.modeling.parameters.enable_anonymous_sproc` cannot be imported due to package
dependency error.
### Behavior Changes
### New Features
## 1.5.1 (05-22-2024)
### Bug Fixes
- Dataset: Fix `snowflake.connector.errors.DataError: Query Result did not match expected number of rows` when accessing
DatasetVersion properties when case insensitive `SHOW VERSIONS IN DATASET` check matches multiple version names.
- Dataset: Fix bug in SnowFS bulk file read when used with DuckDB
- Registry: Fixed a bug when loading old models.
- Lineage: Fix Dataset source lineage propagation through `snowpark.DataFrame` transformations
### Behavior Changes
- Feature Store: convert clear() into a private function. Also make it deletes feature views and entities only.
- Feature Store: Use NULL as default value for timestamp tag value.
### New Features
- Feature Store: Added new `snowflake.ml.feature_store.setup_feature_store()` API to assist Feature Store RBAC setup.
- Feature Store: Add `output_type` argument to `FeatureStore.generate_dataset()` to allow generating data snapshots
as Datasets or Tables.
- Registry: `log_model`, `get_model`, `delete_model` now supports fully qualified name.
- Modeling: Supports anonymous stored procedure during fit calls so that modeling would not require sufficient
permissions to operate on schema. Please call
`import snowflake.ml.modeling.parameters.enable_anonymous_sproc # noqa: F401`
## 1.5.0 (05-01-2024)
### Bug Fixes
- Registry: Fix invalid parameter 'SHOW_MODEL_DETAILS_IN_SHOW_VERSIONS_IN_MODEL' error.
### Behavior Changes
- Model Development: The behavior of `fit_transform` for all estimators is changed.
Firstly, it will cover all the estimator that contains this function,
secondly, the output would be the union of pandas DataFrame and snowpark DataFrame.
#### Model Registry (PrPr)
`snowflake.ml.registry.artifact` and related `snowflake.ml.model_registry.ModelRegistry` APIs have been removed.
- Removed `snowflake.ml.registry.artifact` module.
- Removed `ModelRegistry.log_artifact()`, `ModelRegistry.list_artifacts()`, `ModelRegistry.get_artifact()`
- Removed `artifacts` argument from `ModelRegistry.log_model()`
#### Dataset (PrPr)
`snowflake.ml.dataset.Dataset` has been redesigned to be backed by Snowflake Dataset entities.
- New `Dataset`s can be created with `Dataset.create()` and existing `Dataset`s may be loaded
with `Dataset.load()`.
- `Dataset`s now maintain an immutable `selected_version` state. The `Dataset.create_version()` and
`Dataset.load_version()` APIs return new `Dataset` objects with the requested `selected_version` state.
- Added `dataset.create_from_dataframe()` and `dataset.load_dataset()` convenience APIs as a shortcut
to creating and loading `Dataset`s with a pre-selected version.
- `Dataset.materialized_table` and `Dataset.snapshot_table` no longer exist with `Dataset.fully_qualified_name`
as the closest equivalent.
- `Dataset.df` no longer exists. Instead, use `DatasetReader.read.to_snowpark_dataframe()`.
- `Dataset.owner` has been moved to `Dataset.selected_version.owner`
- `Dataset.desc` has been moved to `DatasetVersion.selected_version.comment`
- `Dataset.timestamp_col`, `Dataset.label_cols`, `Dataset.feature_store_metadata`, and
`Dataset.schema_version` have been removed.
#### Feature Store (PrPr)
- `FeatureStore.generate_dataset` argument list has been changed to match the new
`snowflake.ml.dataset.Dataset` definition
- `materialized_table` has been removed and replaced with `name` and `version`.
- `name` moved to first positional argument
- `save_mode` has been removed as `merge` behavior is no longer supported. The new behavior is always `errorifexists`.
- Change feature view version type from str to `FeatureViewVersion`. It is a restricted string literal.
- Remove as_dataframe arg from FeatureStore.list_feature_views(), now always returns result as DataFrame.
- Combines few metadata tags into a new tag: SNOWML_FEATURE_VIEW_METADATA. This will make previously created feature views
not readable by new SDK.
### New Features
- Registry: Add `export` method to `ModelVersion` instance to export model files.
- Registry: Add `load` method to `ModelVersion` instance to load the underlying object from the model.
- Registry: Add `Model.rename` method to `Model` instance to rename or move a model.
#### Dataset (PrPr)
- Added Snowpark DataFrame integration using `Dataset.read.to_snowpark_dataframe()`
- Added Pandas DataFrame integration using `Dataset.read.to_pandas()`
- Added PyTorch and TensorFlow integrations using `Dataset.read.to_torch_datapipe()`
and `Dataset.read.to_tf_dataset()` respectively.
- Added `fsspec` style file integration using `Dataset.read.files()` and `Dataset.read.filesystem()`
#### Feature Store
- use new tag_reference_internal to speed up metadata lookup.
## 1.4.1 (2024-04-18)
### New Features
- Registry: Add support for `catboost` model (`catboost.CatBoostClassifier`, `catboost.CatBoostRegressor`).
- Registry: Add support for `lightgbm` model (`lightgbm.Booster`, `lightgbm.LightGBMClassifier`, `lightgbm.LightGBMRegressor`).
### Bug Fixes
- Registry: Fix a bug that leads to relax_version option is not working.
### Behavior changes
- Feature Store: update_feature_view takes refresh_freq and warehouse as argument.
## 1.4.0 (2024-04-08)
### Bug Fixes
- Registry: Fix a bug when multiple models are being called from the same query, models other than the first one will
have incorrect result. This fix only works for newly logged model.
- Modeling: When registering a model, only method(s) that is mentioned in `save_model` would be added to model signature
in SnowML models.
- Modeling: Fix a bug that when n_jobs is not 1, model cannot execute methods such as
predict, predict_log_proba, and other batch inference methods. The n_jobs would automatically
set to 1 because vectorized udf currently doesn't support joblib parallel backend.
- Modeling: Fix a bug that batch inference methods cannot infer the datatype when the first row of data contains NULL.
- Modeling: Matches Distributed HPO output column names with the snowflake identifier.
- Modeling: Relax package versions for all Distributed HPO methods if the installed version
is not available in the Snowflake conda channel
- Modeling: Add sklearn as required dependency for LightGBM package.
### Behavior Changes
- Registry: `apply` method is no longer by default logged when logging a xgboost model. If that is required, it could
be specified manually when logging the model by `log_model(..., options={"target_methods": ["apply", ...]})`.
- Feature Store: register_entity returns an entity object.
- Feature Store: register_feature_view `block=true` becomes default.
### New Features
- Registry: Add support for `sentence-transformers` model (`sentence_transformers.SentenceTransformer`).
- Registry: Now version name is no longer required when logging a model. If not provided, a random human readable ID
will be generated.
## 1.3.1 (2024-03-21)
### New Features
- FileSet: `snowflake.ml.fileset.sfcfs.SFFileSystem` can now be used in UDFs and stored procedures.
## 1.3.0 (2024-03-12)
### Bug Fixes
- Registry: Fix a bug that leads to module in `code_paths` when `log_model` cannot be correctly imported.
- Registry: Fix incorrect error message when validating input Snowpark DataFrame with array feature.
- Model Registry: Fix an issue when deploying a model to SPCS that some files do not have proper permission.
- Model Development: Relax package versions for all inference methods if the installed version
is not available in the Snowflake conda channel
### Behavior Changes
- Registry: When running the method of a model, the value range based input validation to avoid input from overflowing
is now optional rather than enforced, this should improve the performance and should not lead to problem for most
kinds of model. If you want to enable this check as previous, specify `strict_input_validation=True` when
calling `run`.
- Registry: By default `relax_version=True` when logging a model instead of using the specific local dependency versions.
This improves dependency versioning by using versions available in Snowflake. To switch back to the previous behavior
and use specific local dependency versions, specify `relax_version=False` when calling `log_model`.
- Model Development: The behavior of `fit_predict` for all estimators is changed.
Firstly, it will cover all the estimator that contains this function,
secondly, the output would be the union of pandas DataFrame and snowpark DataFrame.
### New Features
- FileSet: `snowflake.ml.fileset.sfcfs.SFFileSystem` can now be serialized with `pickle`.
## 1.2.3 (2024-02-26)
### Bug Fixes
- Registry: Now when providing Decimal Type column to a DOUBLE or FLOAT feature will not error out but auto cast with
warnings.
- Registry: Improve the error message when specifying currently unsupported `pip_requirements` argument.
- Model Development: Fix precision_recall_fscore_support incorrect results when `average="samples"`.
- Model Registry: Fix an issue that leads to description, metrics or tags are not correctly returned in newly created
Model Registry (PrPr) due to Snowflake BCR [2024_01](https://docs.snowflake.com/en/release-notes/bcr-bundles/2024_01/bcr-1483)
### Behavior Changes
- Feature Store: `FeatureStore.suspend_feature_view` and `FeatureStore.resume_feature_view` doesn't mutate input feature
view argument any more. The updated status only reflected in the returned feature view object.
### New Features
- Model Development: support `score_samples` method for all the classes, including Pipeline,
GridSearchCV, RandomizedSearchCV, PCA, IsolationForest, ...
- Registry: Support deleting a version of a model.
## 1.2.2 (2024-02-13)
### New Features
- Model Registry: Support providing external access integrations when deploying a model to SPCS. This will help and be
required to make sure the deploying process work as long as SPCS will by default deny all network connections. The
following endpoints must be allowed to make deployment work: docker.com:80, docker.com:443, anaconda.com:80,
anaconda.com:443, anaconda.org:80, anaconda.org:443, pypi.org:80, pypi.org:443. If you are using
`snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel` object, the following endpoints are required
to be allowed: huggingface.com:80, huggingface.com:443, huggingface.co:80, huggingface.co:443.
## 1.2.1 (2024-01-25)
### New Features
- Model Development: Infers output column data type for transformers when possible.
- Registry: `relax_version` option is available in the `options` argument when logging the model.
## 1.2.0 (2024-01-11)
### Bug Fixes
- Model Registry: Fix "XGBoost version not compiled with GPU support" error when running CPU inference against open-source
XGBoost models deployed to SPCS.
- Model Registry: Fix model deployment to SPCS on Windows machines.
### New Features
- Model Development: Introduced XGBoost external memory training feature. This feature enables training XGBoost models
on large datasets that don't fit into memory.
- Registry: New Registry class named `snowflake.ml.registry.Registry` providing similar APIs as the old one but works
with new MODEL object in Snowflake SQL. Also, we are providing`snowflake.ml.model.Model` and
`snowflake.ml.model.ModelVersion` to represent a model and a specific version of a model.
- Model Development: Add support for `fit_predict` method in `AgglomerativeClustering`, `DBSCAN`, and `OPTICS` classes;
- Model Development: Add support for `fit_transform` method in `MDS`, `SpectralEmbedding` and `TSNE` class.
### Additional Notes
- Model Registry: The `snowflake.ml.registry.model_registry.ModelRegistry` has been deprecated starting from version
1.2.0. It will stay in the Private Preview phase. For future implementations, kindly utilize
`snowflake.ml.registry.Registry`, except when specifically required. The old model registry will be removed once all
its primary functionalities are fully integrated into the new registry.
## 1.1.2 (2023-12-18)
### Bug Fixes
- Generic: Fix the issue that stack trace is hidden by telemetry unexpectedly.
- Model Development: Execute model signature inference without materializing full dataframe in memory.
- Model Registry: Fix occasional 'snowflake-ml-python library does not exist' error when deploying to SPCS.
### Behavior Changes
- Model Registry: When calling `predict` with Snowpark DataFrame, both inferred or normalized column names are accepted.
- Model Registry: When logging a Snowpark ML Modeling Model, sample input data or manually provided signature will be
ignored since they are not necessary.
### New Features
- Model Development: SQL implementation of binary `precision_score` metric.
## 1.1.1 (2023-12-05)
### Bug Fixes
- Model Registry: The `predict` target method on registered models is now compatible with unsupervised estimators.
- Model Development: Fix confusion_matrix incorrect results when the row number cannot be divided by the batch size.
### New Features
- Introduced passthrough_col param in Modeling API. This new param is helpful in scenarios
requiring automatic input_cols inference, but need to avoid using specific
columns, like index columns, during training or inference.
## 1.1.0 (2023-12-01)
### Bug Fixes
- Model Registry: Fix panda dataframe input not handling first row properly.
- Model Development: OrdinalEncoder and LabelEncoder output_columns do not need to be valid snowflake identifiers. They
would previously be excluded if the normalized name did not match the name specified in output_columns.
### New Features
- Model Registry: Add support for invoking public endpoint on SPCS service, by providing a "enable_ingress" SPCS
deployment option.
- Model Development: Add support for distributed HPO - GridSearchCV and RandomizedSearchCV execution will be
distributed on multi-node warehouses.
## 1.0.12 (2023-11-13)
### Bug Fixes
- Model Registry: Fix regression issue that container logging is not shown during model deployment to SPCS.
- Model Development: Enhance the column capacity of OrdinalEncoder.
- Model Registry: Fix unbound `batch_size` error when deploying a model other than Hugging Face Pipeline
and LLM with GPU on SPCS.
### Behavior Changes
- Model Registry: Raise early error when deploying to SPCS with db/schema that starts with underscore.
- Model Registry: `conda-forge` channel is now automatically added to channel lists when deploying to SPCS.
- Model Registry: `relax_version` will not strip all version specifier, instead it will relax `==x.y.z` specifier to
`>=x.y,<(x+1)`.
- Model Registry: Python with different patchlevel but the same major and minor will not result a warning when loading
the model via Model Registry and would be considered to use when deploying to SPCS.
- Model Registry: When logging a `snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel` object,
versions of local installed libraries won't be picked as dependencies of models, instead it will pick up some pre-
defined dependencies to improve user experience.
### New Features
- Model Registry: Enable best-effort SPCS job/service log streaming when logging level is set to INFO.
## 1.0.11 (2023-10-27)
### New Features
- Model Registry: Add log_artifact() public method.
- Model Development: Add support for `kneighbors`.
### Behavior Changes
- Model Registry: Change log_model() argument from TrainingDataset to List of Artifact.
- Model Registry: Change get_training_dataset() to get_artifact().
### Bug Fixes
- Model Development: Fix support for XGBoost and LightGBM models using SKLearn Grid Search and Randomized Search model selectors.
- Model Development: DecimalType is now supported as a DataType.
- Model Development: Fix metrics compatibility with Snowpark Dataframes that use Snowflake identifiers
- Model Registry: Resolve 'delete_deployment' not deleting the SPCS service in certain cases.
## 1.0.10 (2023-10-13)
### Behavior Changes
- Model Development: precision_score, recall_score, f1_score, fbeta_score, precision_recall_fscore_support,
mean_absolute_error, mean_squared_error, and mean_absolute_percentage_error metric calculations are now distributed.
- Model Registry: `deploy` will now return `Deployment` for deployment information.
### New Features
- Model Registry: When the model signature is auto-inferred, it will be printed to the log for reference.
- Model Registry: For SPCS deployment, `Deployment` details will contains `image_name`, `service_spec` and `service_function_sql`.
### Bug Fixes
- Model Development: Fix an issue that leading to UTF-8 decoding errors when using modeling modules on Windows.
- Model Development: Fix an issue that alias definitions cause `SnowparkSQLUnexpectedAliasException` in inference.
- Model Registry: Fix an issue that signature inference could be incorrect when using Snowpark DataFrame as sample input.
- Model Registry: Fix too strict data type validation when predicting. Now, for example, if you have a INT8
type feature in the signature, if providing a INT64 dataframe but all values are within the range, it would not fail.
## 1.0.9 (2023-09-28)
### Behavior Changes
- Model Development: log_loss metric calculation is now distributed.
### Bug Fixes
- Model Registry: Fix an issue that building images fails with specific docker setup.
- Model Registry: Fix an issue that unable to embed local ML library when the library is imported by `zipimport`.
- Model Registry: Fix out-of-date doc about `platform` argument in the `deploy` function.
- Model Registry: Fix an issue that unable to deploy a GPU-trained PyTorch model to a platform where GPU is not available.
## 1.0.8 (2023-09-15)
### Bug Fixes
- Model Development: Ordinal encoder can be used with mixed input column types.
- Model Development: Fix an issue when the sklearn default value is `np.nan`.
- Model Registry: Fix an issue that incorrect docker executable is used when building images.
- Model Registry: Fix an issue that specifying `token` argument when using
`snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel` with `transformers < 4.32.0` is not effective.
- Model Registry: Fix an issue that incorrect system function call is used when deploying to SPCS.
- Model Registry: Fix an issue when using a `transformers.pipeline` that does not have a `tokenizer`.
- Model Registry: Fix incorrectly-inferred image repository name during model deployment to SPCS.
- Model Registry: Fix GPU resource retention issue caused by failed or stuck previous deployments in SPCS.
## 1.0.7 (2023-09-05)
### Bug Fixes
- Model Development & Model Registry: Fix an error related to `pandas.io.json.json_normalize`.
- Allow disabling telemetry.
## 1.0.6 (2023-09-01)
### New Features
- Model Registry: add `create_if_not_exists` parameter in constructor.
- Model Registry: Added get_or_create_model_registry API.
- Model Registry: Added support for using GPU inference when deploying XGBoost (`xgboost.XGBModel` and `xgboost.Booster`
), PyTorch (`torch.nn.Module` and `torch.jit.ScriptModule`) and TensorFlow (`tensorflow.Module` and
`tensorflow.keras.Model`) models to Snowpark Container Services.
- Model Registry: When inferring model signature, `Sequence` of built-in types, `Sequence` of `numpy.ndarray`,
`Sequence` of `torch.Tensor`, `Sequence` of `tensorflow.Tensor` and `Sequence` of `tensorflow.Tensor` can be used
instead of only `List` of them.
- Model Registry: Added `get_training_dataset` API.
- Model Development: Size of metrics result can exceed previous 8MB limit.
- Model Registry: Added support save/load/deploy HuggingFace pipeline object (`transformers.Pipeline`) and our wrapper
(`snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel`) to it. Using the wrapper to specify
configurations and the model for the pipeline will be loaded dynamically when deploying. Currently, following tasks
are supported to log without manually specifying model signatures:
- "conversational"
- "fill-mask"
- "question-answering"
- "summarization"
- "table-question-answering"
- "text2text-generation"
- "text-classification" (alias "sentiment-analysis" available)
- "text-generation"
- "token-classification" (alias "ner" available)
- "translation"
- "translation_xx_to_yy"
- "zero-shot-classification"
### Bug Fixes
- Model Development: Fixed a bug when using simple imputer with numpy >= 1.25.
- Model Development: Fixed a bug when inferring the type of label columns.
### Behavior Changes
- Model Registry: `log_model()` now return a `ModelReference` object instead of a model ID.
- Model Registry: When deploying a model with 1 `target method` only, the `target_method` argument can be omitted.
- Model Registry: When using the snowflake-ml-python with version newer than what is available in Snowflake Anaconda
Channel, `embed_local_ml_library` option will be set as `True` automatically if not.
- Model Registry: When deploying a model to Snowpark Container Services and using GPU, the default value of num_workers
will be 1.
- Model Registry: `keep_order` and `output_with_input_features` in the deploy options have been removed. Now the
behavior is controlled by the type of the input when calling `model.predict()`. If the input is a `pandas.DataFrame`,
the behavior will be the same as `keep_order=True` and `output_with_input_features=False` before. If the input is a
`snowpark.DataFrame`, the behavior will be the same as `keep_order=False` and `output_with_input_features=True` before.
- Model Registry: When logging and deploying PyTorch (`torch.nn.Module` and `torch.jit.ScriptModule`) and TensorFlow
(`tensorflow.Module` and `tensorflow.keras.Model`) models, we no longer accept models whose input is a list of tensor
and output is a list of tensors. Instead, now we accept models whose input is 1 or more tensors as positional arguments,
and output is a tensor or a tuple of tensors. The input and output dataframe when predicting keep the same as before,
that is every column is an array feature and contains a tensor.
## 1.0.5 (2023-08-17)
### New Features
- Model Registry: Added support save/load/deploy xgboost Booster model.
- Model Registry: Added support to get the model name and the model version from model references.
### Bug Fixes
- Model Registry: Restore the db/schema back to the session after `create_model_registry()`.
- Model Registry: Fixed an issue that the UDF name created when deploying a model is not identical to what is provided
and cannot be correctly dropped when deployment getting dropped.
- connection_params.SnowflakeLoginOptions(): Added support for `private_key_path`.
## 1.0.4 (2023-07-28)
### New Features
- Model Registry: Added support save/load/deploy Tensorflow models (`tensorflow.Module`).
- Model Registry: Added support save/load/deploy MLFlow PyFunc models (`mlflow.pyfunc.PyFuncModel`).
- Model Development: Input dataframes can now be joined against data loaded from staged files.
- Model Development: Added support for non-English languages.
### Bug Fixes
- Model Registry: Fix an issue that model dependencies are incorrectly reported as unresolvable on certain platforms.
## 1.0.3 (2023-07-14)
### Behavior Changes
- Model Registry: When predicting a model whose output is a list of NumPy ndarray, the output would not be flattened,
instead, every ndarray will act as a feature(column) in the output.
### New Features
- Model Registry: Added support save/load/deploy PyTorch models (`torch.nn.Module` and `torch.jit.ScriptModule`).
### Bug Fixes
- Model Registry: Fix an issue that when database or schema name provided to `create_model_registry` contains special
characters, the model registry cannot be created.
- Model Registry: Fix an issue that `get_model_description` returns with additional quotes.
- Model Registry: Fix incorrect error message when attempting to remove a unset tag of a model.
- Model Registry: Fix a typo in the default deployment table name.
- Model Registry: Snowpark dataframe for sample input or input for `predict` method that contains a column with
Snowflake `NUMBER(precision, scale)` data type where `scale = 0` will not lead to error, and will now correctly
recognized as `INT64` data type in model signature.
- Model Registry: Fix an issue that prevent model logged in the system whose default encoding is not UTF-8 compatible
from deploying.
- Model Registry: Added earlier and better error message when any file name in the model or the file name of model
itself contains characters that are unable to be encoded using ASCII. It is currently not supported to deploy such a
model.
## 1.0.2 (2023-06-22)
### Behavior Changes
- Model Registry: Prohibit non-snowflake-native models from being logged.
- Model Registry: `_use_local_snowml` parameter in options of `deploy()` has been removed.
- Model Registry: A default `False` `embed_local_ml_library` parameter has been added to the options of `log_model()`.
With this set to `False` (default), the version of the local snowflake-ml-python library will be recorded and used when
deploying the model. With this set to `True`, local snowflake-ml-python library will be embedded into the logged model,
and will be used when you load or deploy the model.
### New Features
- Model Registry: A new optional argument named `code_paths` has been added to the arguments of `log_model()` for users
to specify additional code paths to be imported when loading and deploying the model.
- Model Registry: A new optional argument named `options` has been added to the arguments of `log_model()` to specify
any additional options when saving the model.
- Model Development: Added metrics:
- d2_absolute_error_score
- d2_pinball_score
- explained_variance_score
- mean_absolute_error
- mean_absolute_percentage_error
- mean_squared_error
### Bug Fixes
- Model Development: `accuracy_score()` now works when given label column names are lists of a single value.
## 1.0.1 (2023-06-16)
### Behavior Changes
- Model Development: Changed Metrics APIs to imitate sklearn metrics modules:
- `accuracy_score()`, `confusion_matrix()`, `precision_recall_fscore_support()`, `precision_score()` methods move from
respective modules to `metrics.classification`.
- Model Registry: The default table/stage created by the Registry now uses "_SYSTEM_" as a prefix.
- Model Registry: `get_model_history()` method as been enhanced to include the history of model deployment.
### New Features
- Model Registry: A default `False` flag named `replace_udf` has been added to the options of `deploy()`. Setting this
to `True` will allow overwrite existing UDF with the same name when deploying.
- Model Development: Added metrics:
- f1_score
- fbeta_score
- recall_score
- roc_auc_score
- roc_curve
- log_loss
- precision_recall_curve
- Model Registry: A new argument named `permanent` has been added to the argument of `deploy()`. Setting this to `True`
allows the creation of a permanent deployment without needing to specify the UDF location.
- Model Registry: A new method `list_deployments()` has been added to enumerate all permanent deployments originating
from a specific model.
- Model Registry: A new method `get_deployment()` has been added to fetch a deployment by its deployment name.
- Model Registry: A new method `delete_deployment()` has been added to remove an existing permanent deployment.
## 1.0.0 (2023-06-09)
### Behavior Changes
- Model Registry: `predict()` method moves from Registry to ModelReference.
- Model Registry: `_snowml_wheel_path` parameter in options of `deploy()`, is replaced with `_use_local_snowml` with
default value of `False`. Setting this to `True` will have the same effect of uploading local SnowML code when executing
model in the warehouse.
- Model Registry: Removed `id` field from `ModelReference` constructor.
- Model Development: Preprocessing and Metrics move to the modeling package: `snowflake.ml.modeling.preprocessing` and
`snowflake.ml.modeling.metrics`.
- Model Development: `get_sklearn_object()` method is renamed to `to_sklearn()`, `to_xgboost()`, and `to_lightgbm()` for
respective native models.
### New Features
- Added PolynomialFeatures transformer to the snowflake.ml.modeling.preprocessing module.
- Added metrics:
- accuracy_score
- confusion_matrix
- precision_recall_fscore_support
- precision_score
### Bug Fixes
- Model Registry: Model version can now be any string (not required to be a valid identifier)
- Model Deployment: `deploy()` & `predict()` methods now correctly escapes identifiers
## 0.3.2 (2023-05-23)
### Behavior Changes
- Use cloudpickle to serialize and deserialize models throughout the codebase and removed dependency on joblib.
### New Features
- Model Deployment: Added support for snowflake.ml models.
## 0.3.1 (2023-05-18)
### Behavior Changes
- Standardized registry API with following
- Create & open registry taking same set of arguments
- Create & Open can choose schema to use
- Set_tag, set_metric, etc now explicitly calls out arg name as metric_name, tag_name, metric_name, etc.
### New Features
- Changes to support python 3.9, 3.10
- Added kBinsDiscretizer
- Support for deployment of XGBoost models & int8 types of data
## 0.3.0 (2023-05-11)
### Behavior Changes
- Big Model Registry Refresh
- Fixed API discrepancies between register_model & log_model.
- Model can be referred by Name + Version (no opaque internal id is required)
### New Features
- Model Registry: Added support save/load/deploy SKL & XGB Models
## 0.2.3 (2023-04-27)
### Bug Fixes
- Allow using OneHotEncoder along with sklearn style estimators in a pipeline.
### New Features
- Model Registry: Added support for delete_model. Use delete_artifact = False to not delete the underlying model data
but just unregister.
## 0.2.2 (2023-04-11)
### New Features
- Initial version of snowflake-ml modeling package.
- Provide support for training most of scikit-learn and xgboost estimators and transformers.
### Bug Fixes
- Minor fixes in preprocessing package.
## 0.2.1 (2023-03-23)
### New Features
- New in Preprocessing:
- SimpleImputer
- Covariance Matrix
- Optimization of Ordinal Encoder client computations.
### Bug Fixes
- Minor fixes in OneHotEncoder.
## 0.2.0 (2023-02-27)
### New Features
- Model Registry
- PyTorch & Tensorflow connector file generic FileSet API
- New to Preprocessing:
- Binarizer
- Normalizer
- Pearson correlation Matrix
- Optimization in Ordinal Encoder to cache vocabulary in temp tables.
## 0.1.3 (2023-02-02)
### New Features
- Initial version of transformers including:
- Label Encoder
- Max Abs Scaler
- Min Max Scaler
- One Hot Encoder
- Ordinal Encoder
- Robust Scaler
- Standard Scaler
Raw data
{
"_id": null,
"home_page": null,
"name": "snowflake-ml-python",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.9",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "\"Snowflake, Inc\" <support@snowflake.com>",
"download_url": "https://files.pythonhosted.org/packages/7c/28/cce9078ff355fea75b660c16c1054297590ababca437d48f409f1151f702/snowflake_ml_python-1.7.1.tar.gz",
"platform": null,
"description": "# Snowpark ML\n\nSnowpark ML is a set of tools including SDKs and underlying infrastructure to build and deploy machine learning models.\nWith Snowpark ML, you can pre-process data, train, manage and deploy ML models all within Snowflake, using a single SDK,\nand benefit from Snowflake\u2019s proven performance, scalability, stability and governance at every stage of the Machine\nLearning workflow.\n\n## Key Components of Snowpark ML\n\nThe Snowpark ML Python SDK provides a number of APIs to support each stage of an end-to-end Machine Learning development\nand deployment process, and includes two key components.\n\n### Snowpark ML Development\n\n[Snowpark ML Development](https://docs.snowflake.com/en/developer-guide/snowpark-ml/index#snowpark-ml-development)\nprovides a collection of python APIs enabling efficient ML model development directly in Snowflake:\n\n1. Modeling API (`snowflake.ml.modeling`) for data preprocessing, feature engineering and model training in Snowflake.\nThis includes the `snowflake.ml.modeling.preprocessing` module for scalable data transformations on large data sets\nutilizing the compute resources of underlying Snowpark Optimized High Memory Warehouses, and a large collection of ML\nmodel development classes based on sklearn, xgboost, and lightgbm.\n\n1. Framework Connectors: Optimized, secure and performant data provisioning for Pytorch and Tensorflow frameworks in\ntheir native data loader formats.\n\n1. FileSet API: FileSet provides a Python fsspec-compliant API for materializing data into a Snowflake internal stage\nfrom a query or Snowpark Dataframe along with a number of convenience APIs.\n\n### Snowpark Model Management [Public Preview]\n\n[Snowpark Model Management](https://docs.snowflake.com/en/developer-guide/snowpark-ml/index#snowpark-ml-ops) complements\nthe Snowpark ML Development API, and provides model management capabilities along with integrated deployment into Snowflake.\nCurrently, the API consists of:\n\n1. Registry: A python API for managing models within Snowflake which also supports deployment of ML models into Snowflake\nas native MODEL object running with Snowflake Warehouse.\n\n## Getting started\n\n### Have your Snowflake account ready\n\nIf you don't have a Snowflake account yet, you can [sign up for a 30-day free trial account](https://signup.snowflake.com/).\n\n### Installation\n\nFollow the [installation instructions](https://docs.snowflake.com/en/developer-guide/snowpark-ml/index#installing-snowpark-ml)\nin the Snowflake documentation.\n\nPython versions 3.9 to 3.11 are supported. You can use [miniconda](https://docs.conda.io/en/latest/miniconda.html) or\n[anaconda](https://www.anaconda.com/) to create a Conda environment (recommended),\nor [virtualenv](https://docs.python.org/3/tutorial/venv.html) to create a virtual environment.\n\n### Conda channels\n\nThe [Snowflake Conda Channel](https://repo.anaconda.com/pkgs/snowflake/) contains the official snowpark ML package releases.\nThe recommended approach is to install `snowflake-ml-python` this conda channel:\n\n```sh\nconda install \\\n -c https://repo.anaconda.com/pkgs/snowflake \\\n --override-channels \\\n snowflake-ml-python\n```\n\nSee [the developer guide](https://docs.snowflake.com/en/developer-guide/snowpark-ml/index) for installation instructions.\n\nThe latest version of the `snowpark-ml-python` package is also published in a conda channel in this repository. Package versions\nin this channel may not yet be present in the official Snowflake conda channel.\n\nInstall `snowflake-ml-python` from this channel with the following (being sure to replace `<version_specifier>` with the\ndesired version, e.g. `1.0.10`):\n\n```bash\nconda install \\\n -c https://raw.githubusercontent.com/snowflakedb/snowflake-ml-python/conda/releases/ \\\n -c https://repo.anaconda.com/pkgs/snowflake \\\n --override-channels \\\n snowflake-ml-python==<version_specifier>\n```\n\nNote that until a `snowflake-ml-python` package version is available in the official Snowflake conda channel, there may\nbe compatibility issues. Server-side functionality that `snowflake-ml-python` depends on may not yet be released.\n\n# Release History\n\n## 1.7.1\n\n### Bug Fixes\n\n- Registry: Null value is now allowed in the dataframe used in model signature inference. Null values will be ignored\n and others will be used to infer the signature.\n- Registry: Pandas Extension DTypes (`pandas.StringDType()`, `pandas.BooleanDType()`, etc.) are now supported in model\nsignature inference.\n- Registry: Null value is now allowed in the dataframe used to predict.\n- Data: Fix missing `snowflake.ml.data.*` module exports in wheel\n- Dataset: Fix missing `snowflake.ml.dataset.*` module exports in wheel.\n- Registry: Fix the issue that `tf_keras.Model` is not recognized as keras model when logging.\n\n### Behavior Changes\n\n### New Features\n\n- Registry: Option to `enable_monitoring` set to False by default. This will gate access to preview features of Model Monitoring.\n- Model Monitoring: `show_model_monitors` Registry method. This feature is still in Private Preview.\n- Registry: Support `pd.Series` in input and output data.\n- Model Monitoring: `add_monitor` Registry method. This feature is still in Private Preview.\n- Model Monitoring: `resume` and `suspend` ModelMonitor. This feature is still in Private Preview.\n- Model Monitoring: `get_monitor` Registry method. This feature is still in Private Preview.\n- Model Monitoring: `delete_monitor` Registry method. This feature is still in Private Preview.\n\n## 1.7.0 (10-22-2024)\n\n### Behavior Change\n\n- Generic: Require python >= 3.9.\n- Data Connector: Update `to_torch_dataset` and `to_torch_datapipe` to add a dimension for scalar data.\nThis allows for more seamless integration with PyTorch `DataLoader`, which creates batches by stacking inputs of each batch.\n\nExamples:\n\n```python\nds = connector.to_torch_dataset(shuffle=False, batch_size=3)\n```\n\n- Input: \"col1\": [10, 11, 12]\n - Previous batch: array([10., 11., 12.]) with shape (3,)\n - New batch: array([[10.], [11.], [12.]]) with shape (3, 1)\n\n- Input: \"col2\": [[0, 100], [1, 110], [2, 200]]\n - Previous batch: array([[ 0, 100], [ 1, 110], [ 2, 200]]) with shape (3,2)\n - New batch: No change\n\n- Model Registry: External access integrations are optional when creating a model inference service in\n Snowflake >= 8.40.0.\n- Model Registry: Deprecate `build_external_access_integration` with `build_external_access_integrations` in\n `ModelVersion.create_service()`.\n\n### Bug Fixes\n\n- Registry: Updated `log_model` API to accept both signature and sample_input_data parameters.\n- Feature Store: ExampleHelper uses fully qualified path for table name. change weather features aggregation from 1d to 1h.\n- Data Connector: Return numpy array with appropriate object type instead of list for multi-dimensional\ndata from `to_torch_dataset` and `to_torch_datapipe`\n- Model explainability: Incompatibility between SHAP 0.42.1 and XGB 2.1.1 resolved by using latest SHAP 0.46.0.\n\n### New Features\n\n- Registry: Provide pass keyworded variable length of arguments to class ModelContext. Example usage:\n\n```python\nmc = custom_model.ModelContext(\n config = 'local_model_dir/config.json',\n m1 = model1\n)\n\nclass ExamplePipelineModel(custom_model.CustomModel):\n def __init__(self, context: custom_model.ModelContext) -> None:\n super().__init__(context)\n v = open(self.context['config']).read()\n self.bias = json.loads(v)['bias']\n\n @custom_model.inference_api\n def predict(self, input: pd.DataFrame) -> pd.DataFrame:\n model_output = self.context['m1'].predict(input)\n return pd.DataFrame({'output': model_output + self.bias})\n```\n\n- Model Development: Upgrade scikit-learn in UDTF backend for log_loss metric. As a result, `eps` argument is now ignored.\n- Data Connector: Add the option of passing a `None` sized batch to `to_torch_dataset` for better\ninteroperability with PyTorch DataLoader.\n- Model Registry: Support [pandas.CategoricalDtype](https://pandas.pydata.org/docs/reference/api/pandas.CategoricalDtype.html#pandas-categoricaldtype)\n- Registry: It is now possible to pass `signatures` and `sample_input_data` at the same time to capture background\ndata from explainablity and data lineage.\n\n## 1.6.4 (2024-10-17)\n\n### Bug Fixes\n\n- Registry: Fix an issue that leads to incident when using `ModelVersion.run` with service.\n\n## 1.6.3 (2024-10-07)\n\n- Model Registry (PrPr) has been removed.\n\n### Bug Fixes\n\n- Registry: Fix a bug that when package whose name does not follow PEP-508 is provided when logging the model,\n an unexpected normalization is happening.\n- Registry: Fix `not a valid remote uri` error when logging mlflow models.\n- Registry: Fix a bug that `ModelVersion.run` is called in a nested way.\n- Registry: Fix an issue that leads to `log_model` failure when local package version contains parts other than\n base version.\n- Fix issue where `sample_weights` were not being applied to search estimators.\n- Model explainability: Fix bug which creates explain as a function instead of table function when enabling by default.\n- Model explainability: Update lightgbm binary classification to return non-json values, from customer feedback.\n\n### New Features\n\n- Data: Improve `DataConnector.to_pandas()` performance when loading from Snowpark DataFrames.\n- Model Registry: Allow users to set a model task while using `log_model`.\n- Feature Store: FeatureView supports ON_CREATE or ON_SCHEDULE initialize mode.\n\n## 1.6.2 (2024-09-04)\n\n### Bug Fixes\n\n- Modeling: Support XGBoost version that is larger than 2.\n\n- Data: Fix multiple epoch iteration over `DataConnector.to_torch_datapipe()` DataPipes.\n- Generic: Fix a bug that when an invalid name is provided to argument where fully qualified name is expected, it will\n be parsed wrongly. Now it raises an exception correctly.\n- Model Explainability: Handle explanations for multiclass XGBoost classification models\n- Model Explainability: Workarounds and better error handling for XGB>2.1.0 not working with SHAP==0.42.1\n\n### New Features\n\n- Data: Add top-level exports for `DataConnector` and `DataSource` to `snowflake.ml.data`.\n- Data: Add native batching support via `batch_size` and `drop_last_batch` arguments to `DataConnector.to_torch_dataset()`\n- Feature Store: update_feature_view() supports taking feature view object as argument.\n\n## 1.6.1 (2024-08-12)\n\n### Bug Fixes\n\n- Feature Store: Support large metadata blob when generating dataset\n- Feature Store: Added a hidden knob in FeatureView as kargs for setting customized\n refresh_mode\n- Registry: Fix an error message in Model Version `run` when `function_name` is not mentioned and model has multiple\n target methods.\n- Cortex inference: snowflake.cortex.Complete now only uses the REST API for streaming and the use_rest_api_experimental\n is no longer needed.\n- Feature Store: Add a new API: FeatureView.list_columns() which list all column information.\n- Data: Fix `DataFrame` ingestion with `ArrowIngestor`.\n\n### New Features\n\n- Enable `set_params` to set the parameters of the underlying sklearn estimator, if the snowflake-ml model has been fit.\n- Data: Add `snowflake.ml.data.ingestor_utils` module with utility functions helpful for `DataIngestor` implementations.\n- Data: Add new `to_torch_dataset()` connector to `DataConnector` to replace deprecated DataPipe.\n- Registry: Option to `enable_explainability` set to True by default for XGBoost, LightGBM and CatBoost as PuPr feature.\n- Registry: Option to `enable_explainability` when registering SHAP supported sklearn models.\n\n## 1.6.0 (2024-07-29)\n\n### Bug Fixes\n\n- Modeling: `SimpleImputer` can impute integer columns with integer values.\n- Registry: Fix an issue when providing a pandas Dataframe whose index is not starting from 0 as the input to\n the `ModelVersion.run`.\n\n### New Features\n\n- Feature Store: Add overloads to APIs accept both object and name/version. Impacted APIs include read_feature_view(),\n refresh_feature_view(), get_refresh_history(), resume_feature_view(), suspend_feature_view(), delete_feature_view().\n- Feature Store: Add docstring inline examples for all public APIs.\n- Feature Store: Add new utility class `ExampleHelper` to help with load source data to simplify public notebooks.\n- Registry: Option to `enable_explainability` when registering XGBoost models as a pre-PuPr feature.\n- Feature Store: add new API `update_entity()`.\n- Registry: Option to `enable_explainability` when registering Catboost models as a pre-PuPr feature.\n- Feature Store: Add new argument warehouse to FeatureView constructor to overwrite the default warehouse. Also add\n a new column 'warehouse' to the output of list_feature_views().\n- Registry: Add support for logging model from a model version.\n- Modeling: Distributed Hyperparameter Optimization now announce GA refresh version. The latest memory efficient version\n will not have the 10GB training limitation for dataset any more. To turn off, please run\n `\n from snowflake.ml.modeling._internal.snowpark_implementations import (\n distributed_hpo_trainer,\n )\n distributed_hpo_trainer.ENABLE_EFFICIENT_MEMORY_USAGE = False\n `\n- Registry: Option to `enable_explainability` when registering LightGBM models as a pre-PuPr feature.\n- Data: Add new `snowflake.ml.data` preview module which contains data reading utilities like `DataConnector`\n - `DataConnector` provides efficient connectors from Snowpark `DataFrame`\n and Snowpark ML `Dataset` to external frameworks like PyTorch, TensorFlow, and Pandas. Create `DataConnector`\n instances using the classmethod constructors `DataConnector.from_dataset()` and `DataConnector.from_dataframe()`.\n- Data: Add new `DataConnector.from_sources()` classmethod constructor for constructing from `DataSource` objects.\n- Data: Add new `ingestor_class` arg to `DataConnector` classmethod constructors for easier `DataIngestor` injection.\n- Dataset: `DatasetReader` now subclasses new `DataConnector` class.\n - Add optional `limit` arg to `DatasetReader.to_pandas()`\n\n### Behavior Changes\n\n- Feature Store: change some positional parameters to keyword arguments in following APIs:\n - Entity(): desc.\n - FeatureView(): timestamp_col, refresh_freq, desc.\n - FeatureStore(): creation_mode.\n - update_entity(): desc.\n - register_feature_view(): block, overwrite.\n - list_feature_views(): entity_name, feature_view_name.\n - get_refresh_history(): verbose.\n - retrieve_feature_values(): spine_timestamp_col, exclude_columns, include_feature_view_timestamp_col.\n - generate_training_set(): save_as, spine_timestamp_col, spine_label_cols, exclude_columns,\n include_feature_view_timestamp_col.\n - generate_dataset(): version, spine_timestamp_col, spine_label_cols, exclude_columns,\n include_feature_view_timestamp_col, desc, output_type.\n\n## 1.5.4 (2024-07-11)\n\n### Bug Fixes\n\n- Model Registry (PrPr): Fix 401 Unauthorized issue when deploying model to SPCS.\n- Feature Store: Downgrades exceptions to warnings for few property setters in feature view. Now you can set\n desc, refresh_freq and warehouse for draft feature views.\n- Modeling: Fix an issue with calling `OrdinalEncoder` with `categories` as a dictionary and a pandas DataFrame\n- Modeling: Fix an issue with calling `OneHotEncoder` with `categories` as a dictionary and a pandas DataFrame\n\n### New Features\n\n- Registry: Allow overriding `device_map` and `device` when loading huggingface pipeline models.\n- Registry: Add `set_alias` method to `ModelVersion` instance to set an alias to model version.\n- Registry: Add `unset_alias` method to `ModelVersion` instance to unset an alias to model version.\n- Registry: Add `partitioned_inference_api` allowing users to create partitioned inference functions in registered\n models. Enable model inference methods with table functions with vectorized process methods in registered models.\n- Feature Store: add 3 more columns: refresh_freq, refresh_mode and scheduling_state to the result of\n `list_feature_views()`.\n- Feature Store: `update_feature_view()` supports updating description.\n- Feature Store: add new API `refresh_feature_view()`.\n- Feature Store: add new API `get_refresh_history()`.\n- Feature Store: Add `generate_training_set()` API for generating table-backed feature snapshots.\n- Feature Store: Add `DeprecationWarning` for `generate_dataset(..., output_type=\"table\")`.\n- Feature Store: `update_feature_view()` supports updating description.\n- Feature Store: add new API `refresh_feature_view()`.\n- Feature Store: add new API `get_refresh_history()`.\n- Model Development: OrdinalEncoder supports a list of array-likes for `categories` argument.\n- Model Development: OneHotEncoder supports a list of array-likes for `categories` argument.\n\n## 1.5.3 (06-17-2024)\n\n### Bug Fixes\n\n- Modeling: Fix an issue causing lineage information to be missing for\n `Pipeline`, `GridSearchCV` , `SimpleImputer`, and `RandomizedSearchCV`\n- Registry: Fix an issue that leads to incorrect result when using pandas Dataframe with over 100, 000 rows as the input\n of `ModelVersion.run` method in Stored Procedure.\n\n### New Features\n\n- Registry: Add support for TIMESTAMP_NTZ model signature data type, allowing timestamp input and output.\n- Dataset: Add `DatasetVersion.label_cols` and `DatasetVersion.exclude_cols` properties.\n\n## 1.5.2 (06-10-2024)\n\n### Bug Fixes\n\n- Registry: Fix an issue that leads to unable to log model in store procedure.\n- Modeling: Quick fix `import snowflake.ml.modeling.parameters.enable_anonymous_sproc` cannot be imported due to package\n dependency error.\n\n### Behavior Changes\n\n### New Features\n\n## 1.5.1 (05-22-2024)\n\n### Bug Fixes\n\n- Dataset: Fix `snowflake.connector.errors.DataError: Query Result did not match expected number of rows` when accessing\n DatasetVersion properties when case insensitive `SHOW VERSIONS IN DATASET` check matches multiple version names.\n- Dataset: Fix bug in SnowFS bulk file read when used with DuckDB\n- Registry: Fixed a bug when loading old models.\n- Lineage: Fix Dataset source lineage propagation through `snowpark.DataFrame` transformations\n\n### Behavior Changes\n\n- Feature Store: convert clear() into a private function. Also make it deletes feature views and entities only.\n- Feature Store: Use NULL as default value for timestamp tag value.\n\n### New Features\n\n- Feature Store: Added new `snowflake.ml.feature_store.setup_feature_store()` API to assist Feature Store RBAC setup.\n- Feature Store: Add `output_type` argument to `FeatureStore.generate_dataset()` to allow generating data snapshots\n as Datasets or Tables.\n- Registry: `log_model`, `get_model`, `delete_model` now supports fully qualified name.\n- Modeling: Supports anonymous stored procedure during fit calls so that modeling would not require sufficient\n permissions to operate on schema. Please call\n `import snowflake.ml.modeling.parameters.enable_anonymous_sproc # noqa: F401`\n\n## 1.5.0 (05-01-2024)\n\n### Bug Fixes\n\n- Registry: Fix invalid parameter 'SHOW_MODEL_DETAILS_IN_SHOW_VERSIONS_IN_MODEL' error.\n\n### Behavior Changes\n\n- Model Development: The behavior of `fit_transform` for all estimators is changed.\n Firstly, it will cover all the estimator that contains this function,\n secondly, the output would be the union of pandas DataFrame and snowpark DataFrame.\n\n#### Model Registry (PrPr)\n\n`snowflake.ml.registry.artifact` and related `snowflake.ml.model_registry.ModelRegistry` APIs have been removed.\n\n- Removed `snowflake.ml.registry.artifact` module.\n- Removed `ModelRegistry.log_artifact()`, `ModelRegistry.list_artifacts()`, `ModelRegistry.get_artifact()`\n- Removed `artifacts` argument from `ModelRegistry.log_model()`\n\n#### Dataset (PrPr)\n\n`snowflake.ml.dataset.Dataset` has been redesigned to be backed by Snowflake Dataset entities.\n\n- New `Dataset`s can be created with `Dataset.create()` and existing `Dataset`s may be loaded\n with `Dataset.load()`.\n- `Dataset`s now maintain an immutable `selected_version` state. The `Dataset.create_version()` and\n `Dataset.load_version()` APIs return new `Dataset` objects with the requested `selected_version` state.\n- Added `dataset.create_from_dataframe()` and `dataset.load_dataset()` convenience APIs as a shortcut\n to creating and loading `Dataset`s with a pre-selected version.\n- `Dataset.materialized_table` and `Dataset.snapshot_table` no longer exist with `Dataset.fully_qualified_name`\n as the closest equivalent.\n- `Dataset.df` no longer exists. Instead, use `DatasetReader.read.to_snowpark_dataframe()`.\n- `Dataset.owner` has been moved to `Dataset.selected_version.owner`\n- `Dataset.desc` has been moved to `DatasetVersion.selected_version.comment`\n- `Dataset.timestamp_col`, `Dataset.label_cols`, `Dataset.feature_store_metadata`, and\n `Dataset.schema_version` have been removed.\n\n#### Feature Store (PrPr)\n\n- `FeatureStore.generate_dataset` argument list has been changed to match the new\n`snowflake.ml.dataset.Dataset` definition\n\n - `materialized_table` has been removed and replaced with `name` and `version`.\n - `name` moved to first positional argument\n - `save_mode` has been removed as `merge` behavior is no longer supported. The new behavior is always `errorifexists`.\n\n- Change feature view version type from str to `FeatureViewVersion`. It is a restricted string literal.\n\n- Remove as_dataframe arg from FeatureStore.list_feature_views(), now always returns result as DataFrame.\n\n- Combines few metadata tags into a new tag: SNOWML_FEATURE_VIEW_METADATA. This will make previously created feature views\nnot readable by new SDK.\n\n### New Features\n\n- Registry: Add `export` method to `ModelVersion` instance to export model files.\n- Registry: Add `load` method to `ModelVersion` instance to load the underlying object from the model.\n- Registry: Add `Model.rename` method to `Model` instance to rename or move a model.\n\n#### Dataset (PrPr)\n\n- Added Snowpark DataFrame integration using `Dataset.read.to_snowpark_dataframe()`\n- Added Pandas DataFrame integration using `Dataset.read.to_pandas()`\n- Added PyTorch and TensorFlow integrations using `Dataset.read.to_torch_datapipe()`\n and `Dataset.read.to_tf_dataset()` respectively.\n- Added `fsspec` style file integration using `Dataset.read.files()` and `Dataset.read.filesystem()`\n\n#### Feature Store\n\n- use new tag_reference_internal to speed up metadata lookup.\n\n## 1.4.1 (2024-04-18)\n\n### New Features\n\n- Registry: Add support for `catboost` model (`catboost.CatBoostClassifier`, `catboost.CatBoostRegressor`).\n- Registry: Add support for `lightgbm` model (`lightgbm.Booster`, `lightgbm.LightGBMClassifier`, `lightgbm.LightGBMRegressor`).\n\n### Bug Fixes\n\n- Registry: Fix a bug that leads to relax_version option is not working.\n\n### Behavior changes\n\n- Feature Store: update_feature_view takes refresh_freq and warehouse as argument.\n\n## 1.4.0 (2024-04-08)\n\n### Bug Fixes\n\n- Registry: Fix a bug when multiple models are being called from the same query, models other than the first one will\n have incorrect result. This fix only works for newly logged model.\n- Modeling: When registering a model, only method(s) that is mentioned in `save_model` would be added to model signature\n in SnowML models.\n- Modeling: Fix a bug that when n_jobs is not 1, model cannot execute methods such as\n predict, predict_log_proba, and other batch inference methods. The n_jobs would automatically\n set to 1 because vectorized udf currently doesn't support joblib parallel backend.\n- Modeling: Fix a bug that batch inference methods cannot infer the datatype when the first row of data contains NULL.\n- Modeling: Matches Distributed HPO output column names with the snowflake identifier.\n- Modeling: Relax package versions for all Distributed HPO methods if the installed version\n is not available in the Snowflake conda channel\n- Modeling: Add sklearn as required dependency for LightGBM package.\n\n### Behavior Changes\n\n- Registry: `apply` method is no longer by default logged when logging a xgboost model. If that is required, it could\n be specified manually when logging the model by `log_model(..., options={\"target_methods\": [\"apply\", ...]})`.\n- Feature Store: register_entity returns an entity object.\n- Feature Store: register_feature_view `block=true` becomes default.\n\n### New Features\n\n- Registry: Add support for `sentence-transformers` model (`sentence_transformers.SentenceTransformer`).\n- Registry: Now version name is no longer required when logging a model. If not provided, a random human readable ID\n will be generated.\n\n## 1.3.1 (2024-03-21)\n\n### New Features\n\n- FileSet: `snowflake.ml.fileset.sfcfs.SFFileSystem` can now be used in UDFs and stored procedures.\n\n## 1.3.0 (2024-03-12)\n\n### Bug Fixes\n\n- Registry: Fix a bug that leads to module in `code_paths` when `log_model` cannot be correctly imported.\n- Registry: Fix incorrect error message when validating input Snowpark DataFrame with array feature.\n- Model Registry: Fix an issue when deploying a model to SPCS that some files do not have proper permission.\n- Model Development: Relax package versions for all inference methods if the installed version\n is not available in the Snowflake conda channel\n\n### Behavior Changes\n\n- Registry: When running the method of a model, the value range based input validation to avoid input from overflowing\n is now optional rather than enforced, this should improve the performance and should not lead to problem for most\n kinds of model. If you want to enable this check as previous, specify `strict_input_validation=True` when\n calling `run`.\n- Registry: By default `relax_version=True` when logging a model instead of using the specific local dependency versions.\n This improves dependency versioning by using versions available in Snowflake. To switch back to the previous behavior\n and use specific local dependency versions, specify `relax_version=False` when calling `log_model`.\n- Model Development: The behavior of `fit_predict` for all estimators is changed.\n Firstly, it will cover all the estimator that contains this function,\n secondly, the output would be the union of pandas DataFrame and snowpark DataFrame.\n\n### New Features\n\n- FileSet: `snowflake.ml.fileset.sfcfs.SFFileSystem` can now be serialized with `pickle`.\n\n## 1.2.3 (2024-02-26)\n\n### Bug Fixes\n\n- Registry: Now when providing Decimal Type column to a DOUBLE or FLOAT feature will not error out but auto cast with\n warnings.\n- Registry: Improve the error message when specifying currently unsupported `pip_requirements` argument.\n- Model Development: Fix precision_recall_fscore_support incorrect results when `average=\"samples\"`.\n- Model Registry: Fix an issue that leads to description, metrics or tags are not correctly returned in newly created\n Model Registry (PrPr) due to Snowflake BCR [2024_01](https://docs.snowflake.com/en/release-notes/bcr-bundles/2024_01/bcr-1483)\n\n### Behavior Changes\n\n- Feature Store: `FeatureStore.suspend_feature_view` and `FeatureStore.resume_feature_view` doesn't mutate input feature\n view argument any more. The updated status only reflected in the returned feature view object.\n\n### New Features\n\n- Model Development: support `score_samples` method for all the classes, including Pipeline,\n GridSearchCV, RandomizedSearchCV, PCA, IsolationForest, ...\n- Registry: Support deleting a version of a model.\n\n## 1.2.2 (2024-02-13)\n\n### New Features\n\n- Model Registry: Support providing external access integrations when deploying a model to SPCS. This will help and be\n required to make sure the deploying process work as long as SPCS will by default deny all network connections. The\n following endpoints must be allowed to make deployment work: docker.com:80, docker.com:443, anaconda.com:80,\n anaconda.com:443, anaconda.org:80, anaconda.org:443, pypi.org:80, pypi.org:443. If you are using\n `snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel` object, the following endpoints are required\n to be allowed: huggingface.com:80, huggingface.com:443, huggingface.co:80, huggingface.co:443.\n\n## 1.2.1 (2024-01-25)\n\n### New Features\n\n- Model Development: Infers output column data type for transformers when possible.\n- Registry: `relax_version` option is available in the `options` argument when logging the model.\n\n## 1.2.0 (2024-01-11)\n\n### Bug Fixes\n\n- Model Registry: Fix \"XGBoost version not compiled with GPU support\" error when running CPU inference against open-source\n XGBoost models deployed to SPCS.\n- Model Registry: Fix model deployment to SPCS on Windows machines.\n\n### New Features\n\n- Model Development: Introduced XGBoost external memory training feature. This feature enables training XGBoost models\n on large datasets that don't fit into memory.\n- Registry: New Registry class named `snowflake.ml.registry.Registry` providing similar APIs as the old one but works\n with new MODEL object in Snowflake SQL. Also, we are providing`snowflake.ml.model.Model` and\n `snowflake.ml.model.ModelVersion` to represent a model and a specific version of a model.\n- Model Development: Add support for `fit_predict` method in `AgglomerativeClustering`, `DBSCAN`, and `OPTICS` classes;\n- Model Development: Add support for `fit_transform` method in `MDS`, `SpectralEmbedding` and `TSNE` class.\n\n### Additional Notes\n\n- Model Registry: The `snowflake.ml.registry.model_registry.ModelRegistry` has been deprecated starting from version\n 1.2.0. It will stay in the Private Preview phase. For future implementations, kindly utilize\n `snowflake.ml.registry.Registry`, except when specifically required. The old model registry will be removed once all\n its primary functionalities are fully integrated into the new registry.\n\n## 1.1.2 (2023-12-18)\n\n### Bug Fixes\n\n- Generic: Fix the issue that stack trace is hidden by telemetry unexpectedly.\n- Model Development: Execute model signature inference without materializing full dataframe in memory.\n- Model Registry: Fix occasional 'snowflake-ml-python library does not exist' error when deploying to SPCS.\n\n### Behavior Changes\n\n- Model Registry: When calling `predict` with Snowpark DataFrame, both inferred or normalized column names are accepted.\n- Model Registry: When logging a Snowpark ML Modeling Model, sample input data or manually provided signature will be\n ignored since they are not necessary.\n\n### New Features\n\n- Model Development: SQL implementation of binary `precision_score` metric.\n\n## 1.1.1 (2023-12-05)\n\n### Bug Fixes\n\n- Model Registry: The `predict` target method on registered models is now compatible with unsupervised estimators.\n- Model Development: Fix confusion_matrix incorrect results when the row number cannot be divided by the batch size.\n\n### New Features\n\n- Introduced passthrough_col param in Modeling API. This new param is helpful in scenarios\n requiring automatic input_cols inference, but need to avoid using specific\n columns, like index columns, during training or inference.\n\n## 1.1.0 (2023-12-01)\n\n### Bug Fixes\n\n- Model Registry: Fix panda dataframe input not handling first row properly.\n- Model Development: OrdinalEncoder and LabelEncoder output_columns do not need to be valid snowflake identifiers. They\n would previously be excluded if the normalized name did not match the name specified in output_columns.\n\n### New Features\n\n- Model Registry: Add support for invoking public endpoint on SPCS service, by providing a \"enable_ingress\" SPCS\n deployment option.\n- Model Development: Add support for distributed HPO - GridSearchCV and RandomizedSearchCV execution will be\n distributed on multi-node warehouses.\n\n## 1.0.12 (2023-11-13)\n\n### Bug Fixes\n\n- Model Registry: Fix regression issue that container logging is not shown during model deployment to SPCS.\n- Model Development: Enhance the column capacity of OrdinalEncoder.\n- Model Registry: Fix unbound `batch_size` error when deploying a model other than Hugging Face Pipeline\n and LLM with GPU on SPCS.\n\n### Behavior Changes\n\n- Model Registry: Raise early error when deploying to SPCS with db/schema that starts with underscore.\n- Model Registry: `conda-forge` channel is now automatically added to channel lists when deploying to SPCS.\n- Model Registry: `relax_version` will not strip all version specifier, instead it will relax `==x.y.z` specifier to\n `>=x.y,<(x+1)`.\n- Model Registry: Python with different patchlevel but the same major and minor will not result a warning when loading\n the model via Model Registry and would be considered to use when deploying to SPCS.\n- Model Registry: When logging a `snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel` object,\n versions of local installed libraries won't be picked as dependencies of models, instead it will pick up some pre-\n defined dependencies to improve user experience.\n\n### New Features\n\n- Model Registry: Enable best-effort SPCS job/service log streaming when logging level is set to INFO.\n\n## 1.0.11 (2023-10-27)\n\n### New Features\n\n- Model Registry: Add log_artifact() public method.\n- Model Development: Add support for `kneighbors`.\n\n### Behavior Changes\n\n- Model Registry: Change log_model() argument from TrainingDataset to List of Artifact.\n- Model Registry: Change get_training_dataset() to get_artifact().\n\n### Bug Fixes\n\n- Model Development: Fix support for XGBoost and LightGBM models using SKLearn Grid Search and Randomized Search model selectors.\n- Model Development: DecimalType is now supported as a DataType.\n- Model Development: Fix metrics compatibility with Snowpark Dataframes that use Snowflake identifiers\n- Model Registry: Resolve 'delete_deployment' not deleting the SPCS service in certain cases.\n\n## 1.0.10 (2023-10-13)\n\n### Behavior Changes\n\n- Model Development: precision_score, recall_score, f1_score, fbeta_score, precision_recall_fscore_support,\n mean_absolute_error, mean_squared_error, and mean_absolute_percentage_error metric calculations are now distributed.\n- Model Registry: `deploy` will now return `Deployment` for deployment information.\n\n### New Features\n\n- Model Registry: When the model signature is auto-inferred, it will be printed to the log for reference.\n- Model Registry: For SPCS deployment, `Deployment` details will contains `image_name`, `service_spec` and `service_function_sql`.\n\n### Bug Fixes\n\n- Model Development: Fix an issue that leading to UTF-8 decoding errors when using modeling modules on Windows.\n- Model Development: Fix an issue that alias definitions cause `SnowparkSQLUnexpectedAliasException` in inference.\n- Model Registry: Fix an issue that signature inference could be incorrect when using Snowpark DataFrame as sample input.\n- Model Registry: Fix too strict data type validation when predicting. Now, for example, if you have a INT8\n type feature in the signature, if providing a INT64 dataframe but all values are within the range, it would not fail.\n\n## 1.0.9 (2023-09-28)\n\n### Behavior Changes\n\n- Model Development: log_loss metric calculation is now distributed.\n\n### Bug Fixes\n\n- Model Registry: Fix an issue that building images fails with specific docker setup.\n- Model Registry: Fix an issue that unable to embed local ML library when the library is imported by `zipimport`.\n- Model Registry: Fix out-of-date doc about `platform` argument in the `deploy` function.\n- Model Registry: Fix an issue that unable to deploy a GPU-trained PyTorch model to a platform where GPU is not available.\n\n## 1.0.8 (2023-09-15)\n\n### Bug Fixes\n\n- Model Development: Ordinal encoder can be used with mixed input column types.\n- Model Development: Fix an issue when the sklearn default value is `np.nan`.\n- Model Registry: Fix an issue that incorrect docker executable is used when building images.\n- Model Registry: Fix an issue that specifying `token` argument when using\n `snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel` with `transformers < 4.32.0` is not effective.\n- Model Registry: Fix an issue that incorrect system function call is used when deploying to SPCS.\n- Model Registry: Fix an issue when using a `transformers.pipeline` that does not have a `tokenizer`.\n- Model Registry: Fix incorrectly-inferred image repository name during model deployment to SPCS.\n- Model Registry: Fix GPU resource retention issue caused by failed or stuck previous deployments in SPCS.\n\n## 1.0.7 (2023-09-05)\n\n### Bug Fixes\n\n- Model Development & Model Registry: Fix an error related to `pandas.io.json.json_normalize`.\n- Allow disabling telemetry.\n\n## 1.0.6 (2023-09-01)\n\n### New Features\n\n- Model Registry: add `create_if_not_exists` parameter in constructor.\n- Model Registry: Added get_or_create_model_registry API.\n- Model Registry: Added support for using GPU inference when deploying XGBoost (`xgboost.XGBModel` and `xgboost.Booster`\n ), PyTorch (`torch.nn.Module` and `torch.jit.ScriptModule`) and TensorFlow (`tensorflow.Module` and\n `tensorflow.keras.Model`) models to Snowpark Container Services.\n- Model Registry: When inferring model signature, `Sequence` of built-in types, `Sequence` of `numpy.ndarray`,\n `Sequence` of `torch.Tensor`, `Sequence` of `tensorflow.Tensor` and `Sequence` of `tensorflow.Tensor` can be used\n instead of only `List` of them.\n- Model Registry: Added `get_training_dataset` API.\n- Model Development: Size of metrics result can exceed previous 8MB limit.\n- Model Registry: Added support save/load/deploy HuggingFace pipeline object (`transformers.Pipeline`) and our wrapper\n (`snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel`) to it. Using the wrapper to specify\n configurations and the model for the pipeline will be loaded dynamically when deploying. Currently, following tasks\n are supported to log without manually specifying model signatures:\n - \"conversational\"\n - \"fill-mask\"\n - \"question-answering\"\n - \"summarization\"\n - \"table-question-answering\"\n - \"text2text-generation\"\n - \"text-classification\" (alias \"sentiment-analysis\" available)\n - \"text-generation\"\n - \"token-classification\" (alias \"ner\" available)\n - \"translation\"\n - \"translation_xx_to_yy\"\n - \"zero-shot-classification\"\n\n### Bug Fixes\n\n- Model Development: Fixed a bug when using simple imputer with numpy >= 1.25.\n- Model Development: Fixed a bug when inferring the type of label columns.\n\n### Behavior Changes\n\n- Model Registry: `log_model()` now return a `ModelReference` object instead of a model ID.\n- Model Registry: When deploying a model with 1 `target method` only, the `target_method` argument can be omitted.\n- Model Registry: When using the snowflake-ml-python with version newer than what is available in Snowflake Anaconda\n Channel, `embed_local_ml_library` option will be set as `True` automatically if not.\n- Model Registry: When deploying a model to Snowpark Container Services and using GPU, the default value of num_workers\n will be 1.\n- Model Registry: `keep_order` and `output_with_input_features` in the deploy options have been removed. Now the\n behavior is controlled by the type of the input when calling `model.predict()`. If the input is a `pandas.DataFrame`,\n the behavior will be the same as `keep_order=True` and `output_with_input_features=False` before. If the input is a\n `snowpark.DataFrame`, the behavior will be the same as `keep_order=False` and `output_with_input_features=True` before.\n- Model Registry: When logging and deploying PyTorch (`torch.nn.Module` and `torch.jit.ScriptModule`) and TensorFlow\n (`tensorflow.Module` and `tensorflow.keras.Model`) models, we no longer accept models whose input is a list of tensor\n and output is a list of tensors. Instead, now we accept models whose input is 1 or more tensors as positional arguments,\n and output is a tensor or a tuple of tensors. The input and output dataframe when predicting keep the same as before,\n that is every column is an array feature and contains a tensor.\n\n## 1.0.5 (2023-08-17)\n\n### New Features\n\n- Model Registry: Added support save/load/deploy xgboost Booster model.\n- Model Registry: Added support to get the model name and the model version from model references.\n\n### Bug Fixes\n\n- Model Registry: Restore the db/schema back to the session after `create_model_registry()`.\n- Model Registry: Fixed an issue that the UDF name created when deploying a model is not identical to what is provided\n and cannot be correctly dropped when deployment getting dropped.\n- connection_params.SnowflakeLoginOptions(): Added support for `private_key_path`.\n\n## 1.0.4 (2023-07-28)\n\n### New Features\n\n- Model Registry: Added support save/load/deploy Tensorflow models (`tensorflow.Module`).\n- Model Registry: Added support save/load/deploy MLFlow PyFunc models (`mlflow.pyfunc.PyFuncModel`).\n- Model Development: Input dataframes can now be joined against data loaded from staged files.\n- Model Development: Added support for non-English languages.\n\n### Bug Fixes\n\n- Model Registry: Fix an issue that model dependencies are incorrectly reported as unresolvable on certain platforms.\n\n## 1.0.3 (2023-07-14)\n\n### Behavior Changes\n\n- Model Registry: When predicting a model whose output is a list of NumPy ndarray, the output would not be flattened,\n instead, every ndarray will act as a feature(column) in the output.\n\n### New Features\n\n- Model Registry: Added support save/load/deploy PyTorch models (`torch.nn.Module` and `torch.jit.ScriptModule`).\n\n### Bug Fixes\n\n- Model Registry: Fix an issue that when database or schema name provided to `create_model_registry` contains special\n characters, the model registry cannot be created.\n- Model Registry: Fix an issue that `get_model_description` returns with additional quotes.\n- Model Registry: Fix incorrect error message when attempting to remove a unset tag of a model.\n- Model Registry: Fix a typo in the default deployment table name.\n- Model Registry: Snowpark dataframe for sample input or input for `predict` method that contains a column with\n Snowflake `NUMBER(precision, scale)` data type where `scale = 0` will not lead to error, and will now correctly\n recognized as `INT64` data type in model signature.\n- Model Registry: Fix an issue that prevent model logged in the system whose default encoding is not UTF-8 compatible\n from deploying.\n- Model Registry: Added earlier and better error message when any file name in the model or the file name of model\n itself contains characters that are unable to be encoded using ASCII. It is currently not supported to deploy such a\n model.\n\n## 1.0.2 (2023-06-22)\n\n### Behavior Changes\n\n- Model Registry: Prohibit non-snowflake-native models from being logged.\n- Model Registry: `_use_local_snowml` parameter in options of `deploy()` has been removed.\n- Model Registry: A default `False` `embed_local_ml_library` parameter has been added to the options of `log_model()`.\n With this set to `False` (default), the version of the local snowflake-ml-python library will be recorded and used when\n deploying the model. With this set to `True`, local snowflake-ml-python library will be embedded into the logged model,\n and will be used when you load or deploy the model.\n\n### New Features\n\n- Model Registry: A new optional argument named `code_paths` has been added to the arguments of `log_model()` for users\n to specify additional code paths to be imported when loading and deploying the model.\n- Model Registry: A new optional argument named `options` has been added to the arguments of `log_model()` to specify\n any additional options when saving the model.\n- Model Development: Added metrics:\n - d2_absolute_error_score\n - d2_pinball_score\n - explained_variance_score\n - mean_absolute_error\n - mean_absolute_percentage_error\n - mean_squared_error\n\n### Bug Fixes\n\n- Model Development: `accuracy_score()` now works when given label column names are lists of a single value.\n\n## 1.0.1 (2023-06-16)\n\n### Behavior Changes\n\n- Model Development: Changed Metrics APIs to imitate sklearn metrics modules:\n - `accuracy_score()`, `confusion_matrix()`, `precision_recall_fscore_support()`, `precision_score()` methods move from\n respective modules to `metrics.classification`.\n- Model Registry: The default table/stage created by the Registry now uses \"_SYSTEM_\" as a prefix.\n- Model Registry: `get_model_history()` method as been enhanced to include the history of model deployment.\n\n### New Features\n\n- Model Registry: A default `False` flag named `replace_udf` has been added to the options of `deploy()`. Setting this\n to `True` will allow overwrite existing UDF with the same name when deploying.\n- Model Development: Added metrics:\n - f1_score\n - fbeta_score\n - recall_score\n - roc_auc_score\n - roc_curve\n - log_loss\n - precision_recall_curve\n- Model Registry: A new argument named `permanent` has been added to the argument of `deploy()`. Setting this to `True`\n allows the creation of a permanent deployment without needing to specify the UDF location.\n- Model Registry: A new method `list_deployments()` has been added to enumerate all permanent deployments originating\n from a specific model.\n- Model Registry: A new method `get_deployment()` has been added to fetch a deployment by its deployment name.\n- Model Registry: A new method `delete_deployment()` has been added to remove an existing permanent deployment.\n\n## 1.0.0 (2023-06-09)\n\n### Behavior Changes\n\n- Model Registry: `predict()` method moves from Registry to ModelReference.\n- Model Registry: `_snowml_wheel_path` parameter in options of `deploy()`, is replaced with `_use_local_snowml` with\n default value of `False`. Setting this to `True` will have the same effect of uploading local SnowML code when executing\n model in the warehouse.\n- Model Registry: Removed `id` field from `ModelReference` constructor.\n- Model Development: Preprocessing and Metrics move to the modeling package: `snowflake.ml.modeling.preprocessing` and\n `snowflake.ml.modeling.metrics`.\n- Model Development: `get_sklearn_object()` method is renamed to `to_sklearn()`, `to_xgboost()`, and `to_lightgbm()` for\n respective native models.\n\n### New Features\n\n- Added PolynomialFeatures transformer to the snowflake.ml.modeling.preprocessing module.\n- Added metrics:\n - accuracy_score\n - confusion_matrix\n - precision_recall_fscore_support\n - precision_score\n\n### Bug Fixes\n\n- Model Registry: Model version can now be any string (not required to be a valid identifier)\n- Model Deployment: `deploy()` & `predict()` methods now correctly escapes identifiers\n\n## 0.3.2 (2023-05-23)\n\n### Behavior Changes\n\n- Use cloudpickle to serialize and deserialize models throughout the codebase and removed dependency on joblib.\n\n### New Features\n\n- Model Deployment: Added support for snowflake.ml models.\n\n## 0.3.1 (2023-05-18)\n\n### Behavior Changes\n\n- Standardized registry API with following\n - Create & open registry taking same set of arguments\n - Create & Open can choose schema to use\n - Set_tag, set_metric, etc now explicitly calls out arg name as metric_name, tag_name, metric_name, etc.\n\n### New Features\n\n- Changes to support python 3.9, 3.10\n- Added kBinsDiscretizer\n- Support for deployment of XGBoost models & int8 types of data\n\n## 0.3.0 (2023-05-11)\n\n### Behavior Changes\n\n- Big Model Registry Refresh\n - Fixed API discrepancies between register_model & log_model.\n - Model can be referred by Name + Version (no opaque internal id is required)\n\n### New Features\n\n- Model Registry: Added support save/load/deploy SKL & XGB Models\n\n## 0.2.3 (2023-04-27)\n\n### Bug Fixes\n\n- Allow using OneHotEncoder along with sklearn style estimators in a pipeline.\n\n### New Features\n\n- Model Registry: Added support for delete_model. Use delete_artifact = False to not delete the underlying model data\n but just unregister.\n\n## 0.2.2 (2023-04-11)\n\n### New Features\n\n- Initial version of snowflake-ml modeling package.\n - Provide support for training most of scikit-learn and xgboost estimators and transformers.\n\n### Bug Fixes\n\n- Minor fixes in preprocessing package.\n\n## 0.2.1 (2023-03-23)\n\n### New Features\n\n- New in Preprocessing:\n - SimpleImputer\n - Covariance Matrix\n- Optimization of Ordinal Encoder client computations.\n\n### Bug Fixes\n\n- Minor fixes in OneHotEncoder.\n\n## 0.2.0 (2023-02-27)\n\n### New Features\n\n- Model Registry\n- PyTorch & Tensorflow connector file generic FileSet API\n- New to Preprocessing:\n - Binarizer\n - Normalizer\n - Pearson correlation Matrix\n- Optimization in Ordinal Encoder to cache vocabulary in temp tables.\n\n## 0.1.3 (2023-02-02)\n\n### New Features\n\n- Initial version of transformers including:\n - Label Encoder\n - Max Abs Scaler\n - Min Max Scaler\n - One Hot Encoder\n - Ordinal Encoder\n - Robust Scaler\n - Standard Scaler\n",
"bugtrack_url": null,
"license": " Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. \"License\" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. \"Licensor\" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. \"Legal Entity\" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, \"control\" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. \"You\" (or \"Your\") shall mean an individual or Legal Entity exercising permissions granted by this License. \"Source\" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. \"Object\" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. \"Work\" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). \"Derivative Works\" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. \"Contribution\" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, \"submitted\" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as \"Not a Contribution.\" \"Contributor\" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a \"NOTICE\" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets \"[]\" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same \"printed page\" as the copyright notice for easier identification within third-party archives. Copyright (c) 2012-2023 Snowflake Computing, Inc. Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ",
"summary": "The machine learning client library that is used for interacting with Snowflake to build machine learning solutions.",
"version": "1.7.1",
"project_urls": {
"Changelog": "https://github.com/snowflakedb/snowflake-ml-python/blob/master/CHANGELOG.md",
"Documentation": "https://docs.snowflake.com/developer-guide/snowpark-ml",
"Homepage": "https://github.com/snowflakedb/snowflake-ml-python",
"Issues": "https://github.com/snowflakedb/snowflake-ml-python/issues",
"Repository": "https://github.com/snowflakedb/snowflake-ml-python"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "86481a466b00165316e0d25b66b332ab99cf5d7147a00505723245221aa1ee6d",
"md5": "2effefc6529db134027ef3c698418338",
"sha256": "c469523fb801df3b2ea06b0475b5b9032ae17ccb6e9e2c8db2a304ae7abef64f"
},
"downloads": -1,
"filename": "snowflake_ml_python-1.7.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2effefc6529db134027ef3c698418338",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.9",
"size": 1993826,
"upload_time": "2024-11-05T19:23:43",
"upload_time_iso_8601": "2024-11-05T19:23:43.633772Z",
"url": "https://files.pythonhosted.org/packages/86/48/1a466b00165316e0d25b66b332ab99cf5d7147a00505723245221aa1ee6d/snowflake_ml_python-1.7.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7c28cce9078ff355fea75b660c16c1054297590ababca437d48f409f1151f702",
"md5": "6e08829bcd4fcd05d190791d3e56adcb",
"sha256": "2c6ce4a3f17736c55176367d87f4d69cbcb73873339c45c6d8bb4b066de16cb3"
},
"downloads": -1,
"filename": "snowflake_ml_python-1.7.1.tar.gz",
"has_sig": false,
"md5_digest": "6e08829bcd4fcd05d190791d3e56adcb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.9",
"size": 1685305,
"upload_time": "2024-11-05T19:23:45",
"upload_time_iso_8601": "2024-11-05T19:23:45.909541Z",
"url": "https://files.pythonhosted.org/packages/7c/28/cce9078ff355fea75b660c16c1054297590ababca437d48f409f1151f702/snowflake_ml_python-1.7.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-05 19:23:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "snowflakedb",
"github_project": "snowflake-ml-python",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "absl-py",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "accelerate",
"specs": [
[
"==",
"0.22.0"
]
]
},
{
"name": "anyio",
"specs": [
[
"==",
"3.5.0"
]
]
},
{
"name": "boto3",
"specs": [
[
"==",
"1.24.28"
]
]
},
{
"name": "build",
"specs": [
[
"==",
"0.10.0"
]
]
},
{
"name": "cachetools",
"specs": [
[
"==",
"4.2.2"
]
]
},
{
"name": "catboost",
"specs": [
[
"==",
"1.2.0"
]
]
},
{
"name": "cloudpickle",
"specs": [
[
"==",
"2.2.1"
]
]
},
{
"name": "coverage",
"specs": [
[
"==",
"6.3.2"
]
]
},
{
"name": "cryptography",
"specs": [
[
"==",
"39.0.1"
]
]
},
{
"name": "flask-cors",
"specs": [
[
"==",
"3.0.10"
]
]
},
{
"name": "flask",
"specs": [
[
"==",
"2.1.3"
]
]
},
{
"name": "fsspec",
"specs": [
[
"==",
"2023.3.0"
]
]
},
{
"name": "httpx",
"specs": [
[
"==",
"0.23.0"
]
]
},
{
"name": "importlib_resources",
"specs": [
[
"==",
"6.1.1"
]
]
},
{
"name": "inflection",
"specs": [
[
"==",
"0.5.1"
]
]
},
{
"name": "joblib",
"specs": [
[
"==",
"1.4.2"
]
]
},
{
"name": "jsonschema",
"specs": [
[
"==",
"3.2.0"
]
]
},
{
"name": "lightgbm",
"specs": [
[
"==",
"4.1.0"
]
]
},
{
"name": "mlflow",
"specs": [
[
"==",
"2.3.1"
]
]
},
{
"name": "moto",
"specs": [
[
"==",
"4.0.11"
]
]
},
{
"name": "mypy",
"specs": [
[
"==",
"1.10.0"
]
]
},
{
"name": "networkx",
"specs": [
[
"==",
"2.8.4"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"1.23.5"
]
]
},
{
"name": "packaging",
"specs": [
[
"==",
"23.0"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"1.5.3"
]
]
},
{
"name": "peft",
"specs": [
[
"==",
"0.5.0"
]
]
},
{
"name": "protobuf",
"specs": [
[
"==",
"3.20.3"
]
]
},
{
"name": "psutil",
"specs": [
[
"==",
"5.9.0"
]
]
},
{
"name": "pyarrow",
"specs": [
[
"==",
"10.0.1"
]
]
},
{
"name": "pytest-rerunfailures",
"specs": [
[
"==",
"12.0"
]
]
},
{
"name": "pytest-xdist",
"specs": [
[
"==",
"3.5.0"
]
]
},
{
"name": "pytest",
"specs": [
[
"==",
"7.4.0"
]
]
},
{
"name": "pytimeparse",
"specs": [
[
"==",
"1.1.8"
]
]
},
{
"name": "pyyaml",
"specs": [
[
"==",
"6.0"
]
]
},
{
"name": "retrying",
"specs": [
[
"==",
"1.3.3"
]
]
},
{
"name": "ruamel.yaml",
"specs": [
[
"==",
"0.17.21"
]
]
},
{
"name": "s3fs",
"specs": [
[
"==",
"2023.3.0"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
"==",
"1.5.1"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.9.3"
]
]
},
{
"name": "sentence-transformers",
"specs": [
[
"==",
"2.2.2"
]
]
},
{
"name": "sentencepiece",
"specs": [
[
"==",
"0.1.99"
]
]
},
{
"name": "shap",
"specs": [
[
"==",
"0.46.0"
]
]
},
{
"name": "snowflake-connector-python",
"specs": [
[
"==",
"3.10.0"
]
]
},
{
"name": "snowflake-snowpark-python",
"specs": [
[
"==",
"1.17.0"
]
]
},
{
"name": "sphinx",
"specs": [
[
"==",
"5.0.2"
]
]
},
{
"name": "sqlparse",
"specs": [
[
"==",
"0.4.4"
]
]
},
{
"name": "starlette",
"specs": [
[
"==",
"0.27.0"
]
]
},
{
"name": "tensorflow",
"specs": [
[
"==",
"2.12.0"
]
]
},
{
"name": "tokenizers",
"specs": [
[
"==",
"0.13.2"
]
]
},
{
"name": "toml",
"specs": [
[
"==",
"0.10.2"
]
]
},
{
"name": "torch",
"specs": [
[
"==",
"2.0.1"
]
]
},
{
"name": "torchdata",
"specs": [
[
"==",
"0.6.1"
]
]
},
{
"name": "transformers",
"specs": [
[
"==",
"4.32.1"
]
]
},
{
"name": "types-PyYAML",
"specs": [
[
"==",
"6.0.12.12"
]
]
},
{
"name": "types-cachetools",
"specs": [
[
"==",
"4.2.2"
]
]
},
{
"name": "types-protobuf",
"specs": [
[
"==",
"4.23.0.1"
]
]
},
{
"name": "types-requests",
"specs": [
[
"==",
"2.30.0.0"
]
]
},
{
"name": "types-toml",
"specs": [
[
"==",
"0.10.8.6"
]
]
},
{
"name": "typing-extensions",
"specs": [
[
"==",
"4.6.3"
]
]
},
{
"name": "werkzeug",
"specs": [
[
"==",
"2.2.2"
]
]
},
{
"name": "xgboost",
"specs": [
[
"==",
"1.7.6"
]
]
}
],
"lcname": "snowflake-ml-python"
}