snowflake-snowpark-python3

Name	snowflake-snowpark-python3 JSON
Version	1.0.0 JSON
	download
home_page	https://www.snowflake.com/
Summary	Snowflake Snowpark for Python
upload_time	2022-12-06 17:29:50
maintainer
docs_url	None
author	Snowflake, Inc
requires_python	>=3.8
license	Apache License, Version 2.0
keywords	snowflake db database cloud analytics warehouse
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Snowflake Snowpark Python API

The Snowpark library provides intuitive APIs for querying and processing data in a data pipeline.
Using this library, you can build applications that process data in Snowflake without having to move data to the system where your application code runs.

[Source code][source code] | [Developer guide][developer guide] | [API reference][api references] | [Product documentation][snowpark] | [Samples][samples]

## Getting started

### Have your Snowflake account ready
If you don't have a Snowflake account yet, you can [sign up for a 30-day free trial account][sign up trial].

### Create a Python virtual environment
Python 3.8 is required. You can use [miniconda][miniconda], [anaconda][anaconda], or [virtualenv][virtualenv]
to create a Python 3.8 virtual environment.

To have the best experience when using it with UDFs, [creating a local conda environment with the Snowflake channel][use snowflake channel] is recommended.

### Install the library to the Python virtual environment
```bash
pip install snowflake-snowpark-python
```
Optionally, you need to install pandas in the same environment if you want to use pandas-related features:
```bash
pip install "snowflake-snowpark-python[pandas]"
```

### Create a session and use the APIs
```python
from snowflake.snowpark import Session

connection_parameters = {
  "account": "<your snowflake account>",
  "user": "<your snowflake user>",
  "password": "<your snowflake password>",
  "role": "<snowflake user role>",
  "warehouse": "<snowflake warehouse>",
  "database": "<snowflake database>",
  "schema": "<snowflake schema>"
}

session = Session.builder.configs(connection_parameters).create()
df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"])
df = df.filter(df.a > 1)
df.show()
pandas_df = df.to_pandas()  # this requires pandas installed in the Python environment
result = df.collect()
```

## Samples
The [Developer Guide][developer guide] and [API references][api references] have basic sample code.
[Snowflake-Labs][snowflake lab sample code] has more curated demos.

## Logging
Configure logging level for `snowflake.snowpark` for Snowpark Python API logs.
Snowpark uses the [Snowflake Python Connector][python connector].
So you may also want to configure the logging level for `snowflake.connector` when the error is in the Python Connector.
For instance,
```python
import logging
for logger_name in ('snowflake.snowpark', 'snowflake.connector'):
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)
```

## Contributing
Please refer to [CONTRIBUTING.md][contributing].

[add other sample code repo links]: # (Developer advocacy is open-sourcing a repo that has excellent sample code. The link will be added here.)

[developer guide]: https://docs.snowflake.com/en/developer-guide/snowpark/python/index.html
[api references]: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/index.html
[snowpark]: https://www.snowflake.com/snowpark
[sign up trial]: https://signup.snowflake.com
[source code]: https://github.com/snowflakedb/snowpark-python
[miniconda]: https://docs.conda.io/en/latest/miniconda.html
[anaconda]: https://www.anaconda.com/
[virtualenv]: https://docs.python.org/3/tutorial/venv.html
[config pycharm interpreter]: https://www.jetbrains.com/help/pycharm/configuring-python-interpreter.html
[python connector]: https://pypi.org/project/snowflake-connector-python/
[use snowflake channel]: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-packages.html#local-development-and-testing
[snowflake lab sample code]: https://github.com/Snowflake-Labs/snowpark-python-demos
[samples]: https://github.com/snowflakedb/snowpark-python/blob/main/README.md#samples
[contributing]: https://github.com/snowflakedb/snowpark-python/blob/main/CONTRIBUTING.md


# Release History
## 1.0.0 (2012-11-01)
### New Features
- Added `Session.generator()` to create a new `DataFrame` using the Generator table function.
- Added a parameter `secure` to the functions that create a secure UDF or UDTF.


## 0.12.0 (2022-10-14)
### New Features
- Added new APIs for async job:
  - `Session.create_async_job()` to create an `AsyncJob` instance from a query id.
  - `AsyncJob.result()` now accepts argument `result_type` to return the results in different formats.
  - `AsyncJob.to_df()` returns a `DataFrame` built from the result of this asynchronous job.
  - `AsyncJob.query()` returns the SQL text of the executed query.
- `DataFrame.agg()` and `RelationalGroupedDataFrame.agg()` now accept variable-length arguments.
- Added parameters `lsuffix` and `rsuffix` to `DataFram.join()` and `DataFrame.cross_join()` to conveniently rename overlapping columns.
- Added `Table.drop_table()` so you can drop the temp table after `DataFrame.cache_result()`. `Table` is also a context manager so you can use the `with` statement to drop the cache temp table after use.
- Added `Session.use_secondary_roles()`.
- Added functions `first_value()` and `last_value()`. (contributed by @chasleslr)
- Added `on` as an alias for `using_columns` and `how` as an alias for `join_type` in `DataFrame.join()`.

### Bug Fixes
- Fixed a bug in `Session.create_dataframe()` that raised an error when `schema` names had special characters.
- Fixed a bug in which options set in `Session.read.option()` were not passed to `DataFrame.copy_into_table()` as default values.
- Fixed a bug in which `DataFrame.copy_into_table()` raises an error when a copy option has single quotes in the value.

## 0.11.0 (2022-09-28)

### Behavior Changes
- `Session.add_packages()` now raises `ValueError` when the version of a package cannot be found in Snowflake Anaconda channel. Previously, `Session.add_packages()` succeeded, and a `SnowparkSQLException` exception was raised later in the UDF/SP registration step.

### New Features:
- Added method `FileOperation.get_stream()` to support downloading stage files as stream.
- Added support in `functions.ntiles()` to accept int argument.
- Added the following aliases:
  - `functions.call_function()` for `functions.call_builtin()`.
  - `functions.function()` for `functions.builtin()`.
  - `DataFrame.order_by()` for `DataFrame.sort()`
  - `DataFrame.orderBy()` for `DataFrame.sort()`
- Improved `DataFrame.cache_result()` to return a more accurate `Table` class instead of a `DataFrame` class.
- Added support to allow `session` as the first argument when calling `StoredProcedure`.

### Improvements
- Improved nested query generation by flattening queries when applicable.
  - This improvement could be enabled by setting `Session.sql_simplifier_enabled = True`.
  - `DataFrame.select()`, `DataFrame.with_column()`, `DataFrame.drop()` and other select-related APIs have more flattened SQLs.
  - `DataFrame.union()`, `DataFrame.union_all()`, `DataFrame.except_()`, `DataFrame.intersect()`, `DataFrame.union_by_name()` have flattened SQLs generated when multiple set operators are chained.
- Improved type annotations for async job APIs.

### Bug Fixes
- Fixed a bug in which `Table.update()`, `Table.delete()`, `Table.merge()` try to reference a temp table that does not exist.

## 0.10.0 (2022-09-16)

### New Features:
- Added experimental APIs for evaluating Snowpark dataframes with asynchronous queries:
  - Added keyword argument `block` to the following action APIs on Snowpark dataframes (which execute queries) to allow asynchronous evaluations:
    - `DataFrame.collect()`, `DataFrame.to_local_iterator()`, `DataFrame.to_pandas()`, `DataFrame.to_pandas_batches()`, `DataFrame.count()`, `DataFrame.first()`.
    - `DataFrameWriter.save_as_table()`, `DataFrameWriter.copy_into_location()`.
    - `Table.delete()`, `Table.update()`, `Table.merge()`.
  - Added method `DataFrame.collect_nowait()` to allow asynchronous evaluations.
  - Added class `AsyncJob` to retrieve results from asynchronously executed queries and check their status.
- Added support for `table_type` in `Session.write_pandas()`. You can now choose from these `table_type` options: `"temporary"`, `"temp"`, and `"transient"`.
- Added support for using Python structured data (`list`, `tuple` and `dict`) as literal values in Snowpark.
- Added keyword argument `execute_as` to `functions.sproc()` and `session.sproc.register()` to allow registering a stored procedure as a caller or owner.
- Added support for specifying a pre-configured file format when reading files from a stage in Snowflake.

### Improvements:
- Added support for displaying details of a Snowpark session.

### Bug Fixes:
- Fixed a bug in which `DataFrame.copy_into_table()` and `DataFrameWriter.save_as_table()` mistakenly created a new table if the table name is fully qualified, and the table already exists.

### Deprecations:
- Deprecated keyword argument `create_temp_table` in `Session.write_pandas()`.
- Deprecated invoking UDFs using arguments wrapped in a Python list or tuple. You can use variable-length arguments without a list or tuple.

### Dependency updates
- Updated ``snowflake-connector-python`` to 2.7.12.

## 0.9.0 (2022-08-30)

### New Features:
- Added support for displaying source code as comments in the generated scripts when registering UDFs.
This feature is turned on by default. To turn it off, pass the new keyword argument `source_code_display` as `False` when calling `register()` or `@udf()`.
- Added support for calling table functions from `DataFrame.select()`, `DataFrame.with_column()` and `DataFrame.with_columns()` which now take parameters of type `table_function.TableFunctionCall` for columns.
- Added keyword argument `overwrite` to `session.write_pandas()` to allow overwriting contents of a Snowflake table with that of a Pandas DataFrame.
- Added keyword argument `column_order` to `df.write.save_as_table()` to specify the matching rules when inserting data into table in append mode.
- Added method `FileOperation.put_stream()` to upload local files to a stage via file stream.
- Added methods `TableFunctionCall.alias()` and `TableFunctionCall.as_()` to allow aliasing the names of columns that come from the output of table function joins.
- Added function `get_active_session()` in module `snowflake.snowpark.context` to get the current active Snowpark session.

### Bug Fixes:
- Fixed a bug in which batch insert should not raise an error when `statement_params` is not passed to the function.
- Fixed a bug in which column names should be quoted when `session.create_dataframe()` is called with dicts and a given schema.
- Fixed a bug in which creation of table should be skipped if the table already exists and is in append mode when calling `df.write.save_as_table()`.
- Fixed a bug in which third-party packages with underscores cannot be added when registering UDFs.

### Improvements:
- Improved function `function.uniform()` to infer the types of inputs `max_` and `min_` and cast the limits to `IntegerType` or `FloatType` correspondingly.

## 0.8.0 (2022-07-22)

### New Features:
- Added keyword only argument `statement_params` to the following methods to allow for specifying statement level parameters:
  - `collect`, `to_local_iterator`, `to_pandas`, `to_pandas_batches`,
  `count`, `copy_into_table`, `show`, `create_or_replace_view`, `create_or_replace_temp_view`, `first`, `cache_result`
  and `random_split` on class `snowflake.snowpark.Dateframe`.
  - `update`, `delete` and `merge` on class `snowflake.snowpark.Table`.
  - `save_as_table` and `copy_into_location` on class `snowflake.snowpark.DataFrameWriter`.
  - `approx_quantile`, `statement_params`, `cov` and `crosstab` on class `snowflake.snowpark.DataFrameStatFunctions`.
  - `register` and `register_from_file` on class `snowflake.snowpark.udf.UDFRegistration`.
  - `register` and `register_from_file` on class `snowflake.snowpark.udtf.UDTFRegistration`.
  - `register` and `register_from_file` on class `snowflake.snowpark.stored_procedure.StoredProcedureRegistration`.
  - `udf`, `udtf` and `sproc` in `snowflake.snowpark.functions`.
- Added support for `Column` as an input argument to `session.call()`.
- Added support for `table_type` in `df.write.save_as_table()`. You can now choose from these `table_type` options: `"temporary"`, `"temp"`, and `"transient"`.

### Improvements:
- Added validation of object name in `session.use_*` methods.
- Updated the query tag in SQL to escape it when it has special characters.
- Added a check to see if Anaconda terms are acknowledged when adding missing packages.

### Bug Fixes:
- Fixed the limited length of the string column in `session.create_dataframe()`.
- Fixed a bug in which `session.create_dataframe()` mistakenly converted 0 and `False` to `None` when the input data was only a list.
- Fixed a bug in which calling `session.create_dataframe()` using a large local dataset sometimes created a temp table twice.
- Aligned the definition of `function.trim()` with the SQL function definition.
- Fixed an issue where snowpark-python would hang when using the Python system-defined (built-in function) `sum` vs. the Snowpark `function.sum()`.

### Deprecations:
- Deprecated keyword argument `create_temp_table` in `df.write.save_as_table()`.


## 0.7.0 (2022-05-25)

### New Features:
- Added support for user-defined table functions (UDTFs).
  - Use function `snowflake.snowpark.functions.udtf()` to register a UDTF, or use it as a decorator to register the UDTF.
    - You can also use `Session.udtf.register()` to register a UDTF.
  - Use `Session.udtf.register_from_file()` to register a UDTF from a Python file.
- Updated APIs to query a table function, including both Snowflake built-in table functions and UDTFs.
  - Use function `snowflake.snowpark.functions.table_function()` to create a callable representing a table function and use it to call the table function in a query.
  - Alternatively, use function `snowflake.snowpark.functions.call_table_function()` to call a table function.
  - Added support for `over` clause that specifies `partition by` and `order by` when lateral joining a table function.
  - Updated `Session.table_function()` and `DataFrame.join_table_function()` to accept `TableFunctionCall` instances.

### Breaking Changes:
- When creating a function with `functions.udf()` and `functions.sproc()`, you can now specify an empty list for the `imports` or `packages` argument to indicate that no import or package is used for this UDF or stored procedure. Previously, specifying an empty list meant that the function would use session-level imports or packages.
- Improved the `__repr__` implementation of data types in `types.py`. The unused `type_name` property has been removed.
- Added a Snowpark-specific exception class for SQL errors. This replaces the previous `ProgrammingError` from the Python connector.

### Improvements:
- Added a lock to a UDF or UDTF when it is called for the first time per thread.
- Improved the error message for pickling errors that occurred during UDF creation.
- Included the query ID when logging the failed query.

### Bug Fixes:
- Fixed a bug in which non-integral data (such as timestamps) was occasionally converted to integer when calling `DataFrame.to_pandas()`.
- Fixed a bug in which `DataFrameReader.parquet()` failed to read a parquet file when its column contained spaces.
- Fixed a bug in which `DataFrame.copy_into_table()` failed when the dataframe is created by reading a file with inferred schemas.

### Deprecations
`Session.flatten()` and `DataFrame.flatten()`.

### Dependency Updates:
- Restricted the version of `cloudpickle` <= `2.0.0`.


## 0.6.0 (2022-04-27)

### New Features:
- Added support for vectorized UDFs with the input as a Pandas DataFrame or Pandas Series and the output as a Pandas Series. This improves the performance of UDFs in Snowpark.
- Added support for inferring the schema of a DataFrame by default when it is created by reading a Parquet, Avro, or ORC file in the stage.
- Added functions `current_session()`, `current_statement()`, `current_user()`, `current_version()`, `current_warehouse()`, `date_from_parts()`, `date_trunc()`, `dayname()`, `dayofmonth()`, `dayofweek()`, `dayofyear()`, `grouping()`, `grouping_id()`, `hour()`, `last_day()`, `minute()`, `next_day()`, `previous_day()`, `second()`, `month()`, `monthname()`, `quarter()`, `year()`, `current_database()`, `current_role()`, `current_schema()`, `current_schemas()`, `current_region()`, `current_avaliable_roles()`, `add_months()`, `any_value()`, `bitnot()`, `bitshiftleft()`, `bitshiftright()`, `convert_timezone()`, `uniform()`, `strtok_to_array()`, `sysdate()`, `time_from_parts()`,  `timestamp_from_parts()`, `timestamp_ltz_from_parts()`, `timestamp_ntz_from_parts()`, `timestamp_tz_from_parts()`, `weekofyear()`, `percentile_cont()` to `snowflake.snowflake.functions`.

### Breaking Changes:
- Expired deprecations:
  - Removed the following APIs that were deprecated in 0.4.0: `DataFrame.groupByGroupingSets()`, `DataFrame.naturalJoin()`, `DataFrame.joinTableFunction`, `DataFrame.withColumns()`, `Session.getImports()`, `Session.addImport()`, `Session.removeImport()`, `Session.clearImports()`, `Session.getSessionStage()`, `Session.getDefaultDatabase()`, `Session.getDefaultSchema()`, `Session.getCurrentDatabase()`, `Session.getCurrentSchema()`, `Session.getFullyQualifiedCurrentSchema()`.

### Improvements:
- Added support for creating an empty `DataFrame` with a specific schema using the `Session.create_dataframe()` method.
- Changed the logging level from `INFO` to `DEBUG` for several logs (e.g., the executed query) when evaluating a dataframe.
- Improved the error message when failing to create a UDF due to pickle errors.

### Bug Fixes:
- Removed pandas hard dependencies in the `Session.create_dataframe()` method.

### Dependency Updates:
- Added `typing-extension` as a new dependency with the version >= `4.1.0`.


## 0.5.0 (2022-03-22)

### New Features
- Added stored procedures API.
  - Added `Session.sproc` property and `sproc()` to `snowflake.snowpark.functions`, so you can register stored procedures.
  - Added `Session.call` to call stored procedures by name.
- Added `UDFRegistration.register_from_file()` to allow registering UDFs from Python source files or zip files directly.
- Added `UDFRegistration.describe()` to describe a UDF.
- Added `DataFrame.random_split()` to provide a way to randomly split a dataframe.
- Added functions `md5()`, `sha1()`, `sha2()`, `ascii()`, `initcap()`, `length()`, `lower()`, `lpad()`, `ltrim()`, `rpad()`, `rtrim()`, `repeat()`, `soundex()`, `regexp_count()`, `replace()`, `charindex()`, `collate()`, `collation()`, `insert()`, `left()`, `right()`, `endswith()` to `snowflake.snowpark.functions`.
- Allowed `call_udf()` to accept literal values.
- Provided a `distinct` keyword in `array_agg()`.

### Bug Fixes:
- Fixed an issue that caused `DataFrame.to_pandas()` to have a string column if `Column.cast(IntegerType())` was used.
- Fixed a bug in `DataFrame.describe()` when there is more than one string column.

## 0.4.0 (2022-02-15)

### New Features
- You can now specify which Anaconda packages to use when defining UDFs.
  - Added `add_packages()`, `get_packages()`, `clear_packages()`, and `remove_package()`, to class `Session`.
  - Added `add_requirements()` to `Session` so you can use a requirements file to specify which packages this session will use.
  - Added parameter `packages` to function `snowflake.snowpark.functions.udf()` and method `UserDefinedFunction.register()` to indicate UDF-level Anaconda package dependencies when creating a UDF.
  - Added parameter `imports` to `snowflake.snowpark.functions.udf()` and `UserDefinedFunction.register()` to specify UDF-level code imports.
- Added a parameter `session` to function `udf()` and `UserDefinedFunction.register()` so you can specify which session to use to create a UDF if you have multiple sessions.
- Added types `Geography` and `Variant` to `snowflake.snowpark.types` to be used as type hints for Geography and Variant data when defining a UDF.
- Added support for Geography geoJSON data.
- Added `Table`, a subclass of `DataFrame` for table operations:
  - Methods `update` and `delete` update and delete rows of a table in Snowflake.
  - Method `merge` merges data from a `DataFrame` to a `Table`.
  - Override method `DataFrame.sample()` with an additional parameter `seed`, which works on tables but not on view and sub-queries.
- Added `DataFrame.to_local_iterator()` and `DataFrame.to_pandas_batches()` to allow getting results from an iterator when the result set returned from the Snowflake database is too large.
- Added `DataFrame.cache_result()` for caching the operations performed on a `DataFrame` in a temporary table.
  Subsequent operations on the original `DataFrame` have no effect on the cached result `DataFrame`.
- Added property `DataFrame.queries` to get SQL queries that will be executed to evaluate the `DataFrame`.
- Added `Session.query_history()` as a context manager to track SQL queries executed on a session, including all SQL queries to evaluate `DataFrame`s created from a session. Both query ID and query text are recorded.
- You can now create a `Session` instance from an existing established `snowflake.connector.SnowflakeConnection`. Use parameter `connection` in `Session.builder.configs()`.
- Added `use_database()`, `use_schema()`, `use_warehouse()`, and `use_role()` to class `Session` to switch database/schema/warehouse/role after a session is created.
- Added `DataFrameWriter.copy_into_table()` to unload a `DataFrame` to stage files.
- Added `DataFrame.unpivot()`.
- Added `Column.within_group()` for sorting the rows by columns with some aggregation functions.
- Added functions `listagg()`, `mode()`, `div0()`, `acos()`, `asin()`, `atan()`, `atan2()`, `cos()`, `cosh()`, `sin()`, `sinh()`, `tan()`, `tanh()`, `degrees()`, `radians()`, `round()`, `trunc()`, and `factorial()` to `snowflake.snowflake.functions`.
- Added an optional argument `ignore_nulls` in function `lead()` and `lag()`.
- The `condition` parameter of function `when()` and `iff()` now accepts SQL expressions.

### Improvements
- All function and method names have been renamed to use the snake case naming style, which is more Pythonic. For convenience, some camel case names are kept as aliases to the snake case APIs. It is recommended to use the snake case APIs.
  - Deprecated these methods on class `Session` and replaced them with their snake case equivalents: `getImports()`, `addImports()`, `removeImport()`, `clearImports()`, `getSessionStage()`, `getDefaultSchema()`, `getDefaultSchema()`, `getCurrentDatabase()`, `getFullyQualifiedCurrentSchema()`.
  - Deprecated these methods on class `DataFrame` and replaced them with their snake case equivalents: `groupingByGroupingSets()`, `naturalJoin()`, `withColumns()`, `joinTableFunction()`.
- Property `DataFrame.columns` is now consistent with `DataFrame.schema.names` and the Snowflake database `Identifier Requirements`.
- `Column.__bool__()` now raises a `TypeError`. This will ban the use of logical operators `and`, `or`, `not` on `Column` object, for instance `col("a") > 1 and col("b") > 2` will raise the `TypeError`. Use `(col("a") > 1) & (col("b") > 2)` instead.
- Changed `PutResult` and `GetResult` to subclass `NamedTuple`.
- Fixed a bug which raised an error when the local path or stage location has a space or other special characters.
- Changed `DataFrame.describe()` so that non-numeric and non-string columns are ignored instead of raising an exception.

### Dependency updates
- Updated ``snowflake-connector-python`` to 2.7.4.

## 0.3.0 (2022-01-09)
### New Features
- Added `Column.isin()`, with an alias `Column.in_()`.
- Added `Column.try_cast()`, which is a special version of `cast()`. It tries to cast a string expression to other types and returns `null` if the cast is not possible.
- Added `Column.startswith()` and `Column.substr()` to process string columns.
- `Column.cast()` now also accepts a `str` value to indicate the cast type in addition to a `DataType` instance.
- Added `DataFrame.describe()` to summarize stats of a `DataFrame`.
- Added `DataFrame.explain()` to print the query plan of a `DataFrame`.
- `DataFrame.filter()` and `DataFrame.select_expr()` now accepts a sql expression.
- Added a new `bool` parameter `create_temp_table` to methods `DataFrame.saveAsTable()` and `Session.write_pandas()` to optionally create a temp table.
- Added `DataFrame.minus()` and `DataFrame.subtract()` as aliases to `DataFrame.except_()`.
- Added `regexp_replace()`, `concat()`, `concat_ws()`, `to_char()`, `current_timestamp()`, `current_date()`, `current_time()`, `months_between()`, `cast()`, `try_cast()`, `greatest()`, `least()`, and `hash()` to module `snowflake.snowpark.functions`.

### Bug Fixes
- Fixed an issue where `Session.createDataFrame(pandas_df)` and `Session.write_pandas(pandas_df)` raise an exception when the `Pandas DataFrame` has spaces in the column name.
- `DataFrame.copy_into_table()` sometimes prints an `error` level log entry while it actually works. It's fixed now.
- Fixed an API docs issue where some `DataFrame` APIs are missing from the docs.

### Dependency updates
- Update ``snowflake-connector-python`` to 2.7.2, which upgrades ``pyarrow`` dependency to 6.0.x. Refer to the [python connector 2.7.2 release notes](https://pypi.org/project/snowflake-connector-python/2.7.2/) for more details.

## 0.2.0 (2021-12-02)
### New Features
- Updated the `Session.createDataFrame()` method for creating a `DataFrame` from a Pandas DataFrame.
- Added the `Session.write_pandas()` method for writing a `Pandas DataFrame` to a table in Snowflake and getting a `Snowpark DataFrame` object back.
- Added new classes and methods for calling window functions.
- Added the new functions `cume_dist()`, to find the cumulative distribution of a value with regard to other values within a window partition,
  and `row_number()`, which returns a unique row number for each row within a window partition.
- Added functions for computing statistics for DataFrames in the `DataFrameStatFunctions` class.
- Added functions for handling missing values in a DataFrame in the `DataFrameNaFunctions` class.
- Added new methods `rollup()`, `cube()`, and `pivot()` to the `DataFrame` class.
- Added the `GroupingSets` class, which you can use with the DataFrame groupByGroupingSets method to perform a SQL GROUP BY GROUPING SETS.
- Added the new `FileOperation(session)`
  class that you can use to upload and download files to and from a stage.
- Added the `DataFrame.copy_into_table()`
  method for loading data from files in a stage into a table.
- In CASE expressions, the functions `when()` and `otherwise()`
  now accept Python types in addition to `Column` objects.
- When you register a UDF you can now optionally set the `replace` parameter to `True` to overwrite an existing UDF with the same name.

### Improvements
- UDFs are now compressed before they are uploaded to the server. This makes them about 10 times smaller, which can help
  when you are using large ML model files.
- When the size of a UDF is less than 8196 bytes, it will be uploaded as in-line code instead of uploaded to a stage.

### Bug Fixes
- Fixed an issue where the statement `df.select(when(col("a") == 1, 4).otherwise(col("a"))), [Row(4), Row(2), Row(3)]` raised an exception.
- Fixed an issue where `df.toPandas()` raised an exception when a DataFrame was created from large local data.

## 0.1.0 (2021-10-26)

Start of Private Preview

Raw data

            {
    "_id": null,
    "home_page": "https://www.snowflake.com/",
    "name": "snowflake-snowpark-python3",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "Snowflake db database cloud analytics warehouse",
    "author": "Snowflake, Inc",
    "author_email": "snowflake-python-libraries-dl@snowflake.com",
    "download_url": "https://files.pythonhosted.org/packages/cc/50/0f541b81ac70757681e5e465d721b6d6f8a6c6e928243bd30fbfa41cd1ad/snowflake-snowpark-python3-1.0.0.tar.gz",
    "platform": null,
    "description": "# Snowflake Snowpark Python API\n\nThe Snowpark library provides intuitive APIs for querying and processing data in a data pipeline.\nUsing this library, you can build applications that process data in Snowflake without having to move data to the system where your application code runs.\n\n[Source code][source code] | [Developer guide][developer guide] | [API reference][api references] | [Product documentation][snowpark] | [Samples][samples]\n\n## Getting started\n\n### Have your Snowflake account ready\nIf you don't have a Snowflake account yet, you can [sign up for a 30-day free trial account][sign up trial].\n\n### Create a Python virtual environment\nPython 3.8 is required. You can use [miniconda][miniconda], [anaconda][anaconda], or [virtualenv][virtualenv]\nto create a Python 3.8 virtual environment.\n\nTo have the best experience when using it with UDFs, [creating a local conda environment with the Snowflake channel][use snowflake channel] is recommended.\n\n### Install the library to the Python virtual environment\n```bash\npip install snowflake-snowpark-python\n```\nOptionally, you need to install pandas in the same environment if you want to use pandas-related features:\n```bash\npip install \"snowflake-snowpark-python[pandas]\"\n```\n\n### Create a session and use the APIs\n```python\nfrom snowflake.snowpark import Session\n\nconnection_parameters = {\n  \"account\": \"<your snowflake account>\",\n  \"user\": \"<your snowflake user>\",\n  \"password\": \"<your snowflake password>\",\n  \"role\": \"<snowflake user role>\",\n  \"warehouse\": \"<snowflake warehouse>\",\n  \"database\": \"<snowflake database>\",\n  \"schema\": \"<snowflake schema>\"\n}\n\nsession = Session.builder.configs(connection_parameters).create()\ndf = session.create_dataframe([[1, 2], [3, 4]], schema=[\"a\", \"b\"])\ndf = df.filter(df.a > 1)\ndf.show()\npandas_df = df.to_pandas()  # this requires pandas installed in the Python environment\nresult = df.collect()\n```\n\n## Samples\nThe [Developer Guide][developer guide] and [API references][api references] have basic sample code.\n[Snowflake-Labs][snowflake lab sample code] has more curated demos.\n\n## Logging\nConfigure logging level for `snowflake.snowpark` for Snowpark Python API logs.\nSnowpark uses the [Snowflake Python Connector][python connector].\nSo you may also want to configure the logging level for `snowflake.connector` when the error is in the Python Connector.\nFor instance,\n```python\nimport logging\nfor logger_name in ('snowflake.snowpark', 'snowflake.connector'):\n    logger = logging.getLogger(logger_name)\n    logger.setLevel(logging.DEBUG)\n    ch = logging.StreamHandler()\n    ch.setLevel(logging.DEBUG)\n    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))\n    logger.addHandler(ch)\n```\n\n## Contributing\nPlease refer to [CONTRIBUTING.md][contributing].\n\n[add other sample code repo links]: # (Developer advocacy is open-sourcing a repo that has excellent sample code. The link will be added here.)\n\n[developer guide]: https://docs.snowflake.com/en/developer-guide/snowpark/python/index.html\n[api references]: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/index.html\n[snowpark]: https://www.snowflake.com/snowpark\n[sign up trial]: https://signup.snowflake.com\n[source code]: https://github.com/snowflakedb/snowpark-python\n[miniconda]: https://docs.conda.io/en/latest/miniconda.html\n[anaconda]: https://www.anaconda.com/\n[virtualenv]: https://docs.python.org/3/tutorial/venv.html\n[config pycharm interpreter]: https://www.jetbrains.com/help/pycharm/configuring-python-interpreter.html\n[python connector]: https://pypi.org/project/snowflake-connector-python/\n[use snowflake channel]: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-packages.html#local-development-and-testing\n[snowflake lab sample code]: https://github.com/Snowflake-Labs/snowpark-python-demos\n[samples]: https://github.com/snowflakedb/snowpark-python/blob/main/README.md#samples\n[contributing]: https://github.com/snowflakedb/snowpark-python/blob/main/CONTRIBUTING.md\n\n\n# Release History\n## 1.0.0 (2012-11-01)\n### New Features\n- Added `Session.generator()` to create a new `DataFrame` using the Generator table function.\n- Added a parameter `secure` to the functions that create a secure UDF or UDTF.\n\n\n## 0.12.0 (2022-10-14)\n### New Features\n- Added new APIs for async job:\n  - `Session.create_async_job()` to create an `AsyncJob` instance from a query id.\n  - `AsyncJob.result()` now accepts argument `result_type` to return the results in different formats.\n  - `AsyncJob.to_df()` returns a `DataFrame` built from the result of this asynchronous job.\n  - `AsyncJob.query()` returns the SQL text of the executed query.\n- `DataFrame.agg()` and `RelationalGroupedDataFrame.agg()` now accept variable-length arguments.\n- Added parameters `lsuffix` and `rsuffix` to `DataFram.join()` and `DataFrame.cross_join()` to conveniently rename overlapping columns.\n- Added `Table.drop_table()` so you can drop the temp table after `DataFrame.cache_result()`. `Table` is also a context manager so you can use the `with` statement to drop the cache temp table after use.\n- Added `Session.use_secondary_roles()`.\n- Added functions `first_value()` and `last_value()`. (contributed by @chasleslr)\n- Added `on` as an alias for `using_columns` and `how` as an alias for `join_type` in `DataFrame.join()`.\n\n### Bug Fixes\n- Fixed a bug in `Session.create_dataframe()` that raised an error when `schema` names had special characters.\n- Fixed a bug in which options set in `Session.read.option()` were not passed to `DataFrame.copy_into_table()` as default values.\n- Fixed a bug in which `DataFrame.copy_into_table()` raises an error when a copy option has single quotes in the value.\n\n## 0.11.0 (2022-09-28)\n\n### Behavior Changes\n- `Session.add_packages()` now raises `ValueError` when the version of a package cannot be found in Snowflake Anaconda channel. Previously, `Session.add_packages()` succeeded, and a `SnowparkSQLException` exception was raised later in the UDF/SP registration step.\n\n### New Features:\n- Added method `FileOperation.get_stream()` to support downloading stage files as stream.\n- Added support in `functions.ntiles()` to accept int argument.\n- Added the following aliases:\n  - `functions.call_function()` for `functions.call_builtin()`.\n  - `functions.function()` for `functions.builtin()`.\n  - `DataFrame.order_by()` for `DataFrame.sort()`\n  - `DataFrame.orderBy()` for `DataFrame.sort()`\n- Improved `DataFrame.cache_result()` to return a more accurate `Table` class instead of a `DataFrame` class.\n- Added support to allow `session` as the first argument when calling `StoredProcedure`.\n\n### Improvements\n- Improved nested query generation by flattening queries when applicable.\n  - This improvement could be enabled by setting `Session.sql_simplifier_enabled = True`.\n  - `DataFrame.select()`, `DataFrame.with_column()`, `DataFrame.drop()` and other select-related APIs have more flattened SQLs.\n  - `DataFrame.union()`, `DataFrame.union_all()`, `DataFrame.except_()`, `DataFrame.intersect()`, `DataFrame.union_by_name()` have flattened SQLs generated when multiple set operators are chained.\n- Improved type annotations for async job APIs.\n\n### Bug Fixes\n- Fixed a bug in which `Table.update()`, `Table.delete()`, `Table.merge()` try to reference a temp table that does not exist.\n\n## 0.10.0 (2022-09-16)\n\n### New Features:\n- Added experimental APIs for evaluating Snowpark dataframes with asynchronous queries:\n  - Added keyword argument `block` to the following action APIs on Snowpark dataframes (which execute queries) to allow asynchronous evaluations:\n    - `DataFrame.collect()`, `DataFrame.to_local_iterator()`, `DataFrame.to_pandas()`, `DataFrame.to_pandas_batches()`, `DataFrame.count()`, `DataFrame.first()`.\n    - `DataFrameWriter.save_as_table()`, `DataFrameWriter.copy_into_location()`.\n    - `Table.delete()`, `Table.update()`, `Table.merge()`.\n  - Added method `DataFrame.collect_nowait()` to allow asynchronous evaluations.\n  - Added class `AsyncJob` to retrieve results from asynchronously executed queries and check their status.\n- Added support for `table_type` in `Session.write_pandas()`. You can now choose from these `table_type` options: `\"temporary\"`, `\"temp\"`, and `\"transient\"`.\n- Added support for using Python structured data (`list`, `tuple` and `dict`) as literal values in Snowpark.\n- Added keyword argument `execute_as` to `functions.sproc()` and `session.sproc.register()` to allow registering a stored procedure as a caller or owner.\n- Added support for specifying a pre-configured file format when reading files from a stage in Snowflake.\n\n### Improvements:\n- Added support for displaying details of a Snowpark session.\n\n### Bug Fixes:\n- Fixed a bug in which `DataFrame.copy_into_table()` and `DataFrameWriter.save_as_table()` mistakenly created a new table if the table name is fully qualified, and the table already exists.\n\n### Deprecations:\n- Deprecated keyword argument `create_temp_table` in `Session.write_pandas()`.\n- Deprecated invoking UDFs using arguments wrapped in a Python list or tuple. You can use variable-length arguments without a list or tuple.\n\n### Dependency updates\n- Updated ``snowflake-connector-python`` to 2.7.12.\n\n## 0.9.0 (2022-08-30)\n\n### New Features:\n- Added support for displaying source code as comments in the generated scripts when registering UDFs.\nThis feature is turned on by default. To turn it off, pass the new keyword argument `source_code_display` as `False` when calling `register()` or `@udf()`.\n- Added support for calling table functions from `DataFrame.select()`, `DataFrame.with_column()` and `DataFrame.with_columns()` which now take parameters of type `table_function.TableFunctionCall` for columns.\n- Added keyword argument `overwrite` to `session.write_pandas()` to allow overwriting contents of a Snowflake table with that of a Pandas DataFrame.\n- Added keyword argument `column_order` to `df.write.save_as_table()` to specify the matching rules when inserting data into table in append mode.\n- Added method `FileOperation.put_stream()` to upload local files to a stage via file stream.\n- Added methods `TableFunctionCall.alias()` and `TableFunctionCall.as_()` to allow aliasing the names of columns that come from the output of table function joins.\n- Added function `get_active_session()` in module `snowflake.snowpark.context` to get the current active Snowpark session.\n\n### Bug Fixes:\n- Fixed a bug in which batch insert should not raise an error when `statement_params` is not passed to the function.\n- Fixed a bug in which column names should be quoted when `session.create_dataframe()` is called with dicts and a given schema.\n- Fixed a bug in which creation of table should be skipped if the table already exists and is in append mode when calling `df.write.save_as_table()`.\n- Fixed a bug in which third-party packages with underscores cannot be added when registering UDFs.\n\n### Improvements:\n- Improved function `function.uniform()` to infer the types of inputs `max_` and `min_` and cast the limits to `IntegerType` or `FloatType` correspondingly.\n\n## 0.8.0 (2022-07-22)\n\n### New Features:\n- Added keyword only argument `statement_params` to the following methods to allow for specifying statement level parameters:\n  - `collect`, `to_local_iterator`, `to_pandas`, `to_pandas_batches`,\n  `count`, `copy_into_table`, `show`, `create_or_replace_view`, `create_or_replace_temp_view`, `first`, `cache_result`\n  and `random_split` on class `snowflake.snowpark.Dateframe`.\n  - `update`, `delete` and `merge` on class `snowflake.snowpark.Table`.\n  - `save_as_table` and `copy_into_location` on class `snowflake.snowpark.DataFrameWriter`.\n  - `approx_quantile`, `statement_params`, `cov` and `crosstab` on class `snowflake.snowpark.DataFrameStatFunctions`.\n  - `register` and `register_from_file` on class `snowflake.snowpark.udf.UDFRegistration`.\n  - `register` and `register_from_file` on class `snowflake.snowpark.udtf.UDTFRegistration`.\n  - `register` and `register_from_file` on class `snowflake.snowpark.stored_procedure.StoredProcedureRegistration`.\n  - `udf`, `udtf` and `sproc` in `snowflake.snowpark.functions`.\n- Added support for `Column` as an input argument to `session.call()`.\n- Added support for `table_type` in `df.write.save_as_table()`. You can now choose from these `table_type` options: `\"temporary\"`, `\"temp\"`, and `\"transient\"`.\n\n### Improvements:\n- Added validation of object name in `session.use_*` methods.\n- Updated the query tag in SQL to escape it when it has special characters.\n- Added a check to see if Anaconda terms are acknowledged when adding missing packages.\n\n### Bug Fixes:\n- Fixed the limited length of the string column in `session.create_dataframe()`.\n- Fixed a bug in which `session.create_dataframe()` mistakenly converted 0 and `False` to `None` when the input data was only a list.\n- Fixed a bug in which calling `session.create_dataframe()` using a large local dataset sometimes created a temp table twice.\n- Aligned the definition of `function.trim()` with the SQL function definition.\n- Fixed an issue where snowpark-python would hang when using the Python system-defined (built-in function) `sum` vs. the Snowpark `function.sum()`.\n\n### Deprecations:\n- Deprecated keyword argument `create_temp_table` in `df.write.save_as_table()`.\n\n\n## 0.7.0 (2022-05-25)\n\n### New Features:\n- Added support for user-defined table functions (UDTFs).\n  - Use function `snowflake.snowpark.functions.udtf()` to register a UDTF, or use it as a decorator to register the UDTF.\n    - You can also use `Session.udtf.register()` to register a UDTF.\n  - Use `Session.udtf.register_from_file()` to register a UDTF from a Python file.\n- Updated APIs to query a table function, including both Snowflake built-in table functions and UDTFs.\n  - Use function `snowflake.snowpark.functions.table_function()` to create a callable representing a table function and use it to call the table function in a query.\n  - Alternatively, use function `snowflake.snowpark.functions.call_table_function()` to call a table function.\n  - Added support for `over` clause that specifies `partition by` and `order by` when lateral joining a table function.\n  - Updated `Session.table_function()` and `DataFrame.join_table_function()` to accept `TableFunctionCall` instances.\n\n### Breaking Changes:\n- When creating a function with `functions.udf()` and `functions.sproc()`, you can now specify an empty list for the `imports` or `packages` argument to indicate that no import or package is used for this UDF or stored procedure. Previously, specifying an empty list meant that the function would use session-level imports or packages.\n- Improved the `__repr__` implementation of data types in `types.py`. The unused `type_name` property has been removed.\n- Added a Snowpark-specific exception class for SQL errors. This replaces the previous `ProgrammingError` from the Python connector.\n\n### Improvements:\n- Added a lock to a UDF or UDTF when it is called for the first time per thread.\n- Improved the error message for pickling errors that occurred during UDF creation.\n- Included the query ID when logging the failed query.\n\n### Bug Fixes:\n- Fixed a bug in which non-integral data (such as timestamps) was occasionally converted to integer when calling `DataFrame.to_pandas()`.\n- Fixed a bug in which `DataFrameReader.parquet()` failed to read a parquet file when its column contained spaces.\n- Fixed a bug in which `DataFrame.copy_into_table()` failed when the dataframe is created by reading a file with inferred schemas.\n\n### Deprecations\n`Session.flatten()` and `DataFrame.flatten()`.\n\n### Dependency Updates:\n- Restricted the version of `cloudpickle` <= `2.0.0`.\n\n\n## 0.6.0 (2022-04-27)\n\n### New Features:\n- Added support for vectorized UDFs with the input as a Pandas DataFrame or Pandas Series and the output as a Pandas Series. This improves the performance of UDFs in Snowpark.\n- Added support for inferring the schema of a DataFrame by default when it is created by reading a Parquet, Avro, or ORC file in the stage.\n- Added functions `current_session()`, `current_statement()`, `current_user()`, `current_version()`, `current_warehouse()`, `date_from_parts()`, `date_trunc()`, `dayname()`, `dayofmonth()`, `dayofweek()`, `dayofyear()`, `grouping()`, `grouping_id()`, `hour()`, `last_day()`, `minute()`, `next_day()`, `previous_day()`, `second()`, `month()`, `monthname()`, `quarter()`, `year()`, `current_database()`, `current_role()`, `current_schema()`, `current_schemas()`, `current_region()`, `current_avaliable_roles()`, `add_months()`, `any_value()`, `bitnot()`, `bitshiftleft()`, `bitshiftright()`, `convert_timezone()`, `uniform()`, `strtok_to_array()`, `sysdate()`, `time_from_parts()`,  `timestamp_from_parts()`, `timestamp_ltz_from_parts()`, `timestamp_ntz_from_parts()`, `timestamp_tz_from_parts()`, `weekofyear()`, `percentile_cont()` to `snowflake.snowflake.functions`.\n\n### Breaking Changes:\n- Expired deprecations:\n  - Removed the following APIs that were deprecated in 0.4.0: `DataFrame.groupByGroupingSets()`, `DataFrame.naturalJoin()`, `DataFrame.joinTableFunction`, `DataFrame.withColumns()`, `Session.getImports()`, `Session.addImport()`, `Session.removeImport()`, `Session.clearImports()`, `Session.getSessionStage()`, `Session.getDefaultDatabase()`, `Session.getDefaultSchema()`, `Session.getCurrentDatabase()`, `Session.getCurrentSchema()`, `Session.getFullyQualifiedCurrentSchema()`.\n\n### Improvements:\n- Added support for creating an empty `DataFrame` with a specific schema using the `Session.create_dataframe()` method.\n- Changed the logging level from `INFO` to `DEBUG` for several logs (e.g., the executed query) when evaluating a dataframe.\n- Improved the error message when failing to create a UDF due to pickle errors.\n\n### Bug Fixes:\n- Removed pandas hard dependencies in the `Session.create_dataframe()` method.\n\n### Dependency Updates:\n- Added `typing-extension` as a new dependency with the version >= `4.1.0`.\n\n\n## 0.5.0 (2022-03-22)\n\n### New Features\n- Added stored procedures API.\n  - Added `Session.sproc` property and `sproc()` to `snowflake.snowpark.functions`, so you can register stored procedures.\n  - Added `Session.call` to call stored procedures by name.\n- Added `UDFRegistration.register_from_file()` to allow registering UDFs from Python source files or zip files directly.\n- Added `UDFRegistration.describe()` to describe a UDF.\n- Added `DataFrame.random_split()` to provide a way to randomly split a dataframe.\n- Added functions `md5()`, `sha1()`, `sha2()`, `ascii()`, `initcap()`, `length()`, `lower()`, `lpad()`, `ltrim()`, `rpad()`, `rtrim()`, `repeat()`, `soundex()`, `regexp_count()`, `replace()`, `charindex()`, `collate()`, `collation()`, `insert()`, `left()`, `right()`, `endswith()` to `snowflake.snowpark.functions`.\n- Allowed `call_udf()` to accept literal values.\n- Provided a `distinct` keyword in `array_agg()`.\n\n### Bug Fixes:\n- Fixed an issue that caused `DataFrame.to_pandas()` to have a string column if `Column.cast(IntegerType())` was used.\n- Fixed a bug in `DataFrame.describe()` when there is more than one string column.\n\n## 0.4.0 (2022-02-15)\n\n### New Features\n- You can now specify which Anaconda packages to use when defining UDFs.\n  - Added `add_packages()`, `get_packages()`, `clear_packages()`, and `remove_package()`, to class `Session`.\n  - Added `add_requirements()` to `Session` so you can use a requirements file to specify which packages this session will use.\n  - Added parameter `packages` to function `snowflake.snowpark.functions.udf()` and method `UserDefinedFunction.register()` to indicate UDF-level Anaconda package dependencies when creating a UDF.\n  - Added parameter `imports` to `snowflake.snowpark.functions.udf()` and `UserDefinedFunction.register()` to specify UDF-level code imports.\n- Added a parameter `session` to function `udf()` and `UserDefinedFunction.register()` so you can specify which session to use to create a UDF if you have multiple sessions.\n- Added types `Geography` and `Variant` to `snowflake.snowpark.types` to be used as type hints for Geography and Variant data when defining a UDF.\n- Added support for Geography geoJSON data.\n- Added `Table`, a subclass of `DataFrame` for table operations:\n  - Methods `update` and `delete` update and delete rows of a table in Snowflake.\n  - Method `merge` merges data from a `DataFrame` to a `Table`.\n  - Override method `DataFrame.sample()` with an additional parameter `seed`, which works on tables but not on view and sub-queries.\n- Added `DataFrame.to_local_iterator()` and `DataFrame.to_pandas_batches()` to allow getting results from an iterator when the result set returned from the Snowflake database is too large.\n- Added `DataFrame.cache_result()` for caching the operations performed on a `DataFrame` in a temporary table.\n  Subsequent operations on the original `DataFrame` have no effect on the cached result `DataFrame`.\n- Added property `DataFrame.queries` to get SQL queries that will be executed to evaluate the `DataFrame`.\n- Added `Session.query_history()` as a context manager to track SQL queries executed on a session, including all SQL queries to evaluate `DataFrame`s created from a session. Both query ID and query text are recorded.\n- You can now create a `Session` instance from an existing established `snowflake.connector.SnowflakeConnection`. Use parameter `connection` in `Session.builder.configs()`.\n- Added `use_database()`, `use_schema()`, `use_warehouse()`, and `use_role()` to class `Session` to switch database/schema/warehouse/role after a session is created.\n- Added `DataFrameWriter.copy_into_table()` to unload a `DataFrame` to stage files.\n- Added `DataFrame.unpivot()`.\n- Added `Column.within_group()` for sorting the rows by columns with some aggregation functions.\n- Added functions `listagg()`, `mode()`, `div0()`, `acos()`, `asin()`, `atan()`, `atan2()`, `cos()`, `cosh()`, `sin()`, `sinh()`, `tan()`, `tanh()`, `degrees()`, `radians()`, `round()`, `trunc()`, and `factorial()` to `snowflake.snowflake.functions`.\n- Added an optional argument `ignore_nulls` in function `lead()` and `lag()`.\n- The `condition` parameter of function `when()` and `iff()` now accepts SQL expressions.\n\n### Improvements\n- All function and method names have been renamed to use the snake case naming style, which is more Pythonic. For convenience, some camel case names are kept as aliases to the snake case APIs. It is recommended to use the snake case APIs.\n  - Deprecated these methods on class `Session` and replaced them with their snake case equivalents: `getImports()`, `addImports()`, `removeImport()`, `clearImports()`, `getSessionStage()`, `getDefaultSchema()`, `getDefaultSchema()`, `getCurrentDatabase()`, `getFullyQualifiedCurrentSchema()`.\n  - Deprecated these methods on class `DataFrame` and replaced them with their snake case equivalents: `groupingByGroupingSets()`, `naturalJoin()`, `withColumns()`, `joinTableFunction()`.\n- Property `DataFrame.columns` is now consistent with `DataFrame.schema.names` and the Snowflake database `Identifier Requirements`.\n- `Column.__bool__()` now raises a `TypeError`. This will ban the use of logical operators `and`, `or`, `not` on `Column` object, for instance `col(\"a\") > 1 and col(\"b\") > 2` will raise the `TypeError`. Use `(col(\"a\") > 1) & (col(\"b\") > 2)` instead.\n- Changed `PutResult` and `GetResult` to subclass `NamedTuple`.\n- Fixed a bug which raised an error when the local path or stage location has a space or other special characters.\n- Changed `DataFrame.describe()` so that non-numeric and non-string columns are ignored instead of raising an exception.\n\n### Dependency updates\n- Updated ``snowflake-connector-python`` to 2.7.4.\n\n## 0.3.0 (2022-01-09)\n### New Features\n- Added `Column.isin()`, with an alias `Column.in_()`.\n- Added `Column.try_cast()`, which is a special version of `cast()`. It tries to cast a string expression to other types and returns `null` if the cast is not possible.\n- Added `Column.startswith()` and `Column.substr()` to process string columns.\n- `Column.cast()` now also accepts a `str` value to indicate the cast type in addition to a `DataType` instance.\n- Added `DataFrame.describe()` to summarize stats of a `DataFrame`.\n- Added `DataFrame.explain()` to print the query plan of a `DataFrame`.\n- `DataFrame.filter()` and `DataFrame.select_expr()` now accepts a sql expression.\n- Added a new `bool` parameter `create_temp_table` to methods `DataFrame.saveAsTable()` and `Session.write_pandas()` to optionally create a temp table.\n- Added `DataFrame.minus()` and `DataFrame.subtract()` as aliases to `DataFrame.except_()`.\n- Added `regexp_replace()`, `concat()`, `concat_ws()`, `to_char()`, `current_timestamp()`, `current_date()`, `current_time()`, `months_between()`, `cast()`, `try_cast()`, `greatest()`, `least()`, and `hash()` to module `snowflake.snowpark.functions`.\n\n### Bug Fixes\n- Fixed an issue where `Session.createDataFrame(pandas_df)` and `Session.write_pandas(pandas_df)` raise an exception when the `Pandas DataFrame` has spaces in the column name.\n- `DataFrame.copy_into_table()` sometimes prints an `error` level log entry while it actually works. It's fixed now.\n- Fixed an API docs issue where some `DataFrame` APIs are missing from the docs.\n\n### Dependency updates\n- Update ``snowflake-connector-python`` to 2.7.2, which upgrades ``pyarrow`` dependency to 6.0.x. Refer to the [python connector 2.7.2 release notes](https://pypi.org/project/snowflake-connector-python/2.7.2/) for more details.\n\n## 0.2.0 (2021-12-02)\n### New Features\n- Updated the `Session.createDataFrame()` method for creating a `DataFrame` from a Pandas DataFrame.\n- Added the `Session.write_pandas()` method for writing a `Pandas DataFrame` to a table in Snowflake and getting a `Snowpark DataFrame` object back.\n- Added new classes and methods for calling window functions.\n- Added the new functions `cume_dist()`, to find the cumulative distribution of a value with regard to other values within a window partition,\n  and `row_number()`, which returns a unique row number for each row within a window partition.\n- Added functions for computing statistics for DataFrames in the `DataFrameStatFunctions` class.\n- Added functions for handling missing values in a DataFrame in the `DataFrameNaFunctions` class.\n- Added new methods `rollup()`, `cube()`, and `pivot()` to the `DataFrame` class.\n- Added the `GroupingSets` class, which you can use with the DataFrame groupByGroupingSets method to perform a SQL GROUP BY GROUPING SETS.\n- Added the new `FileOperation(session)`\n  class that you can use to upload and download files to and from a stage.\n- Added the `DataFrame.copy_into_table()`\n  method for loading data from files in a stage into a table.\n- In CASE expressions, the functions `when()` and `otherwise()`\n  now accept Python types in addition to `Column` objects.\n- When you register a UDF you can now optionally set the `replace` parameter to `True` to overwrite an existing UDF with the same name.\n\n### Improvements\n- UDFs are now compressed before they are uploaded to the server. This makes them about 10 times smaller, which can help\n  when you are using large ML model files.\n- When the size of a UDF is less than 8196 bytes, it will be uploaded as in-line code instead of uploaded to a stage.\n\n### Bug Fixes\n- Fixed an issue where the statement `df.select(when(col(\"a\") == 1, 4).otherwise(col(\"a\"))), [Row(4), Row(2), Row(3)]` raised an exception.\n- Fixed an issue where `df.toPandas()` raised an exception when a DataFrame was created from large local data.\n\n## 0.1.0 (2021-10-26)\n\nStart of Private Preview\n",
    "bugtrack_url": null,
    "license": "Apache License, Version 2.0",
    "summary": "Snowflake Snowpark for Python",
    "version": "1.0.0",
    "split_keywords": [
        "snowflake",
        "db",
        "database",
        "cloud",
        "analytics",
        "warehouse"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "6321a0064df97c2175f28d714636448e",
                "sha256": "411610edc28336aae4ec82c8092525b0d16469d98b1a6b33d8d3ebace3bf7565"
            },
            "downloads": -1,
            "filename": "snowflake_snowpark_python3-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6321a0064df97c2175f28d714636448e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 253619,
            "upload_time": "2022-12-06T17:29:49",
            "upload_time_iso_8601": "2022-12-06T17:29:49.058271Z",
            "url": "https://files.pythonhosted.org/packages/5b/75/e82ea5b9aadc8e45a902a7fd7b1f1094cd412920f2e63460d4cb19ae2dde/snowflake_snowpark_python3-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "cf8719f7364af1c4b7120ea08cedea74",
                "sha256": "84ae463463b43610184b8a2cfbcf0002285a15094fb491401605c99b36431cf7"
            },
            "downloads": -1,
            "filename": "snowflake-snowpark-python3-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "cf8719f7364af1c4b7120ea08cedea74",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 240971,
            "upload_time": "2022-12-06T17:29:50",
            "upload_time_iso_8601": "2022-12-06T17:29:50.947161Z",
            "url": "https://files.pythonhosted.org/packages/cc/50/0f541b81ac70757681e5e465d721b6d6f8a6c6e928243bd30fbfa41cd1ad/snowflake-snowpark-python3-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-06 17:29:50",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "snowflake-snowpark-python3"
}

Snowflake, Inc