nzpyida


Namenzpyida JSON
Version 1.2 PyPI version JSON
download
home_pagehttps://github.com/ibm/nzpyida
SummarySupports Custom ML/Analytics Execution Inside Netezza
upload_time2023-11-10 07:59:23
maintainer
docs_urlNone
authorIBM Corp.
requires_python
licenseBSD
keywords data analytics database development ibm netezza pandas scikitlearn scalability machine-learning knowledge discovery
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # nzpyida


# Accelerating Python Analytics by In-Database Processing

The nzpyida project provides a Python interface to the in-database
data-manipulation algorithms provided by IBM Netezza:

-   It accelerates Python analytics by seamlessly pushing operations
    written in Python into the underlying database for execution,
    thereby benefitting from in-database performance-enhancing features,
    such as columnar storage and parallel processing.
-   It can be used by Python developers with very little additional
    knowledge, because it copies the well-known interface of the Pandas
    library for data manipulation and the Scikit-learn library for the
    use of machine learning algorithms.
-   It is compatible with Python 3.6.
-   It can connect to Netezza databases via nzpy, ODBC or JDBC.

**nzpyida = NeteZa PYthon In Database Analytics**

The latest version of nzpyida is available on the [Python Package
Index](https://pypi.python.org/pypi/nzpyida) and
[Github](https://github.com/IBM/nzpyida).

## How nzpyida works

The nzpyida project translates Pandas-like syntax into SQL and uses a
middleware API (like pypyodbc or nzpy) to send it to an ODBC, JDBC or
nzpy connected database for execution. The results are fetched and
formatted into the corresponding data structure, for example, a
Pandas.Dataframe or a Pandas.Series.

The following scenario illustrates how nzpyida works.

Issue the following statements to connect via nzpy to a Netezza database
server NETEZZA_HOSTNAME on port 5480 logging in as DATABASE_USER with
password PASSWORD. The database to use on that server is DATABASE.

```
from nzpyida import IdaDataBase, IdaDataFrame
nzpy_cfg = {
  'user': 'DATABASE_USER', 
  'password': 'PASSWORD', 
  'host': 'NETEZZA_HOSTNAME', 
  'port': 5480, 
  'database': 'DATABASE',
  'logLevel': 0, 
  'securityLevel': 0
} 
idadb = IdaDataBase(nzpy_cfg)
```

A few sample data sets are included in nzpyida for you to experiment.
First, we can load the IRIS table into this database instance.

```
from nzpyida.sampledata import iris
idadb.as_idadataframe(iris, "IRIS")
```

Next, we can create an IDA data frame that points to the table we just
uploaded:

```
idadf = IdaDataFrame(idadb, 'IRIS')
```

Note that to create an IDA data frame using the IdaDataFrame object, we
need to specify our previously opened IdaDataBase object, because it
holds the connection.

Next, we compute the correlation matrix:

```
idadf.corr()
```

In the background, nzpyida looks for numerical columns in the table and
builds an SQL request that returns the correlation between each pair of
columns.

The result fetched by nzpyida is a tuple containing all values of the
matrix. This tuple is formatted back into a Pandas.DataFrame and then
returned:

                   sepal_length  sepal_width   petal_length  petal_width
    sepal_length      1.000000    -0.117570      0.871754     0.817941
    sepal_width      -0.117570     1.000000     -0.428440    -0.366126
    petal_length      0.871754    -0.428440      1.000000     0.962865
    petal_width       0.817941    -0.366126      0.962865     1.000000

# Contributors

The nzpyida is based on ibmdbpy project developed for IBM Db2 Warehouse.
See https://github.com/ibmdbanalytics/ibmdbpy for details.

# How to contribute
You want to contribute? That's great! There are many things you can do.

If you are a member of the ibmdbanalytics group, you can create branchs and merge them to master. Otherwise, you can fork the project and do a pull request. You are very welcome to contribute to the code and to the documentation.

There are many ways to contribute. If you find bugs and have improvement ideas or need some new specific features, please open a ticket! We do care about it.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ibm/nzpyida",
    "name": "nzpyida",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "data analytics database development ibm netezza pandas scikitlearn scalability machine-learning knowledge discovery",
    "author": "IBM Corp.",
    "author_email": "mlabenski@ibm.com,pawel.mroz1@ibm.com",
    "download_url": "https://files.pythonhosted.org/packages/1d/55/fc8e1051e40ffbdb9076d7891d945de3a1cac1eb786232fa863751ff5fcd/nzpyida-1.2.tar.gz",
    "platform": null,
    "description": "# nzpyida\n\n\n# Accelerating Python Analytics by In-Database Processing\n\nThe nzpyida project provides a Python interface to the in-database\ndata-manipulation algorithms provided by IBM Netezza:\n\n-   It accelerates Python analytics by seamlessly pushing operations\n    written in Python into the underlying database for execution,\n    thereby benefitting from in-database performance-enhancing features,\n    such as columnar storage and parallel processing.\n-   It can be used by Python developers with very little additional\n    knowledge, because it copies the well-known interface of the Pandas\n    library for data manipulation and the Scikit-learn library for the\n    use of machine learning algorithms.\n-   It is compatible with Python 3.6.\n-   It can connect to Netezza databases via nzpy, ODBC or JDBC.\n\n**nzpyida = NeteZa PYthon In Database Analytics**\n\nThe latest version of nzpyida is available on the [Python Package\nIndex](https://pypi.python.org/pypi/nzpyida) and\n[Github](https://github.com/IBM/nzpyida).\n\n## How nzpyida works\n\nThe nzpyida project translates Pandas-like syntax into SQL and uses a\nmiddleware API (like pypyodbc or nzpy) to send it to an ODBC, JDBC or\nnzpy connected database for execution. The results are fetched and\nformatted into the corresponding data structure, for example, a\nPandas.Dataframe or a Pandas.Series.\n\nThe following scenario illustrates how nzpyida works.\n\nIssue the following statements to connect via nzpy to a Netezza database\nserver NETEZZA_HOSTNAME on port 5480 logging in as DATABASE_USER with\npassword PASSWORD. The database to use on that server is DATABASE.\n\n```\nfrom nzpyida import IdaDataBase, IdaDataFrame\nnzpy_cfg = {\n  'user': 'DATABASE_USER', \n  'password': 'PASSWORD', \n  'host': 'NETEZZA_HOSTNAME', \n  'port': 5480, \n  'database': 'DATABASE',\n  'logLevel': 0, \n  'securityLevel': 0\n} \nidadb = IdaDataBase(nzpy_cfg)\n```\n\nA few sample data sets are included in nzpyida for you to experiment.\nFirst, we can load the IRIS table into this database instance.\n\n```\nfrom nzpyida.sampledata import iris\nidadb.as_idadataframe(iris, \"IRIS\")\n```\n\nNext, we can create an IDA data frame that points to the table we just\nuploaded:\n\n```\nidadf = IdaDataFrame(idadb, 'IRIS')\n```\n\nNote that to create an IDA data frame using the IdaDataFrame object, we\nneed to specify our previously opened IdaDataBase object, because it\nholds the connection.\n\nNext, we compute the correlation matrix:\n\n```\nidadf.corr()\n```\n\nIn the background, nzpyida looks for numerical columns in the table and\nbuilds an SQL request that returns the correlation between each pair of\ncolumns.\n\nThe result fetched by nzpyida is a tuple containing all values of the\nmatrix. This tuple is formatted back into a Pandas.DataFrame and then\nreturned:\n\n                   sepal_length  sepal_width   petal_length  petal_width\n    sepal_length      1.000000    -0.117570      0.871754     0.817941\n    sepal_width      -0.117570     1.000000     -0.428440    -0.366126\n    petal_length      0.871754    -0.428440      1.000000     0.962865\n    petal_width       0.817941    -0.366126      0.962865     1.000000\n\n# Contributors\n\nThe nzpyida is based on ibmdbpy project developed for IBM Db2 Warehouse.\nSee https://github.com/ibmdbanalytics/ibmdbpy for details.\n\n# How to contribute\nYou want to contribute? That's great! There are many things you can do.\n\nIf you are a member of the ibmdbanalytics group, you can create branchs and merge them to master. Otherwise, you can fork the project and do a pull request. You are very welcome to contribute to the code and to the documentation.\n\nThere are many ways to contribute. If you find bugs and have improvement ideas or need some new specific features, please open a ticket! We do care about it.\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Supports Custom ML/Analytics Execution Inside Netezza",
    "version": "1.2",
    "project_urls": {
        "Documentation": "https://nzpyida.readthedocs.io/en/latest/",
        "Homepage": "https://github.com/ibm/nzpyida",
        "Source": "https://github.com/IBM/nzpyida",
        "Tracker": "https://github.com/IBM/nzpyida/issues"
    },
    "split_keywords": [
        "data",
        "analytics",
        "database",
        "development",
        "ibm",
        "netezza",
        "pandas",
        "scikitlearn",
        "scalability",
        "machine-learning",
        "knowledge",
        "discovery"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5af5da6a369ccbe9dd3944cf223b69a7e711e6d5c9cc8702d52502dc9d0a9a6d",
                "md5": "74095dec94e6385dff2d02a82eee9922",
                "sha256": "0c4c04d5d039f0d228a23b2c29fc5796da44d20b3dfac7ee671333019e71d30a"
            },
            "downloads": -1,
            "filename": "nzpyida-1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "74095dec94e6385dff2d02a82eee9922",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 208699,
            "upload_time": "2023-11-10T07:59:21",
            "upload_time_iso_8601": "2023-11-10T07:59:21.074546Z",
            "url": "https://files.pythonhosted.org/packages/5a/f5/da6a369ccbe9dd3944cf223b69a7e711e6d5c9cc8702d52502dc9d0a9a6d/nzpyida-1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1d55fc8e1051e40ffbdb9076d7891d945de3a1cac1eb786232fa863751ff5fcd",
                "md5": "ceae4834f1827a71e6230c389d93e32c",
                "sha256": "5910e5bbaade52eb743ccfebe5902ff805bb7c9ba22b709439a17b7579817487"
            },
            "downloads": -1,
            "filename": "nzpyida-1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "ceae4834f1827a71e6230c389d93e32c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 230303,
            "upload_time": "2023-11-10T07:59:23",
            "upload_time_iso_8601": "2023-11-10T07:59:23.296607Z",
            "url": "https://files.pythonhosted.org/packages/1d/55/fc8e1051e40ffbdb9076d7891d945de3a1cac1eb786232fa863751ff5fcd/nzpyida-1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-10 07:59:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ibm",
    "github_project": "nzpyida",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "nzpyida"
}
        
Elapsed time: 0.14997s