pycarol


Namepycarol JSON
Version 2.55.0 PyPI version JSON
download
home_pagehttps://github.com/totvslabs/pyCarol
SummaryCarol Python API and Tools
upload_time2023-09-27 12:41:36
maintainerTOTVS Labs
docs_urlNone
authorTotvsLabs
requires_python
licenseTOTVS
keywords totvs carol.ai ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            .. note::
   For the latest source, discussion, etc, please visit the
   `GitHub repository <https://github.com/totvslabs/pyCarol>`_


=======
PyCarol
=======

.. image:: https://badge.buildkite.com/b92ca1611add8d61063f61c92b9798fe81e859d468aae36463.svg
    :target: https://buildkite.com/totvslabs/pycarol

.. contents::

Getting Started
---------------
Run ``pip install pycarol`` to install the latest stable version from `PyPI
<https://pypi.python.org/pypi/pycarol>`_. `Documentation for the latest release
<http://pycarol.readthedocs.io/>`__ is hosted on readthedocs.

This will install the minimal dependencies. To install pyCarol with the `dataframes` dependencies use
``pip install pycarol[dataframe]``, or to install with dask+pipeline dependencies use ``pip install pycarol[pipeline,dask]``

The options we have are: `complete`, `dataframe`, `onlineapp`, `dask`, `pipeline`

To install from source:

1. ``pip install -r requirements.txt`` to install the minimal requirements;
2. ``pip install -e . ".[dev]"`` to install the minimal requirements + dev libs;
3. ``pip install -e . ".[pipeline]"`` to install the minimal requirements + pipelines dependencies;
4. ``pip install -e . ".[complete]"`` to install all dependencies;


Initializing pyCarol
--------------------

Carol is the main object to access pyCarol and all Carol's APIs.

.. code:: python

    from pycarol import PwdAuth, Carol
    carol = Carol(domain=TENANT_NAME, app_name=APP_NAME,
                  auth=PwdAuth(USERNAME, PASSWORD), organization=ORGANIZATION)


where ``domain`` is the tenant name, ``app_name`` is the Carol's app name, if any, ``auth``
is the authentication method to be used (using user/password in this case) and ``organization`` is the organization
one wants to connect. Carols's URL is build as ``www.ORGANIZATION.carol.ai/TENANT_NAME``

It is also possible to initialize the object with a token generated via user/password. This is useful when creating an
online app that interacts with Carol

.. code:: python

    from pycarol import PwdKeyAuth, Carol
    carol = Carol(domain=TENANT_NAME, app_name=APP_NAME,
                  auth=PwdKeyAuth(pwd_auth_token), organization=ORGANIZATION)


Using API Key
--------------
To use API keys instead of username and password:

.. code:: python

    from pycarol import ApiKeyAuth, Carol

    carol = Carol(domain=DOMAIN,
                  app_name=APP_NAME,
                  auth=ApiKeyAuth(api_key=X_AUTH_KEY),
                  connector_id=CONNECTOR, organization=ORGANIZATION)

In this case one changes the authentication method to ``ApiKeyAuth``. Noticed that one needs to pass the ``connector_id``
too. An API key is always associated to a connector ID. 

It is possible to use pyCarol to generate an API key

.. code:: python

    from pycarol import PwdAuth, ApiKeyAuth, Carol

    carol = Carol(domain=TENANT_NAME, app_name=APP_NAME, organization=ORGANIZATION,
                  auth=PwdAuth(USERNAME, PASSWORD), connector_id=CONNECTOR)
    api_key = carol.issue_api_key()

    print(f"This is a API key {api_key['X-Auth-Key']}")
    print(f"This is the connector Id {api_key['X-Auth-ConnectorId']}")

To get the details of the API key you can do:

.. code:: python

    details = carol.api_key_details(APIKEY, CONNECTORID)


Finally, to revoke an API key:

.. code:: python

    carol.api_key_revoke(CONNECTORID)



Good practice using token
-------------------------

Never write in plain text your password/API token in your application. Use environment variables. pyCarol can use 
environment variables automatically. When none parameter is passed to the Carol constructor pycarol will look for:

 1. ``CAROLTENANT`` for domain
 2. ``CAROLAPPNAME`` for app_name
 3. ``CAROL_DOMAIN`` for environment
 4. ``CAROLORGANIZATION`` for organization
 5. ``CAROLAPPOAUTH`` for auth
 6. ``CAROLCONNECTORID`` for connector_id
 7. ``CAROLUSER`` for carol user email
 8. ``CAROLPWD`` for user password.
 
 e.g., one can create a ``.env`` file like this:

.. code:: python

    CAROLAPPNAME=myApp
    CAROLTENANT=myTenant
    CAROLORGANIZATION=myOrganization
    CAROLAPPOAUTH=myAPIKey
    CAROLCONNECTORID=myConnector

and then

.. code:: python

    from pycarol import Carol
    from dotenv import load_dotenv
    load_dotenv(".env") #this will import these env variables to your execution.
    carol = Carol()


Ingesting data
--------------

From both Staging Tables and Data Models (CDS Layer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Use this method when you need to read most of the records and columns from the source.

.. code:: python

    from pycarol import Carol, Staging

    staging = Staging(Carol())
    df = staging.fetch_parquet(
        staging_name="execution_history", 
        connector_name="model"
    )

From both Staging Tables and Data Models (BQ Layer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Use this method when you need to read only a subset of records and columns or when 
data transformation is needed.

.. code:: python

    from pycarol import BQ, Carol

    bq = BQ(Carol())
    query_str = "SELECT * FROM stg_connectorname_table_name"
    results = bq.query(query_str)


In case one needs a service account with access to BigQuery, the following code can be
used:

.. code:: python

    from pycarol import Carol
    from pycarol.bigquery import TokenManager

    tm = TokenManager(Carol())
    service_account = tm.get_token().service_account


After each execution of ``BQ.query``, the ``BQ`` object will have an attribute called
``job``. This attribute is of type ``bigquery.job.query.QueryJob`` and may be useful for
monitoring/debug jobs.

PyCarol provides access to BigQuery Storage API also. It allows for much faster reading
times, but with limited querying capabilities. For instance, only tables are readable,
so 'ingestion_stg_model_deep_audit' is ok, but 'stg_model_deep_audit' is not (it is a 
view).

.. code:: python

    from pycarol import BQStorage, Carol

    bq = BQStorage(Carol())
    table_name = "ingestion_stg_model_deep_audit"
    col_names = ["request_id", "version"]
    restriction = "branch = '01'"
    sample_size = 1000
    df = bq.query(
        table_name,
        col_names,
        row_restriction=restriction,
        sample_percentage=sample_size,
        return_dataframe=True
    )


From Data Models (RT Layer): Filter queries
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Use this when you need low latency (only if RT layer is enabled).

.. code:: python

    from pycarol.filter import TYPE_FILTER, TERM_FILTER, Filter
    from pycarol import Query
    json_query = Filter.Builder() \
        .must(TYPE_FILTER(value='ratings' + "Golden")) \
        .must(TERM_FILTER(key='mdmGoldenFieldAndValues.userid.raw',value='123'))\
        .build().to_json()

    FIELDS_ITEMS = ['mdmGoldenFieldAndValues.mdmaddress.coordinates']
    query = Query(carol, page_size=10, print_status=True, only_hits=True,
                  fields=FIELDS_ITEMS, max_hits=200).query(json_query).go()
    query.results



The result will be ``200`` hits of the query ``json_query``  above, the pagination will be 10, that means in each response
there will be 10 records. The query will return only the fields set in ``FIELDS_ITEMS``.

The parameter ``only_hits = True`` will make sure that only records into the path ``$hits.mdmGoldenFieldAndValues`` will return.
If one wants all the response use ``only_hits = False``. Also, if your filter has an aggregation, one should use
``only_hits = False`` and ``get_aggs=True``, e.g.,


.. code:: python

    from pycarol import Query
    from pycarol.filter import TYPE_FILTER, Filter, CARDINALITY

    json_query = Filter.Builder() \
        .must(TYPE_FILTER(value='datamodelname' + "Golden")) \
        .aggregation(CARDINALITY(name='cardinality', params = ["mdmGoldenFieldAndValues.taxid.raw"], size=40))\
        .build().to_json()

    query = Query(carol, get_aggs=True, only_hits=False)
    query.query(json_query).go()
    query.results


From Data Models (RT Layer): Named queries
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code:: python

    from pycarol import Query

    named_query = 'revenueHist'  # named query name
    params = {"bin":"1d","cnpj":"24386434000130"}  #query parameters to send.
    results = Query(carol).named(named_query, params=params).go().results

It is possible to use all the parameters used in the filter query, i.e., ``only_hits`` ,
``save_results``, etc. For more information for the possible input parameters check the docstring.

What if one does not remember the parameters for a given named query?


.. code:: python

    named_query = 'revenueHist'  # named query name
    Query(carol).named_query_params(named_query)
    > {'revenueHist': ['*cnpj', 'dateFrom', 'dateTo', '*bin']}  #Parameters starting by * are mandatory.



Sending data
------------

The first step to send data to Carol is to create a connector.

.. code:: python

    from pycarol import Connectors
    connector_id = Connectors(carol).create(name='my_connector', label="connector_label", group_name="GroupName")
    print(f"This is the connector id: {connector_id}")


With the connector Id on hands we can create the staging schema and then create the staging table. Assuming we have
a sample of the data we want to send.

.. code:: python

    from pycarol import Staging

    json_ex = {"name":'Rafael',"email": {"type": "email", "email": 'rafael@totvs.com.br'} }

    staging = Staging(carol)
    staging.create_schema(staging_name='my_stag', data = json_ex,
                          crosswalk_name= 'my_crosswalk' ,crosswalk_list=['name'],
                            connector_name='my_connector')


The json schema will be in the variable ``schema.schema``. The code above will create the following schema:

.. code:: python

    {
      'mdmCrosswalkTemplate': {
        'mdmCrossreference': {
          'my_crosswalk': [
            'name'
          ]
        }
      },
      'mdmFlexible': 'false',
      'mdmStagingMapping': {
        'properties': {
          'email': {
            'properties': {
              'email': {
                'type': 'string'
              },
              'type': {
                'type': 'string'
              }
            },
            'type': 'nested'
          },
          'name': {
            'type': 'string'
          }
        }
      },
      'mdmStagingType': 'my_stag'
    }


To send the data  (assuming we have a json with the data we want to send).

.. code:: python

    from pycarol import Staging

    json_ex = [{"name":'Rafael',"email": {"type": "email", "email": 'rafael@totvs.com.br'}   },
               {"name":'Leandro',"email": {"type": "email", "email": 'Leandro@totvs.com.br'}   },
               {"name":'Joao',"email": {"type": "email", "email": 'joao@rolima.com.br'}   },
               {"name":'Marcelo',"email": {"type": "email", "email": 'marcelo@totvs.com.br'}   }]


    staging = Staging(carol)
    staging.send_data(staging_name = 'my_stag', data = json_ex, step_size = 2,
                     connector_id=connectorId, print_stats = True)

The parameter ``step_size`` says how many registers will be sent each time. Remember the the max size per payload is
5MB. The parameter  ``data`` can be a pandas DataFrame.

OBS: It is not possible to create a mapping using pycarol. The Mapping has to be done via the UI



Logging
--------


To log messages to Carol:

.. code:: python

    from pycarol import Carol, CarolHandler
    import logging

    logger = logging.getLogger(__name__)
    logger.setLevel(logging.DEBUG)
    carol = CarolHandler(Carol())
    carol.setLevel(logging.INFO)
    logger.addHandler(carol)

    logger.debug('This is a debug message') #This will not be logged in Carol. Level is set to INFO
    logger.info('This is an info message')
    logger.warning('This is a warning message')
    logger.error('This is an error message')
    logger.critical('This is a critical message')


These methods will use the current long task id provided by Carol when running your application.
For local environments you need to set that manually first on the beginning of your code:

.. code:: python

    import os
    os.environ['LONGTASKID'] = task_id

We recommend to log only INFO+ information in Carol. If no TASK ID is passed it works as a Console Handler. 

Settings
--------
We can use pyCarol to access the settings of your Carol App.

.. code:: python

    from pycarol.apps import Apps
    app = Apps(carol)
    settings = app.get_settings(app_name='my_app')
    print(settings)


The settings will be returned as a dictionary where the keys are the parameter names and the values are
the value for that parameter. Please note that your app must be created in Carol.


Useful Functions
--------------------

1. ``track_tasks``: Track a list of tasks.

.. code:: python

    from pycarol import Carol
    from pycarol.functions import track_tasks
    carol = Carol()
    def callback(task_list):
      print(task_list)
    track_tasks(carol=carol, task_list=['task_id_1', 'task_id_2'], callback=callback)
  

Release process
----------------
1. Open a PR with your change for `master` branch;
2. Once approved, merge into `master`;
3. In case there are any changes to the default release notes, please update them


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/totvslabs/pyCarol",
    "name": "pycarol",
    "maintainer": "TOTVS Labs",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "Totvs,Carol.ai,AI",
    "author": "TotvsLabs",
    "author_email": "ops@totvslabs.com",
    "download_url": "https://files.pythonhosted.org/packages/c9/89/a3227dd31c4bee8c89f38adf8e24bf76100019569280acb0c5d568b7d370/pycarol-2.55.0.tar.gz",
    "platform": null,
    "description": ".. note::\n   For the latest source, discussion, etc, please visit the\n   `GitHub repository <https://github.com/totvslabs/pyCarol>`_\n\n\n=======\nPyCarol\n=======\n\n.. image:: https://badge.buildkite.com/b92ca1611add8d61063f61c92b9798fe81e859d468aae36463.svg\n    :target: https://buildkite.com/totvslabs/pycarol\n\n.. contents::\n\nGetting Started\n---------------\nRun ``pip install pycarol`` to install the latest stable version from `PyPI\n<https://pypi.python.org/pypi/pycarol>`_. `Documentation for the latest release\n<http://pycarol.readthedocs.io/>`__ is hosted on readthedocs.\n\nThis will install the minimal dependencies. To install pyCarol with the `dataframes` dependencies use\n``pip install pycarol[dataframe]``, or to install with dask+pipeline dependencies use ``pip install pycarol[pipeline,dask]``\n\nThe options we have are: `complete`, `dataframe`, `onlineapp`, `dask`, `pipeline`\n\nTo install from source:\n\n1. ``pip install -r requirements.txt`` to install the minimal requirements;\n2. ``pip install -e . \".[dev]\"`` to install the minimal requirements + dev libs;\n3. ``pip install -e . \".[pipeline]\"`` to install the minimal requirements + pipelines dependencies;\n4. ``pip install -e . \".[complete]\"`` to install all dependencies;\n\n\nInitializing pyCarol\n--------------------\n\nCarol is the main object to access pyCarol and all Carol's APIs.\n\n.. code:: python\n\n    from pycarol import PwdAuth, Carol\n    carol = Carol(domain=TENANT_NAME, app_name=APP_NAME,\n                  auth=PwdAuth(USERNAME, PASSWORD), organization=ORGANIZATION)\n\n\nwhere ``domain`` is the tenant name, ``app_name`` is the Carol's app name, if any, ``auth``\nis the authentication method to be used (using user/password in this case) and ``organization`` is the organization\none wants to connect. Carols's URL is build as ``www.ORGANIZATION.carol.ai/TENANT_NAME``\n\nIt is also possible to initialize the object with a token generated via user/password. This is useful when creating an\nonline app that interacts with Carol\n\n.. code:: python\n\n    from pycarol import PwdKeyAuth, Carol\n    carol = Carol(domain=TENANT_NAME, app_name=APP_NAME,\n                  auth=PwdKeyAuth(pwd_auth_token), organization=ORGANIZATION)\n\n\nUsing API Key\n--------------\nTo use API keys instead of username and password:\n\n.. code:: python\n\n    from pycarol import ApiKeyAuth, Carol\n\n    carol = Carol(domain=DOMAIN,\n                  app_name=APP_NAME,\n                  auth=ApiKeyAuth(api_key=X_AUTH_KEY),\n                  connector_id=CONNECTOR, organization=ORGANIZATION)\n\nIn this case one changes the authentication method to ``ApiKeyAuth``. Noticed that one needs to pass the ``connector_id``\ntoo. An API key is always associated to a connector ID. \n\nIt is possible to use pyCarol to generate an API key\n\n.. code:: python\n\n    from pycarol import PwdAuth, ApiKeyAuth, Carol\n\n    carol = Carol(domain=TENANT_NAME, app_name=APP_NAME, organization=ORGANIZATION,\n                  auth=PwdAuth(USERNAME, PASSWORD), connector_id=CONNECTOR)\n    api_key = carol.issue_api_key()\n\n    print(f\"This is a API key {api_key['X-Auth-Key']}\")\n    print(f\"This is the connector Id {api_key['X-Auth-ConnectorId']}\")\n\nTo get the details of the API key you can do:\n\n.. code:: python\n\n    details = carol.api_key_details(APIKEY, CONNECTORID)\n\n\nFinally, to revoke an API key:\n\n.. code:: python\n\n    carol.api_key_revoke(CONNECTORID)\n\n\n\nGood practice using token\n-------------------------\n\nNever write in plain text your password/API token in your application. Use environment variables. pyCarol can use \nenvironment variables automatically. When none parameter is passed to the Carol constructor pycarol will look for:\n\n 1. ``CAROLTENANT`` for domain\n 2. ``CAROLAPPNAME`` for app_name\n 3. ``CAROL_DOMAIN`` for environment\n 4. ``CAROLORGANIZATION`` for organization\n 5. ``CAROLAPPOAUTH`` for auth\n 6. ``CAROLCONNECTORID`` for connector_id\n 7. ``CAROLUSER`` for carol user email\n 8. ``CAROLPWD`` for user password.\n \n e.g., one can create a ``.env`` file like this:\n\n.. code:: python\n\n    CAROLAPPNAME=myApp\n    CAROLTENANT=myTenant\n    CAROLORGANIZATION=myOrganization\n    CAROLAPPOAUTH=myAPIKey\n    CAROLCONNECTORID=myConnector\n\nand then\n\n.. code:: python\n\n    from pycarol import Carol\n    from dotenv import load_dotenv\n    load_dotenv(\".env\") #this will import these env variables to your execution.\n    carol = Carol()\n\n\nIngesting data\n--------------\n\nFrom both Staging Tables and Data Models (CDS Layer)\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nUse this method when you need to read most of the records and columns from the source.\n\n.. code:: python\n\n    from pycarol import Carol, Staging\n\n    staging = Staging(Carol())\n    df = staging.fetch_parquet(\n        staging_name=\"execution_history\", \n        connector_name=\"model\"\n    )\n\nFrom both Staging Tables and Data Models (BQ Layer)\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nUse this method when you need to read only a subset of records and columns or when \ndata transformation is needed.\n\n.. code:: python\n\n    from pycarol import BQ, Carol\n\n    bq = BQ(Carol())\n    query_str = \"SELECT * FROM stg_connectorname_table_name\"\n    results = bq.query(query_str)\n\n\nIn case one needs a service account with access to BigQuery, the following code can be\nused:\n\n.. code:: python\n\n    from pycarol import Carol\n    from pycarol.bigquery import TokenManager\n\n    tm = TokenManager(Carol())\n    service_account = tm.get_token().service_account\n\n\nAfter each execution of ``BQ.query``, the ``BQ`` object will have an attribute called\n``job``. This attribute is of type ``bigquery.job.query.QueryJob`` and may be useful for\nmonitoring/debug jobs.\n\nPyCarol provides access to BigQuery Storage API also. It allows for much faster reading\ntimes, but with limited querying capabilities. For instance, only tables are readable,\nso 'ingestion_stg_model_deep_audit' is ok, but 'stg_model_deep_audit' is not (it is a \nview).\n\n.. code:: python\n\n    from pycarol import BQStorage, Carol\n\n    bq = BQStorage(Carol())\n    table_name = \"ingestion_stg_model_deep_audit\"\n    col_names = [\"request_id\", \"version\"]\n    restriction = \"branch = '01'\"\n    sample_size = 1000\n    df = bq.query(\n        table_name,\n        col_names,\n        row_restriction=restriction,\n        sample_percentage=sample_size,\n        return_dataframe=True\n    )\n\n\nFrom Data Models (RT Layer): Filter queries\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nUse this when you need low latency (only if RT layer is enabled).\n\n.. code:: python\n\n    from pycarol.filter import TYPE_FILTER, TERM_FILTER, Filter\n    from pycarol import Query\n    json_query = Filter.Builder() \\\n        .must(TYPE_FILTER(value='ratings' + \"Golden\")) \\\n        .must(TERM_FILTER(key='mdmGoldenFieldAndValues.userid.raw',value='123'))\\\n        .build().to_json()\n\n    FIELDS_ITEMS = ['mdmGoldenFieldAndValues.mdmaddress.coordinates']\n    query = Query(carol, page_size=10, print_status=True, only_hits=True,\n                  fields=FIELDS_ITEMS, max_hits=200).query(json_query).go()\n    query.results\n\n\n\nThe result will be ``200`` hits of the query ``json_query``  above, the pagination will be 10, that means in each response\nthere will be 10 records. The query will return only the fields set in ``FIELDS_ITEMS``.\n\nThe parameter ``only_hits = True`` will make sure that only records into the path ``$hits.mdmGoldenFieldAndValues`` will return.\nIf one wants all the response use ``only_hits = False``. Also, if your filter has an aggregation, one should use\n``only_hits = False`` and ``get_aggs=True``, e.g.,\n\n\n.. code:: python\n\n    from pycarol import Query\n    from pycarol.filter import TYPE_FILTER, Filter, CARDINALITY\n\n    json_query = Filter.Builder() \\\n        .must(TYPE_FILTER(value='datamodelname' + \"Golden\")) \\\n        .aggregation(CARDINALITY(name='cardinality', params = [\"mdmGoldenFieldAndValues.taxid.raw\"], size=40))\\\n        .build().to_json()\n\n    query = Query(carol, get_aggs=True, only_hits=False)\n    query.query(json_query).go()\n    query.results\n\n\nFrom Data Models (RT Layer): Named queries\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n    from pycarol import Query\n\n    named_query = 'revenueHist'  # named query name\n    params = {\"bin\":\"1d\",\"cnpj\":\"24386434000130\"}  #query parameters to send.\n    results = Query(carol).named(named_query, params=params).go().results\n\nIt is possible to use all the parameters used in the filter query, i.e., ``only_hits`` ,\n``save_results``, etc. For more information for the possible input parameters check the docstring.\n\nWhat if one does not remember the parameters for a given named query?\n\n\n.. code:: python\n\n    named_query = 'revenueHist'  # named query name\n    Query(carol).named_query_params(named_query)\n    > {'revenueHist': ['*cnpj', 'dateFrom', 'dateTo', '*bin']}  #Parameters starting by * are mandatory.\n\n\n\nSending data\n------------\n\nThe first step to send data to Carol is to create a connector.\n\n.. code:: python\n\n    from pycarol import Connectors\n    connector_id = Connectors(carol).create(name='my_connector', label=\"connector_label\", group_name=\"GroupName\")\n    print(f\"This is the connector id: {connector_id}\")\n\n\nWith the connector Id on hands we can create the staging schema and then create the staging table. Assuming we have\na sample of the data we want to send.\n\n.. code:: python\n\n    from pycarol import Staging\n\n    json_ex = {\"name\":'Rafael',\"email\": {\"type\": \"email\", \"email\": 'rafael@totvs.com.br'} }\n\n    staging = Staging(carol)\n    staging.create_schema(staging_name='my_stag', data = json_ex,\n                          crosswalk_name= 'my_crosswalk' ,crosswalk_list=['name'],\n                            connector_name='my_connector')\n\n\nThe json schema will be in the variable ``schema.schema``. The code above will create the following schema:\n\n.. code:: python\n\n    {\n      'mdmCrosswalkTemplate': {\n        'mdmCrossreference': {\n          'my_crosswalk': [\n            'name'\n          ]\n        }\n      },\n      'mdmFlexible': 'false',\n      'mdmStagingMapping': {\n        'properties': {\n          'email': {\n            'properties': {\n              'email': {\n                'type': 'string'\n              },\n              'type': {\n                'type': 'string'\n              }\n            },\n            'type': 'nested'\n          },\n          'name': {\n            'type': 'string'\n          }\n        }\n      },\n      'mdmStagingType': 'my_stag'\n    }\n\n\nTo send the data  (assuming we have a json with the data we want to send).\n\n.. code:: python\n\n    from pycarol import Staging\n\n    json_ex = [{\"name\":'Rafael',\"email\": {\"type\": \"email\", \"email\": 'rafael@totvs.com.br'}   },\n               {\"name\":'Leandro',\"email\": {\"type\": \"email\", \"email\": 'Leandro@totvs.com.br'}   },\n               {\"name\":'Joao',\"email\": {\"type\": \"email\", \"email\": 'joao@rolima.com.br'}   },\n               {\"name\":'Marcelo',\"email\": {\"type\": \"email\", \"email\": 'marcelo@totvs.com.br'}   }]\n\n\n    staging = Staging(carol)\n    staging.send_data(staging_name = 'my_stag', data = json_ex, step_size = 2,\n                     connector_id=connectorId, print_stats = True)\n\nThe parameter ``step_size`` says how many registers will be sent each time. Remember the the max size per payload is\n5MB. The parameter  ``data`` can be a pandas DataFrame.\n\nOBS: It is not possible to create a mapping using pycarol. The Mapping has to be done via the UI\n\n\n\nLogging\n--------\n\n\nTo log messages to Carol:\n\n.. code:: python\n\n    from pycarol import Carol, CarolHandler\n    import logging\n\n    logger = logging.getLogger(__name__)\n    logger.setLevel(logging.DEBUG)\n    carol = CarolHandler(Carol())\n    carol.setLevel(logging.INFO)\n    logger.addHandler(carol)\n\n    logger.debug('This is a debug message') #This will not be logged in Carol. Level is set to INFO\n    logger.info('This is an info message')\n    logger.warning('This is a warning message')\n    logger.error('This is an error message')\n    logger.critical('This is a critical message')\n\n\nThese methods will use the current long task id provided by Carol when running your application.\nFor local environments you need to set that manually first on the beginning of your code:\n\n.. code:: python\n\n    import os\n    os.environ['LONGTASKID'] = task_id\n\nWe recommend to log only INFO+ information in Carol. If no TASK ID is passed it works as a Console Handler. \n\nSettings\n--------\nWe can use pyCarol to access the settings of your Carol App.\n\n.. code:: python\n\n    from pycarol.apps import Apps\n    app = Apps(carol)\n    settings = app.get_settings(app_name='my_app')\n    print(settings)\n\n\nThe settings will be returned as a dictionary where the keys are the parameter names and the values are\nthe value for that parameter. Please note that your app must be created in Carol.\n\n\nUseful Functions\n--------------------\n\n1. ``track_tasks``: Track a list of tasks.\n\n.. code:: python\n\n    from pycarol import Carol\n    from pycarol.functions import track_tasks\n    carol = Carol()\n    def callback(task_list):\n      print(task_list)\n    track_tasks(carol=carol, task_list=['task_id_1', 'task_id_2'], callback=callback)\n  \n\nRelease process\n----------------\n1. Open a PR with your change for `master` branch;\n2. Once approved, merge into `master`;\n3. In case there are any changes to the default release notes, please update them\n\n",
    "bugtrack_url": null,
    "license": "TOTVS",
    "summary": "Carol Python API and Tools",
    "version": "2.55.0",
    "project_urls": {
        "Homepage": "https://github.com/totvslabs/pyCarol"
    },
    "split_keywords": [
        "totvs",
        "carol.ai",
        "ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c989a3227dd31c4bee8c89f38adf8e24bf76100019569280acb0c5d568b7d370",
                "md5": "a7b8f21b58e93da6c191b55418eed1e9",
                "sha256": "5baa3f612d6b8df1f383c800d97e2d18d3becce7263ee5a931d22506c2c0e5fc"
            },
            "downloads": -1,
            "filename": "pycarol-2.55.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a7b8f21b58e93da6c191b55418eed1e9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 123949,
            "upload_time": "2023-09-27T12:41:36",
            "upload_time_iso_8601": "2023-09-27T12:41:36.965606Z",
            "url": "https://files.pythonhosted.org/packages/c9/89/a3227dd31c4bee8c89f38adf8e24bf76100019569280acb0c5d568b7d370/pycarol-2.55.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-27 12:41:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "totvslabs",
    "github_project": "pyCarol",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [],
    "lcname": "pycarol"
}
        
Elapsed time: 0.12904s