# Pramen-py
Cli application for defining the data transformations for Pramen.
See:
```bash
pramen-py --help
```
for more information.
## Installation
### App settings
Application configuration solved by the environment variables
(see .env.example)
### Add pramen-py as a dependency to your project
In case of poetry:
```bash
# ensure we have valid poetry environment
ls pyproject.toml || poetry init
poetry add pramen-py
```
In case of pip:
```bash
pip install pramen-py
```
## Usage
## Application configuration
In order to configure the pramen-py options you need to set
corresponding environment variables. To see the list of available options run:
```bash
pramen-py list-configuration-options
```
### Developing transformations
pramen-py uses python's
[namespace packages](https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages)
for discovery of the transformations.
This mean, that in order to build a new transformer, it should be located
inside a python package with the `transformations` directory inside.
This directory should be declared as a package:
- for poetry
```toml
[tool.poetry]
# ...
packages = [
{ include = "transformations" },
]
```
- for setup.py
```python
from setuptools import setup, find_namespace_packages
setup(
name='mynamespace-subpackage-a',
# ...
packages=find_namespace_packages(include=['transformations.*'])
)
```
Example files structure:
```
❯ tree .
.
├── README.md
├── poetry.lock
├── pyproject.toml
├── tests
│ └── test_identity_transformer.py
└── transformations
└── identity_transformer
├── __init__.py
└── example_config.yaml
```
In order to make transformer picked up by the pramen-py the following
conditions should be satisfied:
- python package containing the transformers should be installed to the
same python environment as pramen-py
- python package should have defined namespace package `transformations`
- transformers should extend `pramen_py.Transformation` base class
Subclasses created by extending Transformation base class are registered as
a cli command (pramen-py transformations run TransformationSubclassName)
with default options. Check:
```bash
pramen-py transformations run ExampleTransformation1 --help
```
for more details.
You can add your own cli options to your transformations. See example at
[ExampleTransformation2](transformations/example_trasformation_two/some_transformation.py)
### pramen-py pytest plugin
pramen-py also provides pytest plugin with helpful
fixtures to test created transformers.
List of available fixtures:
```bash
#install pramen-py into the environment and activate it
pytest --fixtures
# check under --- fixtures defined from pramen_py.test_utils.fixtures ---
```
pramen-py pytest plugin also loads environment variables from .env
file if it is presented in the root of the repo.
### Running and configuring transformations
Transformations can be run with the following command:
```bash
pramen-py transformations run \
ExampleTransformation1 \
--config config.yml \
--info-date 2022-04-01
```
`--config` is required option for any transformation. See
[config_example.yaml](tests/resources/real_config.yaml) for more information.
To check available options and documentation for a particular transformation,
run:
```bash
pramen-py transformations run TransformationClassName --help
```
where TransformationClassName is the name of the transformation.
## Using as a Library
Read metastore tables by Pramen-Py API
```python
import datetime
from pyspark.sql import SparkSession
from pramen_py import MetastoreReader
from pramen_py.utils.file_system import FileSystemUtils
spark = SparkSession.getOrCreate()
hocon_config = FileSystemUtils(spark) \
.load_hocon_config_from_hadoop("uri_or_path_to_file")
metastore = MetastoreReader(spark) \
.from_config(hocon_config)
df_txn = metastore.get_table(
"transactions",
info_date_from=datetime.date(2022, 1, 1),
info_date_to=datetime.date(2022, 6, 1)
)
df_customer = metastore.get_latest("customer")
df_txn.show(truncate=False)
df_customer.show(truncate=False)
```
## Development
Prerequisites:
- <https://python-poetry.org/docs/#installation>
- python 3.6
Setup steps:
```bash
git clone https://github.com/AbsaOSS/pramen
cd pramen-py
make install # create virtualenv and install dependencies
make test
make pre-commit
# enable completions
# source <(pramen-py completions zsh)
# source <(pramen-py completions bash)
pramen-py --help
```
### Load environment configuration
Before doing any development step, you have to set your development
environment variables
```bash
make install
```
## Completions
```bash
# enable completions
source <(pramen-py completions zsh)
# or for bash
# source <(pramen-py completions bash)
```
## Deployment
### From the local development environment
```bash
# bump the version
vim pyproject.toml
# deploy to the dev environment (included steps of building and publishing
# artefacts)
cat .env.ci
make publish
```
Raw data
{
"_id": null,
"home_page": "https://github.com/AbsaOSS/pramen",
"name": "pramen-py",
"maintainer": "Artem Zhukov",
"docs_url": null,
"requires_python": "<4.0,>=3.6.8",
"maintainer_email": "iam@zhukovgreen.pro",
"keywords": "paramen, pyspark, transformations, metastore",
"author": "Artem Zhukov",
"author_email": "iam@zhukovgreen.pro",
"download_url": "https://files.pythonhosted.org/packages/7e/c9/f92b6fcfadf22a29008660657513ead5c921cce0867902402ed04ea45769/pramen_py-1.10.1.tar.gz",
"platform": null,
"description": "# Pramen-py\n\nCli application for defining the data transformations for Pramen.\n\nSee:\n```bash\npramen-py --help\n```\nfor more information.\n\n\n## Installation\n\n### App settings\n\nApplication configuration solved by the environment variables\n(see .env.example)\n\n### Add pramen-py as a dependency to your project\n\nIn case of poetry:\n\n```bash\n# ensure we have valid poetry environment\nls pyproject.toml || poetry init\n\npoetry add pramen-py\n```\nIn case of pip:\n\n```bash\npip install pramen-py\n```\n\n\n## Usage\n\n## Application configuration\n\nIn order to configure the pramen-py options you need to set\ncorresponding environment variables. To see the list of available options run:\n\n```bash\npramen-py list-configuration-options\n```\n\n### Developing transformations\n\npramen-py uses python's\n[namespace packages](https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages)\nfor discovery of the transformations.\n\nThis mean, that in order to build a new transformer, it should be located\ninside a python package with the `transformations` directory inside.\n\nThis directory should be declared as a package:\n- for poetry\n```toml\n[tool.poetry]\n# ...\npackages = [\n { include = \"transformations\" },\n]\n\n```\n- for setup.py\n```python\nfrom setuptools import setup, find_namespace_packages\n\nsetup(\n name='mynamespace-subpackage-a',\n # ...\n packages=find_namespace_packages(include=['transformations.*'])\n)\n```\n\nExample files structure:\n```\n\u276f tree .\n.\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 poetry.lock\n\u251c\u2500\u2500 pyproject.toml\n\u251c\u2500\u2500 tests\n\u2502 \u2514\u2500\u2500 test_identity_transformer.py\n\u2514\u2500\u2500 transformations\n \u2514\u2500\u2500 identity_transformer\n \u251c\u2500\u2500 __init__.py\n \u2514\u2500\u2500 example_config.yaml\n```\n\nIn order to make transformer picked up by the pramen-py the following\nconditions should be satisfied:\n- python package containing the transformers should be installed to the\nsame python environment as pramen-py\n- python package should have defined namespace package `transformations`\n- transformers should extend `pramen_py.Transformation` base class\n\nSubclasses created by extending Transformation base class are registered as\na cli command (pramen-py transformations run TransformationSubclassName)\nwith default options. Check:\n\n```bash\npramen-py transformations run ExampleTransformation1 --help\n```\n\nfor more details.\n\nYou can add your own cli options to your transformations. See example at\n[ExampleTransformation2](transformations/example_trasformation_two/some_transformation.py)\n\n### pramen-py pytest plugin\n\npramen-py also provides pytest plugin with helpful\nfixtures to test created transformers.\n\nList of available fixtures:\n```bash\n#install pramen-py into the environment and activate it\npytest --fixtures\n# check under --- fixtures defined from pramen_py.test_utils.fixtures ---\n```\n\npramen-py pytest plugin also loads environment variables from .env\nfile if it is presented in the root of the repo.\n\n### Running and configuring transformations\n\nTransformations can be run with the following command:\n```bash\npramen-py transformations run \\\n ExampleTransformation1 \\\n --config config.yml \\\n --info-date 2022-04-01\n```\n\n`--config` is required option for any transformation. See\n[config_example.yaml](tests/resources/real_config.yaml) for more information.\n\nTo check available options and documentation for a particular transformation,\nrun:\n```bash\npramen-py transformations run TransformationClassName --help\n```\nwhere TransformationClassName is the name of the transformation.\n\n## Using as a Library\nRead metastore tables by Pramen-Py API\n```python\nimport datetime\nfrom pyspark.sql import SparkSession\nfrom pramen_py import MetastoreReader\nfrom pramen_py.utils.file_system import FileSystemUtils\n\nspark = SparkSession.getOrCreate()\n\nhocon_config = FileSystemUtils(spark) \\\n .load_hocon_config_from_hadoop(\"uri_or_path_to_file\")\n\nmetastore = MetastoreReader(spark) \\\n .from_config(hocon_config)\n\ndf_txn = metastore.get_table(\n \"transactions\",\n info_date_from=datetime.date(2022, 1, 1),\n info_date_to=datetime.date(2022, 6, 1)\n)\n\ndf_customer = metastore.get_latest(\"customer\")\n\ndf_txn.show(truncate=False)\ndf_customer.show(truncate=False)\n```\n\n## Development\n\nPrerequisites:\n- <https://python-poetry.org/docs/#installation>\n- python 3.6\n\nSetup steps:\n\n```bash\ngit clone https://github.com/AbsaOSS/pramen\ncd pramen-py\nmake install # create virtualenv and install dependencies\nmake test\nmake pre-commit\n\n# enable completions\n# source <(pramen-py completions zsh)\n# source <(pramen-py completions bash)\n\npramen-py --help\n```\n\n\n### Load environment configuration\n\nBefore doing any development step, you have to set your development\nenvironment variables\n\n```bash\nmake install\n```\n\n## Completions\n\n```bash\n# enable completions\nsource <(pramen-py completions zsh)\n# or for bash\n# source <(pramen-py completions bash)\n```\n\n\n## Deployment\n\n### From the local development environment\n\n```bash\n# bump the version\nvim pyproject.toml\n\n# deploy to the dev environment (included steps of building and publishing\n# artefacts)\ncat .env.ci\nmake publish\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Pramen transformations written in python",
"version": "1.10.1",
"project_urls": {
"Homepage": "https://github.com/AbsaOSS/pramen",
"Repository": "https://github.com/AbsaOSS/pramen"
},
"split_keywords": [
"paramen",
" pyspark",
" transformations",
" metastore"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "106d85187aeb324c4fe5f95406a09d1dd130bcb0020e5ce17a61421c5d43ce80",
"md5": "e17890a9c0c4cb39eb6a43c063824634",
"sha256": "208e5bffc9f5dc936fb42b1b05a8f50e598f6c4403647f6fa440b0cdf8a7cdc6"
},
"downloads": -1,
"filename": "pramen_py-1.10.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e17890a9c0c4cb39eb6a43c063824634",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.6.8",
"size": 45704,
"upload_time": "2024-11-12T10:21:55",
"upload_time_iso_8601": "2024-11-12T10:21:55.291035Z",
"url": "https://files.pythonhosted.org/packages/10/6d/85187aeb324c4fe5f95406a09d1dd130bcb0020e5ce17a61421c5d43ce80/pramen_py-1.10.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7ec9f92b6fcfadf22a29008660657513ead5c921cce0867902402ed04ea45769",
"md5": "f69cdc5c64edd911c27b07507b04e398",
"sha256": "61ef89eec5c6b0363f0d180f1341ee4fa848cdb69ee4a99be613183468c98301"
},
"downloads": -1,
"filename": "pramen_py-1.10.1.tar.gz",
"has_sig": false,
"md5_digest": "f69cdc5c64edd911c27b07507b04e398",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.6.8",
"size": 26784,
"upload_time": "2024-11-12T10:21:57",
"upload_time_iso_8601": "2024-11-12T10:21:57.010909Z",
"url": "https://files.pythonhosted.org/packages/7e/c9/f92b6fcfadf22a29008660657513ead5c921cce0867902402ed04ea45769/pramen_py-1.10.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-12 10:21:57",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AbsaOSS",
"github_project": "pramen",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pramen-py"
}