# Pramen-py
Cli application for defining the data transformations for Pramen.
See:
```bash
pramen-py --help
```
for more information.
## Installation
### App settings
Application configuration solved by the environment variables
(see .env.example)
### Add pramen-py as a dependency to your project
In case of poetry:
```bash
# ensure we have valid poetry environment
ls pyproject.toml || poetry init
poetry add pramen-py
```
In case of pip:
```bash
pip install pramen-py
```
## Usage
## Application configuration
In order to configure the pramen-py options you need to set
corresponding environment variables. To see the list of available options run:
```bash
pramen-py list-configuration-options
```
### Developing transformations
pramen-py uses python's
[namespace packages](https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages)
for discovery of the transformations.
This mean, that in order to build a new transformer, it should be located
inside a python package with the `transformations` directory inside.
This directory should be declared as a package:
- for poetry
```toml
[tool.poetry]
# ...
packages = [
{ include = "transformations" },
]
```
- for setup.py
```python
from setuptools import setup, find_namespace_packages
setup(
name='mynamespace-subpackage-a',
# ...
packages=find_namespace_packages(include=['transformations.*'])
)
```
Example files structure:
```
❯ tree .
.
├── README.md
├── poetry.lock
├── pyproject.toml
├── tests
│ └── test_identity_transformer.py
└── transformations
└── identity_transformer
├── __init__.py
└── example_config.yaml
```
In order to make transformer picked up by the pramen-py the following
conditions should be satisfied:
- python package containing the transformers should be installed to the
same python environment as pramen-py
- python package should have defined namespace package `transformations`
- transformers should extend `pramen_py.Transformation` base class
Subclasses created by extending Transformation base class are registered as
a cli command (pramen-py transformations run TransformationSubclassName)
with default options. Check:
```bash
pramen-py transformations run ExampleTransformation1 --help
```
for more details.
You can add your own cli options to your transformations. See example at
[ExampleTransformation2](transformations/example_trasformation_two/some_transformation.py)
### pramen-py pytest plugin
pramen-py also provides pytest plugin with helpful
fixtures to test created transformers.
List of available fixtures:
```bash
#install pramen-py into the environment and activate it
pytest --fixtures
# check under --- fixtures defined from pramen_py.test_utils.fixtures ---
```
pramen-py pytest plugin also loads environment variables from .env
file if it is presented in the root of the repo.
### Running and configuring transformations
Transformations can be run with the following command:
```bash
pramen-py transformations run \
ExampleTransformation1 \
--config config.yml \
--info-date 2022-04-01
```
`--config` is required option for any transformation. See
[config_example.yaml](tests/resources/real_config.yaml) for more information.
To check available options and documentation for a particular transformation,
run:
```bash
pramen-py transformations run TransformationClassName --help
```
where TransformationClassName is the name of the transformation.
## Using as a Library
Read metastore tables by Pramen-Py API
```python
import datetime
from pyspark.sql import SparkSession
from pramen_py import MetastoreReader
from pramen_py.utils.file_system import FileSystemUtils
spark = SparkSession.getOrCreate()
hocon_config = FileSystemUtils(spark) \
.load_hocon_config_from_hadoop("uri_or_path_to_file")
metastore = MetastoreReader(spark) \
.from_config(hocon_config)
df_txn = metastore.get_table(
"transactions",
info_date_from=datetime.date(2022, 1, 1),
info_date_to=datetime.date(2022, 6, 1)
)
df_customer = metastore.get_latest("customer")
df_txn.show(truncate=False)
df_customer.show(truncate=False)
```
## Development
Prerequisites:
- <https://python-poetry.org/docs/#installation>
- python 3.6
Setup steps:
```bash
git clone https://github.com/AbsaOSS/pramen
cd pramen-py
make install # create virtualenv and install dependencies
make test
make pre-commit
# enable completions
# source <(pramen-py completions zsh)
# source <(pramen-py completions bash)
pramen-py --help
```
### Load environment configuration
Before doing any development step, you have to set your development
environment variables
```bash
make install
```
## Completions
```bash
# enable completions
source <(pramen-py completions zsh)
# or for bash
# source <(pramen-py completions bash)
```
## Deployment
### From the local development environment
```bash
# bump the version
vim pyproject.toml
# deploy to the dev environment (included steps of building and publishing
# artefacts)
cat .env.ci
make publish
```
Raw data
{
"_id": null,
"home_page": "https://github.com/AbsaOSS/pramen",
"name": "pramen-py",
"maintainer": "Artem Zhukov",
"docs_url": null,
"requires_python": "<4.0,>=3.6.8",
"maintainer_email": "iam@zhukovgreen.pro",
"keywords": "paramen, pyspark, transformations, metastore",
"author": "Artem Zhukov",
"author_email": "iam@zhukovgreen.pro",
"download_url": "https://files.pythonhosted.org/packages/f6/2b/0a78838625fce9511caf1e6c6a58839dd8dec5d597e7288b721e98cc3784/pramen_py-1.10.4.tar.gz",
"platform": null,
"description": "# Pramen-py\n\nCli application for defining the data transformations for Pramen.\n\nSee:\n```bash\npramen-py --help\n```\nfor more information.\n\n\n## Installation\n\n### App settings\n\nApplication configuration solved by the environment variables\n(see .env.example)\n\n### Add pramen-py as a dependency to your project\n\nIn case of poetry:\n\n```bash\n# ensure we have valid poetry environment\nls pyproject.toml || poetry init\n\npoetry add pramen-py\n```\nIn case of pip:\n\n```bash\npip install pramen-py\n```\n\n\n## Usage\n\n## Application configuration\n\nIn order to configure the pramen-py options you need to set\ncorresponding environment variables. To see the list of available options run:\n\n```bash\npramen-py list-configuration-options\n```\n\n### Developing transformations\n\npramen-py uses python's\n[namespace packages](https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages)\nfor discovery of the transformations.\n\nThis mean, that in order to build a new transformer, it should be located\ninside a python package with the `transformations` directory inside.\n\nThis directory should be declared as a package:\n- for poetry\n```toml\n[tool.poetry]\n# ...\npackages = [\n { include = \"transformations\" },\n]\n\n```\n- for setup.py\n```python\nfrom setuptools import setup, find_namespace_packages\n\nsetup(\n name='mynamespace-subpackage-a',\n # ...\n packages=find_namespace_packages(include=['transformations.*'])\n)\n```\n\nExample files structure:\n```\n\u276f tree .\n.\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 poetry.lock\n\u251c\u2500\u2500 pyproject.toml\n\u251c\u2500\u2500 tests\n\u2502 \u2514\u2500\u2500 test_identity_transformer.py\n\u2514\u2500\u2500 transformations\n \u2514\u2500\u2500 identity_transformer\n \u251c\u2500\u2500 __init__.py\n \u2514\u2500\u2500 example_config.yaml\n```\n\nIn order to make transformer picked up by the pramen-py the following\nconditions should be satisfied:\n- python package containing the transformers should be installed to the\nsame python environment as pramen-py\n- python package should have defined namespace package `transformations`\n- transformers should extend `pramen_py.Transformation` base class\n\nSubclasses created by extending Transformation base class are registered as\na cli command (pramen-py transformations run TransformationSubclassName)\nwith default options. Check:\n\n```bash\npramen-py transformations run ExampleTransformation1 --help\n```\n\nfor more details.\n\nYou can add your own cli options to your transformations. See example at\n[ExampleTransformation2](transformations/example_trasformation_two/some_transformation.py)\n\n### pramen-py pytest plugin\n\npramen-py also provides pytest plugin with helpful\nfixtures to test created transformers.\n\nList of available fixtures:\n```bash\n#install pramen-py into the environment and activate it\npytest --fixtures\n# check under --- fixtures defined from pramen_py.test_utils.fixtures ---\n```\n\npramen-py pytest plugin also loads environment variables from .env\nfile if it is presented in the root of the repo.\n\n### Running and configuring transformations\n\nTransformations can be run with the following command:\n```bash\npramen-py transformations run \\\n ExampleTransformation1 \\\n --config config.yml \\\n --info-date 2022-04-01\n```\n\n`--config` is required option for any transformation. See\n[config_example.yaml](tests/resources/real_config.yaml) for more information.\n\nTo check available options and documentation for a particular transformation,\nrun:\n```bash\npramen-py transformations run TransformationClassName --help\n```\nwhere TransformationClassName is the name of the transformation.\n\n## Using as a Library\nRead metastore tables by Pramen-Py API\n```python\nimport datetime\nfrom pyspark.sql import SparkSession\nfrom pramen_py import MetastoreReader\nfrom pramen_py.utils.file_system import FileSystemUtils\n\nspark = SparkSession.getOrCreate()\n\nhocon_config = FileSystemUtils(spark) \\\n .load_hocon_config_from_hadoop(\"uri_or_path_to_file\")\n\nmetastore = MetastoreReader(spark) \\\n .from_config(hocon_config)\n\ndf_txn = metastore.get_table(\n \"transactions\",\n info_date_from=datetime.date(2022, 1, 1),\n info_date_to=datetime.date(2022, 6, 1)\n)\n\ndf_customer = metastore.get_latest(\"customer\")\n\ndf_txn.show(truncate=False)\ndf_customer.show(truncate=False)\n```\n\n## Development\n\nPrerequisites:\n- <https://python-poetry.org/docs/#installation>\n- python 3.6\n\nSetup steps:\n\n```bash\ngit clone https://github.com/AbsaOSS/pramen\ncd pramen-py\nmake install # create virtualenv and install dependencies\nmake test\nmake pre-commit\n\n# enable completions\n# source <(pramen-py completions zsh)\n# source <(pramen-py completions bash)\n\npramen-py --help\n```\n\n\n### Load environment configuration\n\nBefore doing any development step, you have to set your development\nenvironment variables\n\n```bash\nmake install\n```\n\n## Completions\n\n```bash\n# enable completions\nsource <(pramen-py completions zsh)\n# or for bash\n# source <(pramen-py completions bash)\n```\n\n\n## Deployment\n\n### From the local development environment\n\n```bash\n# bump the version\nvim pyproject.toml\n\n# deploy to the dev environment (included steps of building and publishing\n# artefacts)\ncat .env.ci\nmake publish\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Pramen transformations written in python",
"version": "1.10.4",
"project_urls": {
"Homepage": "https://github.com/AbsaOSS/pramen",
"Repository": "https://github.com/AbsaOSS/pramen"
},
"split_keywords": [
"paramen",
" pyspark",
" transformations",
" metastore"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d9acce61fb929df369c2950959aa87f47960b9b6368c8ae35769986e0e73219c",
"md5": "6e699d1d25b22478cf956ddd0559d0d9",
"sha256": "8b8f94b686f46e39caf50a106e67b93b31c75555a0fb664403c7f637ca1026dd"
},
"downloads": -1,
"filename": "pramen_py-1.10.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6e699d1d25b22478cf956ddd0559d0d9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.6.8",
"size": 45703,
"upload_time": "2025-01-13T09:24:57",
"upload_time_iso_8601": "2025-01-13T09:24:57.585992Z",
"url": "https://files.pythonhosted.org/packages/d9/ac/ce61fb929df369c2950959aa87f47960b9b6368c8ae35769986e0e73219c/pramen_py-1.10.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f62b0a78838625fce9511caf1e6c6a58839dd8dec5d597e7288b721e98cc3784",
"md5": "07844a458b5ba679e9bf8a697ad4a6ee",
"sha256": "36d0a59ddea6039e4e4b642e52569d159e597e7a114fa5bdcb68c87d654e35a1"
},
"downloads": -1,
"filename": "pramen_py-1.10.4.tar.gz",
"has_sig": false,
"md5_digest": "07844a458b5ba679e9bf8a697ad4a6ee",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.6.8",
"size": 26786,
"upload_time": "2025-01-13T09:25:00",
"upload_time_iso_8601": "2025-01-13T09:25:00.021285Z",
"url": "https://files.pythonhosted.org/packages/f6/2b/0a78838625fce9511caf1e6c6a58839dd8dec5d597e7288b721e98cc3784/pramen_py-1.10.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-13 09:25:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AbsaOSS",
"github_project": "pramen",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pramen-py"
}