[![Python Test](https://github.com/Planet-AI-GmbH/tfaip/actions/workflows/python-test.yml/badge.svg)](https://github.com/Planet-AI-GmbH/tfaip/actions/workflows/python-test.yml)
[![Python Test](https://github.com/Planet-AI-GmbH/tfaip/actions/workflows/python-publish.yml/badge.svg)](https://github.com/Planet-AI-GmbH/tfaip/actions/workflows/python-publish.yml)
# _tfaip_ - A Generic and Powerful Research Framework for Deep Learning based on Tensorflow
*tfaip* is a Python-based research framework for developing, organizing, and deploying Deep Learning models powered by [Tensorflow](https://www.tensorflow.org/).
It enables to implement both simple and complex scenarios that are structured and highly configurable by parameters that can directly be modified by the command line (read the [docs](https://tfaip.readthedocs.io)).
For example, the [tutorial.full](examples/tutorial/full)-scenario for learning MNIST allows to modify the graph during training but also other hyper-parameters such as the optimizer:
```bash
export PYTHONPATH=$PWD # set the PYTHONPATH so that the examples dir is found
# Change the graph
tfaip-train examples.tutorial.full --model.graph MLP --model.graph.nodes 200 100 50 --model.graph.activation relu
tfaip-train examples.tutorial.full --model.graph MLP --model.graph.nodes 200 100 50 --model.graph.activation tanh
tfaip-train examples.tutorial.full --model.graph CNN --model.graph.filters 40 20 --model.graph.dense 100
# Change the optimizer
tfaip-train examples.tutorial.full --trainer.optimizer RMSprop --trainer.optimizer.beta1 0.01 --trainer.optimizer.clip_global_norm 1
# ...
```
A trained model can then easily be integrated in a workflow to predict provided `data`:
```python
predictor = TutorialScenario.create_predictor("PATH_TO_TRAINED_MODEL", PredictorParams())
for sample in predictor.predict(data):
print(sample.outputs)
```
In practice, _tfaip_ follows the rules of object orientation, i.e., the code for a scenario (e.g., image-classification (MNIST), text recognition, NLP, etc.) is organized by implementing classes.
By default, each [`Scenario`](https://tfaip.readthedocs.io/en/latest/doc.scenario.html) must implement [`Model`](https://tfaip.readthedocs.io/en/latest/doc.model.html), and [`Data`](https://tfaip.readthedocs.io/en/latest/doc.data.html).
See [here](examples/tutorial/full) for the complete code to run the upper example for MNIST and see [here](examples/tutorial/min) for the minimal setup.
## Setup
To setup _tfaip_ create a virtual Python (at least 3.7) environment and install the `tfaip` pip package: `pip install tfaip`:
```bash
virtualenv -p python3 venv
source venv/bin/activate
pip install tfaip
pip install tfaip[devel] # to install additional development/test requirements
```
Have a look at the [wiki](https://tfaip.readthedocs.io/en/latest/doc.installation.html) for further setup instructions.
## Run the Tutorial
After the setup succeeded, launch a training of the tutorial which is an implementation of the common MNIST scenario:
```bash
export PYTHONPATH=$PWD # set the PYTHONPATH so that the examples dir is found
tfaip-train examples.tutorial.full
# If you have a GPU, select it by specifying its ID
tfaip-train examples.tutorial.full --device.gpus 0
```
## Next Steps
Start reading the [Minimum Tutorial](examples/tutorial/min), optionally have a look at the [Full Tutorial](examples/tutorial/full) to see more features.
The [docs](https://tfaip.readthedocs.io/en/latest) provides a full description of `tfaip`.
To set up a _new custom scenario_, copy the [general template](examples/template/general) and implement the abstract methods.
Consider renaming the classes!
Launch the training by providing the path or package-name of the new scenario which _must_ be located in the `PYTHONPATH`!
## Features of _tfaip_
_tfaip_ provides different features which allow designing generic scenarios with maximum flexibility and high performance.
### Code design
* _Fully Object-Oriented_: Implement classes and abstract functions or overwrite any function to extend, adapt, or modify its default functionality.
* _Typing support_: _tfaip_ is fully typed with simplifies working with an IDE (e.g., use PyCharm!).
* Using pythons `dataclasses` module to set up parameters which are automatically converted to parameters of the command line by our [`paiargparse`](https://github.com/Planet-AI-GmbH/paiargparse) package.
### Data-Pipeline
Every scenario requires the setup of a data-pipeline to read and transform data.
*tfaip* offers to easily implement and modify even complex pipelines by defining multiple `DataProcessors` which usually implement a small operation to map an input sample to an output sample.
E.g., one `DataProcessor` loads the data (`input=filename`, `output=image`), another one applies normalization rules, again another one applies data augmentation, etc.
The **great advantage** of this setup is that the data processors run in Python and can automatically be parallelized by *tfaip* for speed up by setting `run_parallel=True`.
### Deep-Learning-Features
Since _tfaip_ is based on Tensorflow the full API are available for designing models, graphs, and even data pipelines.
Furthermore, *tfaip* supports additional common techniques for improving the performance of a Deep-Learning model out of the box:
* Warm-starting (i.e., loading a pretrained model)
* EMA-weights
* Early-Stopping
* Weight-Decay
* various optimizers and learning-rate schedules
## Contributing
We highly encourage users to contribute own scenarios and improvements of _tfaip_.
Please read the [contribution guidelines](https://tfaip.readthedocs.io/en/latest/doc.development.html).
## Benchmarks
All timings were obtained on a Intel Core i7, 10th Gen CPU.
### MNIST
The following Table compares the MNIST Tutorial of Keras to the [Minimum Tutorial](examples/tutorial/min).
The keras code was adopted to use the same network architecture and hyperparemter settings (batch size of 16, 10 epochs of training).
Code | Time Per Epoch | Train Acc | Val Acc | Best Val Acc
:---- | --------------: | ---------: | -------: | ------------:
Keras | 16 s | 99.65% | 98.24% | 98.60%
_tfaip_ | 18 s | 99.76% | 98.66% | 98.66%
_tfaip_ and Keras result in comparable accuracies, as to be expected since the actual code for training the graph is fundamentally identical.
_tfaip_ is however a bit slower due some overhead in the input pipeline and additional functionality (e.g., benchmarks, or automatic tracking of the best model).
This overhead is negligible for almost any real-world scenario because due to a clearly larger network architecture, the computation times for inference and backpropagation become the bottleneck.
### Data Pipeline
Integrating pure-python operations (e.g., numpy) into a `tf.data.Dataset `to apply high-level preprocessing is slow by default since [tf.data.Dataset.map](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#map) in cooperation with [tf.py_function](https://www.tensorflow.org/api_docs/python/tf/py_function) does not run in parallel and is therefore blocked by Python's GIL.
_tfaip_ curcumvents this issue by providing an (optional) parallelizable input pipeline.
The following table shows the time in seconds for two different tasks:
* PYTHON: applying some pure python functions on the data
* NUMPY: applying several numpy operations on the data
| Mode | Task | Threads 1 | Threads 2 | Threads 4 | Threads 6 |
|:---------------------|:--------------|--------------------:|--------------------:|--------------------:|--------------------:|
| tf.py_function | PYTHON | 23.47| 22.78 | 24.38 | 25.76 |
| _tfaip_ | PYTHON | 26.68| 14.48 | 8.11 | 8.13 |
| tf.py_function | NUMPY | 104.10 | 82.78 | 76.33 | 77.56 |
| _tfaip_ | NUMPY | 97.07 | 56.93 | 43.78 | 42.73 |
The PYTHON task clearly shows that `tf.data.Dataset.map` is not able to utilize multiple threads.
The speed-up in the NUMPY tasks occurs possibly due to paralization in the numpy API to C.
Raw data
{
"_id": null,
"home_page": "https://github.com/bertsky/tfaip",
"name": "ocrd-fork-tfaip",
"maintainer": "bertsky",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "machine learning, tensorflow, framework",
"author": "PLANET AI GmbH",
"author_email": "admin@planet-ai.de",
"download_url": "https://files.pythonhosted.org/packages/91/c9/fff687dd7819268cb4a5a492077767a800cf6a82f7fabdc8bffd0b2c4db5/ocrd_fork_tfaip-1.2.7.tar.gz",
"platform": null,
"description": "[![Python Test](https://github.com/Planet-AI-GmbH/tfaip/actions/workflows/python-test.yml/badge.svg)](https://github.com/Planet-AI-GmbH/tfaip/actions/workflows/python-test.yml)\n[![Python Test](https://github.com/Planet-AI-GmbH/tfaip/actions/workflows/python-publish.yml/badge.svg)](https://github.com/Planet-AI-GmbH/tfaip/actions/workflows/python-publish.yml)\n\n# _tfaip_ - A Generic and Powerful Research Framework for Deep Learning based on Tensorflow\n\n*tfaip* is a Python-based research framework for developing, organizing, and deploying Deep Learning models powered by [Tensorflow](https://www.tensorflow.org/).\nIt enables to implement both simple and complex scenarios that are structured and highly configurable by parameters that can directly be modified by the command line (read the [docs](https://tfaip.readthedocs.io)).\nFor example, the [tutorial.full](examples/tutorial/full)-scenario for learning MNIST allows to modify the graph during training but also other hyper-parameters such as the optimizer:\n```bash\nexport PYTHONPATH=$PWD # set the PYTHONPATH so that the examples dir is found\n# Change the graph\ntfaip-train examples.tutorial.full --model.graph MLP --model.graph.nodes 200 100 50 --model.graph.activation relu\ntfaip-train examples.tutorial.full --model.graph MLP --model.graph.nodes 200 100 50 --model.graph.activation tanh\ntfaip-train examples.tutorial.full --model.graph CNN --model.graph.filters 40 20 --model.graph.dense 100\n# Change the optimizer\ntfaip-train examples.tutorial.full --trainer.optimizer RMSprop --trainer.optimizer.beta1 0.01 --trainer.optimizer.clip_global_norm 1\n# ...\n```\n\nA trained model can then easily be integrated in a workflow to predict provided `data`:\n```python\npredictor = TutorialScenario.create_predictor(\"PATH_TO_TRAINED_MODEL\", PredictorParams())\nfor sample in predictor.predict(data):\n print(sample.outputs)\n```\n\nIn practice, _tfaip_ follows the rules of object orientation, i.e., the code for a scenario (e.g., image-classification (MNIST), text recognition, NLP, etc.) is organized by implementing classes.\nBy default, each [`Scenario`](https://tfaip.readthedocs.io/en/latest/doc.scenario.html) must implement [`Model`](https://tfaip.readthedocs.io/en/latest/doc.model.html), and [`Data`](https://tfaip.readthedocs.io/en/latest/doc.data.html).\nSee [here](examples/tutorial/full) for the complete code to run the upper example for MNIST and see [here](examples/tutorial/min) for the minimal setup.\n\n\n## Setup\n\nTo setup _tfaip_ create a virtual Python (at least 3.7) environment and install the `tfaip` pip package: `pip install tfaip`:\n```bash\nvirtualenv -p python3 venv\nsource venv/bin/activate\npip install tfaip\npip install tfaip[devel] # to install additional development/test requirements\n```\nHave a look at the [wiki](https://tfaip.readthedocs.io/en/latest/doc.installation.html) for further setup instructions.\n\n## Run the Tutorial\n\nAfter the setup succeeded, launch a training of the tutorial which is an implementation of the common MNIST scenario:\n```bash\nexport PYTHONPATH=$PWD # set the PYTHONPATH so that the examples dir is found\ntfaip-train examples.tutorial.full\n# If you have a GPU, select it by specifying its ID\ntfaip-train examples.tutorial.full --device.gpus 0\n```\n\n## Next Steps\n\nStart reading the [Minimum Tutorial](examples/tutorial/min), optionally have a look at the [Full Tutorial](examples/tutorial/full) to see more features.\nThe [docs](https://tfaip.readthedocs.io/en/latest) provides a full description of `tfaip`.\n\nTo set up a _new custom scenario_, copy the [general template](examples/template/general) and implement the abstract methods.\nConsider renaming the classes!\nLaunch the training by providing the path or package-name of the new scenario which _must_ be located in the `PYTHONPATH`!\n\n## Features of _tfaip_\n\n_tfaip_ provides different features which allow designing generic scenarios with maximum flexibility and high performance.\n\n### Code design\n\n* _Fully Object-Oriented_: Implement classes and abstract functions or overwrite any function to extend, adapt, or modify its default functionality.\n* _Typing support_: _tfaip_ is fully typed with simplifies working with an IDE (e.g., use PyCharm!).\n* Using pythons `dataclasses` module to set up parameters which are automatically converted to parameters of the command line by our [`paiargparse`](https://github.com/Planet-AI-GmbH/paiargparse) package.\n\n### Data-Pipeline\nEvery scenario requires the setup of a data-pipeline to read and transform data.\n*tfaip* offers to easily implement and modify even complex pipelines by defining multiple `DataProcessors` which usually implement a small operation to map an input sample to an output sample.\nE.g., one `DataProcessor` loads the data (`input=filename`, `output=image`), another one applies normalization rules, again another one applies data augmentation, etc.\nThe **great advantage** of this setup is that the data processors run in Python and can automatically be parallelized by *tfaip* for speed up by setting `run_parallel=True`.\n\n### Deep-Learning-Features\n\nSince _tfaip_ is based on Tensorflow the full API are available for designing models, graphs, and even data pipelines.\nFurthermore, *tfaip* supports additional common techniques for improving the performance of a Deep-Learning model out of the box:\n\n* Warm-starting (i.e., loading a pretrained model)\n* EMA-weights\n* Early-Stopping\n* Weight-Decay\n* various optimizers and learning-rate schedules\n\n## Contributing\n\nWe highly encourage users to contribute own scenarios and improvements of _tfaip_.\nPlease read the [contribution guidelines](https://tfaip.readthedocs.io/en/latest/doc.development.html).\n\n## Benchmarks\n\nAll timings were obtained on a Intel Core i7, 10th Gen CPU.\n\n### MNIST\n\nThe following Table compares the MNIST Tutorial of Keras to the [Minimum Tutorial](examples/tutorial/min).\nThe keras code was adopted to use the same network architecture and hyperparemter settings (batch size of 16, 10 epochs of training).\n\nCode | Time Per Epoch | Train Acc | Val Acc | Best Val Acc\n:---- | --------------: | ---------: | -------: | ------------: \nKeras | 16 s | 99.65% | 98.24% | 98.60% \n_tfaip_ | 18 s | 99.76% | 98.66% | 98.66% \n\n_tfaip_ and Keras result in comparable accuracies, as to be expected since the actual code for training the graph is fundamentally identical.\n_tfaip_ is however a bit slower due some overhead in the input pipeline and additional functionality (e.g., benchmarks, or automatic tracking of the best model).\nThis overhead is negligible for almost any real-world scenario because due to a clearly larger network architecture, the computation times for inference and backpropagation become the bottleneck. \n\n### Data Pipeline\n\nIntegrating pure-python operations (e.g., numpy) into a `tf.data.Dataset `to apply high-level preprocessing is slow by default since [tf.data.Dataset.map](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#map) in cooperation with [tf.py_function](https://www.tensorflow.org/api_docs/python/tf/py_function) does not run in parallel and is therefore blocked by Python's GIL.\n_tfaip_ curcumvents this issue by providing an (optional) parallelizable input pipeline.\nThe following table shows the time in seconds for two different tasks:\n\n* PYTHON: applying some pure python functions on the data\n* NUMPY: applying several numpy operations on the data\n\n\n| Mode | Task | Threads 1 | Threads 2 | Threads 4 | Threads 6 |\n|:---------------------|:--------------|--------------------:|--------------------:|--------------------:|--------------------:|\n| tf.py_function | PYTHON | 23.47| 22.78 | 24.38 | 25.76 |\n| _tfaip_ | PYTHON | 26.68| 14.48 | 8.11 | 8.13 |\n| tf.py_function | NUMPY | 104.10 | 82.78 | 76.33 | 77.56 |\n| _tfaip_ | NUMPY | 97.07 | 56.93 | 43.78 | 42.73 |\n\nThe PYTHON task clearly shows that `tf.data.Dataset.map` is not able to utilize multiple threads.\nThe speed-up in the NUMPY tasks occurs possibly due to paralization in the numpy API to C.\n\n",
"bugtrack_url": null,
"license": "GPL-v3.0",
"summary": "Python-based research framework for developing, organizing, and deploying Deep Learning models powered by Tensorflow.",
"version": "1.2.7",
"project_urls": {
"Homepage": "https://github.com/bertsky/tfaip"
},
"split_keywords": [
"machine learning",
" tensorflow",
" framework"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "adf1d09888ce5676c710f1eddb13656c80209342783b86cc2639ad25343ce089",
"md5": "110227275fa76500b3671e08c679cc97",
"sha256": "e33ab958eeecc6a6c99a1fc5e52e5b49e35731166650326675f42ac839b5bcd7"
},
"downloads": -1,
"filename": "ocrd_fork_tfaip-1.2.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "110227275fa76500b3671e08c679cc97",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 361547,
"upload_time": "2024-09-28T22:12:11",
"upload_time_iso_8601": "2024-09-28T22:12:11.489369Z",
"url": "https://files.pythonhosted.org/packages/ad/f1/d09888ce5676c710f1eddb13656c80209342783b86cc2639ad25343ce089/ocrd_fork_tfaip-1.2.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "91c9fff687dd7819268cb4a5a492077767a800cf6a82f7fabdc8bffd0b2c4db5",
"md5": "a2d283a7ebf091b22f09ad6644f14941",
"sha256": "9a17f13c16b4ece9d7275e9fdd10ebb3b38a73361c2fc5887496229a0b7b7fb0"
},
"downloads": -1,
"filename": "ocrd_fork_tfaip-1.2.7.tar.gz",
"has_sig": false,
"md5_digest": "a2d283a7ebf091b22f09ad6644f14941",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 183212,
"upload_time": "2024-09-28T22:12:14",
"upload_time_iso_8601": "2024-09-28T22:12:14.272533Z",
"url": "https://files.pythonhosted.org/packages/91/c9/fff687dd7819268cb4a5a492077767a800cf6a82f7fabdc8bffd0b2c4db5/ocrd_fork_tfaip-1.2.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-28 22:12:14",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "bertsky",
"github_project": "tfaip",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "ocrd-fork-tfaip"
}