# Validating config
Simple, extendable configuration objects
## Quick start
### Installation
As a *package*:
- ```bash
pip install valconfig
```
Or as *inlined source*:
- Copy the `valconfig.py` file into your project and install [Pydantic](https://pydantic-docs.helpmanual.io/).
### Example usage
```python
# config.py
from valconfig import ValConfig
from pydantic import HttpUrl
from scityping.numpy import Array
class Config(ValConfig):
n_units: int=5
connectivites: Array[float, 2] # 2D array of floats
url: HttpUrl="example.com"
config = Config()
```
For more detailed usage, see the [documentation](https://validating-config.readthedocs.io/).
## Motivation
Many types of projects make use of some system to set configuration options,
and while the type of project may influence that system, its basic requirements
are pretty much always the same:
- It should be easy to read.
- It should be easy to extend.
*Validating config* came about to fulfill these needs for **scientific programming**,
which is a bit of an extreme case: On the one hand, scientists are not professional programmers,
so the system needs to be so simple that it needs no documentation at all.
On the other hand, they are also constantly developing new methods, and so
also need to be able to add new configuration options and new variable types.
Moreover, code is often shared (at least within a group) between users with different machines and different needs. A key use of config objects in a scientific context are specifying machine-local settings, like enabling the GPU, defining paths for data and figures, or defining MPI flags.
Thus we needed a way to
- Set version-controlled defaults, that are packaged with the repository.
- Make it easy to know what the configurable options are.
- Make it easy to change those options, without touching the defaults.
And we want to that with nothing more complicated than a `git push`.
In addition, projects often depend on other projects, so we need a way to set
*their* options as well. Ideally all within the same configuration file.
Finally, not all parameters are strings or simple numerical values. Some might need to
be cast to NumPy arrays; others might need to be pre-processed, or checked for
consistency with other parameters, or simply inferred from other parameters.
These are all roles that conceptually belong in a configuration parser, not the core parts of our program.
As scientist programmers, we already have the bad habit of mixing up our programming logic into a big ball of spaghetti – let’s give ourselves one less reason to do so.
This adds two additional desiderata:
- Support for composition: have one configuration for multiple packages.
- Validation of values based on provided type information.
## What a *ValidatingConfig* provides
- A mechanism to define simple lists of configuration options with *zero boilerplate*:
just list the parameter names, their types, and optionally their default.
- *Built-in validation* provided by [Pydantic](https://pydantic-docs.helpmanual.io/).
- Additional validation for *scientific types*, when used in conjunction with [Scitying](https://scityping.readthedocs.io/).
- The ability to *compose* configuration objects from multiple packages into a
single central configuration object – **even when those packages don’t use _ValidatingConfig_**. This ensures all packages have access to a single source of truth.
- An optional mechanism to autogenerate a file for users’ local configuration,
with usage instructions embedded in the file.
+ This is done by defining a class variable [`__local_config_filename__`](https://validating-config.readthedocs.io/en/latest/Usage.html#config-class-options) and mostly useful for project code. For library code it is usually best to keep this undefined.
- The ability to *extend* the functionality of a config object with all the
functionality Python has to offer: read environment variables with `os.getenv`,
get command line parameters with `sys.argv`, etc.
+ Other packages typically need to add features like these because their config objects are
more limited. We avoid this by defining `config` with a standard class.
This keeps this package lean, and your code simple.
Other features
- Zero onboarding:
+ *Validating config* uses only standard Python: No custom function calls, no boilerplate.
If you know modern Python, you already know how to use this module.
+ Just define a `Config` class with the parameter names and their types.
+ Use the `Config` class as any other Python class.
- Define your own custom types with either *Pydantic* or *Scityping*.[^new-types]
- *Pydantic* provides [validators](https://docs.pydantic.dev/usage/validators/) which can be used to add per-field
validation and pre-processing. These are defined with plain Python,
not a restricted minilanguage or a sanitized `eval`.[^pydantic-v1]
- Use standard [properties](https://docs.python.org/3/library/functions.html?highlight=property#property) to define computed (inferred) values.
- Use any file format.
+ The default uses stdlib’s `configparser`, but you can override the method
used to read files.
- Relative paths in config files are correctly resolved.
- Automatic discovery of the local configuration file.
+ Just set the class variable `__local_config_filename__`, and `ValConfig`
will recurse up the directory tree until it finds a matching file.
+ This is especially convenient to accommodate multiple users who organize
their work differently.
- Hierarchical: Organize your parameters into categories by defining nested classes.
```python
from valconfig import ValConfig
class Config(ValConfig):
class figures:
width: float
format: str
class curves:
colormap: str | list[str]
class heatmaps:
colormap: str | list[str]
class run:
cache: bool
class model:
n_units: int
```
- Composable
+ Want multiple config objects for each of your project’s subpackages ?
Just import them.
+ Want to combine them into a single root config file ?
Just import them.
```python
from valconfig import ValConfig
from .subpkg1 import config as config1
from .subpkg2 import config as config2
class Config(ValConfig):
pkg1: config1
pkg2: config2
```
+ Does your project depend on another ?
Define the parameters you need to modify and add a function which updates
the config with those parameters.
```python
from valconfig import ValConfig
from other_package import config as OtherConfig
class Config(ValConfig):
# List the parameters you want to modify
other_package_option1: int
other_package_option2: float
# Include the 3rd party config
other_package: OtherConfig
@validator("other_package")
def update_other_package_options(cls, other_config, values):
# The exact details here will depend on the 3rd party package
other_config.opt1 = values["other_package_option1"]
other_config.opt2 = values["other_package_option2"]
```
[^new-types]: *Scityping* was developed as an extension of *Pydantic* to allow
the use of (abstract) base classes in type definitions, for example defining
a field of type `Model` which accepts any subclass of `Model`. (In plain
Pydantic values are always *coerced* to the target type.) Whether it is best
to define new types with either *Scityping* or *Pydantic* largely depends on
whether this use as abstract classes is needed.
[^pydantic-v1]: Currently validators need to be configured using the standards of Pydantic v1. Supporting Pydantic v2 validators is certainly something I would like to do eventually.
## In relation to other projects
What’s wrong with the gazillion other config file projects out there ?
Why not use one of those ?
In short, because we can now do in 200 lines of code what used to take thousands.
Most existing packages for config files were developed many years ago, before
the standardization of type annotations. The can all perform the basic task of
converting a file to a Python dictionary of strings, but fulfilling all of our
aforementioned desiderata was difficult without creating a bloated package.
Understandly therefore, they focused on the features required by their own use
cases, which means that I found them all unsatisfactory in some respect:
- Some introduce a new API with custom functions or decorators to define parameters. This makes it more difficult to learn and extend. ([vyper-config](https://github.com/alexferl/vyper), [hydra](https://hydra.cc/docs/intro/))
- Many provide no validation ([one-config](https://pypi.org/project/one-config/), [config2](https://github.com/grimen/python-config2)).
- When validations functions are provided, the target type is often not specified
by the config object, but in the calling code – if your configuration library is allows to define
the *name* but not the *type* of the parameters, that’s only 50% of the information and 20% of the work.
([configparser](https://docs.python.org/3/library/configparser.html), [vyper-config](https://github.com/alexferl/vyper))
- For the examples I know which provides validation at the config level,
the set of supported types is very limited and basically hard-coded. ([OmegaConf](https://omegaconf.readthedocs.io/en/latest/structured_config.html), [CFG API](https://docs.red-dove.com/cfg/python.html))
- Some approaches even define their own file formats, substantially raising the barrier to adoption. ([CFG API](https://docs.red-dove.com/cfg/python.html))
- The package [configobj](https://configobj.readthedocs.io/en/latest/index.html) is probably the most closely aligned with our goals: it provides a simple, declarative format for both parameter and types specification, as well as key features like hierarchical parameters. (But not, to my knowledge, the configuration of subpackages.) It is also mature, which unfortunately means that it pre-dates widespread use of validation libraries and therefore must package its own custom validation library.
+ On the other hand, *configobj* might make a great substitute to get around some of the limitations of the builtin `configparser` with regards to reading files.
## A simple implementation
With the standardization of type annotations in Python 3.6–3.8, and the availability of classes like [Pydantic](https://pydantic-docs.helpmanual.io/)’s `BaseModel`, defining classes with validation
logic has become a breeze, and converting a `BaseModel` into a full-featured
config parser basically only needs two things:
- functionality for reading values from files;
- ability to compose configuration classes.
In effect therefore, `ValConfig` is just a subclass of `BaseModel` with some
simple methods for reading config files. The biggest difference is that each
subclass of `ValConfig` is made a *singleton*. We use this pattern for exactly
the same reasons as [one-config](https://pypi.org/project/one-config/): it
solves a host of corner cases, makes it trivial to support composing configs,
and ensures that config objects are a single source of truth.
By relying on Pydantic, `ValConfig` can be extremely lean, while still providing
functionality that is on-par with the much heavier packages listed above.
Pydantic itself is highly mature and actively maintained.
> The `ValConfig` class and all its utility functions clock in at less than
> 500 lines of code and are contained in a single, easily maintained module. If
> desired, the module can even be included as part of your project’s source
> code, thus removing *valconfig* as package dependency and giving you
> full control over the config object.
Finally, *using ValConfig does not preclude from using another config file parser*.
Indeed, the provided implementation uses `configparser` to parse files mostly
because it is already installed part of the standard library. The execution flow
is basically
1. Read config file(s) with `configparser`.
2. Validate with `Pydantic`.
To use a different config file parser, just subclass `ValConfig` and override the
the method `read_cfg_file()`.
Raw data
{
"_id": null,
"home_page": null,
"name": "valconfig",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "config, configuration, validation, hierarchical, scientific computing, reproducibility",
"author": null,
"author_email": "Alexandre Ren\u00e9 <alex.c.rene@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/08/75/e82e93fda83ee23d4fb7e5a643b9b7b3576881a547d692ae43dbb69e74d3/valconfig-0.1.2.tar.gz",
"platform": null,
"description": "# Validating config\n\nSimple, extendable configuration objects\n\n## Quick start\n\n### Installation\n\nAs a *package*:\n\n- ```bash\n pip install valconfig\n ```\n\nOr as *inlined source*:\n- Copy the `valconfig.py` file into your project and install [Pydantic](https://pydantic-docs.helpmanual.io/).\n\n### Example usage\n\n```python\n# config.py\nfrom valconfig import ValConfig\nfrom pydantic import HttpUrl\nfrom scityping.numpy import Array\n\nclass Config(ValConfig):\n n_units: int=5\n connectivites: Array[float, 2] # 2D array of floats\n url: HttpUrl=\"example.com\"\n\nconfig = Config()\n```\n\nFor more detailed usage, see the [documentation](https://validating-config.readthedocs.io/).\n\n## Motivation\n\nMany types of projects make use of some system to set configuration options,\nand while the type of project may influence that system, its basic requirements\nare pretty much always the same:\n\n- It should be easy to read.\n- It should be easy to extend.\n\n*Validating config* came about to fulfill these needs for **scientific programming**,\nwhich is a bit of an extreme case: On the one hand, scientists are not professional programmers,\nso the system needs to be so simple that it needs no documentation at all.\nOn the other hand, they are also constantly developing new methods, and so\nalso need to be able to add new configuration options and new variable types.\nMoreover, code is often shared (at least within a group) between users with different machines and different needs. A key use of config objects in a scientific context are specifying machine-local settings, like enabling the GPU, defining paths for data and figures, or defining MPI flags. \n\nThus we needed a way to\n\n- Set version-controlled defaults, that are packaged with the repository.\n- Make it easy to know what the configurable options are.\n- Make it easy to change those options, without touching the defaults.\n\nAnd we want to that with nothing more complicated than a `git push`.\n\nIn addition, projects often depend on other projects, so we need a way to set\n*their* options as well. Ideally all within the same configuration file.\n\nFinally, not all parameters are strings or simple numerical values. Some might need to\nbe cast to NumPy arrays; others might need to be pre-processed, or checked for\nconsistency with other parameters, or simply inferred from other parameters.\nThese are all roles that conceptually belong in a configuration parser, not the core parts of our program.\nAs scientist programmers, we already have the bad habit of mixing up our programming logic into a big ball of spaghetti \u2013 let\u2019s give ourselves one less reason to do so.\n\nThis adds two additional desiderata:\n\n- Support for composition: have one configuration for multiple packages.\n- Validation of values based on provided type information.\n\n## What a *ValidatingConfig* provides\n\n- A mechanism to define simple lists of configuration options with *zero boilerplate*:\n just list the parameter names, their types, and optionally their default.\n- *Built-in validation* provided by [Pydantic](https://pydantic-docs.helpmanual.io/).\n- Additional validation for *scientific types*, when used in conjunction with [Scitying](https://scityping.readthedocs.io/).\n- The ability to *compose* configuration objects from multiple packages into a\n single central configuration object \u2013 **even when those packages don\u2019t use _ValidatingConfig_**. This ensures all packages have access to a single source of truth.\n- An optional mechanism to autogenerate a file for users\u2019 local configuration,\n with usage instructions embedded in the file.\n + This is done by defining a class variable [`__local_config_filename__`](https://validating-config.readthedocs.io/en/latest/Usage.html#config-class-options) and mostly useful for project code. For library code it is usually best to keep this undefined.\n- The ability to *extend* the functionality of a config object with all the\n functionality Python has to offer: read environment variables with `os.getenv`,\n get command line parameters with `sys.argv`, etc.\n + Other packages typically need to add features like these because their config objects are\n more limited. We avoid this by defining `config` with a standard class.\n This keeps this package lean, and your code simple.\n\nOther features\n\n- Zero onboarding:\n + *Validating config* uses only standard Python: No custom function calls, no boilerplate.\n If you know modern Python, you already know how to use this module.\n + Just define a `Config` class with the parameter names and their types.\n + Use the `Config` class as any other Python class.\n- Define your own custom types with either *Pydantic* or *Scityping*.[^new-types]\n- *Pydantic* provides [validators](https://docs.pydantic.dev/usage/validators/) which can be used to add per-field\n validation and pre-processing. These are defined with plain Python,\n not a restricted minilanguage or a sanitized `eval`.[^pydantic-v1]\n- Use standard [properties](https://docs.python.org/3/library/functions.html?highlight=property#property) to define computed (inferred) values.\n- Use any file format.\n + The default uses stdlib\u2019s `configparser`, but you can override the method\n used to read files.\n- Relative paths in config files are correctly resolved.\n- Automatic discovery of the local configuration file.\n + Just set the class variable `__local_config_filename__`, and `ValConfig`\n will recurse up the directory tree until it finds a matching file.\n + This is especially convenient to accommodate multiple users who organize\n their work differently.\n- Hierarchical: Organize your parameters into categories by defining nested classes.\n\n ```python\n from valconfig import ValConfig\n\n class Config(ValConfig):\n\n class figures:\n width: float\n format: str\n class curves:\n colormap: str | list[str]\n class heatmaps:\n colormap: str | list[str]\n\n class run:\n cache: bool\n\n class model:\n n_units: int\n ```\n\n\n- Composable\n + Want multiple config objects for each of your project\u2019s subpackages ?\n Just import them.\n + Want to combine them into a single root config file ?\n Just import them.\n\n ```python\n from valconfig import ValConfig\n from .subpkg1 import config as config1\n from .subpkg2 import config as config2\n\n class Config(ValConfig):\n pkg1: config1\n pkg2: config2\n ```\n\n + Does your project depend on another ?\n Define the parameters you need to modify and add a function which updates\n the config with those parameters.\n\n ```python\n from valconfig import ValConfig\n from other_package import config as OtherConfig\n\n class Config(ValConfig):\n # List the parameters you want to modify\n other_package_option1: int\n other_package_option2: float\n # Include the 3rd party config\n other_package: OtherConfig\n\n @validator(\"other_package\")\n def update_other_package_options(cls, other_config, values):\n # The exact details here will depend on the 3rd party package\n other_config.opt1 = values[\"other_package_option1\"]\n other_config.opt2 = values[\"other_package_option2\"]\n ```\n\n[^new-types]: *Scityping* was developed as an extension of *Pydantic* to allow\nthe use of (abstract) base classes in type definitions, for example defining\na field of type `Model` which accepts any subclass of `Model`. (In plain\nPydantic values are always *coerced* to the target type.) Whether it is best\nto define new types with either *Scityping* or *Pydantic* largely depends on\nwhether this use as abstract classes is needed.\n[^pydantic-v1]: Currently validators need to be configured using the standards of Pydantic v1. Supporting Pydantic v2 validators is certainly something I would like to do eventually.\n\n## In relation to other projects\n\nWhat\u2019s wrong with the gazillion other config file projects out there ?\nWhy not use one of those ?\n\nIn short, because we can now do in 200 lines of code what used to take thousands.\nMost existing packages for config files were developed many years ago, before\nthe standardization of type annotations. The can all perform the basic task of\nconverting a file to a Python dictionary of strings, but fulfilling all of our\naforementioned desiderata was difficult without creating a bloated package.\nUnderstandly therefore, they focused on the features required by their own use\ncases, which means that I found them all unsatisfactory in some respect:\n\n- Some introduce a new API with custom functions or decorators to define parameters. This makes it more difficult to learn and extend. ([vyper-config](https://github.com/alexferl/vyper), [hydra](https://hydra.cc/docs/intro/))\n- Many provide no validation ([one-config](https://pypi.org/project/one-config/), [config2](https://github.com/grimen/python-config2)).\n- When validations functions are provided, the target type is often not specified\n by the config object, but in the calling code \u2013 if your configuration library is allows to define\n the *name* but not the *type* of the parameters, that\u2019s only 50% of the information and 20% of the work.\n ([configparser](https://docs.python.org/3/library/configparser.html), [vyper-config](https://github.com/alexferl/vyper))\n- For the examples I know which provides validation at the config level,\n the set of supported types is very limited and basically hard-coded. ([OmegaConf](https://omegaconf.readthedocs.io/en/latest/structured_config.html), [CFG API](https://docs.red-dove.com/cfg/python.html))\n- Some approaches even define their own file formats, substantially raising the barrier to adoption. ([CFG API](https://docs.red-dove.com/cfg/python.html))\n- The package [configobj](https://configobj.readthedocs.io/en/latest/index.html) is probably the most closely aligned with our goals: it provides a simple, declarative format for both parameter and types specification, as well as key features like hierarchical parameters. (But not, to my knowledge, the configuration of subpackages.) It is also mature, which unfortunately means that it pre-dates widespread use of validation libraries and therefore must package its own custom validation library.\n + On the other hand, *configobj* might make a great substitute to get around some of the limitations of the builtin `configparser` with regards to reading files.\n\n## A simple implementation\n\nWith the standardization of type annotations in Python 3.6\u20133.8, and the availability of classes like [Pydantic](https://pydantic-docs.helpmanual.io/)\u2019s `BaseModel`, defining classes with validation\nlogic has become a breeze, and converting a `BaseModel` into a full-featured\nconfig parser basically only needs two things:\n\n- functionality for reading values from files;\n- ability to compose configuration classes.\n\nIn effect therefore, `ValConfig` is just a subclass of `BaseModel` with some\nsimple methods for reading config files. The biggest difference is that each\nsubclass of `ValConfig` is made a *singleton*. We use this pattern for exactly\nthe same reasons as [one-config](https://pypi.org/project/one-config/): it\nsolves a host of corner cases, makes it trivial to support composing configs,\nand ensures that config objects are a single source of truth.\n\nBy relying on Pydantic, `ValConfig` can be extremely lean, while still providing\nfunctionality that is on-par with the much heavier packages listed above.\nPydantic itself is highly mature and actively maintained.\n\n> The `ValConfig` class and all its utility functions clock in at less than\n> 500 lines of code and are contained in a single, easily maintained module. If\n> desired, the module can even be included as part of your project\u2019s source\n> code, thus removing *valconfig* as package dependency and giving you\n> full control over the config object.\n\nFinally, *using ValConfig does not preclude from using another config file parser*.\nIndeed, the provided implementation uses `configparser` to parse files mostly\nbecause it is already installed part of the standard library. The execution flow\nis basically\n\n1. Read config file(s) with `configparser`.\n2. Validate with `Pydantic`.\n\nTo use a different config file parser, just subclass `ValConfig` and override the\nthe method `read_cfg_file()`.\n",
"bugtrack_url": null,
"license": "MPL 2.0",
"summary": "Simple, extendable configuration files, using Pydantic for validation.",
"version": "0.1.2",
"project_urls": {
"Bug Tracker": "https://github.com/alcrene/valconfig/issues",
"Documentation": "https://validating-config.readthedocs.io/",
"Source": "https://github.com/alcrene/valconfig"
},
"split_keywords": [
"config",
" configuration",
" validation",
" hierarchical",
" scientific computing",
" reproducibility"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "13d9157d1eb7357cbcee1df9982e2e4f74caf8efe26a2fda113af5c74c28d0a0",
"md5": "51c6c4a75d5180ea8076240de1a0ad9e",
"sha256": "b6fcd03363b012e98924f02deb673924b2b51ba06d3a5d7af4897b33a0486f16"
},
"downloads": -1,
"filename": "valconfig-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "51c6c4a75d5180ea8076240de1a0ad9e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 32709,
"upload_time": "2024-08-09T20:54:14",
"upload_time_iso_8601": "2024-08-09T20:54:14.498482Z",
"url": "https://files.pythonhosted.org/packages/13/d9/157d1eb7357cbcee1df9982e2e4f74caf8efe26a2fda113af5c74c28d0a0/valconfig-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0875e82e93fda83ee23d4fb7e5a643b9b7b3576881a547d692ae43dbb69e74d3",
"md5": "56a57d509dfd8da47a85eff1f24995f0",
"sha256": "b9a74369cc43aab05e28d767a8d486f1bf84cf6808216443964b442c6959e9bb"
},
"downloads": -1,
"filename": "valconfig-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "56a57d509dfd8da47a85eff1f24995f0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 50987,
"upload_time": "2024-08-09T20:54:15",
"upload_time_iso_8601": "2024-08-09T20:54:15.925249Z",
"url": "https://files.pythonhosted.org/packages/08/75/e82e93fda83ee23d4fb7e5a643b9b7b3576881a547d692ae43dbb69e74d3/valconfig-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-09 20:54:15",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "alcrene",
"github_project": "valconfig",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "valconfig"
}