django-data-schema


Namedjango-data-schema JSON
Version 2.1.0 PyPI version JSON
download
home_pagehttps://github.com/ambitioninc/django-data-schema
SummarySchemas over dictionaries and models in Django
upload_time2023-06-29 14:52:54
maintainer
docs_urlNone
authorWes Kendall
requires_python
licenseMIT
keywords django data schema
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            [![Build Status](https://travis-ci.org/ambitioninc/django-data-schema.png)](https://travis-ci.org/ambitioninc/django-data-schema)

# Django Data Schema
Django data schema is a lightweight Django app for defining the schema for a model, dictionary, or list.
By describing a schema on a piece of data, this allows other applications to easily reference
fields of models or fields in dictionaries (or their related json fields).

Django data schema also takes care of all conversions under the hood, such as parsing datetime strings, converting strings to numeric values, using default values when values don't exist, and so on.

1. [Installation](#installation)
2. [Model Overview](#model-overview)
3. [Examples](#examples)

## Installation

```python
pip install django-data-schema
```

## Model Overview
Django data schema defines three models for building schemas on data. These models are ``DataSchema``,
``FieldSchema``, and ``FieldOptional``.

The ``DataSchema`` model provides a ``model_content_type`` field that points to a Django ``ContentType`` model.
This field represents which object this schema is modeling. If the field is None, it is assumed that
this schema models an object such as a dictionary or list.

After the enclosing ``DataSchema`` has been defined, various ``FieldSchema`` models can reference the main
data schema. ``FieldSchema`` models provide the following attributes:

- ``field_key``: The name of the field. Used to identify a field in a dictionary or model.
- ``field_position``: The position of the field. Used to identify a field in a list.
- ``uniqueness_order``: The order of this field in the uniqueness constraint of the schema. Defaults to None.
- ``field_type``: The type of field. More on the field types below.
- ``field_format``: An optional formatting string for the field. Used differently depending on the field type and documented more below.
- ``default_value``: If the field returns None, this default value will be returned instead.

A ``FieldSchema`` object must specify its data type. While data of a given type can be stored in different formats,
django-data-schema normalizes the data when accessing it through ``get_value``, described below. The available
types are listed in the ``FieldSchemaType`` class. These types are listed here, with the type they normalize to:

- ``FieldSchemaType.DATE``: A python ``date`` object from the ``datetime`` module. Currently returned as a ``datetime`` object.
- ``FieldSchemaType.DATETIME``: A python ``datetime`` object from the ``datetime`` module.
- ``FieldSchemaType.INT``: A python ``int``.
- ``FieldSchemaType.FLOAT``: A python ``float``.
- ``FieldSchemaType.STRING``: A python ``str``.
- ``FieldSchemaType.BOOLEAN``: A python ``bool``.

These fields provide the necessary conversion mechanisms when accessing data via ``FieldSchema.get_value``. Differences in how the ``get_value`` function operates are detailed below.

### Using get_value on DATE or DATETIME fields
The ``get_value`` function has the following behavior on DATE and DATETIME fields:

- If called on a Python ``int`` or ``float`` value, the numeric value will be passed to the ``datetime.utcfromtimestamp`` function.
- If called on a ``string`` or ``unicode`` value, the string will be stripped of all trailing and leading whitespace. If the string is empty, the default value (or None) will be used. If the string is not empty, it will be passed to dateutil's ``parse`` function. If the ``field_format`` field is specified on the ``FieldSchema`` object, it will be passed to the ``strptime`` function instead. 
- If called on an aware datetime object (or a string with a timezone), it will be converted to naive UTC time.
- If called on None, the default value (or None) is returned.

### Using get_value on INT or FLOAT fields
The ``get_value`` function has the following behavior on INT and FLOAT fields:

- If called on a ``string`` or ``unicode`` value, the string will be stripped of all non-numeric numbers except for periods. If the string is blank, the default value (or None) will be returned. If not, the string will then be passed to ``int()`` or ``float()``.
- If called on an ``int`` or ``float``, the value will be passed to the ``int()`` or ``float()`` function.
- No other values can be converted. The ``field_format`` parameter is ignored.
- If called on None, the default value (or None) is returned.

### Using get_value on a STRING field
The ``get_value`` function has the following behavior on a STRING field:

- If called on a ``string`` or ``unicode`` value, the string will be stripped of all trailing and leading whitespace. If a ``field_format`` is specified, the string is then be matched to the regex. If it passes, the string is returned. If not, None is returned and the default value is used (or None).
- All other types are passed to the ``str()`` function.
- If called on None, the default value (or None) is returned.

### Using get_value on a BOOLEAN field
The ``get_value`` function has the following behavior on a BOOLEAN field:

- Bool data types will return True or False
- Truthy looking string values return True ('t', 'T', 'true', 'True', 'TRUE', 1, '1')
- Falsy looking string values return False ('f', 'F', 'false', 'False', 'FALSE', 0, '0')
- If called on None, the default value (or None) is returned.

## Examples

A data schema can be created like the following:

```python
from data_schema import DataSchema, FieldSchema, FieldSchemaType

user_login_schema = DataSchema.objects.create()
user_id_field = FieldSchema.objects.create(
    data_schema=user_login_schema, field_key='user_id', uniqueness_order=1, field_type=FieldSchemaType.STRING)
login_time_field = FieldSchema.objects.create(
    data_schema=user_login_schema, field_key='login_time', field_type=FieldSchemaType.DATETIME)
```

The above example represents the schema of a user login. In this schema, the user id field provides the uniqueness
constraint of the data. The uniqueness constraint can then easily be accessed by simply doing the following.

```python
unique_fields = user_login_schema.get_unique_fields()
```

The above function returns the unique fields in the order in which they were specified, allowing the user to
generate a unique ID for the data.

To obtain values of data using the schema, one can use the ``get_value`` function as follows:

```python
data = {
    'user_id': 'my_user_id',
    'login_time': 1396396800,
}

print login_time_field.get_value(data)
2014-04-02 00:00:00
```

Note that the ``get_value`` function looks at the type of data object and uses the proper access method. If the
data object is a ``dict``, it accesses it using ``data[field_key]``. If it is an object, it accesses it with
``getattr(data, field_key)``. An array is accessed as ``data[field_position]``.

Here's another example of parsing datetime objects in an array with a format string.

```python
string_time_field_schema = FieldSchema.objects.create(
    data_schema=data_schema, field_key='time', field_position=1, field_type=FieldSchemaType.DATETIME, field_format='%Y-%m-%d %H:%M:%S')

print string_time_field_schema.get_value(['value', '2013-04-12 12:12:12'])
2013-04-12 12:12:12
```

Note that if you are parsing numerical fields, Django data schema will strip out any non-numerical values, allowing the user to get values of currency-based numbers and other formats.

```python
revenue_field_schema = FieldSchema.objects.create(
    data_schema=data_schema, field_key='revenue', field_type=FieldSchemaType.FLOAT)

print revenue_field_schema.get_value({'revenue': '$15,000,456.23'})
15000456.23
```

Note that ``FieldSchema`` objects have an analogous ``set_value`` method for setting the value of a field.
The ``set_value`` method does not do any data conversions, so when calling this method, be sure to use a value
that is in the correct format.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ambitioninc/django-data-schema",
    "name": "django-data-schema",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "Django Data Schema",
    "author": "Wes Kendall",
    "author_email": "opensource@ambition.com",
    "download_url": "https://files.pythonhosted.org/packages/ec/99/fc6785b027df7e174dba1fff0db1a6c2caafbd42e15e71ea81b6386d73fb/django-data-schema-2.1.0.tar.gz",
    "platform": null,
    "description": "[![Build Status](https://travis-ci.org/ambitioninc/django-data-schema.png)](https://travis-ci.org/ambitioninc/django-data-schema)\n\n# Django Data Schema\nDjango data schema is a lightweight Django app for defining the schema for a model, dictionary, or list.\nBy describing a schema on a piece of data, this allows other applications to easily reference\nfields of models or fields in dictionaries (or their related json fields).\n\nDjango data schema also takes care of all conversions under the hood, such as parsing datetime strings, converting strings to numeric values, using default values when values don't exist, and so on.\n\n1. [Installation](#installation)\n2. [Model Overview](#model-overview)\n3. [Examples](#examples)\n\n## Installation\n\n```python\npip install django-data-schema\n```\n\n## Model Overview\nDjango data schema defines three models for building schemas on data. These models are ``DataSchema``,\n``FieldSchema``, and ``FieldOptional``.\n\nThe ``DataSchema`` model provides a ``model_content_type`` field that points to a Django ``ContentType`` model.\nThis field represents which object this schema is modeling. If the field is None, it is assumed that\nthis schema models an object such as a dictionary or list.\n\nAfter the enclosing ``DataSchema`` has been defined, various ``FieldSchema`` models can reference the main\ndata schema. ``FieldSchema`` models provide the following attributes:\n\n- ``field_key``: The name of the field. Used to identify a field in a dictionary or model.\n- ``field_position``: The position of the field. Used to identify a field in a list.\n- ``uniqueness_order``: The order of this field in the uniqueness constraint of the schema. Defaults to None.\n- ``field_type``: The type of field. More on the field types below.\n- ``field_format``: An optional formatting string for the field. Used differently depending on the field type and documented more below.\n- ``default_value``: If the field returns None, this default value will be returned instead.\n\nA ``FieldSchema`` object must specify its data type. While data of a given type can be stored in different formats,\ndjango-data-schema normalizes the data when accessing it through ``get_value``, described below. The available\ntypes are listed in the ``FieldSchemaType`` class. These types are listed here, with the type they normalize to:\n\n- ``FieldSchemaType.DATE``: A python ``date`` object from the ``datetime`` module. Currently returned as a ``datetime`` object.\n- ``FieldSchemaType.DATETIME``: A python ``datetime`` object from the ``datetime`` module.\n- ``FieldSchemaType.INT``: A python ``int``.\n- ``FieldSchemaType.FLOAT``: A python ``float``.\n- ``FieldSchemaType.STRING``: A python ``str``.\n- ``FieldSchemaType.BOOLEAN``: A python ``bool``.\n\nThese fields provide the necessary conversion mechanisms when accessing data via ``FieldSchema.get_value``. Differences in how the ``get_value`` function operates are detailed below.\n\n### Using get_value on DATE or DATETIME fields\nThe ``get_value`` function has the following behavior on DATE and DATETIME fields:\n\n- If called on a Python ``int`` or ``float`` value, the numeric value will be passed to the ``datetime.utcfromtimestamp`` function.\n- If called on a ``string`` or ``unicode`` value, the string will be stripped of all trailing and leading whitespace. If the string is empty, the default value (or None) will be used. If the string is not empty, it will be passed to dateutil's ``parse`` function. If the ``field_format`` field is specified on the ``FieldSchema`` object, it will be passed to the ``strptime`` function instead. \n- If called on an aware datetime object (or a string with a timezone), it will be converted to naive UTC time.\n- If called on None, the default value (or None) is returned.\n\n### Using get_value on INT or FLOAT fields\nThe ``get_value`` function has the following behavior on INT and FLOAT fields:\n\n- If called on a ``string`` or ``unicode`` value, the string will be stripped of all non-numeric numbers except for periods. If the string is blank, the default value (or None) will be returned. If not, the string will then be passed to ``int()`` or ``float()``.\n- If called on an ``int`` or ``float``, the value will be passed to the ``int()`` or ``float()`` function.\n- No other values can be converted. The ``field_format`` parameter is ignored.\n- If called on None, the default value (or None) is returned.\n\n### Using get_value on a STRING field\nThe ``get_value`` function has the following behavior on a STRING field:\n\n- If called on a ``string`` or ``unicode`` value, the string will be stripped of all trailing and leading whitespace. If a ``field_format`` is specified, the string is then be matched to the regex. If it passes, the string is returned. If not, None is returned and the default value is used (or None).\n- All other types are passed to the ``str()`` function.\n- If called on None, the default value (or None) is returned.\n\n### Using get_value on a BOOLEAN field\nThe ``get_value`` function has the following behavior on a BOOLEAN field:\n\n- Bool data types will return True or False\n- Truthy looking string values return True ('t', 'T', 'true', 'True', 'TRUE', 1, '1')\n- Falsy looking string values return False ('f', 'F', 'false', 'False', 'FALSE', 0, '0')\n- If called on None, the default value (or None) is returned.\n\n## Examples\n\nA data schema can be created like the following:\n\n```python\nfrom data_schema import DataSchema, FieldSchema, FieldSchemaType\n\nuser_login_schema = DataSchema.objects.create()\nuser_id_field = FieldSchema.objects.create(\n    data_schema=user_login_schema, field_key='user_id', uniqueness_order=1, field_type=FieldSchemaType.STRING)\nlogin_time_field = FieldSchema.objects.create(\n    data_schema=user_login_schema, field_key='login_time', field_type=FieldSchemaType.DATETIME)\n```\n\nThe above example represents the schema of a user login. In this schema, the user id field provides the uniqueness\nconstraint of the data. The uniqueness constraint can then easily be accessed by simply doing the following.\n\n```python\nunique_fields = user_login_schema.get_unique_fields()\n```\n\nThe above function returns the unique fields in the order in which they were specified, allowing the user to\ngenerate a unique ID for the data.\n\nTo obtain values of data using the schema, one can use the ``get_value`` function as follows:\n\n```python\ndata = {\n    'user_id': 'my_user_id',\n    'login_time': 1396396800,\n}\n\nprint login_time_field.get_value(data)\n2014-04-02 00:00:00\n```\n\nNote that the ``get_value`` function looks at the type of data object and uses the proper access method. If the\ndata object is a ``dict``, it accesses it using ``data[field_key]``. If it is an object, it accesses it with\n``getattr(data, field_key)``. An array is accessed as ``data[field_position]``.\n\nHere's another example of parsing datetime objects in an array with a format string.\n\n```python\nstring_time_field_schema = FieldSchema.objects.create(\n    data_schema=data_schema, field_key='time', field_position=1, field_type=FieldSchemaType.DATETIME, field_format='%Y-%m-%d %H:%M:%S')\n\nprint string_time_field_schema.get_value(['value', '2013-04-12 12:12:12'])\n2013-04-12 12:12:12\n```\n\nNote that if you are parsing numerical fields, Django data schema will strip out any non-numerical values, allowing the user to get values of currency-based numbers and other formats.\n\n```python\nrevenue_field_schema = FieldSchema.objects.create(\n    data_schema=data_schema, field_key='revenue', field_type=FieldSchemaType.FLOAT)\n\nprint revenue_field_schema.get_value({'revenue': '$15,000,456.23'})\n15000456.23\n```\n\nNote that ``FieldSchema`` objects have an analogous ``set_value`` method for setting the value of a field.\nThe ``set_value`` method does not do any data conversions, so when calling this method, be sure to use a value\nthat is in the correct format.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Schemas over dictionaries and models in Django",
    "version": "2.1.0",
    "project_urls": {
        "Homepage": "https://github.com/ambitioninc/django-data-schema"
    },
    "split_keywords": [
        "django",
        "data",
        "schema"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c7d450fd895a4f8781c311cadc0edd47cdbb3d9da5c88b82fed7f35fe55c643c",
                "md5": "0edb4e3e02883f508de24bca9bb30c04",
                "sha256": "0d33775cbfc611ff88a3dab1ff9c815f2df1972152c19834f15c045340960c84"
            },
            "downloads": -1,
            "filename": "django_data_schema-2.1.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0edb4e3e02883f508de24bca9bb30c04",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 22540,
            "upload_time": "2023-06-29T14:52:52",
            "upload_time_iso_8601": "2023-06-29T14:52:52.619500Z",
            "url": "https://files.pythonhosted.org/packages/c7/d4/50fd895a4f8781c311cadc0edd47cdbb3d9da5c88b82fed7f35fe55c643c/django_data_schema-2.1.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ec99fc6785b027df7e174dba1fff0db1a6c2caafbd42e15e71ea81b6386d73fb",
                "md5": "e322b92cf19869910e6ac4d8754a05ac",
                "sha256": "3e2b04f83789343c3ab58b2dc086c53c526d8b1cf3e9a35aea559ea86416c104"
            },
            "downloads": -1,
            "filename": "django-data-schema-2.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e322b92cf19869910e6ac4d8754a05ac",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 20356,
            "upload_time": "2023-06-29T14:52:54",
            "upload_time_iso_8601": "2023-06-29T14:52:54.621801Z",
            "url": "https://files.pythonhosted.org/packages/ec/99/fc6785b027df7e174dba1fff0db1a6c2caafbd42e15e71ea81b6386d73fb/django-data-schema-2.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-29 14:52:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ambitioninc",
    "github_project": "django-data-schema",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "django-data-schema"
}
        
Elapsed time: 0.12322s