Name | dbml-to-fides JSON |
Version |
1.0.0b1
JSON |
| download |
home_page | |
Summary | Interoperatbility for DBML and Fides dataset manifests |
upload_time | 2023-06-27 13:11:33 |
maintainer | |
docs_url | None |
author | |
requires_python | <4,>=3.8 |
license | Copyright 2023 Ee Durbin Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
fides
dbml
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# dbml-to-fides
This tool converts [DBML](https://dbml.dbdiagram.io/docs/#project-definition)
to [Fides dataset manifests](https://ethyca.github.io/fideslang/resources/dataset/).
It optionally has support for merging the result from DBML into an existing
Fides dataset manifest.
Combined, this can be used in automation to ensure that datasets are kept
up-to-date with the latest schema changes in continuous integration.
## Usage
### Basic
Given a sample DBML in `sample.dbml`:
```dbml
Table users {
id integer [primary key]
username varchar
role varchar
created_at timestamp
}
Table posts {
id integer [primary key]
title varchar
body text [note: 'Content of the post']
user_id integer
status post_status
created_at timestamp
}
Enum post_status {
draft
published
private [note: 'visible via URL only']
}
Ref: posts.user_id > users.id // many-to-one
```
`dbml-to-fides` will output what it can infer from the DBML file as a Fides
dataset:
```sh
$ dbml-to-fides sample.dbml
dataset:
- name: public
collections:
- name: users
description: Users
fields:
- name: id
fides_meta:
primary_key: true
- name: username
- name: role
- name: created_at
- name: posts
description: All the content you crave
fields:
- name: id
fides_meta:
primary_key: true
- name: title
- name: body
description: Content of the post
- name: user_id
fides_meta:
references:
- dataset: public
field: users.id
direction: to
- name: status
- name: created_at
```
### Merging with existing Fides dataset
If you have an existing Fides dataset in `.fides/sample_dataset.yml`:
```yaml
dataset:
- fides_key: sample_dataset
organization_fides_key: default_organization
name: public
description: Sample dataset for my system
meta: null
data_categories: []
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
retention: 30 days after account deletion
collections:
- name: users
description: User information
fields:
- name: id
fides_meta:
primary_key: true
description: User's unique ID
data_categories:
- user.unique_id
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
- name: username
description: User's username
data_categories:
- user.name
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
retention: Account termination
- name: role
description: User's system level role/privilege
data_categories:
- system.operations
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
- name: created_at
description: User's creation timestamp
data_categories:
- system.operations
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
- name: posts
description: Post information
fields:
- name: id
fides_meta:
primary_key: true
description: Post's unique ID
data_categories:
- system.operations
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
- name: title
description: Post's title
data_categories:
- system.operations
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
- name: body
description: Post's body
data_categories:
- system.operations
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
- name: user_id
fides_meta:
references:
- dataset: public
field: users.id
direction: to
description: Post creator's unique User ID
data_categories:
- user.unique_id
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
- name: status
description: User's creation timestamp
data_categories:
- system.operations
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
- name: created_at
description: Post's creation timestamp
data_categories:
- system.operations
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
```
`dbml-to-fides` can be used with the
`--base-dataset` option to merge the results together.
But, in this case there are no differences:
```sh
$ diff -u .fides/sample_dataset.yml <(dbml-to-fides sample.dbml --base-dataset .fides/sample_dataset.yml)
$
```
If we introduce a change to the DBML:
```diff
@@ -3,6 +3,7 @@ Table users {
username varchar
role varchar
created_at timestamp
+ social_security_number varchar
}
Table posts {
```
Then running our diff again will add the field to our Fides dataset:
```shell
$ diff -u .fides/sample_dataset.yml <(dbml-to-fides sample.dbml --base-dataset .fides/sample_dataset.yml)
--- .fides/sample_dataset.yml 2023-05-22 15:39:24
+++ /dev/fd/63 2023-05-22 15:40:07
@@ -34,6 +34,7 @@
data_categories:
- system.operations
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
+ - name: social_security_number
- name: posts
description: Post information
fields:
```
### File output
If we wanted to write the output to a file,
we would add the `--output-file` flag:
```shell
$ dbml-to-fides sample.dbml --base-dataset .fides/sample_dataset.yml --output-file .fides/sample_dataset.yml
$ git diff
diff --git a/.fides/sample_dataset.yml b/.fides/sample_dataset.yml
index 594cee4..edc3141 100644
--- a/.fides/sample_dataset.yml
+++ b/.fides/sample_dataset.yml
@@ -34,6 +34,7 @@ dataset:
data_categories:
- system.operations
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
+ - name: social_security_number
- name: posts
description: Post information
fields:
```
### Initial generation
If you do not have an existing Fides dataset, the `--include-fides-keys` flag will
create a more "fleshed out" version of a
[Fides dataset](https://ethyca.github.io/fideslang/resources/dataset/)
including all keys. See the [docs](https://ethyca.github.io/fideslang/resources/dataset/)
for what each field can/should be populated with.
```shell
$ dbml-to-fides sample.dbml --include-fides-keys
dataset:
- fides_key: null
name: public
description: null
organization_fides_key: null
meta: {}
third_country_transfers: []
joint_controller: []
retention: null
data_categories: []
data_qualifiers: []
collections:
- name: users
description: Users
data_categories: []
data_qualifiers: []
retention: null
fields:
- name: id
description: null
data_categories: []
data_qualifier: null
retention: null
fides_meta:
primary_key: true
- name: username
description: null
data_categories: []
data_qualifier: null
retention: null
- name: role
description: null
data_categories: []
data_qualifier: null
retention: null
- name: created_at
description: null
data_categories: []
data_qualifier: null
retention: null
- name: social
description: null
data_categories: []
data_qualifier: null
retention: null
- name: posts
description: All the content you crave
data_categories: []
data_qualifiers: []
retention: null
fields:
- name: id
description: null
data_categories: []
data_qualifier: null
retention: null
fides_meta:
primary_key: true
- name: title
description: null
data_categories: []
data_qualifier: null
retention: null
- name: body
description: Content of the post
data_categories: []
data_qualifier: null
retention: null
- name: user_id
description: null
data_categories: []
data_qualifier: null
retention: null
fides_meta:
references:
- dataset: public
field: users.id
direction: to
- name: status
description: null
data_categories: []
data_qualifier: null
retention: null
- name: created_at
description: null
data_categories: []
data_qualifier: null
retention: null
```
Raw data
{
"_id": null,
"home_page": "",
"name": "dbml-to-fides",
"maintainer": "",
"docs_url": null,
"requires_python": "<4,>=3.8",
"maintainer_email": "",
"keywords": "fides,dbml",
"author": "",
"author_email": "Ee Durbin <ee.opensource@pyfound.org>",
"download_url": "https://files.pythonhosted.org/packages/3e/eb/314007dcd2fb85e5e86a79e11de46d78cb141135e4d43869d02963872d4d/dbml-to-fides-1.0.0b1.tar.gz",
"platform": null,
"description": "# dbml-to-fides\n\nThis tool converts [DBML](https://dbml.dbdiagram.io/docs/#project-definition)\nto [Fides dataset manifests](https://ethyca.github.io/fideslang/resources/dataset/).\n\nIt optionally has support for merging the result from DBML into an existing\nFides dataset manifest.\n\nCombined, this can be used in automation to ensure that datasets are kept\nup-to-date with the latest schema changes in continuous integration.\n\n## Usage\n\n### Basic\n\nGiven a sample DBML in `sample.dbml`:\n\n```dbml\nTable users {\n id integer [primary key]\n username varchar\n role varchar\n created_at timestamp\n}\n\nTable posts {\n id integer [primary key]\n title varchar\n body text [note: 'Content of the post']\n user_id integer\n status post_status\n created_at timestamp\n}\n\nEnum post_status {\n draft\n published\n private [note: 'visible via URL only']\n}\n\nRef: posts.user_id > users.id // many-to-one\n```\n\n`dbml-to-fides` will output what it can infer from the DBML file as a Fides\ndataset:\n\n```sh\n$ dbml-to-fides sample.dbml\ndataset:\n- name: public\n collections:\n - name: users\n description: Users\n fields:\n - name: id\n fides_meta:\n primary_key: true\n - name: username\n - name: role\n - name: created_at\n - name: posts\n description: All the content you crave\n fields:\n - name: id\n fides_meta:\n primary_key: true\n - name: title\n - name: body\n description: Content of the post\n - name: user_id\n fides_meta:\n references:\n - dataset: public\n field: users.id\n direction: to\n - name: status\n - name: created_at\n```\n\n### Merging with existing Fides dataset\n\nIf you have an existing Fides dataset in `.fides/sample_dataset.yml`:\n\n```yaml\ndataset:\n- fides_key: sample_dataset\n organization_fides_key: default_organization\n name: public\n description: Sample dataset for my system\n meta: null\n data_categories: []\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n retention: 30 days after account deletion\n collections:\n - name: users\n description: User information\n fields:\n - name: id\n fides_meta:\n primary_key: true\n description: User's unique ID\n data_categories:\n - user.unique_id\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n - name: username\n description: User's username\n data_categories:\n - user.name\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n retention: Account termination\n - name: role\n description: User's system level role/privilege\n data_categories:\n - system.operations\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n - name: created_at\n description: User's creation timestamp\n data_categories:\n - system.operations\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n - name: posts\n description: Post information\n fields:\n - name: id\n fides_meta:\n primary_key: true\n description: Post's unique ID\n data_categories:\n - system.operations\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n - name: title\n description: Post's title\n data_categories:\n - system.operations\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n - name: body\n description: Post's body\n data_categories:\n - system.operations\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n - name: user_id\n fides_meta:\n references:\n - dataset: public\n field: users.id\n direction: to\n description: Post creator's unique User ID\n data_categories:\n - user.unique_id\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n - name: status\n description: User's creation timestamp\n data_categories:\n - system.operations\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n - name: created_at\n description: Post's creation timestamp\n data_categories:\n - system.operations\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n\n```\n\n`dbml-to-fides` can be used with the\n`--base-dataset` option to merge the results together.\nBut, in this case there are no differences:\n\n```sh\n$ diff -u .fides/sample_dataset.yml <(dbml-to-fides sample.dbml --base-dataset .fides/sample_dataset.yml)\n$\n```\n\nIf we introduce a change to the DBML:\n\n```diff\n@@ -3,6 +3,7 @@ Table users {\n username varchar\n role varchar\n created_at timestamp\n+ social_security_number varchar\n }\n \n Table posts {\n```\n\nThen running our diff again will add the field to our Fides dataset:\n\n```shell\n$ diff -u .fides/sample_dataset.yml <(dbml-to-fides sample.dbml --base-dataset .fides/sample_dataset.yml)\n--- .fides/sample_dataset.yml\t2023-05-22 15:39:24\n+++ /dev/fd/63\t2023-05-22 15:40:07\n@@ -34,6 +34,7 @@\n data_categories:\n - system.operations\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n+ - name: social_security_number\n - name: posts\n description: Post information\n fields:\n```\n\n### File output\n\nIf we wanted to write the output to a file,\nwe would add the `--output-file` flag:\n\n```shell\n$ dbml-to-fides sample.dbml --base-dataset .fides/sample_dataset.yml --output-file .fides/sample_dataset.yml\n$ git diff\ndiff --git a/.fides/sample_dataset.yml b/.fides/sample_dataset.yml\nindex 594cee4..edc3141 100644\n--- a/.fides/sample_dataset.yml\n+++ b/.fides/sample_dataset.yml\n@@ -34,6 +34,7 @@ dataset:\n data_categories:\n - system.operations\n data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified\n+ - name: social_security_number\n - name: posts\n description: Post information\n fields:\n```\n\n### Initial generation\n\nIf you do not have an existing Fides dataset, the `--include-fides-keys` flag will\ncreate a more \"fleshed out\" version of a\n[Fides dataset](https://ethyca.github.io/fideslang/resources/dataset/)\nincluding all keys. See the [docs](https://ethyca.github.io/fideslang/resources/dataset/)\nfor what each field can/should be populated with.\n\n```shell\n$ dbml-to-fides sample.dbml --include-fides-keys\ndataset:\n- fides_key: null\n name: public\n description: null\n organization_fides_key: null\n meta: {}\n third_country_transfers: []\n joint_controller: []\n retention: null\n data_categories: []\n data_qualifiers: []\n collections:\n - name: users\n description: Users\n data_categories: []\n data_qualifiers: []\n retention: null\n fields:\n - name: id\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n fides_meta:\n primary_key: true\n - name: username\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n - name: role\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n - name: created_at\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n - name: social\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n - name: posts\n description: All the content you crave\n data_categories: []\n data_qualifiers: []\n retention: null\n fields:\n - name: id\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n fides_meta:\n primary_key: true\n - name: title\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n - name: body\n description: Content of the post\n data_categories: []\n data_qualifier: null\n retention: null\n - name: user_id\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n fides_meta:\n references:\n - dataset: public\n field: users.id\n direction: to\n - name: status\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n - name: created_at\n description: null\n data_categories: []\n data_qualifier: null\n retention: null\n```\n",
"bugtrack_url": null,
"license": "Copyright 2023 Ee Durbin Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \u201cSoftware\u201d), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \u201cAS IS\u201d, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
"summary": "Interoperatbility for DBML and Fides dataset manifests",
"version": "1.0.0b1",
"project_urls": {
"Homepage": "https://github.com/ewdurbin/dbml-to-fides",
"Source": "https://github.com/ewdurbin/dbml-to-fides"
},
"split_keywords": [
"fides",
"dbml"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1f1f2d6e198117cb0480a6dc5aff28b177b6ff6dc09c20935506b310fb807816",
"md5": "317cb5e9793da976928d6e440a32b139",
"sha256": "33b32b02d683610f0c3bdd736a9a93a7b12542d01119886fef9ec66641d61ede"
},
"downloads": -1,
"filename": "dbml_to_fides-1.0.0b1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "317cb5e9793da976928d6e440a32b139",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4,>=3.8",
"size": 7079,
"upload_time": "2023-06-27T13:11:32",
"upload_time_iso_8601": "2023-06-27T13:11:32.247251Z",
"url": "https://files.pythonhosted.org/packages/1f/1f/2d6e198117cb0480a6dc5aff28b177b6ff6dc09c20935506b310fb807816/dbml_to_fides-1.0.0b1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3eeb314007dcd2fb85e5e86a79e11de46d78cb141135e4d43869d02963872d4d",
"md5": "727fff5b80ccbcdd4632a3a74653ff26",
"sha256": "96763925e957606299a8c0493017e27531524cda28e590d69972467af1210bca"
},
"downloads": -1,
"filename": "dbml-to-fides-1.0.0b1.tar.gz",
"has_sig": false,
"md5_digest": "727fff5b80ccbcdd4632a3a74653ff26",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4,>=3.8",
"size": 17271,
"upload_time": "2023-06-27T13:11:33",
"upload_time_iso_8601": "2023-06-27T13:11:33.605841Z",
"url": "https://files.pythonhosted.org/packages/3e/eb/314007dcd2fb85e5e86a79e11de46d78cb141135e4d43869d02963872d4d/dbml-to-fides-1.0.0b1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-27 13:11:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ewdurbin",
"github_project": "dbml-to-fides",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "dbml-to-fides"
}