audite


Nameaudite JSON
Version 0.4.5 PyPI version JSON
download
home_pagehttps://github.com/chrisfrank/audite
SummaryInstant data auditing for SQLite
upload_time2022-12-10 14:28:35
maintainer
docs_urlNone
authorChris Frank
requires_python>=3.7, <4
license
keywords sqlite audit history change-data-capture change feed
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Audite: instant change feeds for SQLite

Audite uses SQL triggers to add a transactional change feed to any SQLite
database. It gives you a totally-ordered audit history of all changes to your
data, without touching your application code or running an extra process.

## Use cases
- Track who changed what, when
- Restore previous versions of changed rows
- Replicate data to external systems by streaming the change feed

## Quick start

Let's add a changefeed to `todo.db`, a SQLite database with the following schema:

```sh
sqlite3 todo.db "CREATE TABLE project (id INTEGER PRIMARY KEY, name TEXT)"
sqlite3 todo.db "CREATE TABLE task (
    name TEXT PRIMARY KEY,
    project_id INTEGER REFERENCES project (project_id),
    done BOOLEAN NOT NULL DEFAULT FALSE)"
```

1. **Install audite on your sytem**
    ```sh
    python3 -m pip install audite
    ```
2. **Enable audite on your database**
    ```sh
    python3 -m audite todo.db
    ```

Done! From now on, any process can `INSERT`, `UPDATE`, and `DELETE` from your
database as usual, and audite's triggers will store the results as [change
events](#event-schema) in the `audite_changefeed` table. All (and only)
committed transactions will appear in the change feed.

## Modfying data and querying the change feed

We'll add a project and two tasks...

```sh
sqlite3 todo.db "INSERT INTO project (id, name) VALUES (1, 'goals')"
sqlite3 todo.db "INSERT INTO task (project_id, name) VALUES (1, 'try audite'), (1, 'profit')"
```

cross one task off the list...
```sh
sqlite3 todo.db "UPDATE task SET done = TRUE WHERE name = 'try audite'"
```

and cancel the other:
```sh
sqlite3 todo.db "DELETE FROM task WHERE name = 'profit'"
```

### Now let's see what changed:
```sh
sqlite3 todo.db "SELECT * FROM audite_changefeed ORDER BY id"
```

You should get back something like this:
```
id  source   subject     type             time        specversion  data                                                                                                     
--  -------  ----------  ---------------  ----------  -----------  ---------------------------------------------------------------------------------------------------------
1   project  1           project.created  1669730365  1.0          {"new":{"name":"goals","id":1}}                                                                          
2   task     try audite  task.created     1669730374  1.0          {"new":{"project_id":1,"done":0,"name":"try audite"}}                                                    
3   task     profit      task.created     1669730374  1.0          {"new":{"project_id":1,"done":0,"name":"profit"}}                                                        
4   task     try audite  task.updated     1669730381  1.0          {"new":{"project_id":1,"done":1,"name":"try audite"},"old":{"project_id":1,"done":0,"name":"try audite"}}
5   task     profit      task.deleted     1669730386  1.0          {"old":{"project_id":1,"done":0,"name":"profit"}}                                                        
```

## Event Schema
The event schema follows the [CloudEvents
spec](https://github.com/cloudevents/spec) so that other systems can easily
handle events from yours.

- `id` uniquely identifies the event with a monotonically increasing integer.
- `source` is name of the database table that changed.
- `subject` is the primary key of the database row that changed.
- `type` describes the type of change: `*.created`, `*.updated`, or `*.deleted`.
- `time` is the Unix time when the change was committed.
- `specversion` is the verion of the [CloudEvents spec](https://github.com/cloudevents/spec) in use, currently `1.0`.
- `data` is a JSON snapshot of the row that changed. The `data.new` object holds the post-change values and is present for `*.created` and `*.updated` events. The `data.old` object holds pre-change values and is present for `*.updated` and `*.deleted` events.

**Note:** Audite stores `id` and `time` as integers so that SQLite can store
and sort them efficiently, but the CloudEvents spec mandates strings. To query
events that conform exactly to the [CloudEvents JSON
spec](https://github.com/cloudevents/spec/blob/v1.0.2/cloudevents/formats/json-format.md),
select from the `audite_cloudevent` view instead of the underlying
`audite_changefeed` table:

```sh
sqlite3 todo.db "SELECT cloudevent FROM audite_cloudevent ORDER BY id"
```
```
cloudevent                                                                                                                                                                                                                                                                                                            
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
{"id":"1","sequence":"00000000000000000001","source":"project","subject":"1","type":"project.created","time":"2022-11-29T13:59:25+00:00","specversion":"1.0","datacontenttype":"application/json","data":{"new":{"name":"goals","id":1}}}                                                                             
{"id":"2","sequence":"00000000000000000002","source":"task","subject":"try audite","type":"task.created","time":"2022-11-29T13:59:34+00:00","specversion":"1.0","datacontenttype":"application/json","data":{"new":{"project_id":1,"done":0,"name":"try audite"}}}                                                    
{"id":"3","sequence":"00000000000000000003","source":"task","subject":"profit","type":"task.created","time":"2022-11-29T13:59:34+00:00","specversion":"1.0","datacontenttype":"application/json","data":{"new":{"project_id":1,"done":0,"name":"profit"}}}                                                            
{"id":"4","sequence":"00000000000000000004","source":"task","subject":"try audite","type":"task.updated","time":"2022-11-29T13:59:41+00:00","specversion":"1.0","datacontenttype":"application/json","data":{"new":{"project_id":1,"done":1,"name":"try audite"},"old":{"project_id":1,"done":0,"name":"try audite"}}}
{"id":"5","sequence":"00000000000000000005","source":"task","subject":"profit","type":"task.deleted","time":"2022-11-29T13:59:46+00:00","specversion":"1.0","datacontenttype":"application/json","data":{"old":{"project_id":1,"done":0,"name":"profit"}}}                                                            
```

## Handling database schema changes
When your database schema changes, you need to re-run audite for the triggers to
pick up the latest fields. It's safe to re-run audite multiple times, including
as part of your schema migration scripts or even on app startup.

When your database schema hasn't changed, then re-running audite does nothing.
When your schema _has_ changed, then re-running audite rebuilds the triggers to
write to the change feed with the latest schema.

## Auditing only particular tables
By default, audite tracks all tables in the target database. But you can specify
tables to track via the `--table` argument:

```sh
python3 -m audite blog.db --table post --table comment
```

## Dependencies
Audite is a python package with no dependencies. You need Python >= 3.7 to
enable audite on a database, but because "enable audite on a database" just
means "add some SQL triggers," you don't need Python after the triggers are
installed.


## Prior Art
- [litestream](https://litestream.io/blog/why-i-built-litestream/) by
  @benbjohnson makes a convincing case for using SQLite in production.
- [supa_audit](https://github.com/supabase/supa_audit) by @supabase
  demonstrates how easy change feeds can be in Postgres.
- [marmot](https://github.com/maxpert/marmot) by @maxpert uses schema
  introspection and triggers that directly inspired the approach here.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/chrisfrank/audite",
    "name": "audite",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7, <4",
    "maintainer_email": "",
    "keywords": "sqlite,audit,history,change-data-capture,change feed",
    "author": "Chris Frank",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/59/d3/b51b0b63f0dfb89b688218a7011ce85862721acb0ca87763d0b55fc3b2d6/audite-0.4.5.tar.gz",
    "platform": null,
    "description": "# Audite: instant change feeds for SQLite\n\nAudite uses SQL triggers to add a transactional change feed to any SQLite\ndatabase. It gives you a totally-ordered audit history of all changes to your\ndata, without touching your application code or running an extra process.\n\n## Use cases\n- Track who changed what, when\n- Restore previous versions of changed rows\n- Replicate data to external systems by streaming the change feed\n\n## Quick start\n\nLet's add a changefeed to `todo.db`, a SQLite database with the following schema:\n\n```sh\nsqlite3 todo.db \"CREATE TABLE project (id INTEGER PRIMARY KEY, name TEXT)\"\nsqlite3 todo.db \"CREATE TABLE task (\n    name TEXT PRIMARY KEY,\n    project_id INTEGER REFERENCES project (project_id),\n    done BOOLEAN NOT NULL DEFAULT FALSE)\"\n```\n\n1. **Install audite on your sytem**\n    ```sh\n    python3 -m pip install audite\n    ```\n2. **Enable audite on your database**\n    ```sh\n    python3 -m audite todo.db\n    ```\n\nDone! From now on, any process can `INSERT`, `UPDATE`, and `DELETE` from your\ndatabase as usual, and audite's triggers will store the results as [change\nevents](#event-schema) in the `audite_changefeed` table. All (and only)\ncommitted transactions will appear in the change feed.\n\n## Modfying data and querying the change feed\n\nWe'll add a project and two tasks...\n\n```sh\nsqlite3 todo.db \"INSERT INTO project (id, name) VALUES (1, 'goals')\"\nsqlite3 todo.db \"INSERT INTO task (project_id, name) VALUES (1, 'try audite'), (1, 'profit')\"\n```\n\ncross one task off the list...\n```sh\nsqlite3 todo.db \"UPDATE task SET done = TRUE WHERE name = 'try audite'\"\n```\n\nand cancel the other:\n```sh\nsqlite3 todo.db \"DELETE FROM task WHERE name = 'profit'\"\n```\n\n### Now let's see what changed:\n```sh\nsqlite3 todo.db \"SELECT * FROM audite_changefeed ORDER BY id\"\n```\n\nYou should get back something like this:\n```\nid  source   subject     type             time        specversion  data                                                                                                     \n--  -------  ----------  ---------------  ----------  -----------  ---------------------------------------------------------------------------------------------------------\n1   project  1           project.created  1669730365  1.0          {\"new\":{\"name\":\"goals\",\"id\":1}}                                                                          \n2   task     try audite  task.created     1669730374  1.0          {\"new\":{\"project_id\":1,\"done\":0,\"name\":\"try audite\"}}                                                    \n3   task     profit      task.created     1669730374  1.0          {\"new\":{\"project_id\":1,\"done\":0,\"name\":\"profit\"}}                                                        \n4   task     try audite  task.updated     1669730381  1.0          {\"new\":{\"project_id\":1,\"done\":1,\"name\":\"try audite\"},\"old\":{\"project_id\":1,\"done\":0,\"name\":\"try audite\"}}\n5   task     profit      task.deleted     1669730386  1.0          {\"old\":{\"project_id\":1,\"done\":0,\"name\":\"profit\"}}                                                        \n```\n\n## Event Schema\nThe event schema follows the [CloudEvents\nspec](https://github.com/cloudevents/spec) so that other systems can easily\nhandle events from yours.\n\n- `id` uniquely identifies the event with a monotonically increasing integer.\n- `source` is name of the database table that changed.\n- `subject` is the primary key of the database row that changed.\n- `type` describes the type of change: `*.created`, `*.updated`, or `*.deleted`.\n- `time` is the Unix time when the change was committed.\n- `specversion` is the verion of the [CloudEvents spec](https://github.com/cloudevents/spec) in use, currently `1.0`.\n- `data` is a JSON snapshot of the row that changed. The `data.new` object holds the post-change values and is present for `*.created` and `*.updated` events. The `data.old` object holds pre-change values and is present for `*.updated` and `*.deleted` events.\n\n**Note:** Audite stores `id` and `time` as integers so that SQLite can store\nand sort them efficiently, but the CloudEvents spec mandates strings. To query\nevents that conform exactly to the [CloudEvents JSON\nspec](https://github.com/cloudevents/spec/blob/v1.0.2/cloudevents/formats/json-format.md),\nselect from the `audite_cloudevent` view instead of the underlying\n`audite_changefeed` table:\n\n```sh\nsqlite3 todo.db \"SELECT cloudevent FROM audite_cloudevent ORDER BY id\"\n```\n```\ncloudevent                                                                                                                                                                                                                                                                                                            \n----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n{\"id\":\"1\",\"sequence\":\"00000000000000000001\",\"source\":\"project\",\"subject\":\"1\",\"type\":\"project.created\",\"time\":\"2022-11-29T13:59:25+00:00\",\"specversion\":\"1.0\",\"datacontenttype\":\"application/json\",\"data\":{\"new\":{\"name\":\"goals\",\"id\":1}}}                                                                             \n{\"id\":\"2\",\"sequence\":\"00000000000000000002\",\"source\":\"task\",\"subject\":\"try audite\",\"type\":\"task.created\",\"time\":\"2022-11-29T13:59:34+00:00\",\"specversion\":\"1.0\",\"datacontenttype\":\"application/json\",\"data\":{\"new\":{\"project_id\":1,\"done\":0,\"name\":\"try audite\"}}}                                                    \n{\"id\":\"3\",\"sequence\":\"00000000000000000003\",\"source\":\"task\",\"subject\":\"profit\",\"type\":\"task.created\",\"time\":\"2022-11-29T13:59:34+00:00\",\"specversion\":\"1.0\",\"datacontenttype\":\"application/json\",\"data\":{\"new\":{\"project_id\":1,\"done\":0,\"name\":\"profit\"}}}                                                            \n{\"id\":\"4\",\"sequence\":\"00000000000000000004\",\"source\":\"task\",\"subject\":\"try audite\",\"type\":\"task.updated\",\"time\":\"2022-11-29T13:59:41+00:00\",\"specversion\":\"1.0\",\"datacontenttype\":\"application/json\",\"data\":{\"new\":{\"project_id\":1,\"done\":1,\"name\":\"try audite\"},\"old\":{\"project_id\":1,\"done\":0,\"name\":\"try audite\"}}}\n{\"id\":\"5\",\"sequence\":\"00000000000000000005\",\"source\":\"task\",\"subject\":\"profit\",\"type\":\"task.deleted\",\"time\":\"2022-11-29T13:59:46+00:00\",\"specversion\":\"1.0\",\"datacontenttype\":\"application/json\",\"data\":{\"old\":{\"project_id\":1,\"done\":0,\"name\":\"profit\"}}}                                                            \n```\n\n## Handling database schema changes\nWhen your database schema changes, you need to re-run audite for the triggers to\npick up the latest fields. It's safe to re-run audite multiple times, including\nas part of your schema migration scripts or even on app startup.\n\nWhen your database schema hasn't changed, then re-running audite does nothing.\nWhen your schema _has_ changed, then re-running audite rebuilds the triggers to\nwrite to the change feed with the latest schema.\n\n## Auditing only particular tables\nBy default, audite tracks all tables in the target database. But you can specify\ntables to track via the `--table` argument:\n\n```sh\npython3 -m audite blog.db --table post --table comment\n```\n\n## Dependencies\nAudite is a python package with no dependencies. You need Python >= 3.7 to\nenable audite on a database, but because \"enable audite on a database\" just\nmeans \"add some SQL triggers,\" you don't need Python after the triggers are\ninstalled.\n\n\n## Prior Art\n- [litestream](https://litestream.io/blog/why-i-built-litestream/) by\n  @benbjohnson makes a convincing case for using SQLite in production.\n- [supa_audit](https://github.com/supabase/supa_audit) by @supabase\n  demonstrates how easy change feeds can be in Postgres.\n- [marmot](https://github.com/maxpert/marmot) by @maxpert uses schema\n  introspection and triggers that directly inspired the approach here.\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Instant data auditing for SQLite",
    "version": "0.4.5",
    "split_keywords": [
        "sqlite",
        "audit",
        "history",
        "change-data-capture",
        "change feed"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "f54988af2e56be73d8a937fe1bb21ec5",
                "sha256": "09860b19b3a5629d82767300b16401ad3318b7a4724243ea1763916939cebfae"
            },
            "downloads": -1,
            "filename": "audite-0.4.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f54988af2e56be73d8a937fe1bb21ec5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7, <4",
            "size": 7828,
            "upload_time": "2022-12-10T14:28:33",
            "upload_time_iso_8601": "2022-12-10T14:28:33.535806Z",
            "url": "https://files.pythonhosted.org/packages/94/46/4f3c47fe89c0da7693d0ea7d1f616b189fb4764d6d3d4cfe32ddf0ed3463/audite-0.4.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "85f17ae71911351fe4245d6459099d6a",
                "sha256": "f0315132438363109c8303d0021ecfb39f13c2ec9cd75418c470e2600f0827c4"
            },
            "downloads": -1,
            "filename": "audite-0.4.5.tar.gz",
            "has_sig": false,
            "md5_digest": "85f17ae71911351fe4245d6459099d6a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7, <4",
            "size": 7679,
            "upload_time": "2022-12-10T14:28:35",
            "upload_time_iso_8601": "2022-12-10T14:28:35.022542Z",
            "url": "https://files.pythonhosted.org/packages/59/d3/b51b0b63f0dfb89b688218a7011ce85862721acb0ca87763d0b55fc3b2d6/audite-0.4.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-10 14:28:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "chrisfrank",
    "github_project": "audite",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "audite"
}
        
Elapsed time: 0.01754s