upid


Nameupid JSON
Version 0.2.0 PyPI version JSON
download
home_pagehttps://github.com/carderne/upid
SummaryUniversally Unique Prefixed Lexicographically Sortable Identifier
upload_time2024-08-20 09:36:58
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT License
keywords uuid id database
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # UPID

pronounced YOO-pid

**aka Universally Unique Prefixed Lexicographically Sortable Identifier**

This is the spec and Python implementation for UPID.

UPID is based on [ULID](https://github.com/ulid/spec) but with some modifications, inspired by [this article](https://brandur.org/nanoglyphs/026-ids) and [Stripe IDs](https://dev.to/stripe/designing-apis-for-humans-object-ids-3o5a).

The core idea is that a **meaningful prefix** is specified that is stored in a 128-bit UUID-shaped slot.
Thus a UPID is **human-readable** (like a Stripe ID), but still efficient to store, sort and index.

UPID allows a prefix of up to **4 characters** (will be right-padded if shorter than 4), includes a non-wrapping timestamp with about 250 millisecond precision, and 64 bits of entropy.

This is a UPID in Python:
```python
upid("user")            # user_2accvpp5guht4dts56je5a
```

And in Rust:
```rust
UPID::new("user")      // user_2accvpp5guht4dts56je5a
```

And in Postgres too:
```sql
CREATE TABLE users (id upid NOT NULL DEFAULT gen_upid('user') PRIMARY KEY);
INSERT INTO users DEFAULT VALUES;
SELECT id FROM users;  -- user_2accvpp5guht4dts56je5a

-- this also works
SELECT id FROM users WHERE id = 'user_2accvpp5guht4dts56je5a';
```

Plays nice with your server code, no extra work needed:
```python
with psycopg.connect("postgresql://...") as conn:
    res = conn.execute("SELECT id FROM users").fetchone()
    print(res)          # user_2accvpp5guht4dts56je5a
```

## Demo
You can give it a spin at [upid.rdrn.me](https://upid.rdrn.me/).

## Implementations

If you don't have time for ASCII art, you can skip to the good stuff:
| Language   | Link                                                    |
| --------   | ------------------------------------------------------- |
| Python     | [in this repo (scroll down)](#python-implementation)    |
| Postgres   | [in this repo (scroll down)](#postgres-extension)  |
| Rust       | [in this repo (scroll down)](#rust-implementation)      |
| TypeScript | [carderne/upid-ts](https://github.com/carderne/upid-ts) |

## Specification
Key changes relative to ULID:
1. Uses a modified form of [Crockford's base32](https://www.crockford.com/base32.html) that uses lower-case and includes the full alphabet (for prefix flexibility).
2. Does not permit upper-case/lower-case to be decoded interchangeably.
3. The text encoding is still 5 bits per base32 character.
4. 20 bits assigned to the prefix
5. 40 bits (down from 48) assigned to the timestamp, placed first in binary for sorting
6. 64 bits (down from 80) for randomness
7. 4 bits as a version specifier

```elm
    user       2accvpp5      guht4dts56je5       a
   |----|     |--------|    |-------------|   |-----|
   prefix       time            random        version     total
   4 chars      8 chars         13 chars      1 char      26 chars
       \________/________________|___________    |
               /                 |           \   |
              /                  |            \  |
           40 bits            64 bits         24 bits    128 bits
           5 bytes            8 bytes         3 bytes     16 bytes
           time               random      prefix+version
```

### Binary layout
```
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                            time_high                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    time_low   |                     random                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             random                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     random    |                  prefix_and_version           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

### Collision
Relative to ULID, the time precision is reduced from 48 to 40 bits (keeping the most significant bits, so overflow still won't occur until 10889 AD), and the randomness reduced from 80 to 64 bits.

The timestamp precision at 40 bits is around 250 milliseconds. In order to have a 50% probability of collision with 64 bits of randomness, you would need to generate around **4 billion items per 250 millisecond window**.

## Python implementation
This aims to be maximally simple to convey the core working of the spec.
The current Python implementation is entirely based on [mdomke/python-ulid](https://github.com/mdomke/python-ulid).

#### Installation
```bash
pip install upid
```

#### Usage
Run from the CLI:
```bash
python -m upid user
```

Use in a program:
```python
from upid import upid
upid("user")
```

Or more explicitly:
```python
from upid import UPID
UPID.from_prefix("user")
```

Or specifying your own timestamp or datetime
```python
import time, datetime
UPID.from_prefix_and_milliseconds("user", milliseconds)
UPID.from_prefix_and_datetime("user", datetime.datetime.now())
```

From and to a string:
```python
u = UPID.from_str("user_2accvpp5guht4dts56je5a")
u.to_str()        # user_2a...
```

Get stuff out:
```python
u.prefix     # user
u.datetime   # 2024-07-07 ...
```

Convert to other formats:
```python
int(u)       # 2079795568564925668398930358940603766
u.hex        # 01908dd6a3669b912738191ea3d61576
u.to_uuid()  # UUID('01908dd6-a366-9b91-2738-191ea3d61576')
```

#### Development
Code and tests are in the [py/](./py/) directory.
Using [Rye](https://rye.astral.sh/) for development (installation instructions at the link).

```bash
# can be run from the repo root
rye sync
rye run all  # or fmt/lint/check/test
```

If you just want to have a look around, pip should also work:
```bash
pip install -e .
```

Please open a PR if you spot a bug or improvement!

## Rust implementation
The current Rust implementation is based on [dylanhart/ulid-rs](https://github.com/dylanhart/ulid-rs), but using the same lookup base32 lookup method as the Python implementation.

#### Installation
```bash
cargo add upid
```

#### Usage
```rust
use upid::Upid;
Upid::new("user");
```

Or specifying your own timestamp or datetime:
```rust
use std::time::SystemTime;
Upid::from_prefix_and_milliseconds("user", 1720366572288);
Upid::from_prefix_and_datetime("user", SystemTime::now());
```

From and to a string:
```rust
let u = Upid::from_string("user_2accvpp5guht4dts56je5a");
u.to_string();
```

Get stuff out:
```rust
u.prefix();       // user
u.datetime();     // 2024-07-07 ...
u.milliseconds(); // 17203...
```

Convert to other formats:
```rust
u.to_bytes();
```

#### Development
Code and tests are in the [upid_rs/](./upid_rs/) directory.

```bash
cd upid_rs
cargo check  # or fmt/clippy/build/test/run
```

Please open a PR if you spot a bug or improvement!

## Postgres extension
There is also a Postgres extension built on the Rust implementation, using [pgrx](https://github.com/pgcentralfoundation/pgrx) and based on the very similar extension [pksunkara/pgx_ulid](https://github.com/pksunkara/pgx_ulid).

#### Installation
The easiest would be to try out the Docker image [carderne/postgres-upid:16](https://hub.docker.com/r/carderne/postgres-upid), currently built for arm64 and amd64 but only for Postgres 16:
```bash
docker run -e POSTGRES_HOST_AUTH_METHOD=trust -p 5432:5432 carderne/postgres-upid:16
```

You can also grab a Linux `.deb` from the [Releases](https://github.com/carderne/upid/releases) page. This is built for Postgres 16 and amd64 only.

More architectures and versions will follow once it is out of alpha.

#### Usage
```sql
CREATE EXTENSION upid_pg;

CREATE TABLE users (
    id   upid NOT NULL DEFAULT gen_upid('user') PRIMARY KEY,
    name text NOT NULL
);

INSERT INTO users (name) VALUES('Bob');

SELECT * FROM users;
--              id              | name
-- -----------------------------+------
--  user_2accvpp5guht4dts56je5a | Bob
```

You can get the raw `bytea` data, or the prefix or timestamp:
```sql
SELECT upid_to_bytea(id) FROM users;
-- \x019...

SELECT upid_to_prefix(id) FROM users;
-- 'user'

SELECT upid_to_timestamp(id) FROM users;
-- 2024-07-07 ...
```

You can convert a `UPID` to a regular Postgres `UUID`:
```sql
SELECT upid_to_uuid(gen_upid('user'));
```

Or the reverse (although the prefix and timestamp will no longer make sense):
```sql
select upid_from_uuid(gen_random_uuid());
```

#### Development
If you want to install it into another Postgres, you'll install pgrx and follow its [installation instructions](https://github.com/pgcentralfoundation/pgrx/blob/develop/cargo-pgrx/README.md).
Something like this:
```bash
cd upid_pg
cargo install --locked cargo-pgrx
cargo pgrx init
cargo pgrx install
```

Some `cargo` commands work as normal:
```bash
cargo check  # or fmt/clippy
```

But building, testing and running must be done via pgrx.
This will compile it into a Postgres installation, and allow an interactive session and tests there.

```bash
cargo pgrx test pg16
# or       run
# or       install
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/carderne/upid",
    "name": "upid",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "UUID, id, database",
    "author": null,
    "author_email": "Chris Arderne <chris@rdrn.me>",
    "download_url": "https://files.pythonhosted.org/packages/d4/e4/5cc362db7e420ffa36a1bdde0a6dc05ceb5a97b4e51425a2fd3d902ea13f/upid-0.2.0.tar.gz",
    "platform": null,
    "description": "# UPID\n\npronounced YOO-pid\n\n**aka Universally Unique Prefixed Lexicographically Sortable Identifier**\n\nThis is the spec and Python implementation for UPID.\n\nUPID is based on [ULID](https://github.com/ulid/spec) but with some modifications, inspired by [this article](https://brandur.org/nanoglyphs/026-ids) and [Stripe IDs](https://dev.to/stripe/designing-apis-for-humans-object-ids-3o5a).\n\nThe core idea is that a **meaningful prefix** is specified that is stored in a 128-bit UUID-shaped slot.\nThus a UPID is **human-readable** (like a Stripe ID), but still efficient to store, sort and index.\n\nUPID allows a prefix of up to **4 characters** (will be right-padded if shorter than 4), includes a non-wrapping timestamp with about 250 millisecond precision, and 64 bits of entropy.\n\nThis is a UPID in Python:\n```python\nupid(\"user\")            # user_2accvpp5guht4dts56je5a\n```\n\nAnd in Rust:\n```rust\nUPID::new(\"user\")      // user_2accvpp5guht4dts56je5a\n```\n\nAnd in Postgres too:\n```sql\nCREATE TABLE users (id upid NOT NULL DEFAULT gen_upid('user') PRIMARY KEY);\nINSERT INTO users DEFAULT VALUES;\nSELECT id FROM users;  -- user_2accvpp5guht4dts56je5a\n\n-- this also works\nSELECT id FROM users WHERE id = 'user_2accvpp5guht4dts56je5a';\n```\n\nPlays nice with your server code, no extra work needed:\n```python\nwith psycopg.connect(\"postgresql://...\") as conn:\n    res = conn.execute(\"SELECT id FROM users\").fetchone()\n    print(res)          # user_2accvpp5guht4dts56je5a\n```\n\n## Demo\nYou can give it a spin at [upid.rdrn.me](https://upid.rdrn.me/).\n\n## Implementations\n\nIf you don't have time for ASCII art, you can skip to the good stuff:\n| Language   | Link                                                    |\n| --------   | ------------------------------------------------------- |\n| Python     | [in this repo (scroll down)](#python-implementation)    |\n| Postgres   | [in this repo (scroll down)](#postgres-extension)  |\n| Rust       | [in this repo (scroll down)](#rust-implementation)      |\n| TypeScript | [carderne/upid-ts](https://github.com/carderne/upid-ts) |\n\n## Specification\nKey changes relative to ULID:\n1. Uses a modified form of [Crockford's base32](https://www.crockford.com/base32.html) that uses lower-case and includes the full alphabet (for prefix flexibility).\n2. Does not permit upper-case/lower-case to be decoded interchangeably.\n3. The text encoding is still 5 bits per base32 character.\n4. 20 bits assigned to the prefix\n5. 40 bits (down from 48) assigned to the timestamp, placed first in binary for sorting\n6. 64 bits (down from 80) for randomness\n7. 4 bits as a version specifier\n\n```elm\n    user       2accvpp5      guht4dts56je5       a\n   |----|     |--------|    |-------------|   |-----|\n   prefix       time            random        version     total\n   4 chars      8 chars         13 chars      1 char      26 chars\n       \\________/________________|___________    |\n               /                 |           \\   |\n              /                  |            \\  |\n           40 bits            64 bits         24 bits    128 bits\n           5 bytes            8 bytes         3 bytes     16 bytes\n           time               random      prefix+version\n```\n\n### Binary layout\n```\n 0                   1                   2                   3\n 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1\n+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\n|                            time_high                          |\n+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\n|    time_low   |                     random                    |\n+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\n|                             random                            |\n+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\n|     random    |                  prefix_and_version           |\n+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\n```\n\n### Collision\nRelative to ULID, the time precision is reduced from 48 to 40 bits (keeping the most significant bits, so overflow still won't occur until 10889 AD), and the randomness reduced from 80 to 64 bits.\n\nThe timestamp precision at 40 bits is around 250 milliseconds. In order to have a 50% probability of collision with 64 bits of randomness, you would need to generate around **4 billion items per 250 millisecond window**.\n\n## Python implementation\nThis aims to be maximally simple to convey the core working of the spec.\nThe current Python implementation is entirely based on [mdomke/python-ulid](https://github.com/mdomke/python-ulid).\n\n#### Installation\n```bash\npip install upid\n```\n\n#### Usage\nRun from the CLI:\n```bash\npython -m upid user\n```\n\nUse in a program:\n```python\nfrom upid import upid\nupid(\"user\")\n```\n\nOr more explicitly:\n```python\nfrom upid import UPID\nUPID.from_prefix(\"user\")\n```\n\nOr specifying your own timestamp or datetime\n```python\nimport time, datetime\nUPID.from_prefix_and_milliseconds(\"user\", milliseconds)\nUPID.from_prefix_and_datetime(\"user\", datetime.datetime.now())\n```\n\nFrom and to a string:\n```python\nu = UPID.from_str(\"user_2accvpp5guht4dts56je5a\")\nu.to_str()        # user_2a...\n```\n\nGet stuff out:\n```python\nu.prefix     # user\nu.datetime   # 2024-07-07 ...\n```\n\nConvert to other formats:\n```python\nint(u)       # 2079795568564925668398930358940603766\nu.hex        # 01908dd6a3669b912738191ea3d61576\nu.to_uuid()  # UUID('01908dd6-a366-9b91-2738-191ea3d61576')\n```\n\n#### Development\nCode and tests are in the [py/](./py/) directory.\nUsing [Rye](https://rye.astral.sh/) for development (installation instructions at the link).\n\n```bash\n# can be run from the repo root\nrye sync\nrye run all  # or fmt/lint/check/test\n```\n\nIf you just want to have a look around, pip should also work:\n```bash\npip install -e .\n```\n\nPlease open a PR if you spot a bug or improvement!\n\n## Rust implementation\nThe current Rust implementation is based on [dylanhart/ulid-rs](https://github.com/dylanhart/ulid-rs), but using the same lookup base32 lookup method as the Python implementation.\n\n#### Installation\n```bash\ncargo add upid\n```\n\n#### Usage\n```rust\nuse upid::Upid;\nUpid::new(\"user\");\n```\n\nOr specifying your own timestamp or datetime:\n```rust\nuse std::time::SystemTime;\nUpid::from_prefix_and_milliseconds(\"user\", 1720366572288);\nUpid::from_prefix_and_datetime(\"user\", SystemTime::now());\n```\n\nFrom and to a string:\n```rust\nlet u = Upid::from_string(\"user_2accvpp5guht4dts56je5a\");\nu.to_string();\n```\n\nGet stuff out:\n```rust\nu.prefix();       // user\nu.datetime();     // 2024-07-07 ...\nu.milliseconds(); // 17203...\n```\n\nConvert to other formats:\n```rust\nu.to_bytes();\n```\n\n#### Development\nCode and tests are in the [upid_rs/](./upid_rs/) directory.\n\n```bash\ncd upid_rs\ncargo check  # or fmt/clippy/build/test/run\n```\n\nPlease open a PR if you spot a bug or improvement!\n\n## Postgres extension\nThere is also a Postgres extension built on the Rust implementation, using [pgrx](https://github.com/pgcentralfoundation/pgrx) and based on the very similar extension [pksunkara/pgx_ulid](https://github.com/pksunkara/pgx_ulid).\n\n#### Installation\nThe easiest would be to try out the Docker image [carderne/postgres-upid:16](https://hub.docker.com/r/carderne/postgres-upid), currently built for arm64 and amd64 but only for Postgres 16:\n```bash\ndocker run -e POSTGRES_HOST_AUTH_METHOD=trust -p 5432:5432 carderne/postgres-upid:16\n```\n\nYou can also grab a Linux `.deb` from the [Releases](https://github.com/carderne/upid/releases) page. This is built for Postgres 16 and amd64 only.\n\nMore architectures and versions will follow once it is out of alpha.\n\n#### Usage\n```sql\nCREATE EXTENSION upid_pg;\n\nCREATE TABLE users (\n    id   upid NOT NULL DEFAULT gen_upid('user') PRIMARY KEY,\n    name text NOT NULL\n);\n\nINSERT INTO users (name) VALUES('Bob');\n\nSELECT * FROM users;\n--              id              | name\n-- -----------------------------+------\n--  user_2accvpp5guht4dts56je5a | Bob\n```\n\nYou can get the raw `bytea` data, or the prefix or timestamp:\n```sql\nSELECT upid_to_bytea(id) FROM users;\n-- \\x019...\n\nSELECT upid_to_prefix(id) FROM users;\n-- 'user'\n\nSELECT upid_to_timestamp(id) FROM users;\n-- 2024-07-07 ...\n```\n\nYou can convert a `UPID` to a regular Postgres `UUID`:\n```sql\nSELECT upid_to_uuid(gen_upid('user'));\n```\n\nOr the reverse (although the prefix and timestamp will no longer make sense):\n```sql\nselect upid_from_uuid(gen_random_uuid());\n```\n\n#### Development\nIf you want to install it into another Postgres, you'll install pgrx and follow its [installation instructions](https://github.com/pgcentralfoundation/pgrx/blob/develop/cargo-pgrx/README.md).\nSomething like this:\n```bash\ncd upid_pg\ncargo install --locked cargo-pgrx\ncargo pgrx init\ncargo pgrx install\n```\n\nSome `cargo` commands work as normal:\n```bash\ncargo check  # or fmt/clippy\n```\n\nBut building, testing and running must be done via pgrx.\nThis will compile it into a Postgres installation, and allow an interactive session and tests there.\n\n```bash\ncargo pgrx test pg16\n# or       run\n# or       install\n```\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Universally Unique Prefixed Lexicographically Sortable Identifier",
    "version": "0.2.0",
    "project_urls": {
        "Homepage": "https://github.com/carderne/upid",
        "Repository": "https://github.com/carderne/upid"
    },
    "split_keywords": [
        "uuid",
        " id",
        " database"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "85fb39d220b17c0458b6fd274d7b68a85762bf4abef0d957fd594fe766364300",
                "md5": "d4b5df065b02ffc8629ed9a68537a3b6",
                "sha256": "59beee4a7ca38f167740a0165766aa0371190c24750fe5b6e8089c4e4b5e0b57"
            },
            "downloads": -1,
            "filename": "upid-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d4b5df065b02ffc8629ed9a68537a3b6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 9565,
            "upload_time": "2024-08-20T09:36:57",
            "upload_time_iso_8601": "2024-08-20T09:36:57.470210Z",
            "url": "https://files.pythonhosted.org/packages/85/fb/39d220b17c0458b6fd274d7b68a85762bf4abef0d957fd594fe766364300/upid-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d4e45cc362db7e420ffa36a1bdde0a6dc05ceb5a97b4e51425a2fd3d902ea13f",
                "md5": "38361446f0a68a0a66a13ffb5348cd9f",
                "sha256": "809e091ee4dbea2a42b0e0e94c11fbb80e7e979cb0dd6a44dce4824f966ec07d"
            },
            "downloads": -1,
            "filename": "upid-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "38361446f0a68a0a66a13ffb5348cd9f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 11651,
            "upload_time": "2024-08-20T09:36:58",
            "upload_time_iso_8601": "2024-08-20T09:36:58.502338Z",
            "url": "https://files.pythonhosted.org/packages/d4/e4/5cc362db7e420ffa36a1bdde0a6dc05ceb5a97b4e51425a2fd3d902ea13f/upid-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-20 09:36:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "carderne",
    "github_project": "upid",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "upid"
}
        
Elapsed time: 2.14400s