zero-downtime-migrations


Namezero-downtime-migrations JSON
Version 0.12 PyPI version JSON
download
home_pagehttps://github.com/yandex/zero-downtime-migrations
Summarydjango migrations without long locks
upload_time2023-12-25 15:00:52
maintainerNone
docs_urlNone
authorVladimir Koljasinskij
requires_pythonNone
licenseBSD-3-Clause
keywords django postgresql migrations
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage No coveralls.
            .. image:: https://img.shields.io/pypi/v/zero-downtime-migrations.svg?style=flat
    :alt: PyPI Version
    :target: https://pypi.python.org/pypi/zero-downtime-migrations

.. image:: https://img.shields.io/pypi/pyversions/zero-downtime-migrations.svg
    :alt: Supported Python versions
    :target: https://pypi.python.org/pypi/zero-downtime-migrations

.. image:: https://travis-ci.org/yandex/zero-downtime-migrations.svg?branch=master
    :alt: Build Status
    :target: https://travis-ci.org/yandex/zero-downtime-migrations


Zero-Downtime-Migrations
========================

Description
-----------
*Zero-Downtime-Migrations (ZDM)* -- this is application which allow you to avoid long locks (and rewriting the whole table)
while applying Django migrations using PostgreSql as database.

Current possibilities
---------------------
* add field with default value (nullable or not)
* create index concurrently (and check index status after creation in case it was created with INVALID status)
* add unique property to existing field through creating unique index concurrently and creating constraint using this index

Why use it
----------
We face such a problem - performing some django migrations (such as add column with default value) lock the table on
read/write, so its impossible for our services to work properly during this periods. It can be acceptable on rather small
tables (less than million rows), but even on them it can be painful if service is high loaded.
But we have a lot of tables with more than 50 millions rows, and applying migrations on such a table lock it for
more than an hour, which is totally unacceptable. Also, during this time consuming operations, migration rather often fail
because of different errors (such as TimeoutError) and we have to start it from scratch or run sql manually thought
psql and when fake migration.

So in the end we have an idea of writing this package so it can prevent long locks on table and also
provide more stable migration process which can be continued if operation fall for some reason.

Installation
------------
To install :code:`ZDM`, simply run:

.. code:: bash

    pip install zero-downtime-migrations

Usage
-----
If you are currently using default postresql backend change it to:

.. code:: python

    DATABASES = {
         'default': {
             'ENGINE': 'zero_downtime_migrations.backend',
             ...
         }
         ...
     }


If you are using your own custom backend you can:

* Set :code:`SchemaEditorClass` if you are currently using default one:

.. code:: python

    from zero_downtime_migrations.backend.schema import DatabaseSchemaEditor

    class DatabaseWrapper(BaseWrapper):
        SchemaEditorClass = DatabaseSchemaEditor


* Add :code:`ZeroDownTimeMixin` to base classes of your :code:`DatabaseSchemaEditor` if you are using custom one:

.. code:: python

    from zero_downtime_migrations.backend.schema import ZeroDownTimeMixin

    class YourCustomSchemaEditor(ZeroDownTimeMixin, ...):
        ...

Note about indexes
------------------
Library will always force CONCURRENTLY index creation and after that check index status - if index was
created with INVALID status it will be deleted and error will be raised.
In this case you should fix problem if needed and restart migration.
For example if creating unique index was failed you should make sure that there are only unique values
in column on which index is creating.
Usually index creating with invalid status due to deadlock so you need just restart migration.

Example
-------
When adding not null column with default django will perform such sql query:

.. code:: sql

    ALTER TABLE "test" ADD COLUMN "field" boolean DEFAULT True NOT NULL;

Which cause postgres to rewrite the whole table and when swap it with existing one (`note from django documentation <https://docs.djangoproject.com/en/dev/topics/migrations/#postgresql>`_)
and during this period it will hold exclusive lock on write/read on this table.

This package will break sql above in separate commands not only to prevent the rewriting of whole
table but also to add column with as small lock times as possible.

First of all we will add nullable column without default and add default value to it in separate command in one transaction:

.. code:: sql

    ALTER TABLE "test" ADD COLUMN "field" boolean NULL;
    ALTER TABLE "test" ALTER COLUMN "field" SET DEFAULT true;

This will add default for all new rows in table but all existing ones will be with null value in this column for now,
this operation will be quick because postgres doesn't have to fill all existing rows with default.

Next we will count objects in table and if result if more than zero - calculate the
size of batch in witch we will update existing rows. After that while where are still objects with null in this
column - we will update them.

While result of following statement is more than zero:

.. code:: sql

    WITH cte AS (
    SELECT <table_pk_column> as pk
    FROM "test"
    WHERE  "field" is null
    LIMIT  <size_calculated_on_previous_step>
    )
    UPDATE "test" table_
    SET "field" = true
    FROM   cte
    WHERE  table_.<table_pk_column> = cte.pk

When we have no more rows with null in this column we can set not null and drop default (which is django default
behavior):

.. code:: sql

    ALTER TABLE "test" ALTER COLUMN "field" SET NOT NULL;
    ALTER TABLE "test" ALTER COLUMN "field" DROP DEFAULT;

So we finish add field process.
It will be definitely more time consuming than basic variant with one sql statement, but in this approach
there are no long locks on table so service can work normally during this migrations process.

Run tests
---------

.. code:: bash

    ./run_tests.sh

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yandex/zero-downtime-migrations",
    "name": "zero-downtime-migrations",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "django postgresql migrations",
    "author": "Vladimir Koljasinskij",
    "author_email": "smosker@yandex-team.ru",
    "download_url": "https://files.pythonhosted.org/packages/42/87/84e561006f5c2882e96b02e5e465dbd1ec5084e7136834875ebaba613380/zero-downtime-migrations-0.12.tar.gz",
    "platform": null,
    "description": ".. image:: https://img.shields.io/pypi/v/zero-downtime-migrations.svg?style=flat\n    :alt: PyPI Version\n    :target: https://pypi.python.org/pypi/zero-downtime-migrations\n\n.. image:: https://img.shields.io/pypi/pyversions/zero-downtime-migrations.svg\n    :alt: Supported Python versions\n    :target: https://pypi.python.org/pypi/zero-downtime-migrations\n\n.. image:: https://travis-ci.org/yandex/zero-downtime-migrations.svg?branch=master\n    :alt: Build Status\n    :target: https://travis-ci.org/yandex/zero-downtime-migrations\n\n\nZero-Downtime-Migrations\n========================\n\nDescription\n-----------\n*Zero-Downtime-Migrations (ZDM)* -- this is application which allow you to avoid long locks (and rewriting the whole table)\nwhile applying Django migrations using PostgreSql as database.\n\nCurrent possibilities\n---------------------\n* add field with default value (nullable or not)\n* create index concurrently (and check index status after creation in case it was created with INVALID status)\n* add unique property to existing field through creating unique index concurrently and creating constraint using this index\n\nWhy use it\n----------\nWe face such a problem - performing some django migrations (such as add column with default value) lock the table on\nread/write, so its impossible for our services to work properly during this periods. It can be acceptable on rather small\ntables (less than million rows), but even on them it can be painful if service is high loaded.\nBut we have a lot of tables with more than 50 millions rows, and applying migrations on such a table lock it for\nmore than an hour, which is totally unacceptable. Also, during this time consuming operations, migration rather often fail\nbecause of different errors (such as TimeoutError) and we have to start it from scratch or run sql manually thought\npsql and when fake migration.\n\nSo in the end we have an idea of writing this package so it can prevent long locks on table and also\nprovide more stable migration process which can be continued if operation fall for some reason.\n\nInstallation\n------------\nTo install :code:`ZDM`, simply run:\n\n.. code:: bash\n\n    pip install zero-downtime-migrations\n\nUsage\n-----\nIf you are currently using default postresql backend change it to:\n\n.. code:: python\n\n    DATABASES = {\n         'default': {\n             'ENGINE': 'zero_downtime_migrations.backend',\n             ...\n         }\n         ...\n     }\n\n\nIf you are using your own custom backend you can:\n\n* Set :code:`SchemaEditorClass` if you are currently using default one:\n\n.. code:: python\n\n    from zero_downtime_migrations.backend.schema import DatabaseSchemaEditor\n\n    class DatabaseWrapper(BaseWrapper):\n        SchemaEditorClass = DatabaseSchemaEditor\n\n\n* Add :code:`ZeroDownTimeMixin` to base classes of your :code:`DatabaseSchemaEditor` if you are using custom one:\n\n.. code:: python\n\n    from zero_downtime_migrations.backend.schema import ZeroDownTimeMixin\n\n    class YourCustomSchemaEditor(ZeroDownTimeMixin, ...):\n        ...\n\nNote about indexes\n------------------\nLibrary will always force CONCURRENTLY index creation and after that check index status - if index was\ncreated with INVALID status it will be deleted and error will be raised.\nIn this case you should fix problem if needed and restart migration.\nFor example if creating unique index was failed you should make sure that there are only unique values\nin column on which index is creating.\nUsually index creating with invalid status due to deadlock so you need just restart migration.\n\nExample\n-------\nWhen adding not null column with default django will perform such sql query:\n\n.. code:: sql\n\n    ALTER TABLE \"test\" ADD COLUMN \"field\" boolean DEFAULT True NOT NULL;\n\nWhich cause postgres to rewrite the whole table and when swap it with existing one (`note from django documentation <https://docs.djangoproject.com/en/dev/topics/migrations/#postgresql>`_)\nand during this period it will hold exclusive lock on write/read on this table.\n\nThis package will break sql above in separate commands not only to prevent the rewriting of whole\ntable but also to add column with as small lock times as possible.\n\nFirst of all we will add nullable column without default and add default value to it in separate command in one transaction:\n\n.. code:: sql\n\n    ALTER TABLE \"test\" ADD COLUMN \"field\" boolean NULL;\n    ALTER TABLE \"test\" ALTER COLUMN \"field\" SET DEFAULT true;\n\nThis will add default for all new rows in table but all existing ones will be with null value in this column for now,\nthis operation will be quick because postgres doesn't have to fill all existing rows with default.\n\nNext we will count objects in table and if result if more than zero - calculate the\nsize of batch in witch we will update existing rows. After that while where are still objects with null in this\ncolumn - we will update them.\n\nWhile result of following statement is more than zero:\n\n.. code:: sql\n\n    WITH cte AS (\n    SELECT <table_pk_column> as pk\n    FROM \"test\"\n    WHERE  \"field\" is null\n    LIMIT  <size_calculated_on_previous_step>\n    )\n    UPDATE \"test\" table_\n    SET \"field\" = true\n    FROM   cte\n    WHERE  table_.<table_pk_column> = cte.pk\n\nWhen we have no more rows with null in this column we can set not null and drop default (which is django default\nbehavior):\n\n.. code:: sql\n\n    ALTER TABLE \"test\" ALTER COLUMN \"field\" SET NOT NULL;\n    ALTER TABLE \"test\" ALTER COLUMN \"field\" DROP DEFAULT;\n\nSo we finish add field process.\nIt will be definitely more time consuming than basic variant with one sql statement, but in this approach\nthere are no long locks on table so service can work normally during this migrations process.\n\nRun tests\n---------\n\n.. code:: bash\n\n    ./run_tests.sh\n",
    "bugtrack_url": null,
    "license": "BSD-3-Clause",
    "summary": "django migrations without long locks",
    "version": "0.12",
    "project_urls": {
        "Homepage": "https://github.com/yandex/zero-downtime-migrations"
    },
    "split_keywords": [
        "django",
        "postgresql",
        "migrations"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "428784e561006f5c2882e96b02e5e465dbd1ec5084e7136834875ebaba613380",
                "md5": "a9cc946319f426cb1165bd9f8916ec58",
                "sha256": "c37a16e4f3b43a406580504c916f4ab49a563327a277ba80894e46e7a3844b85"
            },
            "downloads": -1,
            "filename": "zero-downtime-migrations-0.12.tar.gz",
            "has_sig": false,
            "md5_digest": "a9cc946319f426cb1165bd9f8916ec58",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 16586,
            "upload_time": "2023-12-25T15:00:52",
            "upload_time_iso_8601": "2023-12-25T15:00:52.247630Z",
            "url": "https://files.pythonhosted.org/packages/42/87/84e561006f5c2882e96b02e5e465dbd1ec5084e7136834875ebaba613380/zero-downtime-migrations-0.12.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-25 15:00:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yandex",
    "github_project": "zero-downtime-migrations",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "zero-downtime-migrations"
}
        
Elapsed time: 2.99423s