pg-chameleon


Namepg-chameleon JSON
Version 2.0.19 PyPI version JSON
download
home_pagehttps://github.com/the4thdoctor/pg_chameleon/
SummaryMySQL to PostgreSQL replica and migration
upload_time2023-03-26 14:15:00
maintainerFederico Campoli
docs_urlNone
authorFederico Campoli
requires_python>=3.5
licenseBSD License
keywords postgresql mysql replica migration database
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. image:: https://img.shields.io/github/issues/the4thdoctor/pg_chameleon.svg   
        :target: https://github.com/the4thdoctor/pg_chameleon/issues

.. image:: https://img.shields.io/github/forks/the4thdoctor/pg_chameleon.svg   
        :target: https://github.com/the4thdoctor/pg_chameleon/network

.. image:: https://img.shields.io/github/stars/the4thdoctor/pg_chameleon.svg   
        :target: https://github.com/the4thdoctor/pg_chameleon/stargazers
	
.. image:: https://img.shields.io/badge/license-BSD-blue.svg   
        :target: https://raw.githubusercontent.com/the4thdoctor/pg_chameleon/main/LICENSE.txt

.. image:: https://img.shields.io/github/release/the4thdoctor/pg_chameleon
		:target: https://github.com/the4thdoctor/pg_chameleon/releases
	
.. image:: https://img.shields.io/pypi/dm/pg_chameleon.svg
    :target: https://pypi.org/project/pg_chameleon

    
pg_chameleon is a MySQL to PostgreSQL replica system written in Python 3. 
The system use the library mysql-replication to pull the row images from MySQL which are stored into PostgreSQL as JSONB. 
A pl/pgsql function decodes the jsonb values and replays the changes against the PostgreSQL database.
    
pg_chameleon  2.0 `is available on pypi <https://pypi.org/project/pg_chameleon/>`_  

The documentation `is available on pgchameleon.org <http://www.pgchameleon.org/documents/index.html>`_

Please submit your `bug reports on GitHub <https://github.com/the4thdoctor/pg_chameleon>`_.


Requirements
******************

Replica host
..............................

Operating system: Linux, FreeBSD, OpenBSD
Python: CPython 3.5+ 

* `PyMySQL <https://pypi.python.org/pypi/PyMySQL>`_ 
* `mysql-replication <https://pypi.python.org/pypi/mysql-replication>`_
* `psycopg2 <https://pypi.python.org/pypi/psycopg2>`_
* `PyYAML <https://pypi.python.org/pypi/PyYAML>`_
* `tabulate <https://pypi.python.org/pypi/tabulate>`_
* `rollbar <https://pypi.python.org/pypi/rollbar>`_
* `daemonize <https://pypi.python.org/pypi/daemonize>`_

Optionals for building documentation

* `sphinx <http://www.sphinx-doc.org/en/stable/>`_
* `sphinx-autobuild <https://github.com/GaretJax/sphinx-autobuild>`_


Origin database
.................................

MySQL 5.5+

Aurora MySQL 5.7+

Destination database
..............................

PostgreSQL 9.5+

Example scenarios 
..............................

* Analytics 
* Migrations
* Data aggregation from multiple MySQL databases
  
Features
..............................

* Read from multiple MySQL schemas and  restore them it into a target PostgreSQL  database. The source and target schema names can be different.
* Setup PostgreSQL to act as a MySQL replica.
* Support for enumerated and binary data types.
* Basic DDL Support (CREATE/DROP/ALTER TABLE, DROP PRIMARY KEY/TRUNCATE, RENAME).
* Discard of rubbish data coming from the replica. 
* Conservative approach to the replica. Tables which generate errors are automatically excluded from the replica.
* Possibilty to refresh single tables or single schemas.
* Basic replica monitoring.
* Detach replica from MySQL for migration support.
* Data type override (e.g. tinyint(1) to boolean)
* Daemonised init_replica process.
* Daemonised replica process with two separated subprocess, one for the read and one for the replay.
* Rollbar integration
* Support for geometrical data. **Requires PostGIS on the target database.**
* Minimal locking during init_replica for transactional engines (e.g. innodb)





Caveats
..............................
The replica requires the tables to have a primary or unique key. Tables without primary/unique key are initialised during the init_replica process but not replicated.

The copy_max_memory is just an estimate. The average rows size is extracted from mysql's informations schema and can be outdated.
If the copy process fails for memory error check the failing table's row length and the number of rows for each slice. 

Python 3 is supported only from version 3.5 as required by mysql-replication .

The lag is determined using the last received event timestamp and the postgresql timestamp. If the mysql is read only the lag will increase because
no replica event is coming in. 

The detach replica process resets the sequences in postgres to let the database work standalone. The foreign keys from the source MySQL schema are extracted and created initially as NOT VALID.  The foreign keys are created without the ON DELETE or ON UPDATE clauses.
A second run tries to validate the foreign keys. If an error occurs it gets logged out according to the source configuration. 



Setup 
*****************

RPM PGDG
..............................

pg_chameleon is included in the PGDG RMP repository thanks to Devrim.

Please follow the instructions on  `https://www.postgresql.org/download/linux/redhat/ <https://www.postgresql.org/download/linux/redhat/>`_ 

openSUSE Build Service
..............................

pg_chameleon is available on the  `openSUSE build Service <https://build.opensuse.org/package/show/server:database:postgresql/pg_chameleon>`_ 

Currently all releases are supported except SLE_12_SP5 because of unresolved dependencies.

Virtual env setup 
..............................

* Create a virtual environment (e.g. python3 -m venv venv)
* Activate the virtual environment (e.g. source venv/bin/activate)
* Upgrade pip with **pip install pip --upgrade**
* Install pg_chameleon with **pip install pg_chameleon**. 
* Create a user on mysql for the replica (e.g. usr_replica)
* Grant access to usr on the replicated database (e.g. GRANT ALL ON sakila.* TO 'usr_replica';)
* Grant RELOAD privilege to the user (e.g. GRANT RELOAD ON \*.\* to 'usr_replica';)
* Grant REPLICATION CLIENT privilege to the user (e.g. GRANT REPLICATION CLIENT ON \*.\* to 'usr_replica';)
* Grant REPLICATION SLAVE privilege to the user (e.g. GRANT REPLICATION SLAVE ON \*.\* to 'usr_replica';)



Configuration directory
********************************
The system wide install is now supported correctly. 

The configuration is set with the command ``chameleon set_configuration_files`` in $HOME/.pg_chameleon .
Inside the directory there are three subdirectories. 


* configuration is where the configuration files are stored. 
* pid is where the replica pid file is created. it can be changed in the configuration file
* logs is where the replica logs are saved if log_dest is file. It can be changed in the configuration file

You should  use config-example.yaml as template for the other configuration files. 
Check the `configuration file reference <http://www.pgchameleon.org/documents/configuration_file.html>`_   for an overview.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/the4thdoctor/pg_chameleon/",
    "name": "pg-chameleon",
    "maintainer": "Federico Campoli",
    "docs_url": null,
    "requires_python": ">=3.5",
    "maintainer_email": "the4thdoctor.gallifrey@gmail.com",
    "keywords": "postgresql mysql replica migration database",
    "author": "Federico Campoli",
    "author_email": "the4thdoctor.gallifrey@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/0e/e7/f3e11677227991908f8bc0a2ca3ce1ef5e2aefac778cc063236ac85e8a9d/pg_chameleon-2.0.19.tar.gz",
    "platform": "linux",
    "description": ".. image:: https://img.shields.io/github/issues/the4thdoctor/pg_chameleon.svg   \n        :target: https://github.com/the4thdoctor/pg_chameleon/issues\n\n.. image:: https://img.shields.io/github/forks/the4thdoctor/pg_chameleon.svg   \n        :target: https://github.com/the4thdoctor/pg_chameleon/network\n\n.. image:: https://img.shields.io/github/stars/the4thdoctor/pg_chameleon.svg   \n        :target: https://github.com/the4thdoctor/pg_chameleon/stargazers\n\t\n.. image:: https://img.shields.io/badge/license-BSD-blue.svg   \n        :target: https://raw.githubusercontent.com/the4thdoctor/pg_chameleon/main/LICENSE.txt\n\n.. image:: https://img.shields.io/github/release/the4thdoctor/pg_chameleon\n\t\t:target: https://github.com/the4thdoctor/pg_chameleon/releases\n\t\n.. image:: https://img.shields.io/pypi/dm/pg_chameleon.svg\n    :target: https://pypi.org/project/pg_chameleon\n\n    \npg_chameleon is a MySQL to PostgreSQL replica system written in Python 3. \nThe system use the library mysql-replication to pull the row images from MySQL which are stored into PostgreSQL as JSONB. \nA pl/pgsql function decodes the jsonb values and replays the changes against the PostgreSQL database.\n    \npg_chameleon  2.0 `is available on pypi <https://pypi.org/project/pg_chameleon/>`_  \n\nThe documentation `is available on pgchameleon.org <http://www.pgchameleon.org/documents/index.html>`_\n\nPlease submit your `bug reports on GitHub <https://github.com/the4thdoctor/pg_chameleon>`_.\n\n\nRequirements\n******************\n\nReplica host\n..............................\n\nOperating system: Linux, FreeBSD, OpenBSD\nPython: CPython 3.5+ \n\n* `PyMySQL <https://pypi.python.org/pypi/PyMySQL>`_ \n* `mysql-replication <https://pypi.python.org/pypi/mysql-replication>`_\n* `psycopg2 <https://pypi.python.org/pypi/psycopg2>`_\n* `PyYAML <https://pypi.python.org/pypi/PyYAML>`_\n* `tabulate <https://pypi.python.org/pypi/tabulate>`_\n* `rollbar <https://pypi.python.org/pypi/rollbar>`_\n* `daemonize <https://pypi.python.org/pypi/daemonize>`_\n\nOptionals for building documentation\n\n* `sphinx <http://www.sphinx-doc.org/en/stable/>`_\n* `sphinx-autobuild <https://github.com/GaretJax/sphinx-autobuild>`_\n\n\nOrigin database\n.................................\n\nMySQL 5.5+\n\nAurora MySQL 5.7+\n\nDestination database\n..............................\n\nPostgreSQL 9.5+\n\nExample scenarios \n..............................\n\n* Analytics \n* Migrations\n* Data aggregation from multiple MySQL databases\n  \nFeatures\n..............................\n\n* Read from multiple MySQL schemas and  restore them it into a target PostgreSQL  database. The source and target schema names can be different.\n* Setup PostgreSQL to act as a MySQL replica.\n* Support for enumerated and binary data types.\n* Basic DDL Support (CREATE/DROP/ALTER TABLE, DROP PRIMARY KEY/TRUNCATE, RENAME).\n* Discard of rubbish data coming from the replica. \n* Conservative approach to the replica. Tables which generate errors are automatically excluded from the replica.\n* Possibilty to refresh single tables or single schemas.\n* Basic replica monitoring.\n* Detach replica from MySQL for migration support.\n* Data type override (e.g. tinyint(1) to boolean)\n* Daemonised init_replica process.\n* Daemonised replica process with two separated subprocess, one for the read and one for the replay.\n* Rollbar integration\n* Support for geometrical data. **Requires PostGIS on the target database.**\n* Minimal locking during init_replica for transactional engines (e.g. innodb)\n\n\n\n\n\nCaveats\n..............................\nThe replica requires the tables to have a primary or unique key. Tables without primary/unique key are initialised during the init_replica process but not replicated.\n\nThe copy_max_memory is just an estimate. The average rows size is extracted from mysql's informations schema and can be outdated.\nIf the copy process fails for memory error check the failing table's row length and the number of rows for each slice. \n\nPython 3 is supported only from version 3.5 as required by mysql-replication .\n\nThe lag is determined using the last received event timestamp and the postgresql timestamp. If the mysql is read only the lag will increase because\nno replica event is coming in. \n\nThe detach replica process resets the sequences in postgres to let the database work standalone. The foreign keys from the source MySQL schema are extracted and created initially as NOT VALID.  The foreign keys are created without the ON DELETE or ON UPDATE clauses.\nA second run tries to validate the foreign keys. If an error occurs it gets logged out according to the source configuration. \n\n\n\nSetup \n*****************\n\nRPM PGDG\n..............................\n\npg_chameleon is included in the PGDG RMP repository thanks to Devrim.\n\nPlease follow the instructions on  `https://www.postgresql.org/download/linux/redhat/ <https://www.postgresql.org/download/linux/redhat/>`_ \n\nopenSUSE Build Service\n..............................\n\npg_chameleon is available on the  `openSUSE build Service <https://build.opensuse.org/package/show/server:database:postgresql/pg_chameleon>`_ \n\nCurrently all releases are supported except SLE_12_SP5 because of unresolved dependencies.\n\nVirtual env setup \n..............................\n\n* Create a virtual environment (e.g. python3 -m venv venv)\n* Activate the virtual environment (e.g. source venv/bin/activate)\n* Upgrade pip with **pip install pip --upgrade**\n* Install pg_chameleon with **pip install pg_chameleon**. \n* Create a user on mysql for the replica (e.g. usr_replica)\n* Grant access to usr on the replicated database (e.g. GRANT ALL ON sakila.* TO 'usr_replica';)\n* Grant RELOAD privilege to the user (e.g. GRANT RELOAD ON \\*.\\* to 'usr_replica';)\n* Grant REPLICATION CLIENT privilege to the user (e.g. GRANT REPLICATION CLIENT ON \\*.\\* to 'usr_replica';)\n* Grant REPLICATION SLAVE privilege to the user (e.g. GRANT REPLICATION SLAVE ON \\*.\\* to 'usr_replica';)\n\n\n\nConfiguration directory\n********************************\nThe system wide install is now supported correctly. \n\nThe configuration is set with the command ``chameleon set_configuration_files`` in $HOME/.pg_chameleon .\nInside the directory there are three subdirectories. \n\n\n* configuration is where the configuration files are stored. \n* pid is where the replica pid file is created. it can be changed in the configuration file\n* logs is where the replica logs are saved if log_dest is file. It can be changed in the configuration file\n\nYou should  use config-example.yaml as template for the other configuration files. \nCheck the `configuration file reference <http://www.pgchameleon.org/documents/configuration_file.html>`_   for an overview.\n\n",
    "bugtrack_url": null,
    "license": "BSD License",
    "summary": "MySQL to PostgreSQL replica and migration",
    "version": "2.0.19",
    "split_keywords": [
        "postgresql",
        "mysql",
        "replica",
        "migration",
        "database"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fa99e1abe0df4f547bf25ee023632934f6909bcc638f85ac7903019285a2d01c",
                "md5": "e86f716c1efe69d8ed3f115beee18d75",
                "sha256": "d7809c4d7a9c1385df235deeca8f5d107649dde31c5180adb19b08697e4d7a78"
            },
            "downloads": -1,
            "filename": "pg_chameleon-2.0.19-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e86f716c1efe69d8ed3f115beee18d75",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.5",
            "size": 107136,
            "upload_time": "2023-03-26T14:14:55",
            "upload_time_iso_8601": "2023-03-26T14:14:55.154751Z",
            "url": "https://files.pythonhosted.org/packages/fa/99/e1abe0df4f547bf25ee023632934f6909bcc638f85ac7903019285a2d01c/pg_chameleon-2.0.19-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0ee7f3e11677227991908f8bc0a2ca3ce1ef5e2aefac778cc063236ac85e8a9d",
                "md5": "5a8b25407319a26bdcc61f1338a482b5",
                "sha256": "f9cd77cfb7c9af6b1b92193d14986b420cdbaec414c928d3be497f8dc7c80bce"
            },
            "downloads": -1,
            "filename": "pg_chameleon-2.0.19.tar.gz",
            "has_sig": false,
            "md5_digest": "5a8b25407319a26bdcc61f1338a482b5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.5",
            "size": 345748,
            "upload_time": "2023-03-26T14:15:00",
            "upload_time_iso_8601": "2023-03-26T14:15:00.067193Z",
            "url": "https://files.pythonhosted.org/packages/0e/e7/f3e11677227991908f8bc0a2ca3ce1ef5e2aefac778cc063236ac85e8a9d/pg_chameleon-2.0.19.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-26 14:15:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "the4thdoctor",
    "github_project": "pg_chameleon",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pg-chameleon"
}
        
Elapsed time: 0.05510s