S3Backup


NameS3Backup JSON
Version 0.12 PyPI version JSON
download
home_pagehttps://github.com/mgoodfellow/s3-backup
SummaryPerform scripted backups to Amazon S3
upload_time2023-07-20 08:47:00
maintainer
docs_urlNone
authorMike Goodfellow
requires_python
licenseMIT
keywords backup aws s3
VCS
bugtrack_url
requirements boto glob2
Travis-CI No Travis.
coveralls test coverage No coveralls.
            S3Backup
========

Flexible python based backup utility for storing to S3

About
-----

This is a small python script to handle performing backups cross
platform

Features
--------

It supports the following features:

-  Plan based backups
-  Custom command run pre-backup (can be used to perform complex pre-backup preparation tasks)
-  Storing to S3
-  Calculating MD5 hashes of the backup set to avoid uploading duplicate backup sets
-  Emailing the result of the backup plans
-  Python standard logging framework

Installation
------------

Install using ``pip``:

::

    pip install s3backup

Using ``virtualenv``:

::

    $ mkdir s3backup
    $ cd s3backup
    $ python3 -m venv .
    $ . bin/activate
    $ pip install -r requirements.txt

Dependencies
------------

S3Backup depends on:

- boto (AWS SDK)
- `glob2 <http://github.com/miracle2k/python-glob2/>`_ (Better file globbing)

Both can be installed via pip, however, if S3Backup is installed via pip then these dependencies will already be met.

Configuration
-------------

The backup utility is configured through the use of a JSON configuration
file

.. code:: json

    {
      "AWS_KEY": "this is a key",
      "AWS_SECRET": "this is a secret",
      "AWS_BUCKET": "this is a bucket",
      "AWS_REGION": "this is a region",
      "EMAIL_FROM": "source@address.com",
      "EMAIL_TO": "recipient@address.com",
      "HASH_CHECK_FILE": "plan_hashes.txt",
      "Plans": [
        {
          "Name": "MySQL Backup",
          "Command": "/home/bob/backups/backup-prep-script.sh",
          "Src": "/home/bob/backups/database/mysql_backup.sql",
          "OutputPrefix": "main_db",
          "PreviousBackupsCount": 2,
          "Zip64": false
        },
        {
          "Name": "Websites Backup",
          "Src": ["/var/www/html/website/**/*", "/var/www/html/website2/**/*"],
          "OutputPrefix": "websites_backup"
        }
      ]
    }

If emails are not required, then omit the ``EMAIL_FROM`` and
``EMAIL_TO`` fields of the configuration file.

If the ``PreviousBackupsCount`` is not set, then it will default to keeping
1 previous backup. It can be set to 0, which will only keep the current backup.

If the ``Zip64`` is not set, then it will default to ``true``. This allows for
Zip files > 2GB to be created. If running on a old environment this might need to
be forced to false.

*Note*: When on Windows, it is better to pass the paths using forward
slashes (/) as then escaping isn’t required (as with backslashes). The
script will normalize the paths in these cases. However, when providing
the command, if paths are required they will need to be double escaped.

There are more examples (including Windows examples) and further discussion
on `this blog post <https://mikegoodfellow.co.uk/s3-backup-utility/>`_

Usage
-----

You will need to set up an AWS account if you do not have one, and then
obtain AWS keys for an IAM account which has the following privileges:

-  S3 full access (for writing to the storage bucket)
-  SES full access (for sending emails)

Run the backup tool using the following method:

.. code:: python

    import logging
    import os
    import sys
    from S3Backup import S3BackupTool

    script_path = os.path.dirname(os.path.realpath(__file__)) + '/'

    # Log to file
    #logging.basicConfig(format='%(asctime)s - %(levelname)s - %(name)s - %(message)s',
    #                    filename=script_path + "s3backup.log", level=logging.INFO)

    # Log to stdout
    logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)

    s3backup = S3BackupTool("config.json")

    s3backup.run_plans()

See ``test.py`` for an example.

File Hashing
------------

After a backup set is created an MD5 hash is calculated for it. This is then compared against a previously calculated
hash for that particular plan name.

**NOTE:** Do not change the generated HASH_CHECK_FILE!

Finally, be aware of a "gotcha" - the hashes are keyed on the *plan name* - therefore changing the plan name will
cause the backup script to think it needs to upload a new backup set.

Emails
------

An email will be sent after each plan runs. The email will either report a success or a failure. In the event
of a success, it will be reported if there was a new uploaded backup set (and the file name), otherwise it will
state that no changes were detected and no upload was made.

If there was a failure while running the backup, the exception message will be emailed, and the logs can be
referred to for further information.

Future Improvements
-------------------

These are some of the planned future improvements:

-  Allow custom format strings for the output files (instead of the default date/time format)
-  Modification of the glob2 library to allow hidden files to be included
-  Allow exclude globs to be added when providing source directory

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/mgoodfellow/s3-backup",
    "name": "S3Backup",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "backup,aws,s3",
    "author": "Mike Goodfellow",
    "author_email": "mdgoodfellow@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/b1/45/bd352cc55b0af78860a7b4e53e7562dbe72effeb4d63d0b350a62c2ec8b8/S3Backup-0.12.tar.gz",
    "platform": null,
    "description": "S3Backup\r\n========\r\n\r\nFlexible python based backup utility for storing to S3\r\n\r\nAbout\r\n-----\r\n\r\nThis is a small python script to handle performing backups cross\r\nplatform\r\n\r\nFeatures\r\n--------\r\n\r\nIt supports the following features:\r\n\r\n-  Plan based backups\r\n-  Custom command run pre-backup (can be used to perform complex pre-backup preparation tasks)\r\n-  Storing to S3\r\n-  Calculating MD5 hashes of the backup set to avoid uploading duplicate backup sets\r\n-  Emailing the result of the backup plans\r\n-  Python standard logging framework\r\n\r\nInstallation\r\n------------\r\n\r\nInstall using ``pip``:\r\n\r\n::\r\n\r\n    pip install s3backup\r\n\r\nUsing ``virtualenv``:\r\n\r\n::\r\n\r\n    $ mkdir s3backup\r\n    $ cd s3backup\r\n    $ python3 -m venv .\r\n    $ . bin/activate\r\n    $ pip install -r requirements.txt\r\n\r\nDependencies\r\n------------\r\n\r\nS3Backup depends on:\r\n\r\n- boto (AWS SDK)\r\n- `glob2 <http://github.com/miracle2k/python-glob2/>`_ (Better file globbing)\r\n\r\nBoth can be installed via pip, however, if S3Backup is installed via pip then these dependencies will already be met.\r\n\r\nConfiguration\r\n-------------\r\n\r\nThe backup utility is configured through the use of a JSON configuration\r\nfile\r\n\r\n.. code:: json\r\n\r\n    {\r\n      \"AWS_KEY\": \"this is a key\",\r\n      \"AWS_SECRET\": \"this is a secret\",\r\n      \"AWS_BUCKET\": \"this is a bucket\",\r\n      \"AWS_REGION\": \"this is a region\",\r\n      \"EMAIL_FROM\": \"source@address.com\",\r\n      \"EMAIL_TO\": \"recipient@address.com\",\r\n      \"HASH_CHECK_FILE\": \"plan_hashes.txt\",\r\n      \"Plans\": [\r\n        {\r\n          \"Name\": \"MySQL Backup\",\r\n          \"Command\": \"/home/bob/backups/backup-prep-script.sh\",\r\n          \"Src\": \"/home/bob/backups/database/mysql_backup.sql\",\r\n          \"OutputPrefix\": \"main_db\",\r\n          \"PreviousBackupsCount\": 2,\r\n          \"Zip64\": false\r\n        },\r\n        {\r\n          \"Name\": \"Websites Backup\",\r\n          \"Src\": [\"/var/www/html/website/**/*\", \"/var/www/html/website2/**/*\"],\r\n          \"OutputPrefix\": \"websites_backup\"\r\n        }\r\n      ]\r\n    }\r\n\r\nIf emails are not required, then omit the ``EMAIL_FROM`` and\r\n``EMAIL_TO`` fields of the configuration file.\r\n\r\nIf the ``PreviousBackupsCount`` is not set, then it will default to keeping\r\n1 previous backup. It can be set to 0, which will only keep the current backup.\r\n\r\nIf the ``Zip64`` is not set, then it will default to ``true``. This allows for\r\nZip files > 2GB to be created. If running on a old environment this might need to\r\nbe forced to false.\r\n\r\n*Note*: When on Windows, it is better to pass the paths using forward\r\nslashes (/) as then escaping isn\u00e2\u20ac\u2122t required (as with backslashes). The\r\nscript will normalize the paths in these cases. However, when providing\r\nthe command, if paths are required they will need to be double escaped.\r\n\r\nThere are more examples (including Windows examples) and further discussion\r\non `this blog post <https://mikegoodfellow.co.uk/s3-backup-utility/>`_\r\n\r\nUsage\r\n-----\r\n\r\nYou will need to set up an AWS account if you do not have one, and then\r\nobtain AWS keys for an IAM account which has the following privileges:\r\n\r\n-  S3 full access (for writing to the storage bucket)\r\n-  SES full access (for sending emails)\r\n\r\nRun the backup tool using the following method:\r\n\r\n.. code:: python\r\n\r\n    import logging\r\n    import os\r\n    import sys\r\n    from S3Backup import S3BackupTool\r\n\r\n    script_path = os.path.dirname(os.path.realpath(__file__)) + '/'\r\n\r\n    # Log to file\r\n    #logging.basicConfig(format='%(asctime)s - %(levelname)s - %(name)s - %(message)s',\r\n    #                    filename=script_path + \"s3backup.log\", level=logging.INFO)\r\n\r\n    # Log to stdout\r\n    logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)\r\n\r\n    s3backup = S3BackupTool(\"config.json\")\r\n\r\n    s3backup.run_plans()\r\n\r\nSee ``test.py`` for an example.\r\n\r\nFile Hashing\r\n------------\r\n\r\nAfter a backup set is created an MD5 hash is calculated for it. This is then compared against a previously calculated\r\nhash for that particular plan name.\r\n\r\n**NOTE:** Do not change the generated HASH_CHECK_FILE!\r\n\r\nFinally, be aware of a \"gotcha\" - the hashes are keyed on the *plan name* - therefore changing the plan name will\r\ncause the backup script to think it needs to upload a new backup set.\r\n\r\nEmails\r\n------\r\n\r\nAn email will be sent after each plan runs. The email will either report a success or a failure. In the event\r\nof a success, it will be reported if there was a new uploaded backup set (and the file name), otherwise it will\r\nstate that no changes were detected and no upload was made.\r\n\r\nIf there was a failure while running the backup, the exception message will be emailed, and the logs can be\r\nreferred to for further information.\r\n\r\nFuture Improvements\r\n-------------------\r\n\r\nThese are some of the planned future improvements:\r\n\r\n-  Allow custom format strings for the output files (instead of the default date/time format)\r\n-  Modification of the glob2 library to allow hidden files to be included\r\n-  Allow exclude globs to be added when providing source directory\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Perform scripted backups to Amazon S3",
    "version": "0.12",
    "project_urls": {
        "Homepage": "https://github.com/mgoodfellow/s3-backup"
    },
    "split_keywords": [
        "backup",
        "aws",
        "s3"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ddd3c6e160072d77e4c850976118afc5c8a2987e8e722d4c35950238255ba61d",
                "md5": "c07cf2bd4fe8955123b3462a4917ee77",
                "sha256": "69b9f77a1df4b1d982f76b79af6bc757531e3d5b77dc4832bde7c1d06c233d45"
            },
            "downloads": -1,
            "filename": "S3Backup-0.12-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c07cf2bd4fe8955123b3462a4917ee77",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 11537,
            "upload_time": "2023-07-20T08:46:59",
            "upload_time_iso_8601": "2023-07-20T08:46:59.337102Z",
            "url": "https://files.pythonhosted.org/packages/dd/d3/c6e160072d77e4c850976118afc5c8a2987e8e722d4c35950238255ba61d/S3Backup-0.12-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b145bd352cc55b0af78860a7b4e53e7562dbe72effeb4d63d0b350a62c2ec8b8",
                "md5": "a68d23cd93ecba7f3ffa91669877430e",
                "sha256": "e8f5b0b5679c432b5832199d27447fbd41c0d3704d4f7d0b07e8bc78483c9887"
            },
            "downloads": -1,
            "filename": "S3Backup-0.12.tar.gz",
            "has_sig": false,
            "md5_digest": "a68d23cd93ecba7f3ffa91669877430e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10969,
            "upload_time": "2023-07-20T08:47:00",
            "upload_time_iso_8601": "2023-07-20T08:47:00.947556Z",
            "url": "https://files.pythonhosted.org/packages/b1/45/bd352cc55b0af78860a7b4e53e7562dbe72effeb4d63d0b350a62c2ec8b8/S3Backup-0.12.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-20 08:47:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mgoodfellow",
    "github_project": "s3-backup",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "boto",
            "specs": []
        },
        {
            "name": "glob2",
            "specs": []
        }
    ],
    "lcname": "s3backup"
}
        
Elapsed time: 0.09520s