# synkler
Message queue based rsync wrangler for copying files across multiple servers.
## Overview
Synkler exists to solve the (probably ridiculous) problem of needing to copy from server A (**upload**) to server C (**download**) when neither can connect directly to each other but both can connect to server B (**central**) -- with the additional complication that the files *will not live at either the source nor the destination after the copy is complete*.
The basic workflow is as follows:
- *file* arrives on the **upload** server, in the directory synkler is configured to monitor (_file_dir_).
- **upload** notifies **central** via **synkler** (i.e. [rabbitmq](https://www.rabbitmq.com/)) that it has a new file or directory to transfer
- once **central** is ready to receive it signals **upload** to begin the rsync
- when the transfer is complete, **central** will verify its local copy of *file* by comparing the md5 hash against what's reported by **upload**
- **central** will then signal **download** to begin an rsync of *file* from **central** to its own local file system
- once completed, **download** verifies its copy of *file* before signalling to both **central** and **upload** that it has successfully received it
- **upload** and **download** then have the option to run a _cleanup_script_ on *file*, which are free to move it from its original location to wherever
- after a configurable number of minutes (_keep_minutes_), **central** will delete its version of *file*
## Installation
On all three servers (**upload**, **central** and **download**):
```
$ pip3 install synkler
```
On **synkler**, install [rabbitmq](https://www.rabbitmq.com/).
**upload** and **download** should both be able to connect to **central** via ssh and **synkler** on port 5672.
NOTE: **synkler** and **central** are most likely the same server, since both **upload** and **download** can connect to it. But they don't have to be.
## Configuration
Modify [sample-config](https://github.com/pgillan145/synkler/blob/main/sample-config) and either copy it one of these locations:
```
$HOME/synkler.conf
$HOME/.config/synkler/synkler.conf
/etc/synkler.conf
```
... or call synkler with the configuration file as a command line argument:
```
$ synkler --config /location/of/synkler/config.file
```
... or set the $SYNKLER\_CONF environment variable:
```
$ export SYNKLER_CONF=/place/i/put/config.files
$ synkler
```
## Starting
As long as you set _pidfile_ in 'synkler.conf', you can call synkler from a cron without worrying about spawning multiple processes:
```
* * * * * /usr/bin/env synkler --verbose >> /tmp/synkler.log 2>&1
```
## Stopping
To stop synkler, just kill the process. Assuming _pidfile_ is defined in *synkler.conf*:
```
$ cat <pidfile> | xargs.kill
```
Also remember to disable the cron, of course, if that's how you were starting it:
```
#* * * * * /usr/bin/env synkler --verbose >> /tmp/synkler.log 2>&1
```
## To Do
Major pieces that still need to be added, fixed or investigated:
- probably need to be able to specify a port number for rabbitmq.
- needs the option of running it as a service rather than a jenky-ass cron.
- documentashun shud be gooder
- no way to see the overall status of files in the system.
- I heard there might be more than two types of computers, some additional testing could be required.
- while daisy-chaining and having an arbitrary number of **upload** servers is theoretically possible, I haven't tried it. I should.
- unit testing!
- need to be able to specify an arbitrary ID value so multiple instances can run on the same servers without clobbering each other's queues.
Raw data
{
"_id": null,
"home_page": "https://github.com/pgillan145/synkler",
"name": "synkler",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Patrick Gillan",
"author_email": "pgillan@minorimpact.com",
"download_url": "https://files.pythonhosted.org/packages/6d/8e/10331fd89b9d32d5d46b864ccf2a059bdb1c5dee6a8f41dc55a4ef70e833/synkler-0.0.9.tar.gz",
"platform": null,
"description": "# synkler\nMessage queue based rsync wrangler for copying files across multiple servers.\n\n## Overview\nSynkler exists to solve the (probably ridiculous) problem of needing to copy from server A (**upload**) to server C (**download**) when neither can connect directly to each other but both can connect to server B (**central**) -- with the additional complication that the files *will not live at either the source nor the destination after the copy is complete*.\n\nThe basic workflow is as follows: \n- *file* arrives on the **upload** server, in the directory synkler is configured to monitor (_file_dir_). \n- **upload** notifies **central** via **synkler** (i.e. [rabbitmq](https://www.rabbitmq.com/)) that it has a new file or directory to transfer\n- once **central** is ready to receive it signals **upload** to begin the rsync\n- when the transfer is complete, **central** will verify its local copy of *file* by comparing the md5 hash against what's reported by **upload** \n- **central** will then signal **download** to begin an rsync of *file* from **central** to its own local file system\n- once completed, **download** verifies its copy of *file* before signalling to both **central** and **upload** that it has successfully received it\n- **upload** and **download** then have the option to run a _cleanup_script_ on *file*, which are free to move it from its original location to wherever\n- after a configurable number of minutes (_keep_minutes_), **central** will delete its version of *file*\n\n\n## Installation\nOn all three servers (**upload**, **central** and **download**):\n```\n $ pip3 install synkler\n```\nOn **synkler**, install [rabbitmq](https://www.rabbitmq.com/).\n\n**upload** and **download** should both be able to connect to **central** via ssh and **synkler** on port 5672.\n\nNOTE: **synkler** and **central** are most likely the same server, since both **upload** and **download** can connect to it. But they don't have to be.\n\n\n## Configuration\nModify [sample-config](https://github.com/pgillan145/synkler/blob/main/sample-config) and either copy it one of these locations:\n```\n $HOME/synkler.conf\n $HOME/.config/synkler/synkler.conf\n /etc/synkler.conf\n```\n... or call synkler with the configuration file as a command line argument:\n```\n $ synkler --config /location/of/synkler/config.file\n```\n... or set the $SYNKLER\\_CONF environment variable:\n```\n $ export SYNKLER_CONF=/place/i/put/config.files\n $ synkler\n```\n\n## Starting\nAs long as you set _pidfile_ in 'synkler.conf', you can call synkler from a cron without worrying about spawning multiple processes:\n```\n * * * * * /usr/bin/env synkler --verbose >> /tmp/synkler.log 2>&1\n```\n\n## Stopping\nTo stop synkler, just kill the process. Assuming _pidfile_ is defined in *synkler.conf*:\n```\n $ cat <pidfile> | xargs.kill\n```\n\nAlso remember to disable the cron, of course, if that's how you were starting it:\n```\n #* * * * * /usr/bin/env synkler --verbose >> /tmp/synkler.log 2>&1\n```\n\n## To Do\nMajor pieces that still need to be added, fixed or investigated:\n- probably need to be able to specify a port number for rabbitmq.\n- needs the option of running it as a service rather than a jenky-ass cron.\n- documentashun shud be gooder\n- no way to see the overall status of files in the system.\n- I heard there might be more than two types of computers, some additional testing could be required.\n- while daisy-chaining and having an arbitrary number of **upload** servers is theoretically possible, I haven't tried it. I should.\n- unit testing!\n- need to be able to specify an arbitrary ID value so multiple instances can run on the same servers without clobbering each other's queues.\n\n",
"bugtrack_url": null,
"license": "GPLv3",
"summary": "A three-body rsync solution.",
"version": "0.0.9",
"project_urls": {
"Homepage": "https://github.com/pgillan145/synkler"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6d8e10331fd89b9d32d5d46b864ccf2a059bdb1c5dee6a8f41dc55a4ef70e833",
"md5": "25f05ed8497e31e1ee049f0591452e44",
"sha256": "101571b01866c11da45c8c1c00d8d6ea3b990d8eb5c7df27af78e0db4f1d892f"
},
"downloads": -1,
"filename": "synkler-0.0.9.tar.gz",
"has_sig": false,
"md5_digest": "25f05ed8497e31e1ee049f0591452e44",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 21131,
"upload_time": "2024-09-11T00:02:03",
"upload_time_iso_8601": "2024-09-11T00:02:03.972706Z",
"url": "https://files.pythonhosted.org/packages/6d/8e/10331fd89b9d32d5d46b864ccf2a059bdb1c5dee6a8f41dc55a4ef70e833/synkler-0.0.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-11 00:02:03",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pgillan145",
"github_project": "synkler",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "synkler"
}