oeleo


Nameoeleo JSON
Version 0.5.3 PyPI version JSON
download
home_pagehttps://github.com/ife-bat/oeleo
SummaryA one-eyed tool to copy files with.
upload_time2024-08-28 19:53:20
maintainerNone
docs_urlNone
authorjepegit
requires_python<3.13,>=3.8
licenseMIT
keywords ssh db
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # oeleo
Python package / app that can be used for transferring files from an instrument-PC to a data server.


## Features (or limitations)
- Transferring using an ssh connection should preferably be used with key-pairs. This might involve some
  setting up on your server (ACL) to prevent security issues (the `oeleo` user should only have access to
  the data folder on your server).
- Accessing ssh can be done using password if you are not able to figure out how to set proper ownerships 
  on your server.
- `oeleo` is one-eyed. Meaning that tracking of the "state of the duplicates" is only performed on the local side (where `oeleo` is running).
- However, `oeleo` contains a `check` method that can help you figure out if starting copying is a  
  good idea or not. And populate the database if you want.
- The db that stores information about the "state of the duplicates" is stored relative to the folder 
  `oeleo` is running from. If you delete it (by accident?), `oeleo` will make a new empty one from scratch next time you run.
- Configuration is done using environmental variables. 

## Usage

### Install

```bash
$ pip install oeleo
```
### Run

1. Create an `oeleo` worker instance.
2. Connect the worker's `bookkeeper` to a `sqlite3` database.
3. Filter local files.
4. Run to copy files.
5. Repeat from step 3.

### Examples and descriptions

#### Simple script for copying between local folders

```python
import os
from pathlib import Path
import time

import dotenv

from oeleo.checkers import ChecksumChecker
from oeleo.models import SimpleDbHandler
from oeleo.connectors import LocalConnector
from oeleo.workers import Worker
from oeleo.utils import start_logger


def main():
  log = start_logger()
  # assuming you have made a .env file:
  dotenv.load_dotenv()

  db_name = os.environ["OELEO_DB_NAME"]
  base_directory_from = Path(os.environ["OELEO_BASE_DIR_FROM"])
  base_directory_to = Path(os.environ["OELEO_BASE_DIR_TO"])
  filter_extension = os.environ["OELEO_FILTER_EXTENSION"]

  # Making a worker using the Worker class.
  # You can also use the `factory` functions in `oeleo.worker`
  # (e.g. `ssh_worker` and `simple_worker`)
  bookkeeper = SimpleDbHandler(db_name)
  checker = ChecksumChecker()
  local_connector = LocalConnector(directory=base_directory_from)
  external_connector = LocalConnector(directory=base_directory_to)

  worker = Worker(
    checker=checker,
    local_connector=local_connector,
    external_connector=external_connector,
    bookkeeper=bookkeeper,
    extension=filter_extension
  )

  # Running the worker with 5 minutes intervals.
  # You can also use an oeleo scheduler for this.
  worker.connect_to_db()
  while True:
    worker.filter_local()
    worker.run()
    time.sleep(300)


if __name__ == "__main__":
  main()
```

#### Environment `.env` file
```.env
OELEO_BASE_DIR_FROM=C:\data\local
OELEO_BASE_DIR_TO=C:\data\pub
OELEO_FILTER_EXTENSION=.csv
OELEO_DB_NAME=local2pub.db
OELEO_LOG_DIR=C:\oeleo\logs

## only needed for advanced connectors:
# OELEO_DB_HOST=<db host>
# OELEO_DB_PORT=<db port>
# OELEO_DB_USER=<db user>
# OELEO_DB_PASSWORD=<db user>
# OELEO_EXTERNAL_HOST<ssh hostname>
# OELEO_USERNAME=<ssh username>
# OELEO_PASSWORD=<ssh password>
# OELEO_KEY_FILENAME=<ssh key-pair filename>

## only needed for SharePointConnector:
# OELEO_SHAREPOINT_USERNAME=<sharepoint username (fallbacks to ssh username if missing)>
# OELEO_SHAREPOINT_URL=<url to sharepoint>
# OELEO_SHAREPOINT_SITENAME=<name of sharepoint site>
# OELEO_SHAREPOINT_DOC_LIBRARY=<name of sharepoint library>
```

#### Database

The database contains one table called `filelist`:

| id  | processed_date             | local_name         | external_name                         | checksum                         | code |
|-----|:---------------------------|:-------------------|:--------------------------------------|:---------------------------------|-----:|
| 1   | 2022-07-05 15:55:02.521154 | file_number_1.xyz	 | C:\oeleo\check\to\file_number_1.xyz   | c976e564825667d7c11ba200457af263 |    1 |
| 2   | 2022-07-05 15:55:02.536152 | file_number_10.xyz | C:\oeleo\check\to\file_number_10.xyz	 | d502512c0d32d7503feb3fd3dd287376 |    1 |
| 3   | 2022-07-05 15:55:02.553157 | file_number_2.xyz	 | C:\oeleo\check\to\file_number_2.xyz   | cb89d576f5bd57566c78247892baffa3 |    1 |

The `processed_date` is when the file was last updated (meaning last time `oeleo` found a new checksum for it).

The table below shows what the different values of `code` mean:

| code | meaning                       |
|:-----|:------------------------------|
| 0    | `should-be-copied`            |
| 1    | `should-be-copied-if-changed` |
| 2    | `should-not-be-copied`        |

Hint! You can **lock** (chose to never copy) a file by editing the `code` manually to 2. 


#### Using an `oeleo` scheduler

Instead of for example using a while loop to keep `oeleo` running continuously or at selected intervals, 
you can use a scheduler (e.g. `rocketry`, `watchdog`, `schedule`, or more advanced options such as `AirFlow`).

`oeleo` also includes its own schedulers. This is an example of how to use the `oeleo.SimpleScheduler`:


```python
import dotenv

from oeleo.schedulers import SimpleScheduler
from oeleo.workers import simple_worker

# assuming you have created an appropriate .env file
dotenv.load_dotenv()
worker = simple_worker()
s = SimpleScheduler(
        worker,
        run_interval_time=4,  # seconds
        max_run_intervals=4,
    )
s.start()
```


#### Copy files from a Windows PC to a Linux server through ssh

```python
import logging
import os
from pathlib import Path

import dotenv

from oeleo.connectors import register_password
from oeleo.utils import start_logger
from oeleo.workers import ssh_worker

log = start_logger()

print(" ssh ".center(80, "-"))
log.setLevel(logging.DEBUG)
log.info(f"Starting oeleo!")
dotenv.load_dotenv()

external_dir = "/srv/data"
filter_extension = ".res"

register_password(os.environ["OELEO_PASSWORD"])

worker = ssh_worker(
  db_name="ssh_to_server.db",
  base_directory_from=Path(r"data\raw"),
  base_directory_to=external_dir,
  extension=filter_extension,
)
worker.connect_to_db()
try:
  worker.check(update_db=True)
  worker.filter_local()
  worker.run()
finally:
  worker.close()
```

## Future planned improvements

Just plans, no promises given.

- make even nicer printing and logging.
- create CLI.
- create an executable.
- create a web-app.
- create a GUI (not likely).

## Status

- [x] Works on my PC &rarr; PC
- [x] Works on my PC &rarr; my server
- [x] Works on my server &rarr; my server
- [x] Works on my instrument PC &rarr; my instrument PC
- [x] Works on my instrument PC &rarr; my server
- [x] Works OK
- [x] Deployable
- [x] On testpypi
- [x] On pypi
- [x] Code understandable for others
- [x] Looking good
- [x] Fairly easy to use
- [ ] Easy to use
- [ ] Easy to debug runs (e.g. editing sql)

## Licence
MIT

## Development

- Developed using `poetry` on `python 3.11`.
- Must also run on `python 3.8` for Windows 7 support.

### Some useful commands

#### Update version

```bash
# update version e.g. from 0.3.1 to 0.3.2:
poetry version patch
```
Then edit `__init__.py`:
```python
__version__ = "0.3.2"
```
#### Build

```bash
poetry build
```

#### Publish

If you are using 2-factor authentication, you need to create a token on pypi.org and run:

```bash
poetry config pypi-token.pypi <token>
```
Then run:

```bash
poetry publish
```

### Next
- Improve logging

### Development lead
- Jan Petter Maehlen, IFE

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ife-bat/oeleo",
    "name": "oeleo",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.8",
    "maintainer_email": null,
    "keywords": "ssh, db",
    "author": "jepegit",
    "author_email": "jepe@ife.no",
    "download_url": "https://files.pythonhosted.org/packages/62/ff/35cc03a4be600c0fddaeea5fd9083142c2583f4618fb827305629f2c5d2b/oeleo-0.5.3.tar.gz",
    "platform": null,
    "description": "# oeleo\nPython package / app that can be used for transferring files from an instrument-PC to a data server.\n\n\n## Features (or limitations)\n- Transferring using an ssh connection should preferably be used with key-pairs. This might involve some\n  setting up on your server (ACL) to prevent security issues (the `oeleo` user should only have access to\n  the data folder on your server).\n- Accessing ssh can be done using password if you are not able to figure out how to set proper ownerships \n  on your server.\n- `oeleo` is one-eyed. Meaning that tracking of the \"state of the duplicates\" is only performed on the local side (where `oeleo` is running).\n- However, `oeleo` contains a `check` method that can help you figure out if starting copying is a  \n  good idea or not. And populate the database if you want.\n- The db that stores information about the \"state of the duplicates\" is stored relative to the folder \n  `oeleo` is running from. If you delete it (by accident?), `oeleo` will make a new empty one from scratch next time you run.\n- Configuration is done using environmental variables. \n\n## Usage\n\n### Install\n\n```bash\n$ pip install oeleo\n```\n### Run\n\n1. Create an `oeleo` worker instance.\n2. Connect the worker's `bookkeeper` to a `sqlite3` database.\n3. Filter local files.\n4. Run to copy files.\n5. Repeat from step 3.\n\n### Examples and descriptions\n\n#### Simple script for copying between local folders\n\n```python\nimport os\nfrom pathlib import Path\nimport time\n\nimport dotenv\n\nfrom oeleo.checkers import ChecksumChecker\nfrom oeleo.models import SimpleDbHandler\nfrom oeleo.connectors import LocalConnector\nfrom oeleo.workers import Worker\nfrom oeleo.utils import start_logger\n\n\ndef main():\n  log = start_logger()\n  # assuming you have made a .env file:\n  dotenv.load_dotenv()\n\n  db_name = os.environ[\"OELEO_DB_NAME\"]\n  base_directory_from = Path(os.environ[\"OELEO_BASE_DIR_FROM\"])\n  base_directory_to = Path(os.environ[\"OELEO_BASE_DIR_TO\"])\n  filter_extension = os.environ[\"OELEO_FILTER_EXTENSION\"]\n\n  # Making a worker using the Worker class.\n  # You can also use the `factory` functions in `oeleo.worker`\n  # (e.g. `ssh_worker` and `simple_worker`)\n  bookkeeper = SimpleDbHandler(db_name)\n  checker = ChecksumChecker()\n  local_connector = LocalConnector(directory=base_directory_from)\n  external_connector = LocalConnector(directory=base_directory_to)\n\n  worker = Worker(\n    checker=checker,\n    local_connector=local_connector,\n    external_connector=external_connector,\n    bookkeeper=bookkeeper,\n    extension=filter_extension\n  )\n\n  # Running the worker with 5 minutes intervals.\n  # You can also use an oeleo scheduler for this.\n  worker.connect_to_db()\n  while True:\n    worker.filter_local()\n    worker.run()\n    time.sleep(300)\n\n\nif __name__ == \"__main__\":\n  main()\n```\n\n#### Environment `.env` file\n```.env\nOELEO_BASE_DIR_FROM=C:\\data\\local\nOELEO_BASE_DIR_TO=C:\\data\\pub\nOELEO_FILTER_EXTENSION=.csv\nOELEO_DB_NAME=local2pub.db\nOELEO_LOG_DIR=C:\\oeleo\\logs\n\n## only needed for advanced connectors:\n# OELEO_DB_HOST=<db host>\n# OELEO_DB_PORT=<db port>\n# OELEO_DB_USER=<db user>\n# OELEO_DB_PASSWORD=<db user>\n# OELEO_EXTERNAL_HOST<ssh hostname>\n# OELEO_USERNAME=<ssh username>\n# OELEO_PASSWORD=<ssh password>\n# OELEO_KEY_FILENAME=<ssh key-pair filename>\n\n## only needed for SharePointConnector:\n# OELEO_SHAREPOINT_USERNAME=<sharepoint username (fallbacks to ssh username if missing)>\n# OELEO_SHAREPOINT_URL=<url to sharepoint>\n# OELEO_SHAREPOINT_SITENAME=<name of sharepoint site>\n# OELEO_SHAREPOINT_DOC_LIBRARY=<name of sharepoint library>\n```\n\n#### Database\n\nThe database contains one table called `filelist`:\n\n| id  | processed_date             | local_name         | external_name                         | checksum                         | code |\n|-----|:---------------------------|:-------------------|:--------------------------------------|:---------------------------------|-----:|\n| 1   | 2022-07-05 15:55:02.521154 | file_number_1.xyz\t | C:\\oeleo\\check\\to\\file_number_1.xyz   | c976e564825667d7c11ba200457af263 |    1 |\n| 2   | 2022-07-05 15:55:02.536152 | file_number_10.xyz | C:\\oeleo\\check\\to\\file_number_10.xyz\t | d502512c0d32d7503feb3fd3dd287376 |    1 |\n| 3   | 2022-07-05 15:55:02.553157 | file_number_2.xyz\t | C:\\oeleo\\check\\to\\file_number_2.xyz   | cb89d576f5bd57566c78247892baffa3 |    1 |\n\nThe `processed_date` is when the file was last updated (meaning last time `oeleo` found a new checksum for it).\n\nThe table below shows what the different values of `code` mean:\n\n| code | meaning                       |\n|:-----|:------------------------------|\n| 0    | `should-be-copied`            |\n| 1    | `should-be-copied-if-changed` |\n| 2    | `should-not-be-copied`        |\n\nHint! You can **lock** (chose to never copy) a file by editing the `code` manually to 2. \n\n\n#### Using an `oeleo` scheduler\n\nInstead of for example using a while loop to keep `oeleo` running continuously or at selected intervals, \nyou can use a scheduler (e.g. `rocketry`, `watchdog`, `schedule`, or more advanced options such as `AirFlow`).\n\n`oeleo` also includes its own schedulers. This is an example of how to use the `oeleo.SimpleScheduler`:\n\n\n```python\nimport dotenv\n\nfrom oeleo.schedulers import SimpleScheduler\nfrom oeleo.workers import simple_worker\n\n# assuming you have created an appropriate .env file\ndotenv.load_dotenv()\nworker = simple_worker()\ns = SimpleScheduler(\n        worker,\n        run_interval_time=4,  # seconds\n        max_run_intervals=4,\n    )\ns.start()\n```\n\n\n#### Copy files from a Windows PC to a Linux server through ssh\n\n```python\nimport logging\nimport os\nfrom pathlib import Path\n\nimport dotenv\n\nfrom oeleo.connectors import register_password\nfrom oeleo.utils import start_logger\nfrom oeleo.workers import ssh_worker\n\nlog = start_logger()\n\nprint(\" ssh \".center(80, \"-\"))\nlog.setLevel(logging.DEBUG)\nlog.info(f\"Starting oeleo!\")\ndotenv.load_dotenv()\n\nexternal_dir = \"/srv/data\"\nfilter_extension = \".res\"\n\nregister_password(os.environ[\"OELEO_PASSWORD\"])\n\nworker = ssh_worker(\n  db_name=\"ssh_to_server.db\",\n  base_directory_from=Path(r\"data\\raw\"),\n  base_directory_to=external_dir,\n  extension=filter_extension,\n)\nworker.connect_to_db()\ntry:\n  worker.check(update_db=True)\n  worker.filter_local()\n  worker.run()\nfinally:\n  worker.close()\n```\n\n## Future planned improvements\n\nJust plans, no promises given.\n\n- make even nicer printing and logging.\n- create CLI.\n- create an executable.\n- create a web-app.\n- create a GUI (not likely).\n\n## Status\n\n- [x] Works on my PC &rarr; PC\n- [x] Works on my PC &rarr; my server\n- [x] Works on my server &rarr; my server\n- [x] Works on my instrument PC &rarr; my instrument PC\n- [x] Works on my instrument PC &rarr; my server\n- [x] Works OK\n- [x] Deployable\n- [x] On testpypi\n- [x] On pypi\n- [x] Code understandable for others\n- [x] Looking good\n- [x] Fairly easy to use\n- [ ] Easy to use\n- [ ] Easy to debug runs (e.g. editing sql)\n\n## Licence\nMIT\n\n## Development\n\n- Developed using `poetry` on `python 3.11`.\n- Must also run on `python 3.8` for Windows 7 support.\n\n### Some useful commands\n\n#### Update version\n\n```bash\n# update version e.g. from 0.3.1 to 0.3.2:\npoetry version patch\n```\nThen edit `__init__.py`:\n```python\n__version__ = \"0.3.2\"\n```\n#### Build\n\n```bash\npoetry build\n```\n\n#### Publish\n\nIf you are using 2-factor authentication, you need to create a token on pypi.org and run:\n\n```bash\npoetry config pypi-token.pypi <token>\n```\nThen run:\n\n```bash\npoetry publish\n```\n\n### Next\n- Improve logging\n\n### Development lead\n- Jan Petter Maehlen, IFE\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A one-eyed tool to copy files with.",
    "version": "0.5.3",
    "project_urls": {
        "Homepage": "https://github.com/ife-bat/oeleo",
        "Repository": "https://github.com/ife-bat/oeleo"
    },
    "split_keywords": [
        "ssh",
        " db"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "345ad8518094ababd7e00d96bff9885ba2231a3b53807b2e434e09947339158d",
                "md5": "8930c663c3b66151fd07e18922607b8a",
                "sha256": "eb4b323f9691aba9ce411e1714193eda0b911b60ff4b8fc5dd46fbe862f2f069"
            },
            "downloads": -1,
            "filename": "oeleo-0.5.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8930c663c3b66151fd07e18922607b8a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.8",
            "size": 25449,
            "upload_time": "2024-08-28T19:53:18",
            "upload_time_iso_8601": "2024-08-28T19:53:18.383652Z",
            "url": "https://files.pythonhosted.org/packages/34/5a/d8518094ababd7e00d96bff9885ba2231a3b53807b2e434e09947339158d/oeleo-0.5.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "62ff35cc03a4be600c0fddaeea5fd9083142c2583f4618fb827305629f2c5d2b",
                "md5": "be347d9bde134257da8f57c779e3ddc8",
                "sha256": "0aef3d5566ade2b79e2820bc01371030e4ce6205f798f0ddbffaee193217c563"
            },
            "downloads": -1,
            "filename": "oeleo-0.5.3.tar.gz",
            "has_sig": false,
            "md5_digest": "be347d9bde134257da8f57c779e3ddc8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.8",
            "size": 21387,
            "upload_time": "2024-08-28T19:53:20",
            "upload_time_iso_8601": "2024-08-28T19:53:20.245531Z",
            "url": "https://files.pythonhosted.org/packages/62/ff/35cc03a4be600c0fddaeea5fd9083142c2583f4618fb827305629f2c5d2b/oeleo-0.5.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-28 19:53:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ife-bat",
    "github_project": "oeleo",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "oeleo"
}
        
Elapsed time: 0.36624s