mailbagit


Namemailbagit JSON
Version 0.7.3 PyPI version JSON
download
home_pagehttps://github.com/UAlbanyArchives/mailbag
SummaryA tool for preserving email in multiple preservation formats.
upload_time2024-05-03 17:25:44
maintainerNone
docs_urlNone
authorGregory Wiedeman
requires_python>=3.8
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Mailbagit

A tool for creating and managing Mailbags, a package for preserving email in multiple formats. It contains an open [specification for mailbags](https://archives.albany.edu/mailbag/spec/), as well as the `mailbagit` and `mailbagit-gui` tools for packaging email exports into mailbags.

`mailbagit` can be used to convert native email formats, such as PST, MSG, EML, and MBOX into PDF, HTML, WARC, and other formats and combines them into stable packages for preservation.

## Installation

```
pip install mailbagit
```

* To install PST dependancies: `pip install mailbagit[pst]`
* To install `mailbagit-gui`: `pip install mailbagit[gui]`

### Docker setup

You can also run `mailbagit` using a [Docker image](https://archives.albany.edu/mailbag/docker).

```
docker pull ualbanyarchives/mailbagit
wget https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/main/docker-compose.yml
docker compose run mailbagit
mailbagit -v
```

## Quick start

### Examples:

MSG files to PDF, EML, and WARC

```
mailbagit path/to/messages -i msg --derivatives eml pdf warc --mailbag_name my_mailbag
```

MBOX to PDF and plain text

```
mailbagit path/to/mbox_dir -i mbox -d txt pdf-chrome -m my_mailbag -r
```

PST to PDF, MBOX, EML, and WARC

```
mailbagit path/to/export.pst -i pst -d mbox eml pdf warc -m my_mailbag
```
EML to PDF and WARC in another directory

```
mailbagit path/to/messages -i eml -d pdf warc -m /path/to/my_mailbag
```

See the [documentation](https://archives.albany.edu/mailbag/use/) for more details on:

* [mailbagit](https://archives.albany.edu/mailbag/mailbagit/)
* [mailbagit-gui](https://archives.albany.edu/mailbag/mailbagit-gui/)
* [logging](https://archives.albany.edu/mailbag/logging/)
* [plugins](https://archives.albany.edu/mailbag/plugins/)

## Arguments

The arguments listed below can be entered in the command line when using `mailbagit`or entered in `mailbagit-gui` fields

### Mandatory Arguments

* **path**:
> A path to email to be packaged into a mailbag. This can be a single file or a directory containing a number of email exports.

* **-m --mailbag**: 
> A new directory for the mailbag, such as `/path/to/my_mailbag`, or just `my_mailbag` to use the same location as the source email. Must be a valid directory or file name and must not already exist.

* **-i --input**:  
> File format to use  as input for a mailbag.
> Argument takes single input.
> e.g. `-i imap` or `-i pst`

* **-d --derivatives**:
> Specifies a single or list of derivative formats that mailbagit will create and package into the mailbag.
> Argument takes multiple inputs.
e.g. `-d eml pdf warc`


### Mailbagit Optional  Arguments

* **-v --version**
> Reports the version number and exits.

* **-r --dry-run**
> Performs a test run that will not alter any files other than writing an error report. When this flag is used, `mailbagit` parses all the email it is provide and formats derivatives as much as it can without writing anything to disk. If there are any error or warnings, this will create an error report with an `errors.csv` listing all issues as well as a full stack trace in a `.txt` file.

* **-k --keep**
> Keeps the source files as-is and copies instead of moving them into a mailbag.

* **--css**
> Path to a CSS file to override the included CSS when creating PDF or HTML derivatives
> Argument takes single file path as input.

* **-c --compress**
> Compresses the mailbag as a ZIP, TAR, or TAR.GZ
> e.g. `-c zip` or `-c tar.gz`

* **-f, --companion_files**
> Allows for companion metadata files to be packaged alongside email export files.
> When this option is used, `mailbagit` will recursively include all the files in the directory provided into a mailbag.

### Bagit-python arguments

Mailbagit also accepts most [bagit-python](https://github.com/LibraryOfCongress/bagit-python) arguments. Thus, you can provide arguments like `--processes 2` or arguments to add metadata such as `--source-organization University at Albany, SUNY` 

The only bag-python arguments that `mailbagit` does not support are `-log`, `-quiet`, `-validate`, `-fast`, and `-completeness_only`

If you would like to validate your mailbag, `mailbagit` comes with [bagit-python](https://github.com/LibraryOfCongress/bagit-python) installed. Thus, you can run:

```
bagit.py --validate /path/to/mailbag
```

## Development setup

```
git clone git@github.com:UAlbanyArchives/mailbagit.git
cd mailbagit
git switch develop
pip install -e .
```

### Development with docker

* This runs the dev docker image with the code installed in editable mode. You can then make code changes and run them directly with `mailbagit`.

* Assumes you have a directory with email data in ./sampleData. You can change this directory name in [line 7 of docker-compose-dev.yml](https://github.com/UAlbanyArchives/mailbagit/blob/main/docker-compose-dev.yml#L7).

```
docker pull ualbanyarchives/mailbagit:dev
git clone git@github.com:UAlbanyArchives/mailbagit.git
cd mailbagit
git switch develop
docker-compose -f docker-compose-dev.yml run mailbagit
mailbagit -v
```

## License
[MIT](LICENSE)

## Kudos

This project was made possible by funding from the University of Illinois's [Email Archives: Building Capacity and Community Project](https://emailarchivesgrant.library.illinois.edu/).

We owe a lot to the hard work that goes towards developing and maintaining the libraries `mailbagit` uses to parse email formats and make bags. We'd like to thank these awesome projects, without which `mailbagit` wouldn't be possible:  

* [extractMsg](https://github.com/TeamMsgExtractor/msg-extractor)
* [libpff](https://github.com/libyal/libpff)
* [bagit-python](https://github.com/LibraryOfCongress/bagit-python)

We'd also like to thank the [RATOM project](https://ratom.web.unc.edu/) whose documentation was super helpful in guiding us though some roadblocks.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/UAlbanyArchives/mailbag",
    "name": "mailbagit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Gregory Wiedeman",
    "author_email": "gwiedeman@albany.edu",
    "download_url": "https://files.pythonhosted.org/packages/dd/0c/edf72765a58ee79cfca4165e9d39cce581c79f23748330bb47d0409a993f/mailbagit-0.7.3.tar.gz",
    "platform": null,
    "description": "# Mailbagit\r\n\r\nA tool for creating and managing Mailbags, a package for preserving email in multiple formats. It contains an open [specification for mailbags](https://archives.albany.edu/mailbag/spec/), as well as the `mailbagit` and `mailbagit-gui` tools for packaging email exports into mailbags.\r\n\r\n`mailbagit` can be used to convert native email formats, such as PST, MSG, EML, and MBOX into PDF, HTML, WARC, and other formats and combines them into stable packages for preservation.\r\n\r\n## Installation\r\n\r\n```\r\npip install mailbagit\r\n```\r\n\r\n* To install PST dependancies: `pip install mailbagit[pst]`\r\n* To install `mailbagit-gui`: `pip install mailbagit[gui]`\r\n\r\n### Docker setup\r\n\r\nYou can also run `mailbagit` using a [Docker image](https://archives.albany.edu/mailbag/docker).\r\n\r\n```\r\ndocker pull ualbanyarchives/mailbagit\r\nwget https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/main/docker-compose.yml\r\ndocker compose run mailbagit\r\nmailbagit -v\r\n```\r\n\r\n## Quick start\r\n\r\n### Examples:\r\n\r\nMSG files to PDF, EML, and WARC\r\n\r\n```\r\nmailbagit path/to/messages -i msg --derivatives eml pdf warc --mailbag_name my_mailbag\r\n```\r\n\r\nMBOX to PDF and plain text\r\n\r\n```\r\nmailbagit path/to/mbox_dir -i mbox -d txt pdf-chrome -m my_mailbag -r\r\n```\r\n\r\nPST to PDF, MBOX, EML, and WARC\r\n\r\n```\r\nmailbagit path/to/export.pst -i pst -d mbox eml pdf warc -m my_mailbag\r\n```\r\nEML to PDF and WARC in another directory\r\n\r\n```\r\nmailbagit path/to/messages -i eml -d pdf warc -m /path/to/my_mailbag\r\n```\r\n\r\nSee the [documentation](https://archives.albany.edu/mailbag/use/) for more details on:\r\n\r\n* [mailbagit](https://archives.albany.edu/mailbag/mailbagit/)\r\n* [mailbagit-gui](https://archives.albany.edu/mailbag/mailbagit-gui/)\r\n* [logging](https://archives.albany.edu/mailbag/logging/)\r\n* [plugins](https://archives.albany.edu/mailbag/plugins/)\r\n\r\n## Arguments\r\n\r\nThe arguments listed below can be entered in the command line when using `mailbagit`or entered in `mailbagit-gui` fields\r\n\r\n### Mandatory Arguments\r\n\r\n* **path**:\r\n> A path to email to be packaged into a mailbag. This can be a single file or a directory containing a number of email exports.\r\n\r\n* **-m --mailbag**: \r\n> A new directory for the mailbag, such as `/path/to/my_mailbag`, or just `my_mailbag` to use the same location as the source email. Must be a valid directory or file name and must not already exist.\r\n\r\n* **-i --input**:  \r\n> File format to use  as input for a mailbag.\r\n> Argument takes single input.\r\n> e.g. `-i imap` or `-i pst`\r\n\r\n* **-d --derivatives**:\r\n> Specifies a single or list of derivative formats that mailbagit will create and package into the mailbag.\r\n> Argument takes multiple inputs.\r\ne.g. `-d eml pdf warc`\r\n\r\n\r\n### Mailbagit Optional  Arguments\r\n\r\n* **-v --version**\r\n> Reports the version number and exits.\r\n\r\n* **-r --dry-run**\r\n> Performs a test run that will not alter any files other than writing an error report. When this flag is used, `mailbagit` parses all the email it is provide and formats derivatives as much as it can without writing anything to disk. If there are any error or warnings, this will create an error report with an `errors.csv` listing all issues as well as a full stack trace in a `.txt` file.\r\n\r\n* **-k --keep**\r\n> Keeps the source files as-is and copies instead of moving them into a mailbag.\r\n\r\n* **--css**\r\n> Path to a CSS file to override the included CSS when creating PDF or HTML derivatives\r\n> Argument takes single file path as input.\r\n\r\n* **-c --compress**\r\n> Compresses the mailbag as a ZIP, TAR, or TAR.GZ\r\n> e.g. `-c zip` or `-c tar.gz`\r\n\r\n* **-f, --companion_files**\r\n> Allows for companion metadata files to be packaged alongside email export files.\r\n> When this option is used, `mailbagit` will recursively include all the files in the directory provided into a mailbag.\r\n\r\n### Bagit-python arguments\r\n\r\nMailbagit also accepts most [bagit-python](https://github.com/LibraryOfCongress/bagit-python) arguments. Thus, you can provide arguments like `--processes 2` or arguments to add metadata such as `--source-organization University at Albany, SUNY` \r\n\r\nThe only bag-python arguments that `mailbagit` does not support are `-log`, `-quiet`, `-validate`, `-fast`, and `-completeness_only`\r\n\r\nIf you would like to validate your mailbag, `mailbagit` comes with [bagit-python](https://github.com/LibraryOfCongress/bagit-python) installed. Thus, you can run:\r\n\r\n```\r\nbagit.py --validate /path/to/mailbag\r\n```\r\n\r\n## Development setup\r\n\r\n```\r\ngit clone git@github.com:UAlbanyArchives/mailbagit.git\r\ncd mailbagit\r\ngit switch develop\r\npip install -e .\r\n```\r\n\r\n### Development with docker\r\n\r\n* This runs the dev docker image with the code installed in editable mode. You can then make code changes and run them directly with `mailbagit`.\r\n\r\n* Assumes you have a directory with email data in ./sampleData. You can change this directory name in [line 7 of docker-compose-dev.yml](https://github.com/UAlbanyArchives/mailbagit/blob/main/docker-compose-dev.yml#L7).\r\n\r\n```\r\ndocker pull ualbanyarchives/mailbagit:dev\r\ngit clone git@github.com:UAlbanyArchives/mailbagit.git\r\ncd mailbagit\r\ngit switch develop\r\ndocker-compose -f docker-compose-dev.yml run mailbagit\r\nmailbagit -v\r\n```\r\n\r\n## License\r\n[MIT](LICENSE)\r\n\r\n## Kudos\r\n\r\nThis project was made possible by funding from the University of Illinois's [Email Archives: Building Capacity and Community Project](https://emailarchivesgrant.library.illinois.edu/).\r\n\r\nWe owe a lot to the hard work that goes towards developing and maintaining the libraries `mailbagit` uses to parse email formats and make bags. We'd like to thank these awesome projects, without which `mailbagit` wouldn't be possible:  \r\n\r\n* [extractMsg](https://github.com/TeamMsgExtractor/msg-extractor)\r\n* [libpff](https://github.com/libyal/libpff)\r\n* [bagit-python](https://github.com/LibraryOfCongress/bagit-python)\r\n\r\nWe'd also like to thank the [RATOM project](https://ratom.web.unc.edu/) whose documentation was super helpful in guiding us though some roadblocks.\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A tool for preserving email in multiple preservation formats.",
    "version": "0.7.3",
    "project_urls": {
        "Homepage": "https://github.com/UAlbanyArchives/mailbag"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0ba5f5728f57322e2b6916bc573608ddc556a8a18277cf406cdef1de3091dbb4",
                "md5": "4a81522088932d8684672bae2eb5682c",
                "sha256": "9b2b6d9e8bd431024f4ff939738b78d0f5808eaac84e4c2c9f7794bd506f0a25"
            },
            "downloads": -1,
            "filename": "mailbagit-0.7.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4a81522088932d8684672bae2eb5682c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 60800,
            "upload_time": "2024-05-03T17:25:42",
            "upload_time_iso_8601": "2024-05-03T17:25:42.597784Z",
            "url": "https://files.pythonhosted.org/packages/0b/a5/f5728f57322e2b6916bc573608ddc556a8a18277cf406cdef1de3091dbb4/mailbagit-0.7.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dd0cedf72765a58ee79cfca4165e9d39cce581c79f23748330bb47d0409a993f",
                "md5": "ce8af2ab62a134e93dde71b467d6fdcb",
                "sha256": "340669f0e306974e9c340dce73a3115b05bb95c65843d516da97e379a3e4e740"
            },
            "downloads": -1,
            "filename": "mailbagit-0.7.3.tar.gz",
            "has_sig": false,
            "md5_digest": "ce8af2ab62a134e93dde71b467d6fdcb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 47902,
            "upload_time": "2024-05-03T17:25:44",
            "upload_time_iso_8601": "2024-05-03T17:25:44.439987Z",
            "url": "https://files.pythonhosted.org/packages/dd/0c/edf72765a58ee79cfca4165e9d39cce581c79f23748330bb47d0409a993f/mailbagit-0.7.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-03 17:25:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "UAlbanyArchives",
    "github_project": "mailbag",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "mailbagit"
}
        
Elapsed time: 0.25159s