mail-parser


Namemail-parser JSON
Version 1.1.10 PyPI version JSON
download
home_pagehttps://github.com/SpamScope/mail-parser
SummaryWrapper for email standard library
upload_time2017-04-05 20:20:01
maintainerNone
docs_urlNone
authorFedele Mantuano
requires_pythonNone
licenseApache License, Version 2.0
keywords mail email parser wrapper
VCS
bugtrack_url
requirements ipaddress simplejson six
Travis-CI
Coveralis test coverage No Coveralis.
            |PyPI version| |Build Status| |Coverage Status|

mail-parser
===========

Overview
--------

mail-parser is a wrapper for `email`_ Python Standard Library. It’s the
key module of `SpamScope`_.

From version 1.0.0rc1 mail-parser supports Python 3.

Description
-----------

mail-parser takes as input a raw mail and generates a parsed object.
This object is a tokenized mail with the all parts of mail and some
indicator: - body - headers - subject - from - to - attachments -
message id - date - charset mail - sender IP address

We have also two indicator: - anomalies: mail without message id or date
- `defects`_: mail with some not compliance RFC part

Defects
~~~~~~~

These defects can be used to evade the antispam filter. An example are
the mails with a malformed boundary that can hide a not legitimate
epilogue (often malware). This library can take these epilogues.

Apache 2 Open Source License
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mail-parser can be downloaded, used, and modified free of charge. It is
available under the Apache 2 license.

Authors
-------

Main Author
~~~~~~~~~~~

Fedele Mantuano (**Twitter**:
[@fedelemantuano](https://twitter.com/fedelemantuano))

Installation
------------

Clone repository

::

    git clone https://github.com/SpamScope/mail-parser.git

and install mail-parser with ``setup.py``:

::

    cd mail-parser

    python setup.py install

or use ``pip``:

::

    pip install mail-parser

Usage in a project
-------------------

Import ``MailParser`` class:

::

    from mailparser import MailParser

    parser = MailParser()
    parser.parse_from_file(f)
    parser.parse_from_string(raw_mail)

Then you can get all parts

::

    parser.body
    parser.headers
    parser.message_id
    parser.to_
    parser.from_
    parser.subject
    parser.text_plain_list: only text plain mail parts in a list
    parser.attachments_list: list of all attachments
    parser.date_mail
    parser.parsed_mail_obj: tokenized mail in a object
    parser.parsed_mail_json: tokenized mail in a JSON
    parser.defects: defect RFC not compliance
    parser.defects_category: only defects categories
    parser.has_defects
    parser.anomalies
    parser.has_anomalies
    parser.get_server_ipaddress(trust="my_server_mail_trust")

.. _email: https://docs.python.org/2/library/email.message.html
.. _SpamScope: https://github.com/SpamScope/spamscope
.. _defects: https://docs.python.org/2/library/email.message.html#email.message.Message.defects

Usage from command-line
-----------------------

If you installed mailparser with ``pip`` or ``setup.py`` you can use it with
command-line.

These are all swithes:

::

    usage: mailparser [-h] (-f FILE | -s STRING) [-j] [-b] [-a] [-r] [-t] [-m]
                      [-u] [-d] [-n] [-i Trust mail server string] [-p] [-z] [-v]

    Wrapper for email Python Standard Library

    optional arguments:
      -h, --help            show this help message and exit
      -f FILE_, --file FILE_
                            Raw email file (default: None)
      -s STRING_, --string STRING_
                            Raw email string (default: None)
      -j, --json            Show the JSON of parsed mail (default: False)
      -b, --body            Print the body of mail (default: False)
      -a, --attachments     Print the attachments of mail (default: False)
      -r, --headers         Print the headers of mail (default: False)
      -t, --to              Print the to of mail (default: False)
      -m, --from            Print the from of mail (default: False)
      -u, --subject         Print the subject of mail (default: False)
      -d, --defects         Print the defects of mail (default: False)
      -n, --anomalies       Print the anomalies of mail (default: False)
      -i Trust mail server string, --senderip Trust mail server string
                            Extract a reliable sender IP address heuristically
                            (default: None)
      -p, --mail-hash       Print mail fingerprints without headers (default:
                            False)
      -z, --attachments-hash
                            Print attachments with fingerprints (default: False)
      -v, --version         show program's version number and exit

    It takes as input a raw mail and generates a parsed object.

Example:

.. code:: shell

    $ mailparser -f example_mail -j

This example will show you the tokenized mail in a JSON pretty format.


.. |PyPI version| image:: https://badge.fury.io/py/mail-parser.svg
   :target: https://badge.fury.io/py/mail-parser
.. |Build Status| image:: https://travis-ci.org/SpamScope/mail-parser.svg?branch=develop
   :target: https://travis-ci.org/SpamScope/mail-parser
.. |Coverage Status| image:: https://coveralls.io/repos/github/SpamScope/mail-parser/badge.svg?branch=develop
   :target: https://coveralls.io/github/SpamScope/mail-parser?branch=develop
            

Raw data

            {
    "maintainer": null, 
    "docs_url": null, 
    "requires_python": null, 
    "maintainer_email": null, 
    "cheesecake_code_kwalitee_id": null, 
    "coveralis": false, 
    "keywords": "mail,email,parser,wrapper", 
    "upload_time": "2017-04-05 20:20:01", 
    "requirements": [
        {
            "name": "ipaddress", 
            "specs": [
                [
                    "==", 
                    "1.0.17"
                ]
            ]
        }, 
        {
            "name": "simplejson", 
            "specs": [
                [
                    "==", 
                    "3.10.0"
                ]
            ]
        }, 
        {
            "name": "six", 
            "specs": [
                [
                    "==", 
                    "1.10.0"
                ]
            ]
        }
    ], 
    "author": "Fedele Mantuano", 
    "home_page": "https://github.com/SpamScope/mail-parser", 
    "github_user": "SpamScope", 
    "download_url": "https://pypi.python.org/packages/b2/ec/5027e62a528e6dc797123e420db826432cd815328e18a4a828c9da0ffdca/mail-parser-1.1.10.tar.gz", 
    "platform": "Linux", 
    "version": "1.1.10", 
    "cheesecake_documentation_id": null, 
    "description": "|PyPI version| |Build Status| |Coverage Status|\n\nmail-parser\n===========\n\nOverview\n--------\n\nmail-parser is a wrapper for `email`_ Python Standard Library. It\u2019s the\nkey module of `SpamScope`_.\n\nFrom version 1.0.0rc1 mail-parser supports Python 3.\n\nDescription\n-----------\n\nmail-parser takes as input a raw mail and generates a parsed object.\nThis object is a tokenized mail with the all parts of mail and some\nindicator: - body - headers - subject - from - to - attachments -\nmessage id - date - charset mail - sender IP address\n\nWe have also two indicator: - anomalies: mail without message id or date\n- `defects`_: mail with some not compliance RFC part\n\nDefects\n~~~~~~~\n\nThese defects can be used to evade the antispam filter. An example are\nthe mails with a malformed boundary that can hide a not legitimate\nepilogue (often malware). This library can take these epilogues.\n\nApache 2 Open Source License\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nmail-parser can be downloaded, used, and modified free of charge. It is\navailable under the Apache 2 license.\n\nAuthors\n-------\n\nMain Author\n~~~~~~~~~~~\n\nFedele Mantuano (**Twitter**:\n[@fedelemantuano](https://twitter.com/fedelemantuano))\n\nInstallation\n------------\n\nClone repository\n\n::\n\n    git clone https://github.com/SpamScope/mail-parser.git\n\nand install mail-parser with ``setup.py``:\n\n::\n\n    cd mail-parser\n\n    python setup.py install\n\nor use ``pip``:\n\n::\n\n    pip install mail-parser\n\nUsage in a project\n-------------------\n\nImport ``MailParser`` class:\n\n::\n\n    from mailparser import MailParser\n\n    parser = MailParser()\n    parser.parse_from_file(f)\n    parser.parse_from_string(raw_mail)\n\nThen you can get all parts\n\n::\n\n    parser.body\n    parser.headers\n    parser.message_id\n    parser.to_\n    parser.from_\n    parser.subject\n    parser.text_plain_list: only text plain mail parts in a list\n    parser.attachments_list: list of all attachments\n    parser.date_mail\n    parser.parsed_mail_obj: tokenized mail in a object\n    parser.parsed_mail_json: tokenized mail in a JSON\n    parser.defects: defect RFC not compliance\n    parser.defects_category: only defects categories\n    parser.has_defects\n    parser.anomalies\n    parser.has_anomalies\n    parser.get_server_ipaddress(trust=\"my_server_mail_trust\")\n\n.. _email: https://docs.python.org/2/library/email.message.html\n.. _SpamScope: https://github.com/SpamScope/spamscope\n.. _defects: https://docs.python.org/2/library/email.message.html#email.message.Message.defects\n\nUsage from command-line\n-----------------------\n\nIf you installed mailparser with ``pip`` or ``setup.py`` you can use it with\ncommand-line.\n\nThese are all swithes:\n\n::\n\n    usage: mailparser [-h] (-f FILE | -s STRING) [-j] [-b] [-a] [-r] [-t] [-m]\n                      [-u] [-d] [-n] [-i Trust mail server string] [-p] [-z] [-v]\n\n    Wrapper for email Python Standard Library\n\n    optional arguments:\n      -h, --help            show this help message and exit\n      -f FILE_, --file FILE_\n                            Raw email file (default: None)\n      -s STRING_, --string STRING_\n                            Raw email string (default: None)\n      -j, --json            Show the JSON of parsed mail (default: False)\n      -b, --body            Print the body of mail (default: False)\n      -a, --attachments     Print the attachments of mail (default: False)\n      -r, --headers         Print the headers of mail (default: False)\n      -t, --to              Print the to of mail (default: False)\n      -m, --from            Print the from of mail (default: False)\n      -u, --subject         Print the subject of mail (default: False)\n      -d, --defects         Print the defects of mail (default: False)\n      -n, --anomalies       Print the anomalies of mail (default: False)\n      -i Trust mail server string, --senderip Trust mail server string\n                            Extract a reliable sender IP address heuristically\n                            (default: None)\n      -p, --mail-hash       Print mail fingerprints without headers (default:\n                            False)\n      -z, --attachments-hash\n                            Print attachments with fingerprints (default: False)\n      -v, --version         show program's version number and exit\n\n    It takes as input a raw mail and generates a parsed object.\n\nExample:\n\n.. code:: shell\n\n    $ mailparser -f example_mail -j\n\nThis example will show you the tokenized mail in a JSON pretty format.\n\n\n.. |PyPI version| image:: https://badge.fury.io/py/mail-parser.svg\n   :target: https://badge.fury.io/py/mail-parser\n.. |Build Status| image:: https://travis-ci.org/SpamScope/mail-parser.svg?branch=develop\n   :target: https://travis-ci.org/SpamScope/mail-parser\n.. |Coverage Status| image:: https://coveralls.io/repos/github/SpamScope/mail-parser/badge.svg?branch=develop\n   :target: https://coveralls.io/github/SpamScope/mail-parser?branch=develop", 
    "lcname": "mail-parser", 
    "bugtrack_url": null, 
    "github": true, 
    "name": "mail-parser", 
    "license": "Apache License, Version 2.0", 
    "travis_ci": true, 
    "github_project": "mail-parser", 
    "summary": "Wrapper for email standard library", 
    "split_keywords": [
        "mail", 
        "email", 
        "parser", 
        "wrapper"
    ], 
    "author_email": "mantuano.fedele@gmail.com", 
    "urls": [
        {
            "has_sig": false, 
            "upload_time": "2017-04-05T20:20:01", 
            "comment_text": "", 
            "python_version": "source", 
            "url": "https://pypi.python.org/packages/b2/ec/5027e62a528e6dc797123e420db826432cd815328e18a4a828c9da0ffdca/mail-parser-1.1.10.tar.gz", 
            "md5_digest": "220dbb108cda0748eddd5baf08acfe85", 
            "downloads": 0, 
            "filename": "mail-parser-1.1.10.tar.gz", 
            "packagetype": "sdist", 
            "path": "b2/ec/5027e62a528e6dc797123e420db826432cd815328e18a4a828c9da0ffdca/mail-parser-1.1.10.tar.gz", 
            "size": 8488
        }
    ], 
    "_id": null, 
    "cheesecake_installability_id": null
}