smart-importer


Namesmart-importer JSON
Version 1.2 PyPI version JSON
download
home_pagehttps://github.com/beancount/smart_importer
SummaryAugment Beancount importers with machine learning functionality.
upload_time2025-10-17 20:46:15
maintainerNone
docs_urlNone
authorJohannes Harms
requires_pythonNone
licenseMIT
keywords fava beancount accounting machinelearning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            smart_importer
==============

https://github.com/beancount/smart_importer

.. image:: https://github.com/beancount/smart_importer/actions/workflows/ci.yml/badge.svg?branch=main
   :target: https://github.com/beancount/smart_importer/actions?query=branch%3Amain

Augments
`Beancount <http://furius.ca/beancount/>`__ importers
with machine learning functionality.


Status
------

Working protoype, development status: beta


Installation
------------

The ``smart_importer`` can be installed from PyPI:

.. code:: bash

    pip install smart_importer


Quick Start
-----------

This package provides import hooks that can modify the imported entries. When
running the importer, the existing entries will be used as training data for a
machine learning model, which will then predict entry attributes.

The following example shows how to apply the ``PredictPostings`` hook to
an existing CSV importer:

.. code:: python

    from beangulp.importers import csv
    from beangulp.importers.csv import Col

    from smart_importer import PredictPostings


    class MyBankImporter(csv.Importer):
        '''Conventional importer for MyBank'''

        def __init__(self, *, account):
            super().__init__(
                {Col.DATE: 'Date',
                 Col.PAYEE: 'Transaction Details',
                 Col.AMOUNT_DEBIT: 'Funds Out',
                 Col.AMOUNT_CREDIT: 'Funds In'},
                account,
                'EUR',
                (
                    'Date, Transaction Details, Funds Out, Funds In'
                )
            )


    CONFIG = [
        MyBankImporter(account='Assets:MyBank:MyAccount'),
    ]

    HOOKS = [
        PredictPostings().hook
    ]


Documentation
-------------

This section explains in detail the relevant concepts and artifacts
needed for enhancing Beancount importers with machine learning.


Beancount Importers
~~~~~~~~~~~~~~~~~~~~

Let's assume you have created an importer for "MyBank" called
``MyBankImporter``:

.. code:: python

    class MyBankImporter(importer.Importer):
        """My existing importer"""
        # the actual importer logic would be here...

Note:
This documentation assumes you already know how to create Beancount/Beangulp importers.
Relevant documentation can be found in the `beancount import documentation
<https://beancount.github.io/docs/importing_external_data.html>`__.
With the functionality of beangulp, users can
write their own importers and use them to convert downloaded bank statements
into lists of Beancount entries.
Examples are provided as part of beangulps source code under
`examples/importers
<https://github.com/beancount/beangulp/tree/master/examples/importers>`__.

smart_importer only works by appending onto incomplete single-legged postings
(i.e. It will not work by modifying postings with accounts like "Expenses:TODO").
The `extract` method in the importer should follow the
`latest interface <https://github.com/beancount/beangulp/blob/master/beangulp/importer.py>`__
and include an `existing_entries` argument.

Using `smart_importer` as a beangulp hook
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Beangulp has the notation of hooks, for some detailed example see `beangulp hook example <https://github.com/beancount/beangulp/blob/ead8a2517d4f34c7ac7d48e4ef6d21a88be7363c/examples/import.py#L50>`.
This can be used to apply smart importer to all importers.

* ``PredictPostings`` - predict the list of postings.
* ``PredictPayees``- predict the payee of the transaction.

For example, to convert an existing ``MyBankImporter`` into a smart importer:

.. code:: python

    from your_custom_importer import MyBankImporter
    from smart_importer import PredictPayees, PredictPostings

    CONFIG = [
        MyBankImporter('whatever', 'config', 'is', 'needed'),
    ]

    HOOKS = [
        PredictPostings().hook,
        PredictPayees().hook
    ]

Wrapping an importer to become a  `smart_importer`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Instead of using a beangulp hook, it's possible to wrap any importer to become a smart importer, this will modify only this importer.

* ``PredictPostings`` - predict the list of postings.
* ``PredictPayees``- predict the payee of the transaction.

For example, to convert an existing ``MyBankImporter`` into a smart importer:

.. code:: python

    from your_custom_importer import MyBankImporter
    from smart_importer import PredictPayees, PredictPostings

    CONFIG = [
        PredictPostings().wrap(
            PredictPayees().wrap(
                MyBankImporter('whatever', 'config', 'is', 'needed')
            )
        ),
    ]

    HOOKS = [
    ]


Specifying Training Data
~~~~~~~~~~~~~~~~~~~~~~~~

The ``smart_importer`` hooks need training data, i.e. an existing list of
transactions in order to be effective. Training data can be specified by
calling bean-extract with an argument that references existing Beancount
transactions, e.g., ``import.py extract -e existing_transactions.beancount``. When
using the importer in Fava, the existing entries are used as training data
automatically.


Usage with Fava
~~~~~~~~~~~~~~~

Smart importers play nice with `Fava <https://github.com/beancount/fava>`__.
This means you can use smart importers together with Fava in the exact same way
as you would do with a conventional importer. See `Fava's help on importers
<https://github.com/beancount/fava/blob/main/src/fava/help/import.md>`__ for more
information.


Development
-----------

Pull requests welcome!


Executing the Unit Tests
~~~~~~~~~~~~~~~~~~~~~~~~

Simply run (requires tox):

.. code:: bash

    make test


Configuring Logging
~~~~~~~~~~~~~~~~~~~

Python's `logging` module is used by the smart_importer module.
The according log level can be changed as follows:


.. code:: python

    import logging
    logging.getLogger('smart_importer').setLevel(logging.DEBUG)


Using Tokenizer
~~~~~~~~~~~~~~~~~~

Custom tokenizers can let smart_importer support more languages, eg. Chinese.

If you looking for Chinese tokenizer, you can follow this example:

First make sure that `jieba` is installed in your python environment:

.. code:: bash

    pip install jieba


In your importer code, you can then pass `jieba` to be used as tokenizer:

.. code:: python

    from smart_importer import PredictPostings
    import jieba

    jieba.initialize()
    tokenizer = lambda s: list(jieba.cut(s))

    predictor = PredictPostings(string_tokenizer=tokenizer)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/beancount/smart_importer",
    "name": "smart-importer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "fava beancount accounting machinelearning",
    "author": "Johannes Harms",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/e1/33/e7fe2e9373fa3e3cb3d1428e9963524e50e7abe5e3936858ce0009741a7b/smart_importer-1.2.tar.gz",
    "platform": null,
    "description": "smart_importer\n==============\n\nhttps://github.com/beancount/smart_importer\n\n.. image:: https://github.com/beancount/smart_importer/actions/workflows/ci.yml/badge.svg?branch=main\n   :target: https://github.com/beancount/smart_importer/actions?query=branch%3Amain\n\nAugments\n`Beancount <http://furius.ca/beancount/>`__ importers\nwith machine learning functionality.\n\n\nStatus\n------\n\nWorking protoype, development status: beta\n\n\nInstallation\n------------\n\nThe ``smart_importer`` can be installed from PyPI:\n\n.. code:: bash\n\n    pip install smart_importer\n\n\nQuick Start\n-----------\n\nThis package provides import hooks that can modify the imported entries. When\nrunning the importer, the existing entries will be used as training data for a\nmachine learning model, which will then predict entry attributes.\n\nThe following example shows how to apply the ``PredictPostings`` hook to\nan existing CSV importer:\n\n.. code:: python\n\n    from beangulp.importers import csv\n    from beangulp.importers.csv import Col\n\n    from smart_importer import PredictPostings\n\n\n    class MyBankImporter(csv.Importer):\n        '''Conventional importer for MyBank'''\n\n        def __init__(self, *, account):\n            super().__init__(\n                {Col.DATE: 'Date',\n                 Col.PAYEE: 'Transaction Details',\n                 Col.AMOUNT_DEBIT: 'Funds Out',\n                 Col.AMOUNT_CREDIT: 'Funds In'},\n                account,\n                'EUR',\n                (\n                    'Date, Transaction Details, Funds Out, Funds In'\n                )\n            )\n\n\n    CONFIG = [\n        MyBankImporter(account='Assets:MyBank:MyAccount'),\n    ]\n\n    HOOKS = [\n        PredictPostings().hook\n    ]\n\n\nDocumentation\n-------------\n\nThis section explains in detail the relevant concepts and artifacts\nneeded for enhancing Beancount importers with machine learning.\n\n\nBeancount Importers\n~~~~~~~~~~~~~~~~~~~~\n\nLet's assume you have created an importer for \"MyBank\" called\n``MyBankImporter``:\n\n.. code:: python\n\n    class MyBankImporter(importer.Importer):\n        \"\"\"My existing importer\"\"\"\n        # the actual importer logic would be here...\n\nNote:\nThis documentation assumes you already know how to create Beancount/Beangulp importers.\nRelevant documentation can be found in the `beancount import documentation\n<https://beancount.github.io/docs/importing_external_data.html>`__.\nWith the functionality of beangulp, users can\nwrite their own importers and use them to convert downloaded bank statements\ninto lists of Beancount entries.\nExamples are provided as part of beangulps source code under\n`examples/importers\n<https://github.com/beancount/beangulp/tree/master/examples/importers>`__.\n\nsmart_importer only works by appending onto incomplete single-legged postings\n(i.e. It will not work by modifying postings with accounts like \"Expenses:TODO\").\nThe `extract` method in the importer should follow the\n`latest interface <https://github.com/beancount/beangulp/blob/master/beangulp/importer.py>`__\nand include an `existing_entries` argument.\n\nUsing `smart_importer` as a beangulp hook\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nBeangulp has the notation of hooks, for some detailed example see `beangulp hook example <https://github.com/beancount/beangulp/blob/ead8a2517d4f34c7ac7d48e4ef6d21a88be7363c/examples/import.py#L50>`.\nThis can be used to apply smart importer to all importers.\n\n* ``PredictPostings`` - predict the list of postings.\n* ``PredictPayees``- predict the payee of the transaction.\n\nFor example, to convert an existing ``MyBankImporter`` into a smart importer:\n\n.. code:: python\n\n    from your_custom_importer import MyBankImporter\n    from smart_importer import PredictPayees, PredictPostings\n\n    CONFIG = [\n        MyBankImporter('whatever', 'config', 'is', 'needed'),\n    ]\n\n    HOOKS = [\n        PredictPostings().hook,\n        PredictPayees().hook\n    ]\n\nWrapping an importer to become a  `smart_importer`\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nInstead of using a beangulp hook, it's possible to wrap any importer to become a smart importer, this will modify only this importer.\n\n* ``PredictPostings`` - predict the list of postings.\n* ``PredictPayees``- predict the payee of the transaction.\n\nFor example, to convert an existing ``MyBankImporter`` into a smart importer:\n\n.. code:: python\n\n    from your_custom_importer import MyBankImporter\n    from smart_importer import PredictPayees, PredictPostings\n\n    CONFIG = [\n        PredictPostings().wrap(\n            PredictPayees().wrap(\n                MyBankImporter('whatever', 'config', 'is', 'needed')\n            )\n        ),\n    ]\n\n    HOOKS = [\n    ]\n\n\nSpecifying Training Data\n~~~~~~~~~~~~~~~~~~~~~~~~\n\nThe ``smart_importer`` hooks need training data, i.e. an existing list of\ntransactions in order to be effective. Training data can be specified by\ncalling bean-extract with an argument that references existing Beancount\ntransactions, e.g., ``import.py extract -e existing_transactions.beancount``. When\nusing the importer in Fava, the existing entries are used as training data\nautomatically.\n\n\nUsage with Fava\n~~~~~~~~~~~~~~~\n\nSmart importers play nice with `Fava <https://github.com/beancount/fava>`__.\nThis means you can use smart importers together with Fava in the exact same way\nas you would do with a conventional importer. See `Fava's help on importers\n<https://github.com/beancount/fava/blob/main/src/fava/help/import.md>`__ for more\ninformation.\n\n\nDevelopment\n-----------\n\nPull requests welcome!\n\n\nExecuting the Unit Tests\n~~~~~~~~~~~~~~~~~~~~~~~~\n\nSimply run (requires tox):\n\n.. code:: bash\n\n    make test\n\n\nConfiguring Logging\n~~~~~~~~~~~~~~~~~~~\n\nPython's `logging` module is used by the smart_importer module.\nThe according log level can be changed as follows:\n\n\n.. code:: python\n\n    import logging\n    logging.getLogger('smart_importer').setLevel(logging.DEBUG)\n\n\nUsing Tokenizer\n~~~~~~~~~~~~~~~~~~\n\nCustom tokenizers can let smart_importer support more languages, eg. Chinese.\n\nIf you looking for Chinese tokenizer, you can follow this example:\n\nFirst make sure that `jieba` is installed in your python environment:\n\n.. code:: bash\n\n    pip install jieba\n\n\nIn your importer code, you can then pass `jieba` to be used as tokenizer:\n\n.. code:: python\n\n    from smart_importer import PredictPostings\n    import jieba\n\n    jieba.initialize()\n    tokenizer = lambda s: list(jieba.cut(s))\n\n    predictor = PredictPostings(string_tokenizer=tokenizer)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Augment Beancount importers with machine learning functionality.",
    "version": "1.2",
    "project_urls": {
        "Homepage": "https://github.com/beancount/smart_importer"
    },
    "split_keywords": [
        "fava",
        "beancount",
        "accounting",
        "machinelearning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8ea3bc7247b8cf1686def5fd2a3e39425a016612bc68c4dea76669e1aa5eeaca",
                "md5": "ac902a444508c9e472317bae31880d81",
                "sha256": "73a925f8f5c4ef4470f300e9ce47a27e2e35911cba5fe4c1f6aa2e6d02b3eaa8"
            },
            "downloads": -1,
            "filename": "smart_importer-1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ac902a444508c9e472317bae31880d81",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 10847,
            "upload_time": "2025-10-17T20:46:14",
            "upload_time_iso_8601": "2025-10-17T20:46:14.085530Z",
            "url": "https://files.pythonhosted.org/packages/8e/a3/bc7247b8cf1686def5fd2a3e39425a016612bc68c4dea76669e1aa5eeaca/smart_importer-1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e133e7fe2e9373fa3e3cb3d1428e9963524e50e7abe5e3936858ce0009741a7b",
                "md5": "4499906bcc6dbf7060a8cd968f4c6043",
                "sha256": "c6b8ad801912b5f325a74bfbc67d7a67788e0ad80c192648ed9ce84f259c38e4"
            },
            "downloads": -1,
            "filename": "smart_importer-1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "4499906bcc6dbf7060a8cd968f4c6043",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 18461,
            "upload_time": "2025-10-17T20:46:15",
            "upload_time_iso_8601": "2025-10-17T20:46:15.759835Z",
            "url": "https://files.pythonhosted.org/packages/e1/33/e7fe2e9373fa3e3cb3d1428e9963524e50e7abe5e3936858ce0009741a7b/smart_importer-1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-17 20:46:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "beancount",
    "github_project": "smart_importer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "smart-importer"
}
        
Elapsed time: 1.44156s