souper


Namesouper JSON
Version 1.1.2 PyPI version JSON
download
home_pagehttps://pypi.org/project/souper
SummarySouper - Generic Indexed Storage based on ZODB
upload_time2022-12-05 11:56:41
maintainer
docs_urlNone
authorBlueDynamics Alliance
requires_python
licenseBSD
keywords zodb zope pyramid node plone
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
.. image:: https://travis-ci.org/bluedynamics/souper.svg?branch=master
    :target: https://travis-ci.org/bluedynamics/souper

ZODB Storage for lots of (light weight) data.

Utilizes:

- `ZODB <http://www.zodb.org/>`_ and its `BTrees <http://www.zodb.org/documentation/guide/modules.html#btrees-package>`_,
- `node <http://pypi.python.org/pypi/node>`_ (and `node.ext.zodb <http://pypi.python.org/pypi/node.ext.zodb>`_).
- `repoze.catalog <http://pypi.python.org/pypi/repoze.catalog>`_,

.. image:: https://raw.githubusercontent.com/bluedynamics/souper/master/docs/Souper-64.png

Souper is a tool for programmers. It offers an integrated storage tied together with indexes in a catalog.
The records in the storage are generic.
It is possible to store any data on a record if it is persistent pickable in ZODB.

Souper can be used used in any Python application, either standalone using the pure ZODB or with `Pyramid <http://docs.pylonsproject.org/en/latest/docs/pyramid.html>`_, `Zope <https://www.zope.org/>`_ or `Plone <http://plone.org>`_.


Using Souper
============

Providing a Locator
-------------------

Soups are looked up by adapting ``souper.interfaces.IStorageLocator`` to some context.
Souper does not provide any default locator.
So first one need to be provided. Let's assume context is some persistent dict-like instance

.. code-block:: pycon

    >>> from zope.interface import implementer
    >>> from zope.interface import Interface
    >>> from zope.component import provideAdapter
    >>> from souper.interfaces import IStorageLocator
    >>> from souper.soup import SoupData
    >>> @implementer(IStorageLocator)
    ... class StorageLocator(object):
    ...
    ...     def __init__(self, context):
    ...        self.context = context
    ...
    ...     def storage(self, soup_name):
    ...        if soup_name not in self.context:
    ...            self.context[soup_name] = SoupData()
    ...        return self.context[soup_name]

    >>> provideAdapter(StorageLocator, adapts=[Interface])

So we have locator creating soups by name on the fly. Now its easy to get a soup by name:

.. code-block:: pycon

    >>> from souper.soup import get_soup
    >>> soup = get_soup('mysoup', context)
    >>> soup
    <souper.soup.Soup object at 0x...>


Providing a Catalog Factory
---------------------------

Depending on your needs the catalog and its indexes may look different from use-case to use-case.
The catalog factory is responsible to create a catalog for a soup. The factory is a named utility implementing ``souper.interfaces.ICatalogFactory``.
The name of the utility has to the the same as the soup have.

Here ``repoze.catalog`` is used and to let the indexes access the data on the records by key the ``NodeAttributeIndexer`` is used.
For special cases one may write its custom indexers, but the default one is fine most of the time:

.. code-block:: pycon

    >>> from souper.interfaces import ICatalogFactory
    >>> from souper.soup import NodeAttributeIndexer
    >>> from souper.soup import NodeTextIndexer
    >>> from zope.component import provideUtility
    >>> from repoze.catalog.catalog import Catalog
    >>> from repoze.catalog.indexes.field import CatalogFieldIndex
    >>> from repoze.catalog.indexes.text import CatalogTextIndex
    >>> from repoze.catalog.indexes.keyword import CatalogKeywordIndex

    >>> @implementer(ICatalogFactory)
    ... class MySoupCatalogFactory(object):
    ...
    ...     def __call__(self, context=None):
    ...         catalog = Catalog()
    ...         userindexer = NodeAttributeIndexer('user')
    ...         catalog[u'user'] = CatalogFieldIndex(userindexer)
    ...         textindexer = NodeTextIndexer(['text', 'user')
    ...         catalog[u'text'] = CatalogTextIndex(textindexer)
    ...         keywordindexer = NodeAttributeIndexer('keywords')
    ...         catalog[u'keywords'] = CatalogKeywordIndex(keywordindexer)
    ...         return catalog

    >>> provideUtility(MySoupCatalogFactory(), name="mysoup")

The catalog factory is used soup-internal only but one may want to check if it works fine:

.. code-block:: pycon

    >>> catalogfactory = getUtility(ICatalogFactory, name='mysoup')
    >>> catalogfactory
    <MySoupCatalogFactory object at 0x...>

    >>> catalog = catalogfactory()
    >>> sorted(catalog.items())
    [(u'keywords', <repoze.catalog.indexes.keyword.CatalogKeywordIndex object at 0x...>),
    (u'text', <repoze.catalog.indexes.text.CatalogTextIndex object at 0x...>),
    (u'user', <repoze.catalog.indexes.field.CatalogFieldIndex object at 0x...>)]


Adding records
--------------

As mentioned above the ``souper.soup.Record`` is the one and only kind of data added to the soup.
A record has attributes containing the data:

.. code-block:: pycon

    >>> from souper.soup import get_soup
    >>> from souper.soup import Record
    >>> soup = get_soup('mysoup', context)
    >>> record = Record()
    >>> record.attrs['user'] = 'user1'
    >>> record.attrs['text'] = u'foo bar baz'
    >>> record.attrs['keywords'] = [u'1', u'2', u'ΓΌ']
    >>> record_id = soup.add(record)

A record may contains other records. But to index them one would need a custom indexer.
So, usually contained records are valuable for later display, not for searching:

.. code-block:: pycon

    >>> record['subrecord'] = Record()
    >>> record['homeaddress'].attrs['zip'] = '6020'
    >>> record['homeaddress'].attrs['town'] = 'Innsbruck'
    >>> record['homeaddress'].attrs['country'] = 'Austria'


Access data
-----------

Even without any query a record can be fetched by id:

.. code-block:: pycon

    >>> from souper.soup import get_soup
    >>> soup = get_soup('mysoup', context)
    >>> record = soup.get(record_id)

All records can be accessed using utilizing the container BTree:

.. code-block:: pycon

    >>> soup.data.keys()[0] == record_id
    True


Query data
----------

`How to query a repoze catalog is documented well. <http://docs.repoze.org/catalog/usage.html#searching>`_
Sorting works the same too.
Queries are passed to soups ``query`` method (which uses then repoze catalog).
It returns a generator:

.. code-block:: pycon

    >>> from repoze.catalog.query import Eq
    >>> [r for r in soup.query(Eq('user', 'user1'))]
    [<Record object 'None' at ...>]

    >>> [r for r in soup.query(Eq('user', 'nonexists'))]
    []

To also get the size of the result set pass a ``with_size=True`` to the query.
The first item returned by the generator is the size:

.. code-block:: pycon

    >>> [r for r in soup.query(Eq('user', 'user1'), with_size-True)]
    [1, <Record object 'None' at ...>]


To optimize handling of large result sets one may not to fetch the record but a generator returning light weight objects. Records are fetched on call:

.. code-block:: pycon

    >>> lazy = [l for l in soup.lazy(Eq('name', 'name'))]
    >>> lazy
    [<souper.soup.LazyRecord object at ...>,

    >>> lazy[0]()
    <Record object 'None' at ...>

Here the size is passed as first value of the geneartor too if ``with_size=True`` is passed.


Delete a record
---------------

To remove a record from the soup python ``del`` is used like one would do on
any dict:

.. code-block:: pycon

    >>> del soup[record]


Reindex
-------

After a records data changed it needs a reindex:

.. code-block:: pycon

    >>> record.attrs['user'] = 'user1'
    >>> soup.reindex(records=[record])

Sometimes one may want to reindex all data. Then ``reindex`` has to be called without parameters.
It may take a while:

.. code-block:: pycon

    >>> soup.reindex()


Rebuild catalog
---------------

Usally after a change of the catalog factory was made - i.e. some index was added - a rebuild of the catalog i needed.
It replaces the current catalog with a new one created by the catalog factory and reindexes all data.
It may take while:

.. code-block:: pycon

    >>> soup.rebuild()


Reset (or clear) the soup
-------------------------

To remove all data from the soup and empty and rebuild the catalog call ``clear``.

**Attention**: *All data is lost!*

.. code-block:: pycon

    >>> soup.clear()


Source Code
===========

The sources are in a GIT DVCS with its main branches at `github <http://github.com/bluedynamics/souper>`_.

We'd be happy to see many forks and pull-requests to make souper even better.


Contributors
============

- Robert Niederreiter <rnix [at] squarewave [dot] at>

- Jens W. Klein <jk [at] kleinundpartner [dot] at>


Changelog
=========

1.1.2 (2022-12-05)
------------------

- Release wheel.
  [rnix]


1.1.1 (2019-09-16)
------------------

- Cleanup NodeTextIndexer (one loop is enough).
  [jensens]


1.1.0 (2019-03-08)
------------------

- Code style (black, isort, utf8headers).
  [jensens]

- Switched to tox for testing, builodut gone.
  [jensens]

- Python 2/3 compatibility
  [agitator]


1.0.2 (2015-02-25)
------------------

- fix: unicode with special chars in text indexer failed.
  [jensens, 2014-02-25]

1.0.1
-----

- PEP-8.
  [rnix, 2012-10-16]

- Python 2.7 Support.
  [rnix, 2012-10-16]

- Fix documentation.

1.0
---

- make it work
  [rnix, jensens, et al]



            

Raw data

            {
    "_id": null,
    "home_page": "https://pypi.org/project/souper",
    "name": "souper",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "zodb zope pyramid node plone",
    "author": "BlueDynamics Alliance",
    "author_email": "dev@bluedynamics.com",
    "download_url": "https://files.pythonhosted.org/packages/fe/14/8d08137567531fd283569079d31f38d0e3d9880f3c3855ef71d7a5f4b152/souper-1.1.2.tar.gz",
    "platform": null,
    "description": "\n.. image:: https://travis-ci.org/bluedynamics/souper.svg?branch=master\n    :target: https://travis-ci.org/bluedynamics/souper\n\nZODB Storage for lots of (light weight) data.\n\nUtilizes:\n\n- `ZODB <http://www.zodb.org/>`_ and its `BTrees <http://www.zodb.org/documentation/guide/modules.html#btrees-package>`_,\n- `node <http://pypi.python.org/pypi/node>`_ (and `node.ext.zodb <http://pypi.python.org/pypi/node.ext.zodb>`_).\n- `repoze.catalog <http://pypi.python.org/pypi/repoze.catalog>`_,\n\n.. image:: https://raw.githubusercontent.com/bluedynamics/souper/master/docs/Souper-64.png\n\nSouper is a tool for programmers. It offers an integrated storage tied together with indexes in a catalog.\nThe records in the storage are generic.\nIt is possible to store any data on a record if it is persistent pickable in ZODB.\n\nSouper can be used used in any Python application, either standalone using the pure ZODB or with `Pyramid <http://docs.pylonsproject.org/en/latest/docs/pyramid.html>`_, `Zope <https://www.zope.org/>`_ or `Plone <http://plone.org>`_.\n\n\nUsing Souper\n============\n\nProviding a Locator\n-------------------\n\nSoups are looked up by adapting ``souper.interfaces.IStorageLocator`` to some context.\nSouper does not provide any default locator.\nSo first one need to be provided. Let's assume context is some persistent dict-like instance\n\n.. code-block:: pycon\n\n    >>> from zope.interface import implementer\n    >>> from zope.interface import Interface\n    >>> from zope.component import provideAdapter\n    >>> from souper.interfaces import IStorageLocator\n    >>> from souper.soup import SoupData\n    >>> @implementer(IStorageLocator)\n    ... class StorageLocator(object):\n    ...\n    ...     def __init__(self, context):\n    ...        self.context = context\n    ...\n    ...     def storage(self, soup_name):\n    ...        if soup_name not in self.context:\n    ...            self.context[soup_name] = SoupData()\n    ...        return self.context[soup_name]\n\n    >>> provideAdapter(StorageLocator, adapts=[Interface])\n\nSo we have locator creating soups by name on the fly. Now its easy to get a soup by name:\n\n.. code-block:: pycon\n\n    >>> from souper.soup import get_soup\n    >>> soup = get_soup('mysoup', context)\n    >>> soup\n    <souper.soup.Soup object at 0x...>\n\n\nProviding a Catalog Factory\n---------------------------\n\nDepending on your needs the catalog and its indexes may look different from use-case to use-case.\nThe catalog factory is responsible to create a catalog for a soup. The factory is a named utility implementing ``souper.interfaces.ICatalogFactory``.\nThe name of the utility has to the the same as the soup have.\n\nHere ``repoze.catalog`` is used and to let the indexes access the data on the records by key the ``NodeAttributeIndexer`` is used.\nFor special cases one may write its custom indexers, but the default one is fine most of the time:\n\n.. code-block:: pycon\n\n    >>> from souper.interfaces import ICatalogFactory\n    >>> from souper.soup import NodeAttributeIndexer\n    >>> from souper.soup import NodeTextIndexer\n    >>> from zope.component import provideUtility\n    >>> from repoze.catalog.catalog import Catalog\n    >>> from repoze.catalog.indexes.field import CatalogFieldIndex\n    >>> from repoze.catalog.indexes.text import CatalogTextIndex\n    >>> from repoze.catalog.indexes.keyword import CatalogKeywordIndex\n\n    >>> @implementer(ICatalogFactory)\n    ... class MySoupCatalogFactory(object):\n    ...\n    ...     def __call__(self, context=None):\n    ...         catalog = Catalog()\n    ...         userindexer = NodeAttributeIndexer('user')\n    ...         catalog[u'user'] = CatalogFieldIndex(userindexer)\n    ...         textindexer = NodeTextIndexer(['text', 'user')\n    ...         catalog[u'text'] = CatalogTextIndex(textindexer)\n    ...         keywordindexer = NodeAttributeIndexer('keywords')\n    ...         catalog[u'keywords'] = CatalogKeywordIndex(keywordindexer)\n    ...         return catalog\n\n    >>> provideUtility(MySoupCatalogFactory(), name=\"mysoup\")\n\nThe catalog factory is used soup-internal only but one may want to check if it works fine:\n\n.. code-block:: pycon\n\n    >>> catalogfactory = getUtility(ICatalogFactory, name='mysoup')\n    >>> catalogfactory\n    <MySoupCatalogFactory object at 0x...>\n\n    >>> catalog = catalogfactory()\n    >>> sorted(catalog.items())\n    [(u'keywords', <repoze.catalog.indexes.keyword.CatalogKeywordIndex object at 0x...>),\n    (u'text', <repoze.catalog.indexes.text.CatalogTextIndex object at 0x...>),\n    (u'user', <repoze.catalog.indexes.field.CatalogFieldIndex object at 0x...>)]\n\n\nAdding records\n--------------\n\nAs mentioned above the ``souper.soup.Record`` is the one and only kind of data added to the soup.\nA record has attributes containing the data:\n\n.. code-block:: pycon\n\n    >>> from souper.soup import get_soup\n    >>> from souper.soup import Record\n    >>> soup = get_soup('mysoup', context)\n    >>> record = Record()\n    >>> record.attrs['user'] = 'user1'\n    >>> record.attrs['text'] = u'foo bar baz'\n    >>> record.attrs['keywords'] = [u'1', u'2', u'\u00fc']\n    >>> record_id = soup.add(record)\n\nA record may contains other records. But to index them one would need a custom indexer.\nSo, usually contained records are valuable for later display, not for searching:\n\n.. code-block:: pycon\n\n    >>> record['subrecord'] = Record()\n    >>> record['homeaddress'].attrs['zip'] = '6020'\n    >>> record['homeaddress'].attrs['town'] = 'Innsbruck'\n    >>> record['homeaddress'].attrs['country'] = 'Austria'\n\n\nAccess data\n-----------\n\nEven without any query a record can be fetched by id:\n\n.. code-block:: pycon\n\n    >>> from souper.soup import get_soup\n    >>> soup = get_soup('mysoup', context)\n    >>> record = soup.get(record_id)\n\nAll records can be accessed using utilizing the container BTree:\n\n.. code-block:: pycon\n\n    >>> soup.data.keys()[0] == record_id\n    True\n\n\nQuery data\n----------\n\n`How to query a repoze catalog is documented well. <http://docs.repoze.org/catalog/usage.html#searching>`_\nSorting works the same too.\nQueries are passed to soups ``query`` method (which uses then repoze catalog).\nIt returns a generator:\n\n.. code-block:: pycon\n\n    >>> from repoze.catalog.query import Eq\n    >>> [r for r in soup.query(Eq('user', 'user1'))]\n    [<Record object 'None' at ...>]\n\n    >>> [r for r in soup.query(Eq('user', 'nonexists'))]\n    []\n\nTo also get the size of the result set pass a ``with_size=True`` to the query.\nThe first item returned by the generator is the size:\n\n.. code-block:: pycon\n\n    >>> [r for r in soup.query(Eq('user', 'user1'), with_size-True)]\n    [1, <Record object 'None' at ...>]\n\n\nTo optimize handling of large result sets one may not to fetch the record but a generator returning light weight objects. Records are fetched on call:\n\n.. code-block:: pycon\n\n    >>> lazy = [l for l in soup.lazy(Eq('name', 'name'))]\n    >>> lazy\n    [<souper.soup.LazyRecord object at ...>,\n\n    >>> lazy[0]()\n    <Record object 'None' at ...>\n\nHere the size is passed as first value of the geneartor too if ``with_size=True`` is passed.\n\n\nDelete a record\n---------------\n\nTo remove a record from the soup python ``del`` is used like one would do on\nany dict:\n\n.. code-block:: pycon\n\n    >>> del soup[record]\n\n\nReindex\n-------\n\nAfter a records data changed it needs a reindex:\n\n.. code-block:: pycon\n\n    >>> record.attrs['user'] = 'user1'\n    >>> soup.reindex(records=[record])\n\nSometimes one may want to reindex all data. Then ``reindex`` has to be called without parameters.\nIt may take a while:\n\n.. code-block:: pycon\n\n    >>> soup.reindex()\n\n\nRebuild catalog\n---------------\n\nUsally after a change of the catalog factory was made - i.e. some index was added - a rebuild of the catalog i needed.\nIt replaces the current catalog with a new one created by the catalog factory and reindexes all data.\nIt may take while:\n\n.. code-block:: pycon\n\n    >>> soup.rebuild()\n\n\nReset (or clear) the soup\n-------------------------\n\nTo remove all data from the soup and empty and rebuild the catalog call ``clear``.\n\n**Attention**: *All data is lost!*\n\n.. code-block:: pycon\n\n    >>> soup.clear()\n\n\nSource Code\n===========\n\nThe sources are in a GIT DVCS with its main branches at `github <http://github.com/bluedynamics/souper>`_.\n\nWe'd be happy to see many forks and pull-requests to make souper even better.\n\n\nContributors\n============\n\n- Robert Niederreiter <rnix [at] squarewave [dot] at>\n\n- Jens W. Klein <jk [at] kleinundpartner [dot] at>\n\n\nChangelog\n=========\n\n1.1.2 (2022-12-05)\n------------------\n\n- Release wheel.\n  [rnix]\n\n\n1.1.1 (2019-09-16)\n------------------\n\n- Cleanup NodeTextIndexer (one loop is enough).\n  [jensens]\n\n\n1.1.0 (2019-03-08)\n------------------\n\n- Code style (black, isort, utf8headers).\n  [jensens]\n\n- Switched to tox for testing, builodut gone.\n  [jensens]\n\n- Python 2/3 compatibility\n  [agitator]\n\n\n1.0.2 (2015-02-25)\n------------------\n\n- fix: unicode with special chars in text indexer failed.\n  [jensens, 2014-02-25]\n\n1.0.1\n-----\n\n- PEP-8.\n  [rnix, 2012-10-16]\n\n- Python 2.7 Support.\n  [rnix, 2012-10-16]\n\n- Fix documentation.\n\n1.0\n---\n\n- make it work\n  [rnix, jensens, et al]\n\n\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Souper - Generic Indexed Storage based on ZODB",
    "version": "1.1.2",
    "split_keywords": [
        "zodb",
        "zope",
        "pyramid",
        "node",
        "plone"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "751ae8b5de95f87fc65bba8ec7a6554e",
                "sha256": "07f8bcfc858c5d764f0fde8f62636280916c08309cf872f380417e89d9d7396e"
            },
            "downloads": -1,
            "filename": "souper-1.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "751ae8b5de95f87fc65bba8ec7a6554e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 11466,
            "upload_time": "2022-12-05T11:56:37",
            "upload_time_iso_8601": "2022-12-05T11:56:37.667297Z",
            "url": "https://files.pythonhosted.org/packages/21/03/22dba11501592d08d43b83d3c81fb09bad776dacb9dca2ec439db41f1b71/souper-1.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "1dd6003b5728bb6841af7fa70d24b698",
                "sha256": "38a0fcf8e1d1e830895483e7d3d91a03a4c465c3855051e805d518f53aa81c9d"
            },
            "downloads": -1,
            "filename": "souper-1.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "1dd6003b5728bb6841af7fa70d24b698",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 25626,
            "upload_time": "2022-12-05T11:56:41",
            "upload_time_iso_8601": "2022-12-05T11:56:41.219477Z",
            "url": "https://files.pythonhosted.org/packages/fe/14/8d08137567531fd283569079d31f38d0e3d9880f3c3855ef71d7a5f4b152/souper-1.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-05 11:56:41",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "souper"
}
        
Elapsed time: 0.01593s