.. image:: https://travis-ci.org/bluedynamics/souper.svg?branch=master
:target: https://travis-ci.org/bluedynamics/souper
ZODB Storage for lots of (light weight) data.
Utilizes:
- `ZODB <http://www.zodb.org/>`_ and its `BTrees <http://www.zodb.org/documentation/guide/modules.html#btrees-package>`_,
- `node <http://pypi.python.org/pypi/node>`_ (and `node.ext.zodb <http://pypi.python.org/pypi/node.ext.zodb>`_).
- `repoze.catalog <http://pypi.python.org/pypi/repoze.catalog>`_,
.. image:: https://raw.githubusercontent.com/bluedynamics/souper/master/docs/Souper-64.png
Souper is a tool for programmers. It offers an integrated storage tied together with indexes in a catalog.
The records in the storage are generic.
It is possible to store any data on a record if it is persistent pickable in ZODB.
Souper can be used used in any Python application, either standalone using the pure ZODB or with `Pyramid <http://docs.pylonsproject.org/en/latest/docs/pyramid.html>`_, `Zope <https://www.zope.org/>`_ or `Plone <http://plone.org>`_.
Using Souper
============
Providing a Locator
-------------------
Soups are looked up by adapting ``souper.interfaces.IStorageLocator`` to some context.
Souper does not provide any default locator.
So first one need to be provided. Let's assume context is some persistent dict-like instance
.. code-block:: pycon
>>> from zope.interface import implementer
>>> from zope.interface import Interface
>>> from zope.component import provideAdapter
>>> from souper.interfaces import IStorageLocator
>>> from souper.soup import SoupData
>>> @implementer(IStorageLocator)
... class StorageLocator(object):
...
... def __init__(self, context):
... self.context = context
...
... def storage(self, soup_name):
... if soup_name not in self.context:
... self.context[soup_name] = SoupData()
... return self.context[soup_name]
>>> provideAdapter(StorageLocator, adapts=[Interface])
So we have locator creating soups by name on the fly. Now its easy to get a soup by name:
.. code-block:: pycon
>>> from souper.soup import get_soup
>>> soup = get_soup('mysoup', context)
>>> soup
<souper.soup.Soup object at 0x...>
Providing a Catalog Factory
---------------------------
Depending on your needs the catalog and its indexes may look different from use-case to use-case.
The catalog factory is responsible to create a catalog for a soup. The factory is a named utility implementing ``souper.interfaces.ICatalogFactory``.
The name of the utility has to the the same as the soup have.
Here ``repoze.catalog`` is used and to let the indexes access the data on the records by key the ``NodeAttributeIndexer`` is used.
For special cases one may write its custom indexers, but the default one is fine most of the time:
.. code-block:: pycon
>>> from souper.interfaces import ICatalogFactory
>>> from souper.soup import NodeAttributeIndexer
>>> from souper.soup import NodeTextIndexer
>>> from zope.component import provideUtility
>>> from repoze.catalog.catalog import Catalog
>>> from repoze.catalog.indexes.field import CatalogFieldIndex
>>> from repoze.catalog.indexes.text import CatalogTextIndex
>>> from repoze.catalog.indexes.keyword import CatalogKeywordIndex
>>> @implementer(ICatalogFactory)
... class MySoupCatalogFactory(object):
...
... def __call__(self, context=None):
... catalog = Catalog()
... userindexer = NodeAttributeIndexer('user')
... catalog[u'user'] = CatalogFieldIndex(userindexer)
... textindexer = NodeTextIndexer(['text', 'user')
... catalog[u'text'] = CatalogTextIndex(textindexer)
... keywordindexer = NodeAttributeIndexer('keywords')
... catalog[u'keywords'] = CatalogKeywordIndex(keywordindexer)
... return catalog
>>> provideUtility(MySoupCatalogFactory(), name="mysoup")
The catalog factory is used soup-internal only but one may want to check if it works fine:
.. code-block:: pycon
>>> catalogfactory = getUtility(ICatalogFactory, name='mysoup')
>>> catalogfactory
<MySoupCatalogFactory object at 0x...>
>>> catalog = catalogfactory()
>>> sorted(catalog.items())
[(u'keywords', <repoze.catalog.indexes.keyword.CatalogKeywordIndex object at 0x...>),
(u'text', <repoze.catalog.indexes.text.CatalogTextIndex object at 0x...>),
(u'user', <repoze.catalog.indexes.field.CatalogFieldIndex object at 0x...>)]
Adding records
--------------
As mentioned above the ``souper.soup.Record`` is the one and only kind of data added to the soup.
A record has attributes containing the data:
.. code-block:: pycon
>>> from souper.soup import get_soup
>>> from souper.soup import Record
>>> soup = get_soup('mysoup', context)
>>> record = Record()
>>> record.attrs['user'] = 'user1'
>>> record.attrs['text'] = u'foo bar baz'
>>> record.attrs['keywords'] = [u'1', u'2', u'ΓΌ']
>>> record_id = soup.add(record)
A record may contains other records. But to index them one would need a custom indexer.
So, usually contained records are valuable for later display, not for searching:
.. code-block:: pycon
>>> record['subrecord'] = Record()
>>> record['homeaddress'].attrs['zip'] = '6020'
>>> record['homeaddress'].attrs['town'] = 'Innsbruck'
>>> record['homeaddress'].attrs['country'] = 'Austria'
Access data
-----------
Even without any query a record can be fetched by id:
.. code-block:: pycon
>>> from souper.soup import get_soup
>>> soup = get_soup('mysoup', context)
>>> record = soup.get(record_id)
All records can be accessed using utilizing the container BTree:
.. code-block:: pycon
>>> soup.data.keys()[0] == record_id
True
Query data
----------
`How to query a repoze catalog is documented well. <http://docs.repoze.org/catalog/usage.html#searching>`_
Sorting works the same too.
Queries are passed to soups ``query`` method (which uses then repoze catalog).
It returns a generator:
.. code-block:: pycon
>>> from repoze.catalog.query import Eq
>>> [r for r in soup.query(Eq('user', 'user1'))]
[<Record object 'None' at ...>]
>>> [r for r in soup.query(Eq('user', 'nonexists'))]
[]
To also get the size of the result set pass a ``with_size=True`` to the query.
The first item returned by the generator is the size:
.. code-block:: pycon
>>> [r for r in soup.query(Eq('user', 'user1'), with_size-True)]
[1, <Record object 'None' at ...>]
To optimize handling of large result sets one may not to fetch the record but a generator returning light weight objects. Records are fetched on call:
.. code-block:: pycon
>>> lazy = [l for l in soup.lazy(Eq('name', 'name'))]
>>> lazy
[<souper.soup.LazyRecord object at ...>,
>>> lazy[0]()
<Record object 'None' at ...>
Here the size is passed as first value of the geneartor too if ``with_size=True`` is passed.
Delete a record
---------------
To remove a record from the soup python ``del`` is used like one would do on
any dict:
.. code-block:: pycon
>>> del soup[record]
Reindex
-------
After a records data changed it needs a reindex:
.. code-block:: pycon
>>> record.attrs['user'] = 'user1'
>>> soup.reindex(records=[record])
Sometimes one may want to reindex all data. Then ``reindex`` has to be called without parameters.
It may take a while:
.. code-block:: pycon
>>> soup.reindex()
Rebuild catalog
---------------
Usally after a change of the catalog factory was made - i.e. some index was added - a rebuild of the catalog i needed.
It replaces the current catalog with a new one created by the catalog factory and reindexes all data.
It may take while:
.. code-block:: pycon
>>> soup.rebuild()
Reset (or clear) the soup
-------------------------
To remove all data from the soup and empty and rebuild the catalog call ``clear``.
**Attention**: *All data is lost!*
.. code-block:: pycon
>>> soup.clear()
Source Code
===========
The sources are in a GIT DVCS with its main branches at `github <http://github.com/bluedynamics/souper>`_.
We'd be happy to see many forks and pull-requests to make souper even better.
Contributors
============
- Robert Niederreiter <rnix [at] squarewave [dot] at>
- Jens W. Klein <jk [at] kleinundpartner [dot] at>
Changelog
=========
1.1.2 (2022-12-05)
------------------
- Release wheel.
[rnix]
1.1.1 (2019-09-16)
------------------
- Cleanup NodeTextIndexer (one loop is enough).
[jensens]
1.1.0 (2019-03-08)
------------------
- Code style (black, isort, utf8headers).
[jensens]
- Switched to tox for testing, builodut gone.
[jensens]
- Python 2/3 compatibility
[agitator]
1.0.2 (2015-02-25)
------------------
- fix: unicode with special chars in text indexer failed.
[jensens, 2014-02-25]
1.0.1
-----
- PEP-8.
[rnix, 2012-10-16]
- Python 2.7 Support.
[rnix, 2012-10-16]
- Fix documentation.
1.0
---
- make it work
[rnix, jensens, et al]
Raw data
{
"_id": null,
"home_page": "https://pypi.org/project/souper",
"name": "souper",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "zodb zope pyramid node plone",
"author": "BlueDynamics Alliance",
"author_email": "dev@bluedynamics.com",
"download_url": "https://files.pythonhosted.org/packages/fe/14/8d08137567531fd283569079d31f38d0e3d9880f3c3855ef71d7a5f4b152/souper-1.1.2.tar.gz",
"platform": null,
"description": "\n.. image:: https://travis-ci.org/bluedynamics/souper.svg?branch=master\n :target: https://travis-ci.org/bluedynamics/souper\n\nZODB Storage for lots of (light weight) data.\n\nUtilizes:\n\n- `ZODB <http://www.zodb.org/>`_ and its `BTrees <http://www.zodb.org/documentation/guide/modules.html#btrees-package>`_,\n- `node <http://pypi.python.org/pypi/node>`_ (and `node.ext.zodb <http://pypi.python.org/pypi/node.ext.zodb>`_).\n- `repoze.catalog <http://pypi.python.org/pypi/repoze.catalog>`_,\n\n.. image:: https://raw.githubusercontent.com/bluedynamics/souper/master/docs/Souper-64.png\n\nSouper is a tool for programmers. It offers an integrated storage tied together with indexes in a catalog.\nThe records in the storage are generic.\nIt is possible to store any data on a record if it is persistent pickable in ZODB.\n\nSouper can be used used in any Python application, either standalone using the pure ZODB or with `Pyramid <http://docs.pylonsproject.org/en/latest/docs/pyramid.html>`_, `Zope <https://www.zope.org/>`_ or `Plone <http://plone.org>`_.\n\n\nUsing Souper\n============\n\nProviding a Locator\n-------------------\n\nSoups are looked up by adapting ``souper.interfaces.IStorageLocator`` to some context.\nSouper does not provide any default locator.\nSo first one need to be provided. Let's assume context is some persistent dict-like instance\n\n.. code-block:: pycon\n\n >>> from zope.interface import implementer\n >>> from zope.interface import Interface\n >>> from zope.component import provideAdapter\n >>> from souper.interfaces import IStorageLocator\n >>> from souper.soup import SoupData\n >>> @implementer(IStorageLocator)\n ... class StorageLocator(object):\n ...\n ... def __init__(self, context):\n ... self.context = context\n ...\n ... def storage(self, soup_name):\n ... if soup_name not in self.context:\n ... self.context[soup_name] = SoupData()\n ... return self.context[soup_name]\n\n >>> provideAdapter(StorageLocator, adapts=[Interface])\n\nSo we have locator creating soups by name on the fly. Now its easy to get a soup by name:\n\n.. code-block:: pycon\n\n >>> from souper.soup import get_soup\n >>> soup = get_soup('mysoup', context)\n >>> soup\n <souper.soup.Soup object at 0x...>\n\n\nProviding a Catalog Factory\n---------------------------\n\nDepending on your needs the catalog and its indexes may look different from use-case to use-case.\nThe catalog factory is responsible to create a catalog for a soup. The factory is a named utility implementing ``souper.interfaces.ICatalogFactory``.\nThe name of the utility has to the the same as the soup have.\n\nHere ``repoze.catalog`` is used and to let the indexes access the data on the records by key the ``NodeAttributeIndexer`` is used.\nFor special cases one may write its custom indexers, but the default one is fine most of the time:\n\n.. code-block:: pycon\n\n >>> from souper.interfaces import ICatalogFactory\n >>> from souper.soup import NodeAttributeIndexer\n >>> from souper.soup import NodeTextIndexer\n >>> from zope.component import provideUtility\n >>> from repoze.catalog.catalog import Catalog\n >>> from repoze.catalog.indexes.field import CatalogFieldIndex\n >>> from repoze.catalog.indexes.text import CatalogTextIndex\n >>> from repoze.catalog.indexes.keyword import CatalogKeywordIndex\n\n >>> @implementer(ICatalogFactory)\n ... class MySoupCatalogFactory(object):\n ...\n ... def __call__(self, context=None):\n ... catalog = Catalog()\n ... userindexer = NodeAttributeIndexer('user')\n ... catalog[u'user'] = CatalogFieldIndex(userindexer)\n ... textindexer = NodeTextIndexer(['text', 'user')\n ... catalog[u'text'] = CatalogTextIndex(textindexer)\n ... keywordindexer = NodeAttributeIndexer('keywords')\n ... catalog[u'keywords'] = CatalogKeywordIndex(keywordindexer)\n ... return catalog\n\n >>> provideUtility(MySoupCatalogFactory(), name=\"mysoup\")\n\nThe catalog factory is used soup-internal only but one may want to check if it works fine:\n\n.. code-block:: pycon\n\n >>> catalogfactory = getUtility(ICatalogFactory, name='mysoup')\n >>> catalogfactory\n <MySoupCatalogFactory object at 0x...>\n\n >>> catalog = catalogfactory()\n >>> sorted(catalog.items())\n [(u'keywords', <repoze.catalog.indexes.keyword.CatalogKeywordIndex object at 0x...>),\n (u'text', <repoze.catalog.indexes.text.CatalogTextIndex object at 0x...>),\n (u'user', <repoze.catalog.indexes.field.CatalogFieldIndex object at 0x...>)]\n\n\nAdding records\n--------------\n\nAs mentioned above the ``souper.soup.Record`` is the one and only kind of data added to the soup.\nA record has attributes containing the data:\n\n.. code-block:: pycon\n\n >>> from souper.soup import get_soup\n >>> from souper.soup import Record\n >>> soup = get_soup('mysoup', context)\n >>> record = Record()\n >>> record.attrs['user'] = 'user1'\n >>> record.attrs['text'] = u'foo bar baz'\n >>> record.attrs['keywords'] = [u'1', u'2', u'\u00fc']\n >>> record_id = soup.add(record)\n\nA record may contains other records. But to index them one would need a custom indexer.\nSo, usually contained records are valuable for later display, not for searching:\n\n.. code-block:: pycon\n\n >>> record['subrecord'] = Record()\n >>> record['homeaddress'].attrs['zip'] = '6020'\n >>> record['homeaddress'].attrs['town'] = 'Innsbruck'\n >>> record['homeaddress'].attrs['country'] = 'Austria'\n\n\nAccess data\n-----------\n\nEven without any query a record can be fetched by id:\n\n.. code-block:: pycon\n\n >>> from souper.soup import get_soup\n >>> soup = get_soup('mysoup', context)\n >>> record = soup.get(record_id)\n\nAll records can be accessed using utilizing the container BTree:\n\n.. code-block:: pycon\n\n >>> soup.data.keys()[0] == record_id\n True\n\n\nQuery data\n----------\n\n`How to query a repoze catalog is documented well. <http://docs.repoze.org/catalog/usage.html#searching>`_\nSorting works the same too.\nQueries are passed to soups ``query`` method (which uses then repoze catalog).\nIt returns a generator:\n\n.. code-block:: pycon\n\n >>> from repoze.catalog.query import Eq\n >>> [r for r in soup.query(Eq('user', 'user1'))]\n [<Record object 'None' at ...>]\n\n >>> [r for r in soup.query(Eq('user', 'nonexists'))]\n []\n\nTo also get the size of the result set pass a ``with_size=True`` to the query.\nThe first item returned by the generator is the size:\n\n.. code-block:: pycon\n\n >>> [r for r in soup.query(Eq('user', 'user1'), with_size-True)]\n [1, <Record object 'None' at ...>]\n\n\nTo optimize handling of large result sets one may not to fetch the record but a generator returning light weight objects. Records are fetched on call:\n\n.. code-block:: pycon\n\n >>> lazy = [l for l in soup.lazy(Eq('name', 'name'))]\n >>> lazy\n [<souper.soup.LazyRecord object at ...>,\n\n >>> lazy[0]()\n <Record object 'None' at ...>\n\nHere the size is passed as first value of the geneartor too if ``with_size=True`` is passed.\n\n\nDelete a record\n---------------\n\nTo remove a record from the soup python ``del`` is used like one would do on\nany dict:\n\n.. code-block:: pycon\n\n >>> del soup[record]\n\n\nReindex\n-------\n\nAfter a records data changed it needs a reindex:\n\n.. code-block:: pycon\n\n >>> record.attrs['user'] = 'user1'\n >>> soup.reindex(records=[record])\n\nSometimes one may want to reindex all data. Then ``reindex`` has to be called without parameters.\nIt may take a while:\n\n.. code-block:: pycon\n\n >>> soup.reindex()\n\n\nRebuild catalog\n---------------\n\nUsally after a change of the catalog factory was made - i.e. some index was added - a rebuild of the catalog i needed.\nIt replaces the current catalog with a new one created by the catalog factory and reindexes all data.\nIt may take while:\n\n.. code-block:: pycon\n\n >>> soup.rebuild()\n\n\nReset (or clear) the soup\n-------------------------\n\nTo remove all data from the soup and empty and rebuild the catalog call ``clear``.\n\n**Attention**: *All data is lost!*\n\n.. code-block:: pycon\n\n >>> soup.clear()\n\n\nSource Code\n===========\n\nThe sources are in a GIT DVCS with its main branches at `github <http://github.com/bluedynamics/souper>`_.\n\nWe'd be happy to see many forks and pull-requests to make souper even better.\n\n\nContributors\n============\n\n- Robert Niederreiter <rnix [at] squarewave [dot] at>\n\n- Jens W. Klein <jk [at] kleinundpartner [dot] at>\n\n\nChangelog\n=========\n\n1.1.2 (2022-12-05)\n------------------\n\n- Release wheel.\n [rnix]\n\n\n1.1.1 (2019-09-16)\n------------------\n\n- Cleanup NodeTextIndexer (one loop is enough).\n [jensens]\n\n\n1.1.0 (2019-03-08)\n------------------\n\n- Code style (black, isort, utf8headers).\n [jensens]\n\n- Switched to tox for testing, builodut gone.\n [jensens]\n\n- Python 2/3 compatibility\n [agitator]\n\n\n1.0.2 (2015-02-25)\n------------------\n\n- fix: unicode with special chars in text indexer failed.\n [jensens, 2014-02-25]\n\n1.0.1\n-----\n\n- PEP-8.\n [rnix, 2012-10-16]\n\n- Python 2.7 Support.\n [rnix, 2012-10-16]\n\n- Fix documentation.\n\n1.0\n---\n\n- make it work\n [rnix, jensens, et al]\n\n\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "Souper - Generic Indexed Storage based on ZODB",
"version": "1.1.2",
"split_keywords": [
"zodb",
"zope",
"pyramid",
"node",
"plone"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "751ae8b5de95f87fc65bba8ec7a6554e",
"sha256": "07f8bcfc858c5d764f0fde8f62636280916c08309cf872f380417e89d9d7396e"
},
"downloads": -1,
"filename": "souper-1.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "751ae8b5de95f87fc65bba8ec7a6554e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 11466,
"upload_time": "2022-12-05T11:56:37",
"upload_time_iso_8601": "2022-12-05T11:56:37.667297Z",
"url": "https://files.pythonhosted.org/packages/21/03/22dba11501592d08d43b83d3c81fb09bad776dacb9dca2ec439db41f1b71/souper-1.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "1dd6003b5728bb6841af7fa70d24b698",
"sha256": "38a0fcf8e1d1e830895483e7d3d91a03a4c465c3855051e805d518f53aa81c9d"
},
"downloads": -1,
"filename": "souper-1.1.2.tar.gz",
"has_sig": false,
"md5_digest": "1dd6003b5728bb6841af7fa70d24b698",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 25626,
"upload_time": "2022-12-05T11:56:41",
"upload_time_iso_8601": "2022-12-05T11:56:41.219477Z",
"url": "https://files.pythonhosted.org/packages/fe/14/8d08137567531fd283569079d31f38d0e3d9880f3c3855ef71d7a5f4b152/souper-1.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-05 11:56:41",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "souper"
}