python-disk-collections


Namepython-disk-collections JSON
Version 0.0.5 PyPI version JSON
download
home_pagehttps://github.com/thegrymek/python-disk-collections
SummaryPackage provides classes: FileList, FileDeque that behaves like bulltins but keeps items at disk.
upload_time2023-11-30 17:34:27
maintainer
docs_urlNone
authorthegrymek
requires_python
licenseMIT
keywords pickle cache collections list deque json zlib disk
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            =======================
Python Disk Collections
=======================

.. image:: https://img.shields.io/pypi/v/python-disk-collections.svg
  :target: https://pypi.python.org/pypi/python-disk-collections

.. image:: https://img.shields.io/pypi/l/python-disk-collections.svg
  :target: https://pypi.python.org/pypi/python-disk-collections

.. image:: https://img.shields.io/pypi/pyversions/python-disk-collections.svg
  :target: https://pypi.python.org/pypi/python-disk-collections


Module contains class with extended python list that stores items at disk.
By default items before save are pickled and compressed. Use that list
as usual list!

In addition, there is implemented extended python deque with disk storage and
same behaviour as **collections.deque**.

Intend of package was to create generic iterables that stores really big collection of items
that does not fit in memory and to avoid usage of external cache and local database
storages.


.. code-block:: python

    >>> from diskcollections.iterables import FileList
    >>> flist = FileList()
    >>> flist.extend([1, 2, 3])
    >>> flist.append(4)
    >>> flist
    [1, 2, 3, 4]
    >>> flist[2]
    3
    >>> flist2 = flist[:]  # copy makes new FileList
    >>> my_list = list(flist)  # now its simple list


.. code-block:: python

    >>> from diskcollections.iterables import FileDeque
    >>> fdeque = FileDeque()
    >>> fdeque.extend([1, 2, 3])
    >>> fdeque.append(4)
    >>> fdeque
    FileDeque([1, 2, 3, 4])
    >>> fdeque.pop()
    4
    >>> fdeque.appendleft(0)
    >>> fdeque.popleft()
    0


There are available more ways to serialize items.


.. code-block:: python

    >>> from diskcollections.iterables import List, FileList, FileDeque
    >>> from diskcollections.serializers import (
        PickleSerializer,  # pickle items
        PickleZLibSerializer,  # pickle + compress items
        JsonSerializer, # convert to json items
        JsonZLibSerializer  # convert to json + compress items
    )
    >>> from functools import partial
    >>> JsonFileList = partial(List, serializer_class=JsonHandler)
    >>> flist = JsonFileList()
    >>> flist.append({'a': 1, 'b': 2, 'c': 3})
    >>> flist[0]
    {u'a': 1, u'b': 2, u'c': 3}


Installation
------------

To install package type

.. code-block:: bash

    $ pip install python-disk-collections


How it works
------------

Explaining example above:

.. code-block:: python

    >>> from diskcollections.iterables import FileList
    >>> from diskcollections.serializers import JsonZLibSerializer
    >>>
    >>> flist = FileList(serializer_class=JsonZLibSerializer)

New instance of this object creates new temporary directory.
By using `serializer_class=JsonZLibSerializer` each incoming item to list will be: json.dumped and compressed

.. code-block:: python

    >>> flist.append({'a': 1, 'b': 2, 'c': 3})

so using this serializer have in mind that all objects you put into list
have to lend themself and compatible with json.
Exactly this object `{'a': 1, 'b': 2, 'c': 3}` will serialized and compressed and saved inside temporary directory.

.. code-block:: python

    >>> flist[0]
    {u'a': 1, u'b': 2, u'c': 3}

Getting an item will read a file and because `JsonZLibSerializer` is used: then content will be decompressed and tried
to loaded from json.

This package provides a few other serializers:

* PickleSerializer - pickle items
* PickleZLibSerializer - pickle + compress items
* JsonSerializer - convert to json items
* JsonZLibSerializer - convert to json + compress items

.. code-block:: python

    from diskcollections.serializers import (
          PickleSerializer,
          PickleZLibSerializer,
          JsonSerializer,
          JsonZLibSerializer,
      )

In order to implement your serializer create class with methods:
**dumps** and **loads** or import interface.


.. code-block:: python

    >>> from diskcollections.interfaces import ISerializer

    class ISerializer:

    @staticmethod
    def dumps(obj):
        """Converts object to string.

        :param obj: any python object
        :return: dumped string
        """
        raise NotImplementedError

    @staticmethod
    def loads(obj):
        """Restored dumped string into python object.

        :param obj: Object stored as string
        :return: python object restored from dump
        """
        raise NotImplementedError

All serializers from example above implements interface **ISerializer**.

Under the hood, **FileList** for storage items uses *tempfile.mktemp* (in python2)
or *tempfile.TemporaryDirectory* (in python3). It means, that every list
has own unique directory, placed likely in */tmp/*.
When list is removed by garbage collector, all items that was stored are lost.

For **FileDeque** stores items in the same way as **FileList**.

By default on exit program, or when list or deque is removed: all content of files also are dropped.

To prevent this use `PersistentDirectoryClient`:

.. code-block:: python

    >>> from functools import partial

    >>> from diskcollections.iterables import List, PersistentDirectoryClient
    >>> from diskcollections.serializers import JsonSerializer
    >>> from diskcollections.iterables import PersistentDirectoryClient

    >>> dir_abc = partial(PersistentDirectoryClient, "abc")
    >>> persistent_list = List(client_class=dir_abc, serializer_class=JsonSerializer)
    >>> persistent_list.append({"a": 1, "b": 2})
    >>> assert len(persistent_list) == 1
    >>> assert open("abc/0").read() == '{"a": 1, "b": 2}'

On exit directory `abc` with file `0` of his contents will still exist.


Contribute
----------

#. Fork repository on GitHub to start making your changes to the **master** branch (or branch off of it).
#. Write tests that prove that bug or future works as expected
#. Install other python versions with **pyenv** together with **tox**:

.. code-block:: bash

  $ sudo apt-get install pyenv tox

#. Install other python versions

.. code-block:: bash

  $ pyenv install 2.7 3.5 3.6 3.7 3.8 3.9 3.10 3.11


#. Make them global for **detox** package

.. code-block:: bash

  $ pyenv global 2.7 3.5 3.6 3.7 3.8 3.9 3.10 3.11

#. Install globally **detox**

.. code-block:: bash

  $ sudo pip install detox

#. Check your code and tests with **detox**

.. code-block:: bash

  $ detox -n 1
  GLOB sdist-make: python-disk-collections/setup.py
  lint inst-nodeps: python-disk-collections/.tox/.tmp/package/7/python-disk-collections-0.0.4.zip
  lint run-test-pre: PYTHONHASHSEED='1334400931'
  lint runtests: commands[0] | flake8
  lint runtests: commands[1] | python setup.py check -r -s -m
  py27 inst-nodeps: python-disk-collections/.tox/.tmp/package/7/python-disk-collections-0.0.4.zip
  py27 run-test-pre: PYTHONHASHSEED='1334400931'
  py27 runtests: commands[0] | py.test -v --cov diskcollections --cov-config .coveragerc --cov-report term-missing --cov-fail-under 95
  ...
  py311 inst-nodeps: python-disk-collections/.tox/.tmp/package/7/python-disk-collections-0.0.4.zip
  py311 run-test-pre: PYTHONHASHSEED='1334400931'
  py311 runtests: commands[0] | py.test -v --cov diskcollections --cov-config .coveragerc --cov-report term-missing --cov-fail-under 95
  _________________________________________________________________________________________________________________ summary __________________________________________________________________________________________________________________
    lint: commands succeeded
    py27: commands succeeded
    py35: commands succeeded
    py36: commands succeeded
    py37: commands succeeded
    py38: commands succeeded
    py39: commands succeeded
    py310: commands succeeded
    py311: commands succeeded
    congratulations :)

#. Send a pull request!


License
-------

Python-Disk-Collection is under MIT license, see LICENSE for more details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/thegrymek/python-disk-collections",
    "name": "python-disk-collections",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "pickle,cache,collections,list,deque,json,zlib,disk",
    "author": "thegrymek",
    "author_email": "andrzej.grymkowski@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/83/b9/f3ec40418105919cfdb55a20e33c9c14b8fdc59d74f23f31eb2896ce228f/python-disk-collections-0.0.5.tar.gz",
    "platform": null,
    "description": "=======================\nPython Disk Collections\n=======================\n\n.. image:: https://img.shields.io/pypi/v/python-disk-collections.svg\n  :target: https://pypi.python.org/pypi/python-disk-collections\n\n.. image:: https://img.shields.io/pypi/l/python-disk-collections.svg\n  :target: https://pypi.python.org/pypi/python-disk-collections\n\n.. image:: https://img.shields.io/pypi/pyversions/python-disk-collections.svg\n  :target: https://pypi.python.org/pypi/python-disk-collections\n\n\nModule contains class with extended python list that stores items at disk.\nBy default items before save are pickled and compressed. Use that list\nas usual list!\n\nIn addition, there is implemented extended python deque with disk storage and\nsame behaviour as **collections.deque**.\n\nIntend of package was to create generic iterables that stores really big collection of items\nthat does not fit in memory and to avoid usage of external cache and local database\nstorages.\n\n\n.. code-block:: python\n\n    >>> from diskcollections.iterables import FileList\n    >>> flist = FileList()\n    >>> flist.extend([1, 2, 3])\n    >>> flist.append(4)\n    >>> flist\n    [1, 2, 3, 4]\n    >>> flist[2]\n    3\n    >>> flist2 = flist[:]  # copy makes new FileList\n    >>> my_list = list(flist)  # now its simple list\n\n\n.. code-block:: python\n\n    >>> from diskcollections.iterables import FileDeque\n    >>> fdeque = FileDeque()\n    >>> fdeque.extend([1, 2, 3])\n    >>> fdeque.append(4)\n    >>> fdeque\n    FileDeque([1, 2, 3, 4])\n    >>> fdeque.pop()\n    4\n    >>> fdeque.appendleft(0)\n    >>> fdeque.popleft()\n    0\n\n\nThere are available more ways to serialize items.\n\n\n.. code-block:: python\n\n    >>> from diskcollections.iterables import List, FileList, FileDeque\n    >>> from diskcollections.serializers import (\n        PickleSerializer,  # pickle items\n        PickleZLibSerializer,  # pickle + compress items\n        JsonSerializer, # convert to json items\n        JsonZLibSerializer  # convert to json + compress items\n    )\n    >>> from functools import partial\n    >>> JsonFileList = partial(List, serializer_class=JsonHandler)\n    >>> flist = JsonFileList()\n    >>> flist.append({'a': 1, 'b': 2, 'c': 3})\n    >>> flist[0]\n    {u'a': 1, u'b': 2, u'c': 3}\n\n\nInstallation\n------------\n\nTo install package type\n\n.. code-block:: bash\n\n    $ pip install python-disk-collections\n\n\nHow it works\n------------\n\nExplaining example above:\n\n.. code-block:: python\n\n    >>> from diskcollections.iterables import FileList\n    >>> from diskcollections.serializers import JsonZLibSerializer\n    >>>\n    >>> flist = FileList(serializer_class=JsonZLibSerializer)\n\nNew instance of this object creates new temporary directory.\nBy using `serializer_class=JsonZLibSerializer` each incoming item to list will be: json.dumped and compressed\n\n.. code-block:: python\n\n    >>> flist.append({'a': 1, 'b': 2, 'c': 3})\n\nso using this serializer have in mind that all objects you put into list\nhave to lend themself and compatible with json.\nExactly this object `{'a': 1, 'b': 2, 'c': 3}` will serialized and compressed and saved inside temporary directory.\n\n.. code-block:: python\n\n    >>> flist[0]\n    {u'a': 1, u'b': 2, u'c': 3}\n\nGetting an item will read a file and because `JsonZLibSerializer` is used: then content will be decompressed and tried\nto loaded from json.\n\nThis package provides a few other serializers:\n\n* PickleSerializer - pickle items\n* PickleZLibSerializer - pickle + compress items\n* JsonSerializer - convert to json items\n* JsonZLibSerializer - convert to json + compress items\n\n.. code-block:: python\n\n    from diskcollections.serializers import (\n          PickleSerializer,\n          PickleZLibSerializer,\n          JsonSerializer,\n          JsonZLibSerializer,\n      )\n\nIn order to implement your serializer create class with methods:\n**dumps** and **loads** or import interface.\n\n\n.. code-block:: python\n\n    >>> from diskcollections.interfaces import ISerializer\n\n    class ISerializer:\n\n    @staticmethod\n    def dumps(obj):\n        \"\"\"Converts object to string.\n\n        :param obj: any python object\n        :return: dumped string\n        \"\"\"\n        raise NotImplementedError\n\n    @staticmethod\n    def loads(obj):\n        \"\"\"Restored dumped string into python object.\n\n        :param obj: Object stored as string\n        :return: python object restored from dump\n        \"\"\"\n        raise NotImplementedError\n\nAll serializers from example above implements interface **ISerializer**.\n\nUnder the hood, **FileList** for storage items uses *tempfile.mktemp* (in python2)\nor *tempfile.TemporaryDirectory* (in python3). It means, that every list\nhas own unique directory, placed likely in */tmp/*.\nWhen list is removed by garbage collector, all items that was stored are lost.\n\nFor **FileDeque** stores items in the same way as **FileList**.\n\nBy default on exit program, or when list or deque is removed: all content of files also are dropped.\n\nTo prevent this use `PersistentDirectoryClient`:\n\n.. code-block:: python\n\n    >>> from functools import partial\n\n    >>> from diskcollections.iterables import List, PersistentDirectoryClient\n    >>> from diskcollections.serializers import JsonSerializer\n    >>> from diskcollections.iterables import PersistentDirectoryClient\n\n    >>> dir_abc = partial(PersistentDirectoryClient, \"abc\")\n    >>> persistent_list = List(client_class=dir_abc, serializer_class=JsonSerializer)\n    >>> persistent_list.append({\"a\": 1, \"b\": 2})\n    >>> assert len(persistent_list) == 1\n    >>> assert open(\"abc/0\").read() == '{\"a\": 1, \"b\": 2}'\n\nOn exit directory `abc` with file `0` of his contents will still exist.\n\n\nContribute\n----------\n\n#. Fork repository on GitHub to start making your changes to the **master** branch (or branch off of it).\n#. Write tests that prove that bug or future works as expected\n#. Install other python versions with **pyenv** together with **tox**:\n\n.. code-block:: bash\n\n  $ sudo apt-get install pyenv tox\n\n#. Install other python versions\n\n.. code-block:: bash\n\n  $ pyenv install 2.7 3.5 3.6 3.7 3.8 3.9 3.10 3.11\n\n\n#. Make them global for **detox** package\n\n.. code-block:: bash\n\n  $ pyenv global 2.7 3.5 3.6 3.7 3.8 3.9 3.10 3.11\n\n#. Install globally **detox**\n\n.. code-block:: bash\n\n  $ sudo pip install detox\n\n#. Check your code and tests with **detox**\n\n.. code-block:: bash\n\n  $ detox -n 1\n  GLOB sdist-make: python-disk-collections/setup.py\n  lint inst-nodeps: python-disk-collections/.tox/.tmp/package/7/python-disk-collections-0.0.4.zip\n  lint run-test-pre: PYTHONHASHSEED='1334400931'\n  lint runtests: commands[0] | flake8\n  lint runtests: commands[1] | python setup.py check -r -s -m\n  py27 inst-nodeps: python-disk-collections/.tox/.tmp/package/7/python-disk-collections-0.0.4.zip\n  py27 run-test-pre: PYTHONHASHSEED='1334400931'\n  py27 runtests: commands[0] | py.test -v --cov diskcollections --cov-config .coveragerc --cov-report term-missing --cov-fail-under 95\n  ...\n  py311 inst-nodeps: python-disk-collections/.tox/.tmp/package/7/python-disk-collections-0.0.4.zip\n  py311 run-test-pre: PYTHONHASHSEED='1334400931'\n  py311 runtests: commands[0] | py.test -v --cov diskcollections --cov-config .coveragerc --cov-report term-missing --cov-fail-under 95\n  _________________________________________________________________________________________________________________ summary __________________________________________________________________________________________________________________\n    lint: commands succeeded\n    py27: commands succeeded\n    py35: commands succeeded\n    py36: commands succeeded\n    py37: commands succeeded\n    py38: commands succeeded\n    py39: commands succeeded\n    py310: commands succeeded\n    py311: commands succeeded\n    congratulations :)\n\n#. Send a pull request!\n\n\nLicense\n-------\n\nPython-Disk-Collection is under MIT license, see LICENSE for more details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Package provides classes: FileList, FileDeque that behaves like bulltins but keeps items at disk.",
    "version": "0.0.5",
    "project_urls": {
        "Download": "https://github.com/thegrymek/python-disk-collections/archive/0.0.5.tar.gz",
        "Homepage": "https://github.com/thegrymek/python-disk-collections"
    },
    "split_keywords": [
        "pickle",
        "cache",
        "collections",
        "list",
        "deque",
        "json",
        "zlib",
        "disk"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "58b0e39a0cc3a2a247a149bf6a45e6edfd9918df1283ba161e1b93b3f544c29f",
                "md5": "395e55e30a233e674ffc077244f19752",
                "sha256": "49d18a91ae0bf232fa92d9a5a43bf5049caa4f26de0e27af9d3f87cd6376372c"
            },
            "downloads": -1,
            "filename": "python_disk_collections-0.0.5-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "395e55e30a233e674ffc077244f19752",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 12556,
            "upload_time": "2023-11-30T17:34:24",
            "upload_time_iso_8601": "2023-11-30T17:34:24.912294Z",
            "url": "https://files.pythonhosted.org/packages/58/b0/e39a0cc3a2a247a149bf6a45e6edfd9918df1283ba161e1b93b3f544c29f/python_disk_collections-0.0.5-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "83b9f3ec40418105919cfdb55a20e33c9c14b8fdc59d74f23f31eb2896ce228f",
                "md5": "7590ad8341d3a2be46cabf3480142517",
                "sha256": "6caae9f09bfee94de62208cbb8eb72bff9185a9b5648ed05eddbe2231662f05a"
            },
            "downloads": -1,
            "filename": "python-disk-collections-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "7590ad8341d3a2be46cabf3480142517",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 13306,
            "upload_time": "2023-11-30T17:34:27",
            "upload_time_iso_8601": "2023-11-30T17:34:27.440647Z",
            "url": "https://files.pythonhosted.org/packages/83/b9/f3ec40418105919cfdb55a20e33c9c14b8fdc59d74f23f31eb2896ce228f/python-disk-collections-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-30 17:34:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "thegrymek",
    "github_project": "python-disk-collections",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": false,
    "tox": true,
    "lcname": "python-disk-collections"
}
        
Elapsed time: 0.14597s