camSort


NamecamSort JSON
Version 0.4.1 PyPI version JSON
download
home_page
Summarymake sorted() fast for strings
upload_time2023-08-04 14:30:54
maintainer
docs_urlNone
author['cameronbae / cameron jay']
requires_python
license
keywords python sorted sort camsort reverse list strings
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# camSort 0.4.1!

Working on `python 3.7.3`! (at least I think.)

`camSort` is a Python library that makes sorted() look slow while also using sorted(). It's built on Cython, and provides preallocation to string sorting by using a custom key calculation for each string. Normal key calculation and object creation are often computationally expensive—but here we optimize that with Cython >:)!

## Overview

The `camSort` library sorts a list of strings by creating a custom object for each string. This custom object, `StringWithKey`, holds the string and a unique key calculated from it. The key is a long integer derived from the sum of Unicode code points of the characters in the string, and the string's length.

The actual sorting operation uses Python's built-in `sorted` function (based on Timsort), however, now with precalculated keys, rather than comparing the strings themselves. Making sorting quite faster.

The primary benefit of using `camSort` is realized when working with very large lists of strings, see: Performance.

## Installation

```pip install camSort```, or build locally :)

## Usage

```python
from camSort import sortStrings

myList = ['your', 'list', 'of', 'strings', ':)!']

# By length
sortedStrings = sortStrings.byLength(myList)
# By alphabetically
sortedStrings = stringSort.byAlphabet(myList)
# By sorting alphabetically and length
sortedStrings = stringSort.byLengthAndAlphabet(myList)
# Reversing! (This is the exact same as Python's)
# Please read the demo.py file for an explaination.
sortedStrings = stringSort.byLength(myList).reverse()
# Returns a list of strings that contain a given substring/key, this is list comprehension.
# Speed increase will be looked at in future.
filteredStrings = sortedStrings.filterWithSubstring('hello'))

```

## Performance

Try out the demo.py file!

Average output on my 2017 macbook pro:
    (1 million strings, with a random legnth of 1 to 1000 chararacters)

```
Time taken by Python sorted():               52.74097275733948 seconds
    Using camSort!
Time taken sorting by length:                2.0430028438568115 seconds
By sorting alphabetically:                   3.7439329624176025 seconds
By sorting alphabetically and length:        2.3696372509002686 seconds

Not optmised (yet)

Python reverse:                              2.4335076808929443 seconds
Camsort reverse:                             2.4054980278015137 seconds
Using python list comprehension:             0.40741705894470215 seconds
Using filterWithSubstring:                   0.5404400825500488 seconds

Results pre 0.4.

Python reverse:                              5.526482105255127 seconds
Camsort reverse:                             5.083630084991455 seconds
Using Python's list comprehension:           0.4732799530029297 seconds
Using filterWithSubstring:                   0.38806891441345215 seconds
```


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "camSort",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "python,sorted,sort,camSort,reverse,list,strings",
    "author": "['cameronbae / cameron jay']",
    "author_email": "<contact@camjay.io>",
    "download_url": "https://files.pythonhosted.org/packages/fe/bb/4458566aeded5717223f42c0aa81dec3806b9f8b7e44ac19aa534b181508/camSort-0.4.1.tar.gz",
    "platform": null,
    "description": "\n# camSort 0.4.1!\n\nWorking on `python 3.7.3`! (at least I think.)\n\n`camSort` is a Python library that makes sorted() look slow while also using sorted(). It's built on Cython, and provides preallocation to string sorting by using a custom key calculation for each string. Normal key calculation and object creation are often computationally expensive\u2014but here we optimize that with Cython >:)!\n\n## Overview\n\nThe `camSort` library sorts a list of strings by creating a custom object for each string. This custom object, `StringWithKey`, holds the string and a unique key calculated from it. The key is a long integer derived from the sum of Unicode code points of the characters in the string, and the string's length.\n\nThe actual sorting operation uses Python's built-in `sorted` function (based on Timsort), however, now with precalculated keys, rather than comparing the strings themselves. Making sorting quite faster.\n\nThe primary benefit of using `camSort` is realized when working with very large lists of strings, see: Performance.\n\n## Installation\n\n```pip install camSort```, or build locally :)\n\n## Usage\n\n```python\nfrom camSort import sortStrings\n\nmyList = ['your', 'list', 'of', 'strings', ':)!']\n\n# By length\nsortedStrings = sortStrings.byLength(myList)\n# By alphabetically\nsortedStrings = stringSort.byAlphabet(myList)\n# By sorting alphabetically and length\nsortedStrings = stringSort.byLengthAndAlphabet(myList)\n# Reversing! (This is the exact same as Python's)\n# Please read the demo.py file for an explaination.\nsortedStrings = stringSort.byLength(myList).reverse()\n# Returns a list of strings that contain a given substring/key, this is list comprehension.\n# Speed increase will be looked at in future.\nfilteredStrings = sortedStrings.filterWithSubstring('hello'))\n\n```\n\n## Performance\n\nTry out the demo.py file!\n\nAverage output on my 2017 macbook pro:\n    (1 million strings, with a random legnth of 1 to 1000 chararacters)\n\n```\nTime taken by Python sorted():               52.74097275733948 seconds\n    Using camSort!\nTime taken sorting by length:                2.0430028438568115 seconds\nBy sorting alphabetically:                   3.7439329624176025 seconds\nBy sorting alphabetically and length:        2.3696372509002686 seconds\n\nNot optmised (yet)\n\nPython reverse:                              2.4335076808929443 seconds\nCamsort reverse:                             2.4054980278015137 seconds\nUsing python list comprehension:             0.40741705894470215 seconds\nUsing filterWithSubstring:                   0.5404400825500488 seconds\n\nResults pre 0.4.\n\nPython reverse:                              5.526482105255127 seconds\nCamsort reverse:                             5.083630084991455 seconds\nUsing Python's list comprehension:           0.4732799530029297 seconds\nUsing filterWithSubstring:                   0.38806891441345215 seconds\n```\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "make sorted() fast for strings",
    "version": "0.4.1",
    "project_urls": null,
    "split_keywords": [
        "python",
        "sorted",
        "sort",
        "camsort",
        "reverse",
        "list",
        "strings"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b1d193b71eee2b08c5676d1071515168d13a439934df10e2ccb7c486fbf6c226",
                "md5": "df2e1f1ae47e86ef8d50cd5d64ddc53a",
                "sha256": "02b1ed74ee17924eb497b4996cafb5456bb021d9efa5ff8c00f27538da84738f"
            },
            "downloads": -1,
            "filename": "camSort-0.4.1-cp37-cp37m-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "df2e1f1ae47e86ef8d50cd5d64ddc53a",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 31064,
            "upload_time": "2023-08-04T14:30:52",
            "upload_time_iso_8601": "2023-08-04T14:30:52.889775Z",
            "url": "https://files.pythonhosted.org/packages/b1/d1/93b71eee2b08c5676d1071515168d13a439934df10e2ccb7c486fbf6c226/camSort-0.4.1-cp37-cp37m-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "febb4458566aeded5717223f42c0aa81dec3806b9f8b7e44ac19aa534b181508",
                "md5": "cc25253319632ffd6806710999003e54",
                "sha256": "695d357ad478ff4e4d6ee3502ef3cc279ece0427206f2b037abffbf55c1f0ff7"
            },
            "downloads": -1,
            "filename": "camSort-0.4.1.tar.gz",
            "has_sig": false,
            "md5_digest": "cc25253319632ffd6806710999003e54",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 50028,
            "upload_time": "2023-08-04T14:30:54",
            "upload_time_iso_8601": "2023-08-04T14:30:54.931471Z",
            "url": "https://files.pythonhosted.org/packages/fe/bb/4458566aeded5717223f42c0aa81dec3806b9f8b7e44ac19aa534b181508/camSort-0.4.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-04 14:30:54",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "camsort"
}
        
Elapsed time: 0.09657s