hg


Namehg JSON
Version 0.0.6 PyPI version JSON
download
home_pageNone
SummaryHomogenous Groups (Duplication detection, Frequent Itemsets, etc.)
upload_time2025-01-11 10:06:55
maintainerNone
docs_urlNone
authorThor Whalen
requires_pythonNone
licensemit
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # hg

Homogenous Groups, Duplication detection, Frequent Itemsets etc.

To install:	```pip install hg```

# Examples

```python
>>> from hg import deduplicate_string_lines
>>> example_text = '''Lorem ipsum
... dolor sit amet
... dolor sit amet
... dolor sit amet
... Consectetur adipiscing
... Lorem ipsum
... dolor sit amet
... dolor sit amet
... Consectetur adipiscing
... Something else
... '''
>>> final_text, removed = deduplicate_string_lines(example_text, min_block_size=3)
>>> print("=== Deduplicated Lines ===")
=== Deduplicated Lines ===
>>> print(final_text)
Lorem ipsum
dolor sit amet
dolor sit amet
dolor sit amet
Consectetur adipiscing
Consectetur adipiscing
Something else
>>> print("\n=== Removed Blocks ===")
<BLANKLINE>
=== Removed Blocks ===
>>> for block in removed:
...     print(f"Removed block starting at line {block['removed_start']} with length {block['length']}:")
...     for item in block['block_items']:
...         print(f"  {item}")
...     print()
Removed block starting at line 5 with length 3:
Lorem ipsum
dolor sit amet
dolor sit amet
<BLANKLINE>
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "hg",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Thor Whalen",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/21/64/53fd71987ac4c70a9f96ee9e9faa0957ec8dfd4b0330b5232ab06d31273b/hg-0.0.6.tar.gz",
    "platform": null,
    "description": "# hg\n\nHomogenous Groups, Duplication detection, Frequent Itemsets etc.\n\nTo install:\t```pip install hg```\n\n# Examples\n\n```python\n>>> from hg import deduplicate_string_lines\n>>> example_text = '''Lorem ipsum\n... dolor sit amet\n... dolor sit amet\n... dolor sit amet\n... Consectetur adipiscing\n... Lorem ipsum\n... dolor sit amet\n... dolor sit amet\n... Consectetur adipiscing\n... Something else\n... '''\n>>> final_text, removed = deduplicate_string_lines(example_text, min_block_size=3)\n>>> print(\"=== Deduplicated Lines ===\")\n=== Deduplicated Lines ===\n>>> print(final_text)\nLorem ipsum\ndolor sit amet\ndolor sit amet\ndolor sit amet\nConsectetur adipiscing\nConsectetur adipiscing\nSomething else\n>>> print(\"\\n=== Removed Blocks ===\")\n<BLANKLINE>\n=== Removed Blocks ===\n>>> for block in removed:\n...     print(f\"Removed block starting at line {block['removed_start']} with length {block['length']}:\")\n...     for item in block['block_items']:\n...         print(f\"  {item}\")\n...     print()\nRemoved block starting at line 5 with length 3:\nLorem ipsum\ndolor sit amet\ndolor sit amet\n<BLANKLINE>\n```\n",
    "bugtrack_url": null,
    "license": "mit",
    "summary": "Homogenous Groups (Duplication detection, Frequent Itemsets, etc.)",
    "version": "0.0.6",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c46d7f7370bc37c31eb0bc15394ee03c5c49fbf2ffc1ae889cc7449d0e070500",
                "md5": "2e33e53e36db2d3aba5b60bfb39d6bb0",
                "sha256": "e3a69416f14bf9377888b872a8a5d2befafdeab96f6648c723167414f5202c7b"
            },
            "downloads": -1,
            "filename": "hg-0.0.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2e33e53e36db2d3aba5b60bfb39d6bb0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 9890,
            "upload_time": "2025-01-11T10:06:53",
            "upload_time_iso_8601": "2025-01-11T10:06:53.279966Z",
            "url": "https://files.pythonhosted.org/packages/c4/6d/7f7370bc37c31eb0bc15394ee03c5c49fbf2ffc1ae889cc7449d0e070500/hg-0.0.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "216453fd71987ac4c70a9f96ee9e9faa0957ec8dfd4b0330b5232ab06d31273b",
                "md5": "3f9ce9b16d1b66f139c5ae6c0168450d",
                "sha256": "679f35723dec9404f1cfb8641d40929a014a477fa1b46d8da3055a6e71da0489"
            },
            "downloads": -1,
            "filename": "hg-0.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "3f9ce9b16d1b66f139c5ae6c0168450d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 9163,
            "upload_time": "2025-01-11T10:06:55",
            "upload_time_iso_8601": "2025-01-11T10:06:55.469561Z",
            "url": "https://files.pythonhosted.org/packages/21/64/53fd71987ac4c70a9f96ee9e9faa0957ec8dfd4b0330b5232ab06d31273b/hg-0.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-11 10:06:55",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "hg"
}
        
Elapsed time: 0.46372s