ngram-trie


Namengram-trie JSON
Version 1.2.6 PyPI version JSON
download
home_pageNone
SummaryA Rust-based n-gram trie library for Python
upload_time2024-12-08 04:35:56
maintainerNone
docs_urlNone
authorBotond Lovász
requires_python>=3.6
licenseMIT
keywords
VCS
bugtrack_url
requirements maturin pyo3 pyo3-log
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ngram-trie

`ngram-trie` is a Rust library designed to efficiently handle n-gram data structures using a trie-based approach. It provides functionalities for fitting, saving, loading, and querying n-gram models, with support for various smoothing techniques.

## Installation Rust

1. Include it in the Cargo.toml:

    ```toml
    [dependencies]
    ngram-trie = { git = "https://github.com/behappiness/ngram-trie" }
    ```

## Installation Python

1. Install from pip:

    ```bash
    pip install ngram-trie
    ```


## Example Usage
```python
from ngram_trie import PySmoothedTrie

trie = PySmoothedTrie(n_gram_max_length=7, root_capacity=None)

trie.fit(tokenized_data, n_gram_max_length=7, root_capacity=None, max_tokens=None)

trie.set_rule_set(["++++++", "+++++", "++++", "+++", "++", "+"])

trie.fit_smoothing()

trie.get_prediction_probabilities(tokenized_context)
```

#### Specify the smoothing

```python
trie.fit_smoothing("modified_kneser_ney"/"stupid_backoff")
```

#### Unsmoothed

```python
from ngram_trie import PySmoothedTrie

trie = PySmoothedTrie(n_gram_max_length=7, root_capacity=None)

trie.fit(tokenized_data, n_gram_max_length=7, root_capacity=None, max_tokens=None)

trie.set_rule_set(rules)

trie.get_unsmoothed_probabilities(tokenized_context)
```

## Dev
```bash
cargo add pyo3 --features extension-module
```

#### Build wheel
```bash
maturin build
```




            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ngram-trie",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "Botond Lov\u00e1sz <botilovasz@gmail.com>",
    "keywords": null,
    "author": "Botond Lov\u00e1sz",
    "author_email": "botilovasz@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/64/77/d87440b8aa94fe96c4fd638a12bebbd999502d797795b7bf5cb45b237cfb/ngram_trie-1.2.6.tar.gz",
    "platform": null,
    "description": "# ngram-trie\r\n\r\n`ngram-trie` is a Rust library designed to efficiently handle n-gram data structures using a trie-based approach. It provides functionalities for fitting, saving, loading, and querying n-gram models, with support for various smoothing techniques.\r\n\r\n## Installation Rust\r\n\r\n1. Include it in the Cargo.toml:\r\n\r\n    ```toml\r\n    [dependencies]\r\n    ngram-trie = { git = \"https://github.com/behappiness/ngram-trie\" }\r\n    ```\r\n\r\n## Installation Python\r\n\r\n1. Install from pip:\r\n\r\n    ```bash\r\n    pip install ngram-trie\r\n    ```\r\n\r\n\r\n## Example Usage\r\n```python\r\nfrom ngram_trie import PySmoothedTrie\r\n\r\ntrie = PySmoothedTrie(n_gram_max_length=7, root_capacity=None)\r\n\r\ntrie.fit(tokenized_data, n_gram_max_length=7, root_capacity=None, max_tokens=None)\r\n\r\ntrie.set_rule_set([\"++++++\", \"+++++\", \"++++\", \"+++\", \"++\", \"+\"])\r\n\r\ntrie.fit_smoothing()\r\n\r\ntrie.get_prediction_probabilities(tokenized_context)\r\n```\r\n\r\n#### Specify the smoothing\r\n\r\n```python\r\ntrie.fit_smoothing(\"modified_kneser_ney\"/\"stupid_backoff\")\r\n```\r\n\r\n#### Unsmoothed\r\n\r\n```python\r\nfrom ngram_trie import PySmoothedTrie\r\n\r\ntrie = PySmoothedTrie(n_gram_max_length=7, root_capacity=None)\r\n\r\ntrie.fit(tokenized_data, n_gram_max_length=7, root_capacity=None, max_tokens=None)\r\n\r\ntrie.set_rule_set(rules)\r\n\r\ntrie.get_unsmoothed_probabilities(tokenized_context)\r\n```\r\n\r\n## Dev\r\n```bash\r\ncargo add pyo3 --features extension-module\r\n```\r\n\r\n#### Build wheel\r\n```bash\r\nmaturin build\r\n```\r\n\r\n\r\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Rust-based n-gram trie library for Python",
    "version": "1.2.6",
    "project_urls": {
        "Repository": "https://github.com/behappiness/ngram-trie"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "644c3638ff82357262644637892df5a6ed3beab85aa1a0ece643a17cefa4a8a7",
                "md5": "ca677b37bd360db6d3464c627d431e2a",
                "sha256": "530c28dd14793cd660b79c34558a80401aed0b908e0af29d123efbedbf919de4"
            },
            "downloads": -1,
            "filename": "ngram_trie-1.2.6-cp312-cp312-manylinux_2_34_x86_64.whl",
            "has_sig": false,
            "md5_digest": "ca677b37bd360db6d3464c627d431e2a",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.6",
            "size": 657555,
            "upload_time": "2024-12-08T04:35:54",
            "upload_time_iso_8601": "2024-12-08T04:35:54.750529Z",
            "url": "https://files.pythonhosted.org/packages/64/4c/3638ff82357262644637892df5a6ed3beab85aa1a0ece643a17cefa4a8a7/ngram_trie-1.2.6-cp312-cp312-manylinux_2_34_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6477d87440b8aa94fe96c4fd638a12bebbd999502d797795b7bf5cb45b237cfb",
                "md5": "5318592bdeaad4a937d9eb356f26138c",
                "sha256": "8599f3c87f77748097d1eebf56f368777e423705ef8552d5526816e4587b35d1"
            },
            "downloads": -1,
            "filename": "ngram_trie-1.2.6.tar.gz",
            "has_sig": false,
            "md5_digest": "5318592bdeaad4a937d9eb356f26138c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 31678,
            "upload_time": "2024-12-08T04:35:56",
            "upload_time_iso_8601": "2024-12-08T04:35:56.738254Z",
            "url": "https://files.pythonhosted.org/packages/64/77/d87440b8aa94fe96c4fd638a12bebbd999502d797795b7bf5cb45b237cfb/ngram_trie-1.2.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-08 04:35:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "behappiness",
    "github_project": "ngram-trie",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "maturin",
            "specs": []
        },
        {
            "name": "pyo3",
            "specs": []
        },
        {
            "name": "pyo3-log",
            "specs": []
        }
    ],
    "lcname": "ngram-trie"
}
        
Elapsed time: 0.40346s