lingpatlab


Namelingpatlab JSON
Version 0.1.10 PyPI version JSON
download
home_pagehttps://github.com/craigtrim/lingpatlab
SummaryLinguistic Pattern Lab using spaCy
upload_time2024-04-16 17:40:17
maintainerCraig Trim
docs_urlNone
authorCraig Trim
requires_python<4.0,>=3.11
licenseMIT
keywords nlp spacy text-analysis linguistic-patterns natural-language-processing machine-learning api cloud aws microservice utility
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # LingPatLab: Linguistic Pattern Laboratory

## Overview

LingPatLab is a robust API designed to perform advanced Natural Language Processing (NLP) tasks, utilizing the capabilities of the spaCy library. This tool is expertly crafted to convert raw textual data into structured, analyzable forms. It is ideal for developers, researchers, and linguists who require comprehensive processing capabilities, from tokenization to sophisticated text summarization.

## Features

- **Tokenization**: Splits raw text into individual tokens.
- **Parsing**: Analyzes tokens to construct sentences with detailed linguistic annotations.
- **Phrase Extraction**: Identifies and extracts significant phrases from sentences.
- **Text Summarization**: Produces concise summaries of input text, optionally leveraging extracted phrases.

## Usage

To get started with LingPatLab, you can set up the API as follows:

```python
from spacy_core.api import SpacyCoreAPI

api = LingPatLab()
```

### Tokenization and Parsing

To tokenize and parse input text into structured sentences:

```python
parsed_sentence: Sentence = api.parse_input_text("Your input text here.")
print(parsed_sentence.to_string())
```

### Phrase Extraction

To extract phrases from a structured Sentences object:

```python
phrases: List[str] = api.extract_topics(parsed_sentences)
for phrase in phrases:
    print(phrase)
```

### Summarization

To generate a summary of the input text:

```python
summary: str = api.generate_summary("Your input text here.")
print(summary)
```

### Data Classes

LingPatLab utilizes several custom data classes to structure the data throughout the NLP process:

- `Sentence`: Represents a single sentence, containing a list of tokens (`SpacyResult` objects).
- `Sentences`: Represents a collection of sentences, useful for processing paragraphs or multiple lines of text.
- `SpacyResult`: Encapsulates the detailed analysis of a single token, including part of speech, dependency relations, and additional linguistic features.
- `OtherInfo`: Contains additional information about a token, particularly in relation to its syntactic head.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/craigtrim/lingpatlab",
    "name": "lingpatlab",
    "maintainer": "Craig Trim",
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": "craigtrim@gmail.com",
    "keywords": "nlp, spacy, text-analysis, linguistic-patterns, natural-language-processing, machine-learning, api, cloud, AWS, microservice, utility",
    "author": "Craig Trim",
    "author_email": "craigtrim@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/c7/e4/10b3654f5c8bd1c9d536e5a56337b384372cfe69b14e1de7d7881e328078/lingpatlab-0.1.10.tar.gz",
    "platform": null,
    "description": "# LingPatLab: Linguistic Pattern Laboratory\n\n## Overview\n\nLingPatLab is a robust API designed to perform advanced Natural Language Processing (NLP) tasks, utilizing the capabilities of the spaCy library. This tool is expertly crafted to convert raw textual data into structured, analyzable forms. It is ideal for developers, researchers, and linguists who require comprehensive processing capabilities, from tokenization to sophisticated text summarization.\n\n## Features\n\n- **Tokenization**: Splits raw text into individual tokens.\n- **Parsing**: Analyzes tokens to construct sentences with detailed linguistic annotations.\n- **Phrase Extraction**: Identifies and extracts significant phrases from sentences.\n- **Text Summarization**: Produces concise summaries of input text, optionally leveraging extracted phrases.\n\n## Usage\n\nTo get started with LingPatLab, you can set up the API as follows:\n\n```python\nfrom spacy_core.api import SpacyCoreAPI\n\napi = LingPatLab()\n```\n\n### Tokenization and Parsing\n\nTo tokenize and parse input text into structured sentences:\n\n```python\nparsed_sentence: Sentence = api.parse_input_text(\"Your input text here.\")\nprint(parsed_sentence.to_string())\n```\n\n### Phrase Extraction\n\nTo extract phrases from a structured Sentences object:\n\n```python\nphrases: List[str] = api.extract_topics(parsed_sentences)\nfor phrase in phrases:\n    print(phrase)\n```\n\n### Summarization\n\nTo generate a summary of the input text:\n\n```python\nsummary: str = api.generate_summary(\"Your input text here.\")\nprint(summary)\n```\n\n### Data Classes\n\nLingPatLab utilizes several custom data classes to structure the data throughout the NLP process:\n\n- `Sentence`: Represents a single sentence, containing a list of tokens (`SpacyResult` objects).\n- `Sentences`: Represents a collection of sentences, useful for processing paragraphs or multiple lines of text.\n- `SpacyResult`: Encapsulates the detailed analysis of a single token, including part of speech, dependency relations, and additional linguistic features.\n- `OtherInfo`: Contains additional information about a token, particularly in relation to its syntactic head.\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Linguistic Pattern Lab using spaCy",
    "version": "0.1.10",
    "project_urls": {
        "Bug Tracker": "https://github.com/craigtrim/lingpatlab/issues",
        "Homepage": "https://github.com/craigtrim/lingpatlab",
        "Repository": "https://github.com/craigtrim/lingpatlab"
    },
    "split_keywords": [
        "nlp",
        " spacy",
        " text-analysis",
        " linguistic-patterns",
        " natural-language-processing",
        " machine-learning",
        " api",
        " cloud",
        " aws",
        " microservice",
        " utility"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "067ff86216564b78304202e3789c8d3c29ead8a99cfd3b71cf5afc5bf1ffb004",
                "md5": "fa03f949e8b84007e0962859dcf4d20e",
                "sha256": "1445edb56de32dc261c7bc30cc78cb3532c9a64f935ef6098e5e56539882267b"
            },
            "downloads": -1,
            "filename": "lingpatlab-0.1.10-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fa03f949e8b84007e0962859dcf4d20e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 366656,
            "upload_time": "2024-04-16T17:40:13",
            "upload_time_iso_8601": "2024-04-16T17:40:13.905073Z",
            "url": "https://files.pythonhosted.org/packages/06/7f/f86216564b78304202e3789c8d3c29ead8a99cfd3b71cf5afc5bf1ffb004/lingpatlab-0.1.10-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c7e410b3654f5c8bd1c9d536e5a56337b384372cfe69b14e1de7d7881e328078",
                "md5": "c85a27043b2c976d9751e1a795ccd62e",
                "sha256": "19cdb08326561f0b945049aee717a554d0068fefdf26c8e041b7c2d88f93ae0e"
            },
            "downloads": -1,
            "filename": "lingpatlab-0.1.10.tar.gz",
            "has_sig": false,
            "md5_digest": "c85a27043b2c976d9751e1a795ccd62e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 337945,
            "upload_time": "2024-04-16T17:40:17",
            "upload_time_iso_8601": "2024-04-16T17:40:17.861707Z",
            "url": "https://files.pythonhosted.org/packages/c7/e4/10b3654f5c8bd1c9d536e5a56337b384372cfe69b14e1de7d7881e328078/lingpatlab-0.1.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-16 17:40:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "craigtrim",
    "github_project": "lingpatlab",
    "github_not_found": true,
    "lcname": "lingpatlab"
}
        
Elapsed time: 0.23695s