# 🧠 Centering-Lgram: Advanced Language Model with Centering Theory
[](https://badge.fury.io/py/centering-lgram)
[](https://pypi.org/project/centering-lgram/)
[](https://opensource.org/licenses/MIT)
[](https://pepy.tech/project/centering-lgram)
A sophisticated natural language processing library that combines **N-gram language models** with **Centering Theory** to generate coherent and contextually appropriate text. Lgram provides state-of-the-art discourse coherence analysis and text generation capabilities.
## ✨ Key Features
- **🎯 Coherent Text Generation**: Advanced n-gram models (2-gram to 6-gram) with centering theory
- **🧠 Discourse Analysis**: Implementation of centering theory for coherence evaluation
- **🔧 Grammar Correction**: T5 transformer-based grammar and style correction
- **🌐 Semantic Analysis**: SpaCy-powered semantic relationship detection
- **📊 Collocation Analysis**: Statistical collocation and thematic consistency
- **⚡ Django Ready**: Production-ready integration with Django framework
- **🎨 CLI Interface**: Easy-to-use command line tools
- **📈 Progress Tracking**: Visual progress bars and detailed logging
## 🚀 Quick Start
### Installation
```bash
# Install from PyPI
pip install centering-lgram
# Install with all optional dependencies
pip install centering-lgram[full]
# Install for Django projects
pip install centering-lgram[django]
```
### Basic Usage
```python
from lgram.models.simple_language_model import create_default_language_model
# Eğitilmiş model ve veriyle doğrudan başlat
model = create_default_language_model()
# Metin üret
text = model.generate_text(
num_sentences=3,
input_words=["The", "weather"],
length=12,
use_progress_bar=True
)
print(text)
```
### Kendi Verinizle Model Başlatmak
```python
from lgram import create_language_model
model = create_language_model(model_file="ngrams/bigram_model.pkl", text_file="ngrams/text_data.txt")
```
### Gelişmiş Kullanım
```python
from lgram.models.simple_language_model import EnhancedLanguageModel
model = EnhancedLanguageModel("Some training text.")
sentence = model.generate_sentence(start_words=["The", "man"], length=12)
print("Generated:", sentence)
```
### Command Line Interface
```bash
# Generate text from command line
centering-lgram generate --input "The weather today" --sentences 3 --correct
# Use centering theory
centering-lgram generate --input "She founded" --sentences 5 --centering --progress
# Show system information
centering-lgram info
# Train a new model
centering-lgram train --text-file data.txt --model-file my_model.pkl
# Backward compatibility - old command still works
lgram generate --input "Hello world" --sentences 3
```
- Output: Transition scores, types, and detailed pairwise information
### 2. `TransitionAnalyzer`
Analyzes sentence pairs to extract:
- **Noun phrases** (`noun_chunks`)
- **Anaphoric relations**
- **Transition types**
This analysis supports both statistical and linguistic evaluation.
### 3. `EnhancedLanguageModel`
Generates context-aware, fluent sentences using a **Kneser-Ney smoothed n-gram model** enhanced with POS tagging.
#### Key Features:
- Generation using 2- to 6-gram models
- Syntactic analysis and centering using `SpaCy`
- Linguistic center tracking via `get_center_from_sentence`
- Contextual word selection via `choose_word_with_context`
- Completeness check via `is_complete_thought`
- Theme consistency via `post_process_sentences`
### 4. `dynamicngramparaphraser.py`
Performs **contextual paraphrasing** based on n-grams. Selects the **best alternative match** for each word depending on its position and syntactic role.
- Supports **dependency-based reordering** (`reorder_sentence`)
- Combines vector similarity and frequency with `select_best_match`
### 5. `analyze_transitions.py`
Invokes the `CenteringModel` to analyze all sentence transitions in a text and returns the results as a `DataFrame`, including:
- `current_sentence`
- `next_sentence`
- `transition_type`
- `score`
- `total_score`
## 🗂 File Structure
.
├── analyze_transitions.py
├── centering_model.py
├── chunk.py
├── dynamicngramparaphraser.py
├── simple_language_model.py
├── get_gender.py
├── transition_analyzer.py
├── corrections.json
├── ngrams/
│ ├── bigram_model.pkl
│ ├── trigram_model.pkl
│ ├── fourgram_model.pkl
│ ├── fivegram_model.pkl
│ ├── sixgram_model.pkl
│ └── text_data.txt
## 🚀 Usage Example
### Transition Analysis
```python
from analyze_transitions import analyze_transitions
text = "Least of all do they thus dispose of the murdered. Guardsman take small farmer well who loathe every precaution the officer."
df = analyze_transitions(text)
print(df)
```
### Sentence Generation
```python
from simple_language_model import EnhancedLanguageModel
model = EnhancedLanguageModel("Some training text.")
sentence = model.generate_sentence(start_words=["The", "man"], length=12)
print("Generated:", sentence)
```
### Coherence Report
```python
sentences = ["The man left.", "She stayed at home."]
cleaned_sentences, report = model.post_process_sentences(sentences)
print(report)
```
## 🛠 Requirements
- Python 3.8+
- `spacy`
- `numpy`
- `scikit-learn`
- `tqdm`
- `pandas`
### SpaCy Model
```bash
python -m spacy download en_core_web_lg
```
## 🎯 Purpose
This project provides a powerful infrastructure for researchers, developers, and linguistics enthusiasts working in **textual coherence**, **discourse transition**, and **automated sentence generation**. Whether you're generating text, analyzing transitions, or evaluating textual consistency — this is your **linguistic lab**.
## Outputs
The victim must get to go to the woods and learn the true meaning of the word, "she continued. " According to Gallup, in 2002, they would not say willingly. Your opinion is needed. At some moment he might not think of himself as a genius, but as the mere fact of being alive. He chose art.
The murder of Agamemnon would send shivers down your backbone. It is no use trying to suppress that side of myself. In her dead mind there is nothing which appears to her as being outside, and what is outside is what He has left behind. There is evidence to support the view that he has seen such a mention as an occasional burst of electricity and, I am sure, no trace of it. His plan would have been to attack the house and burn down the garden with some kind of fire. While her husband was away, she had made an excuse for being late. The size of these ships is unknown.
The murderer will say that this way if indeed the former is the case. In spring of 1992, I would have noticed. The gun swung and he asked the woman what she wanted, but she did not say anything. With the information that we have, we are able to attach more importance to what we do not need. To be near the rest of the world in this case is a matter of great importance.
The crime was committed by an old man. Indeed, so bad is the weather that we sometimes talk about the reason why he was in Berlin and the city itself. The reason is that he is a rich man. Grass has not grown in this day and age, so it is not suitable for making friends or building communities. I find myself in a bar and ask for a drink when I see a new world around me.
The time has come for you to do something in return and to observe what is happening in the present as closely as possible to one possibility. Mr. Phillips's protest was that softening the blow by a gentle breeze and creating a myriad of other sounds was wrong for the foot to be moved. He just pointed out that the world around him had gone wrong and that the course of events in his country would be too difficult for him to change. Of course, Miss Diana told me that she found it very pleasant to hear that he would return to private life. So say much as well as a bird would be a woman's best friend.
The victims of the fire could not have been more remarkable and sad to see. The victim wants to say that he has a young wife who loves him. It seems to him that his life has been entirely smothered by his work. He is on the verge of death and reaches for the moon.
The murder of Lady Godfrey does not make the heart race as much as anybody else. The murderer is the light sleeper and nothing less than Mr. Bingley will do. It is out of the question whether his own vanity has done much good for him. Is it rather subtler to say that some things are better than others? I like it pretty well and am willing to give it a try.
Raw data
{
"_id": null,
"home_page": "https://github.com/iatagun/Lgram",
"name": "centering-lgram",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "nlp, natural language processing, text generation, centering theory, coherence, language model, n-gram, discourse analysis, computational linguistics",
"author": "\u0130lker Atag\u00fcn",
"author_email": "\u0130lker Atag\u00fcn <ilker.atagun@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/82/de/5777abfa8a73c9c55df24211aebacd0610fbcbdbd5e1180106c90c8ddf29/centering_lgram-1.0.47.tar.gz",
"platform": null,
"description": "\r\n# \ud83e\udde0 Centering-Lgram: Advanced Language Model with Centering Theory\r\n\r\n[](https://badge.fury.io/py/centering-lgram)\r\n[](https://pypi.org/project/centering-lgram/)\r\n[](https://opensource.org/licenses/MIT)\r\n[](https://pepy.tech/project/centering-lgram)\r\n\r\nA sophisticated natural language processing library that combines **N-gram language models** with **Centering Theory** to generate coherent and contextually appropriate text. Lgram provides state-of-the-art discourse coherence analysis and text generation capabilities.\r\n\r\n## \u2728 Key Features\r\n\r\n- **\ud83c\udfaf Coherent Text Generation**: Advanced n-gram models (2-gram to 6-gram) with centering theory\r\n- **\ud83e\udde0 Discourse Analysis**: Implementation of centering theory for coherence evaluation\r\n- **\ud83d\udd27 Grammar Correction**: T5 transformer-based grammar and style correction\r\n- **\ud83c\udf10 Semantic Analysis**: SpaCy-powered semantic relationship detection\r\n- **\ud83d\udcca Collocation Analysis**: Statistical collocation and thematic consistency\r\n- **\u26a1 Django Ready**: Production-ready integration with Django framework\r\n- **\ud83c\udfa8 CLI Interface**: Easy-to-use command line tools\r\n- **\ud83d\udcc8 Progress Tracking**: Visual progress bars and detailed logging\r\n\r\n## \ud83d\ude80 Quick Start\r\n\r\n### Installation\r\n\r\n```bash\r\n# Install from PyPI\r\npip install centering-lgram\r\n\r\n# Install with all optional dependencies\r\npip install centering-lgram[full]\r\n\r\n# Install for Django projects\r\npip install centering-lgram[django]\r\n```\r\n\r\n\r\n### Basic Usage\r\n\r\n```python\r\nfrom lgram.models.simple_language_model import create_default_language_model\r\n\r\n# E\u011fitilmi\u015f model ve veriyle do\u011frudan ba\u015flat\r\nmodel = create_default_language_model()\r\n\r\n# Metin \u00fcret\r\ntext = model.generate_text(\r\n num_sentences=3,\r\n input_words=[\"The\", \"weather\"],\r\n length=12,\r\n use_progress_bar=True\r\n)\r\nprint(text)\r\n```\r\n\r\n### Kendi Verinizle Model Ba\u015flatmak\r\n\r\n```python\r\nfrom lgram import create_language_model\r\n\r\nmodel = create_language_model(model_file=\"ngrams/bigram_model.pkl\", text_file=\"ngrams/text_data.txt\")\r\n```\r\n\r\n### Geli\u015fmi\u015f Kullan\u0131m\r\n\r\n```python\r\nfrom lgram.models.simple_language_model import EnhancedLanguageModel\r\n\r\nmodel = EnhancedLanguageModel(\"Some training text.\")\r\nsentence = model.generate_sentence(start_words=[\"The\", \"man\"], length=12)\r\nprint(\"Generated:\", sentence)\r\n```\r\n\r\n### Command Line Interface\r\n\r\n```bash\r\n# Generate text from command line\r\ncentering-lgram generate --input \"The weather today\" --sentences 3 --correct\r\n\r\n# Use centering theory\r\ncentering-lgram generate --input \"She founded\" --sentences 5 --centering --progress\r\n\r\n# Show system information\r\ncentering-lgram info\r\n\r\n# Train a new model\r\ncentering-lgram train --text-file data.txt --model-file my_model.pkl\r\n\r\n# Backward compatibility - old command still works\r\nlgram generate --input \"Hello world\" --sentences 3\r\n```\r\n- Output: Transition scores, types, and detailed pairwise information\r\n\r\n### 2. `TransitionAnalyzer`\r\nAnalyzes sentence pairs to extract:\r\n- **Noun phrases** (`noun_chunks`)\r\n- **Anaphoric relations**\r\n- **Transition types**\r\n\r\nThis analysis supports both statistical and linguistic evaluation.\r\n\r\n### 3. `EnhancedLanguageModel`\r\nGenerates context-aware, fluent sentences using a **Kneser-Ney smoothed n-gram model** enhanced with POS tagging.\r\n\r\n#### Key Features:\r\n- Generation using 2- to 6-gram models\r\n- Syntactic analysis and centering using `SpaCy`\r\n- Linguistic center tracking via `get_center_from_sentence`\r\n- Contextual word selection via `choose_word_with_context`\r\n- Completeness check via `is_complete_thought`\r\n- Theme consistency via `post_process_sentences`\r\n\r\n### 4. `dynamicngramparaphraser.py`\r\nPerforms **contextual paraphrasing** based on n-grams. Selects the **best alternative match** for each word depending on its position and syntactic role.\r\n\r\n- Supports **dependency-based reordering** (`reorder_sentence`)\r\n- Combines vector similarity and frequency with `select_best_match`\r\n\r\n### 5. `analyze_transitions.py`\r\nInvokes the `CenteringModel` to analyze all sentence transitions in a text and returns the results as a `DataFrame`, including:\r\n- `current_sentence`\r\n- `next_sentence`\r\n- `transition_type`\r\n- `score`\r\n- `total_score`\r\n\r\n## \ud83d\uddc2 File Structure\r\n\r\n.\r\n\u251c\u2500\u2500 analyze_transitions.py\r\n\u251c\u2500\u2500 centering_model.py\r\n\u251c\u2500\u2500 chunk.py\r\n\u251c\u2500\u2500 dynamicngramparaphraser.py\r\n\u251c\u2500\u2500 simple_language_model.py\r\n\u251c\u2500\u2500 get_gender.py\r\n\u251c\u2500\u2500 transition_analyzer.py\r\n\u251c\u2500\u2500 corrections.json\r\n\u251c\u2500\u2500 ngrams/\r\n\u2502 \u251c\u2500\u2500 bigram_model.pkl\r\n\u2502 \u251c\u2500\u2500 trigram_model.pkl\r\n\u2502 \u251c\u2500\u2500 fourgram_model.pkl\r\n\u2502 \u251c\u2500\u2500 fivegram_model.pkl\r\n\u2502 \u251c\u2500\u2500 sixgram_model.pkl\r\n\u2502 \u2514\u2500\u2500 text_data.txt\r\n\r\n## \ud83d\ude80 Usage Example\r\n\r\n### Transition Analysis\r\n```python\r\nfrom analyze_transitions import analyze_transitions\r\n\r\ntext = \"Least of all do they thus dispose of the murdered. Guardsman take small farmer well who loathe every precaution the officer.\"\r\ndf = analyze_transitions(text)\r\nprint(df)\r\n```\r\n\r\n### Sentence Generation\r\n```python\r\nfrom simple_language_model import EnhancedLanguageModel\r\n\r\nmodel = EnhancedLanguageModel(\"Some training text.\")\r\nsentence = model.generate_sentence(start_words=[\"The\", \"man\"], length=12)\r\nprint(\"Generated:\", sentence)\r\n```\r\n\r\n### Coherence Report\r\n```python\r\nsentences = [\"The man left.\", \"She stayed at home.\"]\r\ncleaned_sentences, report = model.post_process_sentences(sentences)\r\nprint(report)\r\n```\r\n\r\n## \ud83d\udee0 Requirements\r\n\r\n- Python 3.8+\r\n- `spacy`\r\n- `numpy`\r\n- `scikit-learn`\r\n- `tqdm`\r\n- `pandas`\r\n\r\n### SpaCy Model\r\n```bash\r\npython -m spacy download en_core_web_lg\r\n```\r\n\r\n## \ud83c\udfaf Purpose\r\n\r\nThis project provides a powerful infrastructure for researchers, developers, and linguistics enthusiasts working in **textual coherence**, **discourse transition**, and **automated sentence generation**. Whether you're generating text, analyzing transitions, or evaluating textual consistency \u2014 this is your **linguistic lab**.\r\n\r\n## Outputs\r\n\r\nThe victim must get to go to the woods and learn the true meaning of the word, \"she continued. \" According to Gallup, in 2002, they would not say willingly. Your opinion is needed. At some moment he might not think of himself as a genius, but as the mere fact of being alive. He chose art.\r\n\r\nThe murder of Agamemnon would send shivers down your backbone. It is no use trying to suppress that side of myself. In her dead mind there is nothing which appears to her as being outside, and what is outside is what He has left behind. There is evidence to support the view that he has seen such a mention as an occasional burst of electricity and, I am sure, no trace of it. His plan would have been to attack the house and burn down the garden with some kind of fire. While her husband was away, she had made an excuse for being late. The size of these ships is unknown.\r\n\r\nThe murderer will say that this way if indeed the former is the case. In spring of 1992, I would have noticed. The gun swung and he asked the woman what she wanted, but she did not say anything. With the information that we have, we are able to attach more importance to what we do not need. To be near the rest of the world in this case is a matter of great importance.\r\n\r\nThe crime was committed by an old man. Indeed, so bad is the weather that we sometimes talk about the reason why he was in Berlin and the city itself. The reason is that he is a rich man. Grass has not grown in this day and age, so it is not suitable for making friends or building communities. I find myself in a bar and ask for a drink when I see a new world around me.\r\n\r\nThe time has come for you to do something in return and to observe what is happening in the present as closely as possible to one possibility. Mr. Phillips's protest was that softening the blow by a gentle breeze and creating a myriad of other sounds was wrong for the foot to be moved. He just pointed out that the world around him had gone wrong and that the course of events in his country would be too difficult for him to change. Of course, Miss Diana told me that she found it very pleasant to hear that he would return to private life. So say much as well as a bird would be a woman's best friend.\r\n\r\nThe victims of the fire could not have been more remarkable and sad to see. The victim wants to say that he has a young wife who loves him. It seems to him that his life has been entirely smothered by his work. He is on the verge of death and reaches for the moon.\r\n\r\nThe murder of Lady Godfrey does not make the heart race as much as anybody else. The murderer is the light sleeper and nothing less than Mr. Bingley will do. It is out of the question whether his own vanity has done much good for him. Is it rather subtler to say that some things are better than others? I like it pretty well and am willing to give it a try.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Advanced Language Model with Centering Theory for Coherent Text Generation",
"version": "1.0.47",
"project_urls": {
"Bug Reports": "https://github.com/iatagun/Lgram/issues",
"Documentation": "https://github.com/iatagun/Lgram/blob/main/README.md",
"Homepage": "https://github.com/iatagun/Lgram",
"Source": "https://github.com/iatagun/Lgram"
},
"split_keywords": [
"nlp",
" natural language processing",
" text generation",
" centering theory",
" coherence",
" language model",
" n-gram",
" discourse analysis",
" computational linguistics"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "9c939ea7b773d082e437a6aad8558fab6e6363177bb3a0471236c3162afab107",
"md5": "1e151eca4a6a7f5c1944e43cd012a7d6",
"sha256": "c912ca0c842036dab9127b0d49606f45a765c86d49cc4ffc92f5ee03806a692d"
},
"downloads": -1,
"filename": "centering_lgram-1.0.47-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1e151eca4a6a7f5c1944e43cd012a7d6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 4816808,
"upload_time": "2025-08-20T19:41:12",
"upload_time_iso_8601": "2025-08-20T19:41:12.800806Z",
"url": "https://files.pythonhosted.org/packages/9c/93/9ea7b773d082e437a6aad8558fab6e6363177bb3a0471236c3162afab107/centering_lgram-1.0.47-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "82de5777abfa8a73c9c55df24211aebacd0610fbcbdbd5e1180106c90c8ddf29",
"md5": "f0b8b6976446ef45de9363f8fed3e7ed",
"sha256": "7a5f24bd9f3154da0786e02fa1a2d35c6a9990339e76b2eb63eb4e0cfa7673ab"
},
"downloads": -1,
"filename": "centering_lgram-1.0.47.tar.gz",
"has_sig": false,
"md5_digest": "f0b8b6976446ef45de9363f8fed3e7ed",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 4724444,
"upload_time": "2025-08-20T19:41:19",
"upload_time_iso_8601": "2025-08-20T19:41:19.362467Z",
"url": "https://files.pythonhosted.org/packages/82/de/5777abfa8a73c9c55df24211aebacd0610fbcbdbd5e1180106c90c8ddf29/centering_lgram-1.0.47.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-20 19:41:19",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "iatagun",
"github_project": "Lgram",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "centering-lgram"
}