Name | contraction-fix JSON |
Version |
0.2.1
JSON |
| download |
home_page | None |
Summary | A fast and efficient library for fixing contractions in text with reverse functionality and batch processing support |
upload_time | 2025-07-22 03:03:12 |
maintainer | None |
docs_url | None |
author | Sean Gao |
requires_python | >=3.8 |
license | MIT |
keywords |
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Contraction Fix
[](https://pypi.org/project/contraction-fix/)
[](https://pypi.org/project/contraction-fix/)
[](https://opensource.org/licenses/MIT)
A fast and efficient library for fixing contractions in text. This package provides tools to expand contractions in English text while maintaining high performance and accuracy. **NEW in v0.2.1: Reverse functionality to contract expanded forms back to contractions!**
## Features
- Fast text processing using precompiled regex patterns
- **Batch processing for multiple texts with optimized performance**
- **NEW: Reverse functionality to contract expanded forms back to contractions**
- Support for standard contractions, informal contractions, and internet slang
- Configurable dictionary usage
- Optimized caching for improved performance
- Preview functionality to see contractions before fixing
- Easy addition and removal of custom contractions
- Thread-safe operations
## Installation
```bash
pip install contraction-fix
```
## Usage
### Basic Usage
#### Expanding Contractions
```python
from contraction_fix import fix
text = "I can't believe it's not butter!"
fixed_text = fix(text)
print(fixed_text) # "I cannot believe it is not butter!"
```
#### Contracting Expanded Forms (NEW!)
```python
from contraction_fix import contract
text = "I cannot believe it is not butter!"
contracted_text = contract(text)
print(contracted_text) # "I can't believe it's not butter!"
```
### Batch Processing
#### Expanding Contractions in Batch
For processing multiple texts efficiently:
```python
from contraction_fix import fix_batch
texts = [
"I can't believe it's working!",
"They're going to the store",
"We'll see what happens"
]
fixed_texts = fix_batch(texts)
print(fixed_texts)
# Output: ["I cannot believe it is working!", "They are going to the store", "We will see what happens"]
```
#### Contracting Expanded Forms in Batch (NEW!)
```python
from contraction_fix import contract_batch
texts = [
"I cannot believe it is working!",
"They are going to the store",
"We will see what happens"
]
contracted_texts = contract_batch(texts)
print(contracted_texts)
# Output: ["I can't believe it's working!", "They're goin' to the store", "We'll see what happens"]
```
### Instantiating `ContractionFixer`
Start by creating an instance of the `ContractionFixer` class:
```python
from contraction_fix import ContractionFixer
fixer = ContractionFixer()
```
### Optional Parameters:
- **`use_informal: bool = True`**
- Enables informal contractions like `"gonna"` → `"going to"`.
- Set to `False` to avoid informal style expansions.
- **`use_slang: bool = True`**
- Enables slang contractions like `"brb"` → `"be right back"`.
- Set to `False` for more formal or academic applications.
- **`cache_size: int = 1024`**
- Sets the LRU cache size for memoization. Improves performance when processing repeated inputs.
#### Example – Disabling slang:
```python
fixer = ContractionFixer(use_slang=False)
print(fixer.fix("brb, idk what's up"))
# Output: "brb, I don't know what is up" (brb is skipped because use_slang=False)
```
### Contractions vs. Possessives
The package intelligently differentiates between contractions and possessive forms:
```python
from contraction_fix import fix
text = "I can't find Sarah's keys, and she won't be at her brother's house until it's dark."
fixed_text = fix(text)
print(fixed_text) # "I cannot find Sarah's keys, and she will not be at her brother's house until it is dark."
```
Notice how the package:
- Expands contractions: "can't" → "cannot", "won't" → "will not", "it's" → "it is"
- Preserves possessives: "Sarah's" and "brother's" remain unchanged
### Advanced Usage
```python
from contraction_fix import ContractionFixer
# Create a custom fixer instance
fixer = ContractionFixer(use_informal=True, use_slang=False)
# Fix single text
text = "I'd like to see y'all tomorrow"
fixed_text = fixer.fix(text)
print(fixed_text) # "I would like to see you all tomorrow"
# Contract single text (NEW!)
expanded_text = "I would like to see you all tomorrow"
contracted_text = fixer.contract(expanded_text)
print(contracted_text) # "I would like to see y'all tomorrow"
# Fix multiple texts efficiently
texts = [
"I can't believe it's working",
"They're going home",
"We'll see what happens"
]
fixed_texts = fixer.fix_batch(texts)
print(fixed_texts) # ["I cannot believe it is working", "They are going home", "We will see what happens"]
# Contract multiple texts efficiently (NEW!)
expanded_texts = [
"I cannot believe it is working",
"They are going home",
"We will see what happens"
]
contracted_texts = fixer.contract_batch(expanded_texts)
print(contracted_texts) # ["I can't believe it's working", "They're goin' home", "We'll see what happens"]
# Preview contractions
matches = fixer.preview(text, context_size=5)
for match in matches:
print(f"Found '{match.text}' at position {match.start}")
print(f"Context: '{match.context}'")
print(f"Will be replaced with: '{match.replacement}'")
# Add custom contraction
fixer.add_contraction("gonna", "going to")
# Remove contraction
fixer.remove_contraction("won't")
```
## Dictionary Types
The package uses three types of dictionaries:
1. **Standard Contractions**: Common English contractions like "can't", "won't", etc.
2. **Informal Contractions**: Less formal contractions and patterns like "goin'", "doin'", etc.
3. **Internet Slang**: Modern internet slang and abbreviations like "lol", "btw", etc.
## Performance
The package is optimized for speed through:
- Precompiled regex patterns with cached compilation
- LRU caching of results for repeated inputs
- Efficient dictionary lookups with optimized key ordering
- **Batch processing for multiple texts**
- Minimal memory usage with frozenset constants
- Thread-safe operations
### Batch Processing Performance
When processing multiple texts, use `fix_batch()` or `contract_batch()` for better performance:
```python
from contraction_fix import fix_batch, contract_batch
# More efficient for multiple texts
texts = ["I can't go", "They're here", "We'll see"]
results = fix_batch(texts) # Uses shared cache and optimized processing
# For reverse processing
expanded_texts = ["I cannot go", "They are here", "We will see"]
results = contract_batch(expanded_texts) # Uses shared cache and optimized processing
# Less efficient for multiple texts
results = [fix(text) for text in texts] # Creates new instances
```
## API Reference
### Functions
- `fix(text: str, use_informal: bool = True, use_slang: bool = True) -> str`
- `fix_batch(texts: List[str], use_informal: bool = True, use_slang: bool = True) -> List[str]`
- `contract(text: str, use_informal: bool = True, use_slang: bool = True) -> str` **(NEW!)**
- `contract_batch(texts: List[str], use_informal: bool = True, use_slang: bool = True) -> List[str]` **(NEW!)**
### Classes
- `ContractionFixer(use_informal: bool = True, use_slang: bool = True, cache_size: int = 1024)`
- `fix(text: str) -> str`
- `fix_batch(texts: List[str]) -> List[str]`
- `contract(text: str) -> str` **(NEW!)**
- `contract_batch(texts: List[str]) -> List[str]` **(NEW!)**
- `preview(text: str, context_size: int = 10) -> List[Match]`
- `add_contraction(contraction: str, expansion: str) -> None`
- `remove_contraction(contraction: str) -> None`
## What's New in v0.2.1
- **Reverse Functionality**: New `contract()` and `contract_batch()` methods to convert expanded forms back to contractions
- **Enhanced API**: Package-level convenience functions for reverse functionality
- **Comprehensive Testing**: Extensive test coverage for all new functionality
- **Improved Performance**: Optimizations for both expansion and contraction operations
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "contraction-fix",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Sean Gao",
"author_email": "seangaoxy@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/ed/84/6b9ad752706510156f0cf7131a24eda5f74892a40b858b745c62e8c6e91b/contraction_fix-0.2.1.tar.gz",
"platform": null,
"description": "# Contraction Fix\n\n[](https://pypi.org/project/contraction-fix/)\n[](https://pypi.org/project/contraction-fix/)\n[](https://opensource.org/licenses/MIT)\n\nA fast and efficient library for fixing contractions in text. This package provides tools to expand contractions in English text while maintaining high performance and accuracy. **NEW in v0.2.1: Reverse functionality to contract expanded forms back to contractions!**\n\n## Features\n\n- Fast text processing using precompiled regex patterns\n- **Batch processing for multiple texts with optimized performance**\n- **NEW: Reverse functionality to contract expanded forms back to contractions**\n- Support for standard contractions, informal contractions, and internet slang\n- Configurable dictionary usage\n- Optimized caching for improved performance\n- Preview functionality to see contractions before fixing\n- Easy addition and removal of custom contractions\n- Thread-safe operations\n\n## Installation\n\n```bash\npip install contraction-fix\n```\n\n## Usage\n\n### Basic Usage\n\n#### Expanding Contractions\n\n```python\nfrom contraction_fix import fix\n\ntext = \"I can't believe it's not butter!\"\nfixed_text = fix(text)\nprint(fixed_text) # \"I cannot believe it is not butter!\"\n```\n\n#### Contracting Expanded Forms (NEW!)\n\n```python\nfrom contraction_fix import contract\n\ntext = \"I cannot believe it is not butter!\"\ncontracted_text = contract(text)\nprint(contracted_text) # \"I can't believe it's not butter!\"\n```\n\n### Batch Processing\n\n#### Expanding Contractions in Batch\n\nFor processing multiple texts efficiently:\n\n```python\nfrom contraction_fix import fix_batch\n\ntexts = [\n \"I can't believe it's working!\",\n \"They're going to the store\",\n \"We'll see what happens\"\n]\n\nfixed_texts = fix_batch(texts)\nprint(fixed_texts)\n# Output: [\"I cannot believe it is working!\", \"They are going to the store\", \"We will see what happens\"]\n```\n\n#### Contracting Expanded Forms in Batch (NEW!)\n\n```python\nfrom contraction_fix import contract_batch\n\ntexts = [\n \"I cannot believe it is working!\",\n \"They are going to the store\", \n \"We will see what happens\"\n]\n\ncontracted_texts = contract_batch(texts)\nprint(contracted_texts)\n# Output: [\"I can't believe it's working!\", \"They're goin' to the store\", \"We'll see what happens\"]\n```\n\n### Instantiating `ContractionFixer`\n\nStart by creating an instance of the `ContractionFixer` class:\n\n```python\nfrom contraction_fix import ContractionFixer\n\nfixer = ContractionFixer()\n```\n\n### Optional Parameters:\n\n- **`use_informal: bool = True`**\n \n - Enables informal contractions like `\"gonna\"` \u2192 `\"going to\"`.\n \n - Set to `False` to avoid informal style expansions.\n \n- **`use_slang: bool = True`**\n \n - Enables slang contractions like `\"brb\"` \u2192 `\"be right back\"`.\n \n - Set to `False` for more formal or academic applications.\n \n- **`cache_size: int = 1024`**\n \n - Sets the LRU cache size for memoization. Improves performance when processing repeated inputs.\n \n\n#### Example \u2013 Disabling slang:\n\n```python\nfixer = ContractionFixer(use_slang=False)\nprint(fixer.fix(\"brb, idk what's up\")) \n# Output: \"brb, I don't know what is up\" (brb is skipped because use_slang=False)\n```\n\n### Contractions vs. Possessives\n\nThe package intelligently differentiates between contractions and possessive forms:\n\n```python\nfrom contraction_fix import fix\n\ntext = \"I can't find Sarah's keys, and she won't be at her brother's house until it's dark.\"\nfixed_text = fix(text)\nprint(fixed_text) # \"I cannot find Sarah's keys, and she will not be at her brother's house until it is dark.\"\n```\n\nNotice how the package:\n- Expands contractions: \"can't\" \u2192 \"cannot\", \"won't\" \u2192 \"will not\", \"it's\" \u2192 \"it is\"\n- Preserves possessives: \"Sarah's\" and \"brother's\" remain unchanged\n\n### Advanced Usage\n\n```python\nfrom contraction_fix import ContractionFixer\n\n# Create a custom fixer instance\nfixer = ContractionFixer(use_informal=True, use_slang=False)\n\n# Fix single text\ntext = \"I'd like to see y'all tomorrow\"\nfixed_text = fixer.fix(text)\nprint(fixed_text) # \"I would like to see you all tomorrow\"\n\n# Contract single text (NEW!)\nexpanded_text = \"I would like to see you all tomorrow\"\ncontracted_text = fixer.contract(expanded_text)\nprint(contracted_text) # \"I would like to see y'all tomorrow\"\n\n# Fix multiple texts efficiently\ntexts = [\n \"I can't believe it's working\",\n \"They're going home\",\n \"We'll see what happens\"\n]\nfixed_texts = fixer.fix_batch(texts)\nprint(fixed_texts) # [\"I cannot believe it is working\", \"They are going home\", \"We will see what happens\"]\n\n# Contract multiple texts efficiently (NEW!)\nexpanded_texts = [\n \"I cannot believe it is working\",\n \"They are going home\",\n \"We will see what happens\"\n]\ncontracted_texts = fixer.contract_batch(expanded_texts)\nprint(contracted_texts) # [\"I can't believe it's working\", \"They're goin' home\", \"We'll see what happens\"]\n\n# Preview contractions\nmatches = fixer.preview(text, context_size=5)\nfor match in matches:\n print(f\"Found '{match.text}' at position {match.start}\")\n print(f\"Context: '{match.context}'\")\n print(f\"Will be replaced with: '{match.replacement}'\")\n\n# Add custom contraction\nfixer.add_contraction(\"gonna\", \"going to\")\n\n# Remove contraction\nfixer.remove_contraction(\"won't\")\n```\n\n## Dictionary Types\n\nThe package uses three types of dictionaries:\n\n1. **Standard Contractions**: Common English contractions like \"can't\", \"won't\", etc.\n2. **Informal Contractions**: Less formal contractions and patterns like \"goin'\", \"doin'\", etc.\n3. **Internet Slang**: Modern internet slang and abbreviations like \"lol\", \"btw\", etc.\n\n## Performance\n\nThe package is optimized for speed through:\n- Precompiled regex patterns with cached compilation\n- LRU caching of results for repeated inputs\n- Efficient dictionary lookups with optimized key ordering\n- **Batch processing for multiple texts**\n- Minimal memory usage with frozenset constants\n- Thread-safe operations\n\n### Batch Processing Performance\n\nWhen processing multiple texts, use `fix_batch()` or `contract_batch()` for better performance:\n\n```python\nfrom contraction_fix import fix_batch, contract_batch\n\n# More efficient for multiple texts\ntexts = [\"I can't go\", \"They're here\", \"We'll see\"]\nresults = fix_batch(texts) # Uses shared cache and optimized processing\n\n# For reverse processing\nexpanded_texts = [\"I cannot go\", \"They are here\", \"We will see\"]\nresults = contract_batch(expanded_texts) # Uses shared cache and optimized processing\n\n# Less efficient for multiple texts\nresults = [fix(text) for text in texts] # Creates new instances\n```\n\n## API Reference\n\n### Functions\n\n- `fix(text: str, use_informal: bool = True, use_slang: bool = True) -> str`\n- `fix_batch(texts: List[str], use_informal: bool = True, use_slang: bool = True) -> List[str]`\n- `contract(text: str, use_informal: bool = True, use_slang: bool = True) -> str` **(NEW!)**\n- `contract_batch(texts: List[str], use_informal: bool = True, use_slang: bool = True) -> List[str]` **(NEW!)**\n\n### Classes\n\n- `ContractionFixer(use_informal: bool = True, use_slang: bool = True, cache_size: int = 1024)`\n - `fix(text: str) -> str`\n - `fix_batch(texts: List[str]) -> List[str]`\n - `contract(text: str) -> str` **(NEW!)**\n - `contract_batch(texts: List[str]) -> List[str]` **(NEW!)**\n - `preview(text: str, context_size: int = 10) -> List[Match]`\n - `add_contraction(contraction: str, expansion: str) -> None`\n - `remove_contraction(contraction: str) -> None`\n\n## What's New in v0.2.1\n\n- **Reverse Functionality**: New `contract()` and `contract_batch()` methods to convert expanded forms back to contractions\n- **Enhanced API**: Package-level convenience functions for reverse functionality\n- **Comprehensive Testing**: Extensive test coverage for all new functionality\n- **Improved Performance**: Optimizations for both expansion and contraction operations\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details. \n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A fast and efficient library for fixing contractions in text with reverse functionality and batch processing support",
"version": "0.2.1",
"project_urls": {
"Homepage": "https://github.com/xga0/contraction_fix"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "de872ed65e5407b0a07aa1236ec34978e88c75ebfa365060c8e8ab8fbabdb516",
"md5": "fee3b70d34ec97ae980efc5b6d488f83",
"sha256": "ac8999117dc702fd9324c471c3fbd965b92605a3ef57275da573a0c9bde4d180"
},
"downloads": -1,
"filename": "contraction_fix-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fee3b70d34ec97ae980efc5b6d488f83",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 11683,
"upload_time": "2025-07-22T03:03:11",
"upload_time_iso_8601": "2025-07-22T03:03:11.057501Z",
"url": "https://files.pythonhosted.org/packages/de/87/2ed65e5407b0a07aa1236ec34978e88c75ebfa365060c8e8ab8fbabdb516/contraction_fix-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "ed846b9ad752706510156f0cf7131a24eda5f74892a40b858b745c62e8c6e91b",
"md5": "e086b965b10c5f96dc6a26b1e107e727",
"sha256": "72a46894e1de8dcde233bb858b09b3aeebe0a69c74c5417b291d47e44f200ae5"
},
"downloads": -1,
"filename": "contraction_fix-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "e086b965b10c5f96dc6a26b1e107e727",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 16679,
"upload_time": "2025-07-22T03:03:12",
"upload_time_iso_8601": "2025-07-22T03:03:12.205576Z",
"url": "https://files.pythonhosted.org/packages/ed/84/6b9ad752706510156f0cf7131a24eda5f74892a40b858b745c62e8c6e91b/contraction_fix-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-22 03:03:12",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "xga0",
"github_project": "contraction_fix",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "contraction-fix"
}