# Email Typo Fixer
[](https://pypi.org/project/email-typo-fixer/)
[](https://pypi.org/project/email-typo-fixer/)
[](https://opensource.org/licenses/MIT)
[](https://github.com/machado000/email-typo-fixer/issues)
A Python library to automatically detect and fix common typos in email addresses using intelligent algorithms and domain knowledge.
## Features
- **Email Normalization**: Lowercases, strips, and removes invalid characters
- **Extension Validation**: Validates and corrects TLDs using the official [PublicSuffixList](https://pypi.org/project/publicsuffixlist/) (parses `.dat` file directly)
- **Smart Typo Detection**: Uses Levenshtein distance to detect and correct TLD and domain name typos
- **Domain Correction**: Fixes common domain typos (e.g., `gamil.com` → `gmail.com`)
- **Configurable**: Custom typo dictionary and distance thresholds
- **Logging Support**: Built-in logging for debugging and monitoring
## Installation
```bash
pip install email-typo-fixer
```
## Quick Start
```python
from email_typo_fixer import normalize_email, EmailTypoFixer
# Simple function interface
corrected_email = normalize_email("user@gamil.com")
print(corrected_email) # user@gmail.com
# Class interface for more control
fixer = EmailTypoFixer(max_distance=1)
corrected_email = fixer.normalize("user@yaho.com")
print(corrected_email) # user@yahoo.com
```
## Limitations
### TLD '.co' False Positives
By default, the library may correct emails ending in `.co` (such as `user@example.co`) to `.com` if the Levenshtein distance is within the allowed threshold. This can lead to false positives, especially for valid `.co` domains (e.g., Colombian domains or legitimate `.co` TLDs).
**How to control this behavior:**
- The `normalize` method and the `normalize_email` function accept an optional parameter `fix_tld_co: bool` (default: `True`).
- If you want to prevent `.co` domains from being auto-corrected to `.com`, call:
```python
from email_typo_fixer import normalize_email
normalize_email("user@example.co", fix_tld_co=False) # Will NOT change .co to .com
```
Or, with the class:
```python
fixer = EmailTypoFixer()
fixer.normalize("user@example.co", fix_tld_co=False)
```
This gives you control to avoid unwanted corrections for `.co` domains.
## Usage Examples
### Basic Email Correction
```python
from email_typo_fixer import normalize_email
# Fix common domain typos
normalize_email("john.doe@gamil.com") # → john.doe@gmail.com
normalize_email("jane@yaho.com") # → jane@yahoo.com
normalize_email("user@outlok.com") # → user@outlook.com
normalize_email("test@hotmal.com") # → test@hotmail.com
# Fix extension typos (using up-to-date public suffix list)
normalize_email("user@example.co") # → user@example.com
normalize_email("user@site.rog") # → user@site.org
```
### Robust Suffix Handling
This library parses the official `public_suffix_list.dat` file at runtime, ensuring all TLDs and public suffixes are always up to date. No hardcoded suffixes are used.
### Advanced Usage with Custom Configuration
```python
from email_typo_fixer import EmailTypoFixer
import logging
# Create a custom logger
logger = logging.getLogger("email_fixer")
logger.setLevel(logging.INFO)
# Custom typo dictionary
custom_typos = {
'companytypo': 'company',
'orgtypo': 'org',
}
# Initialize with custom settings
fixer = EmailTypoFixer(
max_distance=2, # Allow more distant corrections
typo_domains=custom_typos, # Use custom typo dictionary
logger=logger # Use custom logger
)
# Fix emails with custom rules
corrected = fixer.normalize("user@companytypo.com")
print(corrected) # user@company.com
```
### Email Validation and Normalization
```python
from email_typo_fixer import EmailTypoFixer
fixer = EmailTypoFixer()
try:
# Normalize and validate
email = fixer.normalize(" USER@EXAMPLE.COM ")
print(email) # user@example.com
# Remove invalid characters
email = fixer.normalize("us*er@exam!ple.com")
print(email) # user@example.com
except ValueError as e:
print(f"Invalid email: {e}")
```
## API Reference
### `normalize_email(email: str) -> str`
Simple function interface for email normalization.
**Parameters:**
- `email` (str): The email address to normalize
**Returns:**
- `str`: The corrected and normalized email address
**Raises:**
- `ValueError`: If the email cannot be fixed or is invalid
### `EmailTypoFixer`
Main class for email typo correction with customizable options.
#### `__init__(max_distance=1, typo_domains=None, logger=None)`
**Parameters:**
- `max_distance` (int): Maximum Levenshtein distance for extension corrections (default: 1)
- `typo_domains` (dict): Custom dictionary of domain typos to corrections
- `logger` (logging.Logger): Custom logger instance
#### `normalize(email: str) -> str`
Normalize and fix typos in an email address.
**Parameters:**
- `email` (str): The email address to normalize
**Returns:**
- `str`: The corrected and normalized email address
**Raises:**
- `ValueError`: If the email cannot be fixed or is invalid
## Default Typo Corrections
The library includes built-in corrections for common email provider typos:
| Typo | Correction |
|------|------------|
| gamil | gmail |
| gmial | gmail |
| gnail | gmail |
| gmaill | gmail |
| yaho | yahoo |
| yahho | yahoo |
| outlok | outlook |
| outllok | outlook |
| outlokk | outlook |
| hotmal | hotmail |
| hotmial | hotmail |
| homtail | hotmail |
| hotmaill | hotmail |
## Error Handling
The library raises `ValueError` exceptions for emails that cannot be corrected:
```python
from email_typo_fixer import normalize_email
try:
normalize_email("invalid.email") # Missing @ symbol
except ValueError as e:
print(f"Cannot fix email: {e}")
try:
normalize_email("user@") # Missing domain
except ValueError as e:
print(f"Cannot fix email: {e}")
```
## Requirements
- Python 3.10+
- RapidFuzz >= 3.13.0
- publicsuffixlist >= 1.0.2
## Development
### Setting up for Development
```bash
# Clone the repository
git clone https://github.com/yourusername/email-typo-fixer.git
cd email-typo-fixer
# Install Poetry (if not already installed)
curl -sSL https://install.python-poetry.org | python3 -
# Install dependencies
poetry install
# Activate the virtual environment
poetry shell
```
### Running Tests
```bash
# Run tests with coverage
poetry run pytest
# Run tests with verbose output
poetry run pytest -v
# Run specific test file
poetry run pytest tests/test_email_typo_fixer.py
```
### Code Quality
```bash
# Lint with flake8
poetry run flake8 email_typo_fixer tests
# Type checking with mypy
poetry run mypy email_typo_fixer
```
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Uses the [Levenshtein](https://github.com/maxbachmann/Levenshtein) and [RapidFuzz](https://github.com/rapidfuzz/RapidFuzz) libraries for string distance calculations
- Uses [publicsuffixlist](https://github.com/ko-zu/psl) for TLD (Top Level Domain) validation
- Inspired by various email validation libraries in the Python ecosystem
Raw data
{
"_id": null,
"home_page": "https://github.com/machado000/email-typo-fixer",
"name": "email-typo-fixer",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "email, typo, correction, validation, normalization",
"author": "Joao Brito",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/7b/e6/42863b043ace8adef05c3ef9890395af1527a8e2a4432e837c5ae31d57ad/email_typo_fixer-1.1.0.tar.gz",
"platform": null,
"description": "# Email Typo Fixer\n\n[](https://pypi.org/project/email-typo-fixer/)\n[](https://pypi.org/project/email-typo-fixer/)\n[](https://opensource.org/licenses/MIT)\n[](https://github.com/machado000/email-typo-fixer/issues)\n\nA Python library to automatically detect and fix common typos in email addresses using intelligent algorithms and domain knowledge.\n\n## Features\n\n- **Email Normalization**: Lowercases, strips, and removes invalid characters\n- **Extension Validation**: Validates and corrects TLDs using the official [PublicSuffixList](https://pypi.org/project/publicsuffixlist/) (parses `.dat` file directly)\n- **Smart Typo Detection**: Uses Levenshtein distance to detect and correct TLD and domain name typos\n- **Domain Correction**: Fixes common domain typos (e.g., `gamil.com` \u2192 `gmail.com`)\n- **Configurable**: Custom typo dictionary and distance thresholds\n- **Logging Support**: Built-in logging for debugging and monitoring\n\n\n## Installation\n\n```bash\npip install email-typo-fixer\n```\n\n## Quick Start\n\n```python\nfrom email_typo_fixer import normalize_email, EmailTypoFixer\n\n# Simple function interface\ncorrected_email = normalize_email(\"user@gamil.com\")\nprint(corrected_email) # user@gmail.com\n\n# Class interface for more control\nfixer = EmailTypoFixer(max_distance=1)\ncorrected_email = fixer.normalize(\"user@yaho.com\")\nprint(corrected_email) # user@yahoo.com\n```\n\n\n## Limitations\n\n### TLD '.co' False Positives\n\nBy default, the library may correct emails ending in `.co` (such as `user@example.co`) to `.com` if the Levenshtein distance is within the allowed threshold. This can lead to false positives, especially for valid `.co` domains (e.g., Colombian domains or legitimate `.co` TLDs).\n\n**How to control this behavior:**\n\n- The `normalize` method and the `normalize_email` function accept an optional parameter `fix_tld_co: bool` (default: `True`).\n- If you want to prevent `.co` domains from being auto-corrected to `.com`, call:\n\n```python\nfrom email_typo_fixer import normalize_email\n\nnormalize_email(\"user@example.co\", fix_tld_co=False) # Will NOT change .co to .com\n```\n\nOr, with the class:\n\n```python\nfixer = EmailTypoFixer()\nfixer.normalize(\"user@example.co\", fix_tld_co=False)\n```\n\nThis gives you control to avoid unwanted corrections for `.co` domains.\n\n\n## Usage Examples\n\n### Basic Email Correction\n\n```python\nfrom email_typo_fixer import normalize_email\n\n# Fix common domain typos\nnormalize_email(\"john.doe@gamil.com\") # \u2192 john.doe@gmail.com\nnormalize_email(\"jane@yaho.com\") # \u2192 jane@yahoo.com\nnormalize_email(\"user@outlok.com\") # \u2192 user@outlook.com\nnormalize_email(\"test@hotmal.com\") # \u2192 test@hotmail.com\n\n# Fix extension typos (using up-to-date public suffix list)\nnormalize_email(\"user@example.co\") # \u2192 user@example.com\nnormalize_email(\"user@site.rog\") # \u2192 user@site.org\n```\n\n### Robust Suffix Handling\n\nThis library parses the official `public_suffix_list.dat` file at runtime, ensuring all TLDs and public suffixes are always up to date. No hardcoded suffixes are used.\n\n### Advanced Usage with Custom Configuration\n\n```python\nfrom email_typo_fixer import EmailTypoFixer\nimport logging\n\n# Create a custom logger\nlogger = logging.getLogger(\"email_fixer\")\nlogger.setLevel(logging.INFO)\n\n# Custom typo dictionary\ncustom_typos = {\n 'companytypo': 'company',\n 'orgtypo': 'org',\n}\n\n# Initialize with custom settings\nfixer = EmailTypoFixer(\n max_distance=2, # Allow more distant corrections\n typo_domains=custom_typos, # Use custom typo dictionary\n logger=logger # Use custom logger\n)\n\n# Fix emails with custom rules\ncorrected = fixer.normalize(\"user@companytypo.com\")\nprint(corrected) # user@company.com\n```\n\n### Email Validation and Normalization\n\n```python\nfrom email_typo_fixer import EmailTypoFixer\n\nfixer = EmailTypoFixer()\n\ntry:\n # Normalize and validate\n email = fixer.normalize(\" USER@EXAMPLE.COM \")\n print(email) # user@example.com\n \n # Remove invalid characters\n email = fixer.normalize(\"us*er@exam!ple.com\")\n print(email) # user@example.com\n \nexcept ValueError as e:\n print(f\"Invalid email: {e}\")\n```\n\n## API Reference\n\n### `normalize_email(email: str) -> str`\n\nSimple function interface for email normalization.\n\n**Parameters:**\n- `email` (str): The email address to normalize\n\n**Returns:**\n- `str`: The corrected and normalized email address\n\n**Raises:**\n- `ValueError`: If the email cannot be fixed or is invalid\n\n### `EmailTypoFixer`\n\nMain class for email typo correction with customizable options.\n\n#### `__init__(max_distance=1, typo_domains=None, logger=None)`\n\n**Parameters:**\n- `max_distance` (int): Maximum Levenshtein distance for extension corrections (default: 1)\n- `typo_domains` (dict): Custom dictionary of domain typos to corrections\n- `logger` (logging.Logger): Custom logger instance\n\n#### `normalize(email: str) -> str`\n\nNormalize and fix typos in an email address.\n\n**Parameters:**\n- `email` (str): The email address to normalize\n\n**Returns:**\n- `str`: The corrected and normalized email address\n\n**Raises:**\n- `ValueError`: If the email cannot be fixed or is invalid\n\n## Default Typo Corrections\n\nThe library includes built-in corrections for common email provider typos:\n\n| Typo | Correction |\n|------|------------|\n| gamil | gmail |\n| gmial | gmail |\n| gnail | gmail |\n| gmaill | gmail |\n| yaho | yahoo |\n| yahho | yahoo |\n| outlok | outlook |\n| outllok | outlook |\n| outlokk | outlook |\n| hotmal | hotmail |\n| hotmial | hotmail |\n| homtail | hotmail |\n| hotmaill | hotmail |\n\n## Error Handling\n\nThe library raises `ValueError` exceptions for emails that cannot be corrected:\n\n```python\nfrom email_typo_fixer import normalize_email\n\ntry:\n normalize_email(\"invalid.email\") # Missing @ symbol\nexcept ValueError as e:\n print(f\"Cannot fix email: {e}\")\n\ntry:\n normalize_email(\"user@\") # Missing domain\nexcept ValueError as e:\n print(f\"Cannot fix email: {e}\")\n```\n\n## Requirements\n\n- Python 3.10+\n- RapidFuzz >= 3.13.0\n- publicsuffixlist >= 1.0.2\n\n## Development\n\n### Setting up for Development\n\n```bash\n# Clone the repository\ngit clone https://github.com/yourusername/email-typo-fixer.git\ncd email-typo-fixer\n\n# Install Poetry (if not already installed)\ncurl -sSL https://install.python-poetry.org | python3 -\n\n# Install dependencies\npoetry install\n\n# Activate the virtual environment\npoetry shell\n```\n\n### Running Tests\n\n```bash\n# Run tests with coverage\npoetry run pytest\n\n# Run tests with verbose output\npoetry run pytest -v\n\n# Run specific test file\npoetry run pytest tests/test_email_typo_fixer.py\n```\n\n### Code Quality\n\n```bash\n# Lint with flake8\npoetry run flake8 email_typo_fixer tests\n\n# Type checking with mypy\npoetry run mypy email_typo_fixer\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n\n## Acknowledgments\n\n- Uses the [Levenshtein](https://github.com/maxbachmann/Levenshtein) and [RapidFuzz](https://github.com/rapidfuzz/RapidFuzz) libraries for string distance calculations\n- Uses [publicsuffixlist](https://github.com/ko-zu/psl) for TLD (Top Level Domain) validation\n- Inspired by various email validation libraries in the Python ecosystem\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python library to automatically detect and fix common typos in email addresses",
"version": "1.1.0",
"project_urls": {
"Homepage": "https://github.com/machado000/email-typo-fixer",
"Issues": "https://github.com/machado000/email-typo-fixer/issues"
},
"split_keywords": [
"email",
" typo",
" correction",
" validation",
" normalization"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "3ad6a19f43c7ddc780e70994bb68df97acc86dcbb30cbca2b43a62a03c326327",
"md5": "ecdd2413ca07c0edba2ea31a4d614105",
"sha256": "40a4d8677161b9554360236377e5097043f1f6fcebe5c863eb6ae8541b87abda"
},
"downloads": -1,
"filename": "email_typo_fixer-1.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ecdd2413ca07c0edba2ea31a4d614105",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 8899,
"upload_time": "2025-08-13T11:24:08",
"upload_time_iso_8601": "2025-08-13T11:24:08.712336Z",
"url": "https://files.pythonhosted.org/packages/3a/d6/a19f43c7ddc780e70994bb68df97acc86dcbb30cbca2b43a62a03c326327/email_typo_fixer-1.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7be642863b043ace8adef05c3ef9890395af1527a8e2a4432e837c5ae31d57ad",
"md5": "9b36aa69cc58ce58ab93e918496e2e63",
"sha256": "9018213050d9685effb8be45bcb345836b33e4a9971e539a9db27efc835b438d"
},
"downloads": -1,
"filename": "email_typo_fixer-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "9b36aa69cc58ce58ab93e918496e2e63",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 8112,
"upload_time": "2025-08-13T11:24:09",
"upload_time_iso_8601": "2025-08-13T11:24:09.772249Z",
"url": "https://files.pythonhosted.org/packages/7b/e6/42863b043ace8adef05c3ef9890395af1527a8e2a4432e837c5ae31d57ad/email_typo_fixer-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-13 11:24:09",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "machado000",
"github_project": "email-typo-fixer",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "email-typo-fixer"
}