Turkish Spell Checker
============
This tool is a spelling checker for Modern Turkish. It detects spelling errors and corrects them appropriately, through its list of misspellings and matching to the Turkish dictionary.
Video Lectures
============
[<img src="https://github.com/StarlangSoftware/TurkishSpellChecker/blob/master/video.jpg" width="50%">](https://youtu.be/wKwTKv6Jlo0)
For Developers
============
You can also see [Cython](https://github.com/starlangsoftware/TurkishSpellChecker-Cy), [Java](https://github.com/starlangsoftware/TurkishSpellChecker), [C++](https://github.com/starlangsoftware/TurkishSpellChecker-CPP), [Swift](https://github.com/starlangsoftware/TurkishSpellChecker-Swift), [Js](https://github.com/starlangsoftware/TurkishSpellChecker-Js), or [C#](https://github.com/starlangsoftware/TurkishSpellChecker-CS) repository.
## Requirements
* [Python 3.7 or higher](#python)
* [Git](#git)
### Python
To check if you have a compatible version of Python installed, use the following command:
python -V
You can find the latest version of Python [here](https://www.python.org/downloads/).
### Git
Install the [latest version of Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).
## Pip Install
pip3 install NlpToolkit-SpellChecker
## Download Code
In order to work on code, create a fork from GitHub page.
Use Git for cloning the code to your local or below line for Ubuntu:
git clone <your-fork-git-link>
A directory called SpellChecker will be created. Or you can use below link for exploring the code:
git clone https://github.com/starlangsoftware/TurkishSpellChecker-Py.git
## Open project with Pycharm IDE
Steps for opening the cloned project:
* Start IDE
* Select **File | Open** from main menu
* Choose `DataStructure-PY` file
* Select open as project option
* Couple of seconds, project will be downloaded.
Detailed Description
============
+ [Creating SpellChecker](#creating-spellchecker)
+ [Spell Correction](#spell-correction)
## Creating SpellChecker
SpellChecker finds spelling errors and corrects them in Turkish. There are two types of spell checker available:
* `SimpleSpellChecker`
* To instantiate this, a `FsmMorphologicalAnalyzer` is needed.
fsm = FsmMorphologicalAnalyzer()
spellChecker = SimpleSpellChecker(fsm)
* `NGramSpellChecker`,
* To create an instance of this, both a `FsmMorphologicalAnalyzer` and a `NGram` is required.
* `FsmMorphologicalAnalyzer` can be instantiated as follows:
fsm = FsmMorphologicalAnalyzer()
* `NGram` can be either trained from scratch or loaded from an existing model.
* Training from scratch:
corpus = Corpus("corpus.txt");
ngram = NGram(corpus.getAllWordsAsArrayList(), 1)
ngram.calculateNGramProbabilities(LaplaceSmoothing())
*There are many smoothing methods available. For other smoothing methods, check [here](https://github.com/olcaytaner/NGram).*
* Loading from an existing model:
ngram = NGram("ngram.txt")
*For further details, please check [here](https://github.com/starlangsoftware/NGram).*
* Afterwards, `NGramSpellChecker` can be created as below:
spellChecker = NGramSpellChecker(fsm, ngram)
## Spell Correction
Spell correction can be done as follows:
sentence = Sentence("Dıktor olaç yazdı")
corrected = spellChecker.spellCheck(sentence)
print(corrected)
Output:
Doktor ilaç yazdı
Raw data
{
"_id": null,
"home_page": "https://github.com/StarlangSoftware/TurkishSpellChecker-Py",
"name": "NlpToolkit-SpellChecker",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "olcaytaner",
"author_email": "olcay.yildiz@ozyegin.edu.tr",
"download_url": "https://files.pythonhosted.org/packages/37/ac/371e8d790e9f91c4bbb0151b75b03ddf7db4dad62fb15ad1ecc42fdcca6a/NlpToolkit-SpellChecker-1.0.26.tar.gz",
"platform": null,
"description": "Turkish Spell Checker\n============\n\nThis tool is a spelling checker for Modern Turkish. It detects spelling errors and corrects them appropriately, through its list of misspellings and matching to the Turkish dictionary.\n\nVideo Lectures\n============\n\n[<img src=\"https://github.com/StarlangSoftware/TurkishSpellChecker/blob/master/video.jpg\" width=\"50%\">](https://youtu.be/wKwTKv6Jlo0)\n\nFor Developers\n============\n\nYou can also see [Cython](https://github.com/starlangsoftware/TurkishSpellChecker-Cy), [Java](https://github.com/starlangsoftware/TurkishSpellChecker), [C++](https://github.com/starlangsoftware/TurkishSpellChecker-CPP), [Swift](https://github.com/starlangsoftware/TurkishSpellChecker-Swift), [Js](https://github.com/starlangsoftware/TurkishSpellChecker-Js), or [C#](https://github.com/starlangsoftware/TurkishSpellChecker-CS) repository.\n\n## Requirements\n\n* [Python 3.7 or higher](#python)\n* [Git](#git)\n\n### Python \n\nTo check if you have a compatible version of Python installed, use the following command:\n\n python -V\n \nYou can find the latest version of Python [here](https://www.python.org/downloads/).\n\n### Git\n\nInstall the [latest version of Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).\n\n## Pip Install\n\n\tpip3 install NlpToolkit-SpellChecker\n\n## Download Code\n\nIn order to work on code, create a fork from GitHub page. \nUse Git for cloning the code to your local or below line for Ubuntu:\n\n\tgit clone <your-fork-git-link>\n\nA directory called SpellChecker will be created. Or you can use below link for exploring the code:\n\n\tgit clone https://github.com/starlangsoftware/TurkishSpellChecker-Py.git\n\n## Open project with Pycharm IDE\n\nSteps for opening the cloned project:\n\n* Start IDE\n* Select **File | Open** from main menu\n* Choose `DataStructure-PY` file\n* Select open as project option\n* Couple of seconds, project will be downloaded. \n\nDetailed Description\n============\n\n+ [Creating SpellChecker](#creating-spellchecker)\n+ [Spell Correction](#spell-correction)\n\n## Creating SpellChecker\n\nSpellChecker finds spelling errors and corrects them in Turkish. There are two types of spell checker available:\n\n* `SimpleSpellChecker`\n \n * To instantiate this, a `FsmMorphologicalAnalyzer` is needed. \n \n fsm = FsmMorphologicalAnalyzer()\n spellChecker = SimpleSpellChecker(fsm) \n \n* `NGramSpellChecker`,\n \n * To create an instance of this, both a `FsmMorphologicalAnalyzer` and a `NGram` is required. \n \n * `FsmMorphologicalAnalyzer` can be instantiated as follows:\n \n fsm = FsmMorphologicalAnalyzer()\n \n * `NGram` can be either trained from scratch or loaded from an existing model.\n \n * Training from scratch:\n \n corpus = Corpus(\"corpus.txt\");\n ngram = NGram(corpus.getAllWordsAsArrayList(), 1)\n ngram.calculateNGramProbabilities(LaplaceSmoothing())\n \n *There are many smoothing methods available. For other smoothing methods, check [here](https://github.com/olcaytaner/NGram).* \n * Loading from an existing model:\n \n ngram = NGram(\"ngram.txt\")\n\n\t*For further details, please check [here](https://github.com/starlangsoftware/NGram).* \n \n * Afterwards, `NGramSpellChecker` can be created as below:\n \n spellChecker = NGramSpellChecker(fsm, ngram)\n \n\n## Spell Correction\n\nSpell correction can be done as follows:\n\n sentence = Sentence(\"D\u0131ktor ola\u00e7 yazd\u0131\")\n corrected = spellChecker.spellCheck(sentence)\n print(corrected)\n \nOutput:\n\n Doktor ila\u00e7 yazd\u0131",
"bugtrack_url": null,
"license": "",
"summary": "Turkish Spell Checker Library",
"version": "1.0.26",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "37ac371e8d790e9f91c4bbb0151b75b03ddf7db4dad62fb15ad1ecc42fdcca6a",
"md5": "8654c6373f0033760ca46721d22c7eb0",
"sha256": "05d044b33108907e6b3d5f388fa3b742bf3bfd15c726559b00400b59bb06d897"
},
"downloads": -1,
"filename": "NlpToolkit-SpellChecker-1.0.26.tar.gz",
"has_sig": false,
"md5_digest": "8654c6373f0033760ca46721d22c7eb0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 25641,
"upload_time": "2023-03-25T17:37:13",
"upload_time_iso_8601": "2023-03-25T17:37:13.020788Z",
"url": "https://files.pythonhosted.org/packages/37/ac/371e8d790e9f91c4bbb0151b75b03ddf7db4dad62fb15ad1ecc42fdcca6a/NlpToolkit-SpellChecker-1.0.26.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-25 17:37:13",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "StarlangSoftware",
"github_project": "TurkishSpellChecker-Py",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "nlptoolkit-spellchecker"
}