# Bengali Plagiarism Checker
Introducing a Python library for detecting plagiarism in Bengali texts. This library comprises 200 Bengali books with approximately 4100 pages sourced from the National Digital Library, processed using the Tesseract OCR engine. With just two lines of code, you can check for similarities in Bengali written content. If a high degree of similarity is found, it will display the book title, author name, and other details. You can install the library using the following command in the terminal:
```
pip install BengaliPlagiarismChecker
```
<hr>
<br>
## Sample Usage
```
import BengaliPlagiarismChecker as bpc #importing package
#input text
text="""
বসন্তাগমে কামিনী রায় বসন্ত কি সহসা এ নির্জন আবাসে পশিয়াছ চুপি চুপি? নবীন পল্পবে
সাজিয়াছে তরুরাজি। ঝেড়ে দিলে কবে পুরাতন জীর্ণপত্র শীতল বাতাসে বাতাবি ফুলের গন্ধ ধীরে ধীরে ভেসে আসে আমার গবাক্ষপথে ঘন কুহুরবে মুখরিত আম্রবন বসন্তই হবে উদ্যান উজ্জল শত শ্বেত পুস্প হাসে আজিও ধরনি মরে রেখেছে ধরিয়া তার স্বর্ণ কারাগারে বর্ণ গন্ধ গানে রসে স্পর্শে দিতে চাহে দেহে আর চিতে নব প্রাণ, কিন্তু হায় নিঃশেষে ভরিয়া কই দিতে পারে, মধু? দূরে কোন্খানে থাকে অদেহীরা, বধু, পারো বলে দিতে?
"""
#method to find out plagiarism
bpc.check(text)
'''
OUTPUT
[[194, #BookID
'State Council of Educational Research and Training (SCERT)', #Author or Publisher
'সাহিত্য মালঞ্চ', #Book name
14, # Page number
23.88]] #Similarity Score
'''
```
<hr>
<br>
<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>.
Raw data
{
"_id": null,
"home_page": "https://github.com/SATYAJIT1910/Bengali-Plagiarism-Checker-pip",
"name": "BengaliPlagiarismCheckerTool",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "",
"author": "Satyajit Ghosh",
"author_email": "satyajit.ghosh@outlook.in",
"download_url": "https://files.pythonhosted.org/packages/57/be/20201aa45fc525654d7a77420c33c6398fb8ab5be5fe4c9898816c0c6a80/BengaliPlagiarismCheckerTool-0.0.3.tar.gz",
"platform": null,
"description": "# Bengali Plagiarism Checker\r\n\r\nIntroducing a Python library for detecting plagiarism in Bengali texts. This library comprises 200 Bengali books with approximately 4100 pages sourced from the National Digital Library, processed using the Tesseract OCR engine. With just two lines of code, you can check for similarities in Bengali written content. If a high degree of similarity is found, it will display the book title, author name, and other details. You can install the library using the following command in the terminal: \r\n```\r\npip install BengaliPlagiarismChecker\r\n```\r\n\r\n\r\n<hr>\r\n<br>\r\n\r\n## Sample Usage\r\n\r\n```\r\nimport BengaliPlagiarismChecker as bpc #importing package\r\n\r\n#input text\r\ntext=\"\"\"\r\n\r\n\u09ac\u09b8\u09a8\u09cd\u09a4\u09be\u0997\u09ae\u09c7 \u0995\u09be\u09ae\u09bf\u09a8\u09c0 \u09b0\u09be\u09af\u09bc \u09ac\u09b8\u09a8\u09cd\u09a4 \u0995\u09bf \u09b8\u09b9\u09b8\u09be \u098f \u09a8\u09bf\u09b0\u09cd\u099c\u09a8 \u0986\u09ac\u09be\u09b8\u09c7 \u09aa\u09b6\u09bf\u09af\u09bc\u09be\u099b \u099a\u09c1\u09aa\u09bf \u099a\u09c1\u09aa\u09bf? \u09a8\u09ac\u09c0\u09a8 \u09aa\u09b2\u09cd\u09aa\u09ac\u09c7\r\n\u09b8\u09be\u099c\u09bf\u09af\u09bc\u09be\u099b\u09c7 \u09a4\u09b0\u09c1\u09b0\u09be\u099c\u09bf\u0964 \u099d\u09c7\u09a1\u09bc\u09c7 \u09a6\u09bf\u09b2\u09c7 \u0995\u09ac\u09c7 \u09aa\u09c1\u09b0\u09be\u09a4\u09a8 \u099c\u09c0\u09b0\u09cd\u09a3\u09aa\u09a4\u09cd\u09b0 \u09b6\u09c0\u09a4\u09b2 \u09ac\u09be\u09a4\u09be\u09b8\u09c7 \u09ac\u09be\u09a4\u09be\u09ac\u09bf \u09ab\u09c1\u09b2\u09c7\u09b0 \u0997\u09a8\u09cd\u09a7 \u09a7\u09c0\u09b0\u09c7 \u09a7\u09c0\u09b0\u09c7 \u09ad\u09c7\u09b8\u09c7 \u0986\u09b8\u09c7 \u0986\u09ae\u09be\u09b0 \u0997\u09ac\u09be\u0995\u09cd\u09b7\u09aa\u09a5\u09c7 \u0998\u09a8 \u0995\u09c1\u09b9\u09c1\u09b0\u09ac\u09c7 \u09ae\u09c1\u0996\u09b0\u09bf\u09a4 \u0986\u09ae\u09cd\u09b0\u09ac\u09a8 \u09ac\u09b8\u09a8\u09cd\u09a4\u0987 \u09b9\u09ac\u09c7 \u0989\u09a6\u09cd\u09af\u09be\u09a8 \u0989\u099c\u09cd\u099c\u09b2 \u09b6\u09a4 \u09b6\u09cd\u09ac\u09c7\u09a4 \u09aa\u09c1\u09b8\u09cd\u09aa \u09b9\u09be\u09b8\u09c7 \u0986\u099c\u09bf\u0993 \u09a7\u09b0\u09a8\u09bf \u09ae\u09b0\u09c7 \u09b0\u09c7\u0996\u09c7\u099b\u09c7 \u09a7\u09b0\u09bf\u09df\u09be \u09a4\u09be\u09b0 \u09b8\u09cd\u09ac\u09b0\u09cd\u09a3 \u0995\u09be\u09b0\u09be\u0997\u09be\u09b0\u09c7 \u09ac\u09b0\u09cd\u09a3 \u0997\u09a8\u09cd\u09a7 \u0997\u09be\u09a8\u09c7 \u09b0\u09b8\u09c7 \u09b8\u09cd\u09aa\u09b0\u09cd\u09b6\u09c7 \u09a6\u09bf\u09a4\u09c7 \u099a\u09be\u09b9\u09c7 \u09a6\u09c7\u09b9\u09c7 \u0986\u09b0 \u099a\u09bf\u09a4\u09c7 \u09a8\u09ac \u09aa\u09cd\u09b0\u09be\u09a3, \u0995\u09bf\u09a8\u09cd\u09a4\u09c1 \u09b9\u09be\u09af\u09bc \u09a8\u09bf\u0983\u09b6\u09c7\u09b7\u09c7 \u09ad\u09b0\u09bf\u09af\u09bc\u09be \u0995\u0987 \u09a6\u09bf\u09a4\u09c7 \u09aa\u09be\u09b0\u09c7, \u09ae\u09a7\u09c1? \u09a6\u09c2\u09b0\u09c7 \u0995\u09cb\u09a8\u09cd\u0996\u09be\u09a8\u09c7 \u09a5\u09be\u0995\u09c7 \u0985\u09a6\u09c7\u09b9\u09c0\u09b0\u09be, \u09ac\u09a7\u09c1, \u09aa\u09be\u09b0\u09cb \u09ac\u09b2\u09c7 \u09a6\u09bf\u09a4\u09c7?\r\n\r\n\"\"\"\r\n#method to find out plagiarism\r\nbpc.check(text)\r\n\r\n'''\r\nOUTPUT\r\n\r\n[[194, #BookID\r\n 'State Council of Educational Research and Training (SCERT)', #Author or Publisher\r\n '\u09b8\u09be\u09b9\u09bf\u09a4\u09cd\u09af \u09ae\u09be\u09b2\u099e\u09cd\u099a', #Book name\r\n 14, # Page number\r\n 23.88]] #Similarity Score\r\n\r\n\r\n'''\r\n\r\n```\r\n<hr>\r\n<br>\r\n<a rel=\"license\" href=\"http://creativecommons.org/licenses/by/4.0/\"><img alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https://i.creativecommons.org/l/by/4.0/88x31.png\" /></a><br />This work is licensed under a <a rel=\"license\" href=\"http://creativecommons.org/licenses/by/4.0/\">Creative Commons Attribution 4.0 International License</a>.\r\n",
"bugtrack_url": null,
"license": "",
"summary": "A package for plagiarism detection on Bengali texts.",
"version": "0.0.3",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5f0d9225c25acfa6bb40b50b0e5e0fa0a4439e2ddaad6bd0314b70d42683b68b",
"md5": "c156ff27b5499234a852602c9233d13f",
"sha256": "e9a6f6aec7f1293765298e263c0fb39fcdefae96e89026e26133e19fcb8191d2"
},
"downloads": -1,
"filename": "BengaliPlagiarismCheckerTool-0.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c156ff27b5499234a852602c9233d13f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 2793469,
"upload_time": "2023-01-31T07:52:36",
"upload_time_iso_8601": "2023-01-31T07:52:36.856718Z",
"url": "https://files.pythonhosted.org/packages/5f/0d/9225c25acfa6bb40b50b0e5e0fa0a4439e2ddaad6bd0314b70d42683b68b/BengaliPlagiarismCheckerTool-0.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "57be20201aa45fc525654d7a77420c33c6398fb8ab5be5fe4c9898816c0c6a80",
"md5": "0244f6dfd3068943d5814b9f46619cd6",
"sha256": "5ccc5e9d35930ba05ffc16ca97b86e97d21d1eba4dc01d54cdcf95c1a0b37f4a"
},
"downloads": -1,
"filename": "BengaliPlagiarismCheckerTool-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "0244f6dfd3068943d5814b9f46619cd6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 2663273,
"upload_time": "2023-01-31T07:52:47",
"upload_time_iso_8601": "2023-01-31T07:52:47.991451Z",
"url": "https://files.pythonhosted.org/packages/57/be/20201aa45fc525654d7a77420c33c6398fb8ab5be5fe4c9898816c0c6a80/BengaliPlagiarismCheckerTool-0.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-31 07:52:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "SATYAJIT1910",
"github_project": "Bengali-Plagiarism-Checker-pip",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "bengaliplagiarismcheckertool"
}