# KSSDS
한국어 대화 시스템 용 문장 분리기
KSSDS는 한국어 대화 시스템을 다루기 위해 설계된 딥러닝 기반 문장 분리기입니다.
기존의 한국어 문장 분리기는 규칙 또는 통계 기반의 모델로, 종결 어미나 구두점에 크게 의존하는 경향이 있습니다.
이러한 특성 때문에, STT(Speech-to-Text) 모델을 통해 생성된 텍스트에서 자주 발생하는 변칙적인 사례
(예: 구두점 생략, 어순 도치 등)에 효과적으로 대응하기 어려운 한계가 있습니다.
KSSDS는 이러한 한계를 극복하기 위해 개발된 모델로,
트랜스포머 기반 딥러닝을 활용하여 대화 시스템에서도 안정적이고 유연한 문장 분리를 목표로 합니다.
---
## 설치
To install KSSDS, simply use pip:
```bash
pip install KSSDS
```
---
## Quickstart
다음은 KSSDS를 사용하여 한국어 문장을 분리하는 간단한 예제입니다:
```python
from KSSDS import KSSDS
# Initialize the model
kssds = KSSDS()
# Split sentences
input_text = "안녕하세요. 오늘 날씨가 참 좋네요. 저는 산책을 나갈 예정입니다."
split_sentences = kssds.split_sentences(input_text)
# Print results
for idx, sentence in enumerate(split_sentences):
print(f"{idx + 1}: {sentence}")
```
<pre style="background-color:#F5EDED; color:white; padding:10px; border-radius:5px; font-family:monospace;">
<span style="color:#a29acb;">1: 안녕하세요.</span>
<span style="color:#c3adad;">2: 오늘 날씨가 참 좋네요.</span>
<span style="color:YellowGreen;">3: 저는 산책을 나갈 예정입니다.</span>
</pre>
---
## 문서 및 자세한 정보
For advanced usage, training scripts, or contributing, please visit the [GitHub Repository](https://github.com/ggomarobot/KSSDS).
Raw data
{
"_id": null,
"home_page": "https://github.com/ggomarobot/KSSDS",
"name": "KSSDS",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "Korean NLP sentence splitter dialogue systems",
"author": "Gun Yang",
"author_email": "ggomarobot@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/94/20/05bddaea0e757eed922cf760df14e0feeb151940fa754934f665866b57c1/kssds-1.0.6.tar.gz",
"platform": null,
"description": "# KSSDS\n\n\ud55c\uad6d\uc5b4 \ub300\ud654 \uc2dc\uc2a4\ud15c \uc6a9 \ubb38\uc7a5 \ubd84\ub9ac\uae30\n\nKSSDS\ub294 \ud55c\uad6d\uc5b4 \ub300\ud654 \uc2dc\uc2a4\ud15c\uc744 \ub2e4\ub8e8\uae30 \uc704\ud574 \uc124\uacc4\ub41c \ub525\ub7ec\ub2dd \uae30\ubc18 \ubb38\uc7a5 \ubd84\ub9ac\uae30\uc785\ub2c8\ub2e4.\n\n\uae30\uc874\uc758 \ud55c\uad6d\uc5b4 \ubb38\uc7a5 \ubd84\ub9ac\uae30\ub294 \uaddc\uce59 \ub610\ub294 \ud1b5\uacc4 \uae30\ubc18\uc758 \ubaa8\ub378\ub85c, \uc885\uacb0 \uc5b4\ubbf8\ub098 \uad6c\ub450\uc810\uc5d0 \ud06c\uac8c \uc758\uc874\ud558\ub294 \uacbd\ud5a5\uc774 \uc788\uc2b5\ub2c8\ub2e4. \n\uc774\ub7ec\ud55c \ud2b9\uc131 \ub54c\ubb38\uc5d0, STT(Speech-to-Text) \ubaa8\ub378\uc744 \ud1b5\ud574 \uc0dd\uc131\ub41c \ud14d\uc2a4\ud2b8\uc5d0\uc11c \uc790\uc8fc \ubc1c\uc0dd\ud558\ub294 \ubcc0\uce59\uc801\uc778 \uc0ac\ub840 \n(\uc608: \uad6c\ub450\uc810 \uc0dd\ub7b5, \uc5b4\uc21c \ub3c4\uce58 \ub4f1)\uc5d0 \ud6a8\uacfc\uc801\uc73c\ub85c \ub300\uc751\ud558\uae30 \uc5b4\ub824\uc6b4 \ud55c\uacc4\uac00 \uc788\uc2b5\ub2c8\ub2e4.\n\nKSSDS\ub294 \uc774\ub7ec\ud55c \ud55c\uacc4\ub97c \uadf9\ubcf5\ud558\uae30 \uc704\ud574 \uac1c\ubc1c\ub41c \ubaa8\ub378\ub85c, \n\ud2b8\ub79c\uc2a4\ud3ec\uba38 \uae30\ubc18 \ub525\ub7ec\ub2dd\uc744 \ud65c\uc6a9\ud558\uc5ec \ub300\ud654 \uc2dc\uc2a4\ud15c\uc5d0\uc11c\ub3c4 \uc548\uc815\uc801\uc774\uace0 \uc720\uc5f0\ud55c \ubb38\uc7a5 \ubd84\ub9ac\ub97c \ubaa9\ud45c\ub85c \ud569\ub2c8\ub2e4.\n\n---\n\n## \uc124\uce58\n\nTo install KSSDS, simply use pip:\n\n```bash\npip install KSSDS\n```\n\n---\n\n## Quickstart\n\n\ub2e4\uc74c\uc740 KSSDS\ub97c \uc0ac\uc6a9\ud558\uc5ec \ud55c\uad6d\uc5b4 \ubb38\uc7a5\uc744 \ubd84\ub9ac\ud558\ub294 \uac04\ub2e8\ud55c \uc608\uc81c\uc785\ub2c8\ub2e4:\n\n```python\nfrom KSSDS import KSSDS\n\n# Initialize the model\nkssds = KSSDS()\n\n# Split sentences\ninput_text = \"\uc548\ub155\ud558\uc138\uc694. \uc624\ub298 \ub0a0\uc528\uac00 \ucc38 \uc88b\ub124\uc694. \uc800\ub294 \uc0b0\ucc45\uc744 \ub098\uac08 \uc608\uc815\uc785\ub2c8\ub2e4.\"\nsplit_sentences = kssds.split_sentences(input_text)\n\n# Print results\nfor idx, sentence in enumerate(split_sentences):\n print(f\"{idx + 1}: {sentence}\")\n```\n\n<pre style=\"background-color:#F5EDED; color:white; padding:10px; border-radius:5px; font-family:monospace;\">\n<span style=\"color:#a29acb;\">1: \uc548\ub155\ud558\uc138\uc694.</span>\n<span style=\"color:#c3adad;\">2: \uc624\ub298 \ub0a0\uc528\uac00 \ucc38 \uc88b\ub124\uc694.</span>\n<span style=\"color:YellowGreen;\">3: \uc800\ub294 \uc0b0\ucc45\uc744 \ub098\uac08 \uc608\uc815\uc785\ub2c8\ub2e4.</span>\n</pre>\n\n---\n\n## \ubb38\uc11c \ubc0f \uc790\uc138\ud55c \uc815\ubcf4\n\nFor advanced usage, training scripts, or contributing, please visit the [GitHub Repository](https://github.com/ggomarobot/KSSDS).\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Korean Sentence Splitter for Dialogue Systems",
"version": "1.0.6",
"project_urls": {
"Homepage": "https://github.com/ggomarobot/KSSDS"
},
"split_keywords": [
"korean",
"nlp",
"sentence",
"splitter",
"dialogue",
"systems"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "77dd3ccc41222f01849db1fd8e2661e2f31d1726c7dcf56a9abfd0820b5e491c",
"md5": "9f51c530b94f8af6fc574e5af67b2429",
"sha256": "441ff62e89bcc00f477813c00a3376675d541501c28ad094f830a0a829ed97de"
},
"downloads": -1,
"filename": "KSSDS-1.0.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9f51c530b94f8af6fc574e5af67b2429",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 8543,
"upload_time": "2025-01-16T01:06:45",
"upload_time_iso_8601": "2025-01-16T01:06:45.965666Z",
"url": "https://files.pythonhosted.org/packages/77/dd/3ccc41222f01849db1fd8e2661e2f31d1726c7dcf56a9abfd0820b5e491c/KSSDS-1.0.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "942005bddaea0e757eed922cf760df14e0feeb151940fa754934f665866b57c1",
"md5": "91ed5504bb78cc674569493e8f59c26f",
"sha256": "aa8a17d5c5f0b65dbad0a03c063b14c1a1c4bc5a389389b3c1da8834a7e48cb4"
},
"downloads": -1,
"filename": "kssds-1.0.6.tar.gz",
"has_sig": false,
"md5_digest": "91ed5504bb78cc674569493e8f59c26f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 13154,
"upload_time": "2025-01-16T01:06:48",
"upload_time_iso_8601": "2025-01-16T01:06:48.392445Z",
"url": "https://files.pythonhosted.org/packages/94/20/05bddaea0e757eed922cf760df14e0feeb151940fa754934f665866b57c1/kssds-1.0.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-16 01:06:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ggomarobot",
"github_project": "KSSDS",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "GPUtil",
"specs": [
[
">=",
"1.4"
],
[
"<",
"2.0"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.19.5"
],
[
"<",
"2.0"
]
]
},
{
"name": "PyYAML",
"specs": [
[
">=",
"6.0"
],
[
"<",
"7.0"
]
]
},
{
"name": "scikit_learn",
"specs": [
[
">=",
"1.6"
],
[
"<",
"2.0"
]
]
},
{
"name": "torch",
"specs": [
[
">=",
"2.5"
],
[
"<",
"3.0"
]
]
},
{
"name": "transformers",
"specs": [
[
">=",
"4.42"
],
[
"<",
"5.0"
]
]
}
],
"lcname": "kssds"
}