# yolo-splitter
Tool to create,modify YOLO dataset.
## Installation
```bash
pip install yolosplitter
```
## Uses
```python
from yolosplitter import YoloSplitter
ys = YoloSplitter(imgFormat=['.jpg', '.jpeg', '.png'], labelFormat=['.txt'] )
# If you have yolo-format dataset already on the system
df = ys.from_yolo_dir(input_dir="yolo_dataset",ratio=(0.7,0.2,0.1),return_df=True)
# If you have mixed Images and Labels in the same directory
df = ys.from_mixed_dir(input_dir="mydataset",ratio=(0.7,0.2,0.1),return_df=True)
# To see train/test/val split size, total error files, all class names from annotation files
ys.info()
# !!! changed show_dataframe to get_dataframe()
# to see dataframe
ys.get_dataframe()
```
![2024-01-30_08-19](https://github.com/sandeshkharat87/yolo-splitter/assets/47347413/b2475cde-cbb7-410f-a4df-dd2622698ee1)
```python
ys.save_split(output_dir="potholes")
```
```bash
Saving New split in 'potholes' dir
100%|██████████| 118/118 [00:00<00:00, 1352.79it/s]
```
```python
# Use ys.show_show_errors to show filename which have errors
ys.show_errors()
# Use ys.show_dataframe to see dataframe created on the dataset
ys.get_dataframe()
# To see train/test/val split size, total error files, all class names from annotation files
ys.info()
```
### Input Directory
```
MyDataset/
├── 02.png
├── 02.txt
├── 03.png
├── 03.txt
├── 04.png
├── 04.txt
├── 05.png
├── 05.txt
├── 06.png
├── 06.txt
├── 07.png
├── 07.txt
├── 08.png
├── 08.txt
├── 09.png
├── 09.txt
├── 10.png
├── 10.txt
├── 11.png
└── 11.txt
```
### Output Directory
```
MyDataset-splitted/
├── data.yaml
├── train
│ ├── images
│ │ ├── 03.png
│ │ ├── 04.png
│ │ ├── 05.png
│ │ ├── 07.png
│ │ ├── 08.png
│ │ ├── 09.png
│ │ └── 10.png
│ └── labels
│ ├── 03.txt
│ ├── 04.txt
│ ├── 05.txt
│ ├── 07.txt
│ ├── 08.txt
│ ├── 09.txt
│ └── 10.txt
└── val
├── images
│ ├── 02.png
│ ├── 06.png
│ └── 11.png
└── labels
├── 02.txt
├── 06.txt
└── 11.txt
```
# Change Log
## Stable
* 2024-08-26 version 5.0.0
* Optimize code and speedup execution. Thanks for incredible work [https://github.com/MarcelloCuoghi]
* 2023-04-25 version 4.9.1
* Fixed. "Having a newline at the end of the file causes an error:
ValueError('invalid literal for int() with base 10: ''')". Thanks to [https://github.com/Maxvgrad] for finding bug.
* 2023-01-30 version 4.9
* Fixed Fixes Annotation Parse Error. Thanks to [https://github.com/Xiteed]
* 2023-12-20 version 4.8
* Changed yaml file style
* 2023-12-19 version 4.7
* Fix output dir of `val` to `valid` thanks to [https://github.com/AndreasFridh]
* Added `ys.info()` To see train/test/val split size, total error files, all class names from annotation files
* Changed `ys.show_dataframe` to `ys.get_dataframe()`
* small bug fixes
Raw data
{
"_id": null,
"home_page": "https://github.com/sandeshkharat87/yolo-splitter",
"name": "yolosplitter",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "yolo splitter, split datasets, yolo split, yolos split dataset, yolo, yolosplitter, yolo-splitter",
"author": "wpnx",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/31/13/c5e86e2e97a175a70aa03120780cdf75e6849a111666cb85c88d57feac9d/yolosplitter-0.5.0.0.tar.gz",
"platform": null,
"description": "# yolo-splitter\nTool to create,modify YOLO dataset.\n\n## Installation\n```bash\npip install yolosplitter\n```\n\n## Uses\n```python\nfrom yolosplitter import YoloSplitter\n\nys = YoloSplitter(imgFormat=['.jpg', '.jpeg', '.png'], labelFormat=['.txt'] )\n\n# If you have yolo-format dataset already on the system\ndf = ys.from_yolo_dir(input_dir=\"yolo_dataset\",ratio=(0.7,0.2,0.1),return_df=True)\n\n# If you have mixed Images and Labels in the same directory\ndf = ys.from_mixed_dir(input_dir=\"mydataset\",ratio=(0.7,0.2,0.1),return_df=True)\n\n# To see train/test/val split size, total error files, all class names from annotation files\nys.info()\n\n# !!! changed show_dataframe to get_dataframe()\n# to see dataframe\nys.get_dataframe()\n```\n![2024-01-30_08-19](https://github.com/sandeshkharat87/yolo-splitter/assets/47347413/b2475cde-cbb7-410f-a4df-dd2622698ee1)\n\n\n\n```python\nys.save_split(output_dir=\"potholes\")\n```\n\n```bash\nSaving New split in 'potholes' dir\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 118/118 [00:00<00:00, 1352.79it/s]\n```\n\n```python\n# Use ys.show_show_errors to show filename which have errors\nys.show_errors()\n\n# Use ys.show_dataframe to see dataframe created on the dataset\nys.get_dataframe()\n\n# To see train/test/val split size, total error files, all class names from annotation files\nys.info()\n```\n\n\n### Input Directory\n```\nMyDataset/\n\u251c\u2500\u2500 02.png\n\u251c\u2500\u2500 02.txt\n\u251c\u2500\u2500 03.png\n\u251c\u2500\u2500 03.txt\n\u251c\u2500\u2500 04.png\n\u251c\u2500\u2500 04.txt\n\u251c\u2500\u2500 05.png\n\u251c\u2500\u2500 05.txt\n\u251c\u2500\u2500 06.png\n\u251c\u2500\u2500 06.txt\n\u251c\u2500\u2500 07.png\n\u251c\u2500\u2500 07.txt\n\u251c\u2500\u2500 08.png\n\u251c\u2500\u2500 08.txt\n\u251c\u2500\u2500 09.png\n\u251c\u2500\u2500 09.txt\n\u251c\u2500\u2500 10.png\n\u251c\u2500\u2500 10.txt\n\u251c\u2500\u2500 11.png\n\u2514\u2500\u2500 11.txt\n```\n\n### Output Directory\n```\nMyDataset-splitted/\n\u251c\u2500\u2500 data.yaml\n\u251c\u2500\u2500 train\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 images\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 03.png\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 04.png\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 05.png\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 07.png\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 08.png\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 09.png\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 10.png\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 labels\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 03.txt\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 04.txt\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 05.txt\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 07.txt\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 08.txt\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 09.txt\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 10.txt\n\u2514\u2500\u2500 val\n \u251c\u2500\u2500 images\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 02.png\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 06.png\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 11.png\n \u2514\u2500\u2500 labels\n \u251c\u2500\u2500 02.txt\n \u251c\u2500\u2500 06.txt\n \u2514\u2500\u2500 11.txt\n```\n# Change Log\n## Stable\n\n* 2024-08-26 version 5.0.0\n * Optimize code and speedup execution. Thanks for incredible work [https://github.com/MarcelloCuoghi]\n\n* 2023-04-25 version 4.9.1\n * Fixed. \"Having a newline at the end of the file causes an error:\n ValueError('invalid literal for int() with base 10: ''')\". Thanks to [https://github.com/Maxvgrad] for finding bug.\n\n* 2023-01-30 version 4.9\n * Fixed Fixes Annotation Parse Error. Thanks to [https://github.com/Xiteed] \n \n* 2023-12-20 version 4.8\n * Changed yaml file style\n\n* 2023-12-19 version 4.7\n * Fix output dir of `val` to `valid` thanks to [https://github.com/AndreasFridh]\n * Added `ys.info()` To see train/test/val split size, total error files, all class names from annotation files\n * Changed `ys.show_dataframe` to `ys.get_dataframe()`\n * small bug fixes\n \n",
"bugtrack_url": null,
"license": null,
"summary": "Tool to Create,Modify YOLO dataset and much more...",
"version": "0.5.0.0",
"project_urls": {
"Homepage": "https://github.com/sandeshkharat87/yolo-splitter"
},
"split_keywords": [
"yolo splitter",
" split datasets",
" yolo split",
" yolos split dataset",
" yolo",
" yolosplitter",
" yolo-splitter"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "22bc7166da446532864ed6f78ccd6872e761fc3deb74b3fe5035dd1fa0cba2cf",
"md5": "8476719bed127e91654d175175bdbe60",
"sha256": "2a3989cb9e0017ba54efc06ee84d2ff4f2268de42dcd317caba835315c435c17"
},
"downloads": -1,
"filename": "yolosplitter-0.5.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8476719bed127e91654d175175bdbe60",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 5248,
"upload_time": "2024-08-26T06:14:19",
"upload_time_iso_8601": "2024-08-26T06:14:19.271592Z",
"url": "https://files.pythonhosted.org/packages/22/bc/7166da446532864ed6f78ccd6872e761fc3deb74b3fe5035dd1fa0cba2cf/yolosplitter-0.5.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3113c5e86e2e97a175a70aa03120780cdf75e6849a111666cb85c88d57feac9d",
"md5": "11e4f4c520fadaf8450aa885b754f229",
"sha256": "ded06850f6dc72f012ba59304a7791c937cd29624e25d767d80cc9f586de120b"
},
"downloads": -1,
"filename": "yolosplitter-0.5.0.0.tar.gz",
"has_sig": false,
"md5_digest": "11e4f4c520fadaf8450aa885b754f229",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 5206,
"upload_time": "2024-08-26T06:14:21",
"upload_time_iso_8601": "2024-08-26T06:14:21.317084Z",
"url": "https://files.pythonhosted.org/packages/31/13/c5e86e2e97a175a70aa03120780cdf75e6849a111666cb85c88d57feac9d/yolosplitter-0.5.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-26 06:14:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sandeshkharat87",
"github_project": "yolo-splitter",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "yolosplitter"
}