.. image:: https://badge.fury.io/py/filesplit.png
:target: https://badge.fury.io/py/filesplit
filesplit
==========
File splitting and merging made easy for python programmers!
This module
* Can split files of any size into multiple chunks and also merge them back.
* Can handle both structured and unstructured files.
System Requirements
--------------------
**Operating System**: Windows/Linux/Mac
**Python version**: 3.x.x
Installation
------------
The module is available as a part of PyPI and can be easily installed
using ``pip``
::
pip install filesplit
Split
-----
Create an instance
.. code-block:: python
from filesplit.split import Split
split = Split(inputfile: str, outputdir: str)
``inputfile`` (str, Required) - Path to the original file.
``outputdir`` (str, Required) - Output directory path to write the file splits.
With the instance created, the following methods can be used on the instance
bysize (size: int, newline: Optional[bool] = False, includeheader: Optional[bool] = False, callback: Optional[Callable] = None) -> None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Splits file by size.
Args:
``size`` (int, Required): Max size in bytes that is allowed in each split.
``newline`` (bool, Optional): Setting this to True will not produce any incomplete lines in each split. Defaults to False.
``includeheader`` (bool, Optional): Setting this to True will include header in each split. The first line is treated as a header. Defaults to False.
``callback`` (Callable, Optional): Callback function to invoke after each split. The callback function should accept two arguments [func (str, int)] - full path to the split file,
split file size (bytes). Defaults to None.
Returns:
``None``
bylinecount(self, linecount: int, includeheader: Optional[bool] = False, callback: Optional[Callable] = None) -> None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Splits file by line count.
Args:
``linecount`` (int, Required): Max lines that is allowed in each split.
``includeheader`` (bool, Optional): Setting this to True will include header in each split. The first line is treated as a header. Defaults to False.
``callback`` (Callable, Optional): Callback function to invoke after each split. The callback function should accept two arguments [func (str, int)] - full path to the split file,
split file size (bytes). Defaults to None.
Returns:
``None``
The file splits are generated in this fashion ``[original_filename]_0001.ext, [original_filename]_0002.ext, .., [original_filename]_n.ext``.
A manifest file is also created in the output directory to keep track of the file splits. This manifest file is required for merge operation.
Moreover,
* The delimiter for the generated splits can be changed by setting ``splitdelimiter`` property like ``split.splitdelimiter='$'``. Default is ``_`` (underscore).
* The number of zero fill digits for the generated splits can be changed by setting ``splitzerofill`` property like ``split.splitzerofill=10``. Default is 4.
* The manifest file name for the generated splits can be changed by setting ``manfilename`` property like ``split.manfilename='man'``. Default is ``manifest``.
* To forcefully and safely terminate the process set the property ``terminate`` to True while the process is running.
Merge
-----
Create an instance
.. code-block:: python
from filesplit.merge import Merge
merge = Merge(inputdir: str, outputdir: str, outputfilename: str)
``inputdir`` (str, Required) - Path to the directory containing file splits.
``outputdir`` (str, Required) - Output directory path to write the merged file.
``outputfilename`` (str, Required) - Name to use for the merged file.
With the instance created, the following method can be used on the instance
merge(cleanup: Optional[bool] = False, callback: Optional[Callable] = None) -> None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Merges the split files back into one single file.
Args:
``cleanup`` (bool, Optional): If True, all the split files and manifest file will be purged after successful merge. Defaults to False.
``callback`` (Callable, Optional): Callback function to invoke after merge. The callback function should accept two arguments [func (str, int)] - full path to the merged file,
merged file size (bytes). Defaults to None.
Returns:
``None``
Moreover,
* The manifest file name can be changed by setting ``manfilename`` property like ``merge.manfilename='man'``.
The manifest file name should match with the one used during the file split process and should be available in the same directory as that of file splits. Default is ``manifest``.
* To forcefully and safely terminate the process set the property ``terminate`` to True while the process is running.
Raw data
{
"_id": null,
"home_page": "https://github.com/ram-jayapalan/filesplit",
"name": "filesplit",
"maintainer": null,
"docs_url": null,
"requires_python": "<4,>=3",
"maintainer_email": null,
"keywords": "file split, filesplit, split file, splitfile",
"author": "Ramprakash Jayapalan",
"author_email": "ramp16888@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/0f/17/39439b12d77c4ca76e795832b3d3209609b58bc5a0a375630e271b5d7b88/filesplit-4.1.0.tar.gz",
"platform": null,
"description": ".. image:: https://badge.fury.io/py/filesplit.png\n :target: https://badge.fury.io/py/filesplit\n\nfilesplit\n==========\n\nFile splitting and merging made easy for python programmers!\n\nThis module \n * Can split files of any size into multiple chunks and also merge them back. \n * Can handle both structured and unstructured files.\n\n\nSystem Requirements\n--------------------\n\n**Operating System**: Windows/Linux/Mac\n\n**Python version**: 3.x.x\n\n\nInstallation\n------------\n\nThe module is available as a part of PyPI and can be easily installed\nusing ``pip``\n\n::\n\n pip install filesplit\n\nSplit\n-----\n\nCreate an instance\n\n.. code-block:: python\n\n from filesplit.split import Split\n\n split = Split(inputfile: str, outputdir: str)\n\n``inputfile`` (str, Required) - Path to the original file.\n\n``outputdir`` (str, Required) - Output directory path to write the file splits.\n\nWith the instance created, the following methods can be used on the instance\n\n\nbysize (size: int, newline: Optional[bool] = False, includeheader: Optional[bool] = False, callback: Optional[Callable] = None) -> None\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nSplits file by size.\n\nArgs:\n\n``size`` (int, Required): Max size in bytes that is allowed in each split.\n\n``newline`` (bool, Optional): Setting this to True will not produce any incomplete lines in each split. Defaults to False.\n\n``includeheader`` (bool, Optional): Setting this to True will include header in each split. The first line is treated as a header. Defaults to False.\n\n``callback`` (Callable, Optional): Callback function to invoke after each split. The callback function should accept two arguments [func (str, int)] - full path to the split file, \nsplit file size (bytes). Defaults to None.\n\nReturns:\n\n``None``\n\n\nbylinecount(self, linecount: int, includeheader: Optional[bool] = False, callback: Optional[Callable] = None) -> None\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nSplits file by line count.\n\nArgs:\n\n``linecount`` (int, Required): Max lines that is allowed in each split.\n\n``includeheader`` (bool, Optional): Setting this to True will include header in each split. The first line is treated as a header. Defaults to False.\n\n``callback`` (Callable, Optional): Callback function to invoke after each split. The callback function should accept two arguments [func (str, int)] - full path to the split file, \nsplit file size (bytes). Defaults to None.\n\nReturns:\n\n``None``\n\nThe file splits are generated in this fashion ``[original_filename]_0001.ext, [original_filename]_0002.ext, .., [original_filename]_n.ext``.\n\nA manifest file is also created in the output directory to keep track of the file splits. This manifest file is required for merge operation.\n\nMoreover, \n * The delimiter for the generated splits can be changed by setting ``splitdelimiter`` property like ``split.splitdelimiter='$'``. Default is ``_`` (underscore).\n * The number of zero fill digits for the generated splits can be changed by setting ``splitzerofill`` property like ``split.splitzerofill=10``. Default is 4.\n * The manifest file name for the generated splits can be changed by setting ``manfilename`` property like ``split.manfilename='man'``. Default is ``manifest``.\n * To forcefully and safely terminate the process set the property ``terminate`` to True while the process is running.\n\n\nMerge\n-----\n\nCreate an instance\n\n.. code-block:: python\n\n from filesplit.merge import Merge\n\n merge = Merge(inputdir: str, outputdir: str, outputfilename: str)\n\n``inputdir`` (str, Required) - Path to the directory containing file splits.\n\n``outputdir`` (str, Required) - Output directory path to write the merged file.\n\n``outputfilename`` (str, Required) - Name to use for the merged file.\n\nWith the instance created, the following method can be used on the instance\n\n\nmerge(cleanup: Optional[bool] = False, callback: Optional[Callable] = None) -> None\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nMerges the split files back into one single file.\n\nArgs:\n\n``cleanup`` (bool, Optional): If True, all the split files and manifest file will be purged after successful merge. Defaults to False.\n\n``callback`` (Callable, Optional): Callback function to invoke after merge. The callback function should accept two arguments [func (str, int)] - full path to the merged file, \nmerged file size (bytes). Defaults to None.\n\nReturns:\n\n``None``\n\nMoreover, \n * The manifest file name can be changed by setting ``manfilename`` property like ``merge.manfilename='man'``. \n The manifest file name should match with the one used during the file split process and should be available in the same directory as that of file splits. Default is ``manifest``.\n * To forcefully and safely terminate the process set the property ``terminate`` to True while the process is running.\n",
"bugtrack_url": null,
"license": null,
"summary": "Python module that is capable of splitting files and merging it back.",
"version": "4.1.0",
"project_urls": {
"Bug Reports": "https://github.com/ram-jayapalan/filesplit/issues",
"Homepage": "https://github.com/ram-jayapalan/filesplit",
"Source": "https://github.com/ram-jayapalan/filesplit"
},
"split_keywords": [
"file split",
" filesplit",
" split file",
" splitfile"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ee8b8381669a91a04834c5111e0ff1d56efb5c2779ba6e7410678f4ee4799083",
"md5": "7029ee516a1905807c5dbe6a608ee144",
"sha256": "5244718d37302b5741a7ffe11e7379bd178bcf31d8350632be200ba94c74a12c"
},
"downloads": -1,
"filename": "filesplit-4.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7029ee516a1905807c5dbe6a608ee144",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4,>=3",
"size": 9460,
"upload_time": "2024-10-20T01:37:11",
"upload_time_iso_8601": "2024-10-20T01:37:11.086357Z",
"url": "https://files.pythonhosted.org/packages/ee/8b/8381669a91a04834c5111e0ff1d56efb5c2779ba6e7410678f4ee4799083/filesplit-4.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0f1739439b12d77c4ca76e795832b3d3209609b58bc5a0a375630e271b5d7b88",
"md5": "e652e13dd8e8d30117694ba054e6ffc4",
"sha256": "1aceb3a8bea84743254683e6b97056aa24593783f3b7e35dac10bac706e184b3"
},
"downloads": -1,
"filename": "filesplit-4.1.0.tar.gz",
"has_sig": false,
"md5_digest": "e652e13dd8e8d30117694ba054e6ffc4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4,>=3",
"size": 7395,
"upload_time": "2024-10-20T01:37:12",
"upload_time_iso_8601": "2024-10-20T01:37:12.666798Z",
"url": "https://files.pythonhosted.org/packages/0f/17/39439b12d77c4ca76e795832b3d3209609b58bc5a0a375630e271b5d7b88/filesplit-4.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-20 01:37:12",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ram-jayapalan",
"github_project": "filesplit",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "filesplit"
}