sabia-utils


Namesabia-utils JSON
Version 0.2.0 PyPI version JSON
download
home_page
SummaryGroup of utilities for Sabia
upload_time2023-06-07 12:51:06
maintainer
docs_urlNone
authorAI Lab Unb
requires_python
licenseMIT License
keywords sabia utils
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Sabia Utils

This is a collection of utilities for Sabia.

## Concat Module

This module is used to concatenate files.


### Concatenate all files from a path

* Returns a concatenated dataframe from all files in a path.
* Can save the concatenated dataframe to a file.

```python
from sabia_utils import concat

concat.concatenate_all_from_path(
    path='path\\to\\files',
    output_file='output\\file\\path', # optional
    fine_name='file_name'           # optional 
)
```

### Concatenate some files from a path

* Returns a concatenated dataframe from some files in a path.
* Can save the concatenated dataframe to a file.

```python
from sabia_utils import concat

concat.concatenate_files(
    path='path\\to\\files',
    files=['file1', 'file2'],
    output_file='output\\file\\path', # optional
    fine_name='file_name'           # optional 
)
```


### Copy files from a path to another

* Verify if the files exist in the path before copy.

```python
from sabia_utils import group

group.copy_new_files(
    PATH_IN='path\\to\\files1',
    PATH_OUT='path\\to\\files2'
)
```

## Group Module

This module is used to group files.

### Process files in both paths

* Verify if the files exist in the path before process.

```python
from sabia_utils import group

group.process_existent_files(
    PATH_IN='path\\to\\files1',
    PATH_OUT='path\\to\\files2'
)
```

### Process all files

* apply the function of copy and process files between the paths.

```python

from sabia_utils import group

group.process_all_files(
    PATH_IN='path\\to\\files1',
    PATH_OUT='path\\to\\files2'
)
```

## Pre_process Module

This module is used to pre_process parquet files.

### Process all parquet files

* Define a class that inherit sabia_utils.pre_process.Processing
* Override method apply_to_df(self, df, column), defining the pre-processing to be applied

* Apply this function on a folder containing parquet files to process them.

```python
from sabia_utils.pre_process import Processing
from sabia_utils import pre_process

class MyProcessor(Processing):
    def apply_to_df(self, df, column): 
        # Your pre-processing steps

pre_process.pre_process_parquets(
    folder_path='path\\to\\folder',
    colomun_to_pre_process='column_name_to_be_processed',
    pre_processed_column='processed_column_name',
    processor=MyProcessor()
)
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "sabia-utils",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "sabia utils",
    "author": "AI Lab Unb",
    "author_email": "ailabunb@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/30/68/1fea725967184c6f722b6316bb891fadb0248388834441991b05293ab689/sabia-utils-0.2.0.tar.gz",
    "platform": null,
    "description": "# Sabia Utils\n\nThis is a collection of utilities for Sabia.\n\n## Concat Module\n\nThis module is used to concatenate files.\n\n\n### Concatenate all files from a path\n\n* Returns a concatenated dataframe from all files in a path.\n* Can save the concatenated dataframe to a file.\n\n```python\nfrom sabia_utils import concat\n\nconcat.concatenate_all_from_path(\n    path='path\\\\to\\\\files',\n    output_file='output\\\\file\\\\path', # optional\n    fine_name='file_name'           # optional \n)\n```\n\n### Concatenate some files from a path\n\n* Returns a concatenated dataframe from some files in a path.\n* Can save the concatenated dataframe to a file.\n\n```python\nfrom sabia_utils import concat\n\nconcat.concatenate_files(\n    path='path\\\\to\\\\files',\n    files=['file1', 'file2'],\n    output_file='output\\\\file\\\\path', # optional\n    fine_name='file_name'           # optional \n)\n```\n\n\n### Copy files from a path to another\n\n* Verify if the files exist in the path before copy.\n\n```python\nfrom sabia_utils import group\n\ngroup.copy_new_files(\n    PATH_IN='path\\\\to\\\\files1',\n    PATH_OUT='path\\\\to\\\\files2'\n)\n```\n\n## Group Module\n\nThis module is used to group files.\n\n### Process files in both paths\n\n* Verify if the files exist in the path before process.\n\n```python\nfrom sabia_utils import group\n\ngroup.process_existent_files(\n    PATH_IN='path\\\\to\\\\files1',\n    PATH_OUT='path\\\\to\\\\files2'\n)\n```\n\n### Process all files\n\n* apply the function of copy and process files between the paths.\n\n```python\n\nfrom sabia_utils import group\n\ngroup.process_all_files(\n    PATH_IN='path\\\\to\\\\files1',\n    PATH_OUT='path\\\\to\\\\files2'\n)\n```\n\n## Pre_process Module\n\nThis module is used to pre_process parquet files.\n\n### Process all parquet files\n\n* Define a class that inherit sabia_utils.pre_process.Processing\n* Override method apply_to_df(self, df, column), defining the pre-processing to be applied\n\n* Apply this function on a folder containing parquet files to process them.\n\n```python\nfrom sabia_utils.pre_process import Processing\nfrom sabia_utils import pre_process\n\nclass MyProcessor(Processing):\n    def apply_to_df(self, df, column): \n        # Your pre-processing steps\n\npre_process.pre_process_parquets(\n    folder_path='path\\\\to\\\\folder',\n    colomun_to_pre_process='column_name_to_be_processed',\n    pre_processed_column='processed_column_name',\n    processor=MyProcessor()\n)\n```\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Group of utilities for Sabia",
    "version": "0.2.0",
    "project_urls": null,
    "split_keywords": [
        "sabia",
        "utils"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aeb48df09b46837a86370213d685c7b1fe7f3c3ed3891fbabca4d1e7dec0dc14",
                "md5": "6c471cb2e348633944a6e862e8532a61",
                "sha256": "b09f1c3e3fffaa103fa581291fc6a8ce784b7176f0d78d50d5325db4389a909d"
            },
            "downloads": -1,
            "filename": "sabia_utils-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6c471cb2e348633944a6e862e8532a61",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 12945,
            "upload_time": "2023-06-07T12:51:04",
            "upload_time_iso_8601": "2023-06-07T12:51:04.255461Z",
            "url": "https://files.pythonhosted.org/packages/ae/b4/8df09b46837a86370213d685c7b1fe7f3c3ed3891fbabca4d1e7dec0dc14/sabia_utils-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "30681fea725967184c6f722b6316bb891fadb0248388834441991b05293ab689",
                "md5": "713b69d0ae209906db4885e8312fa142",
                "sha256": "f994cfd1ca250d0a82b4e7cd35915dfcc607da96486e5bfc2633c77f0a0b5b77"
            },
            "downloads": -1,
            "filename": "sabia-utils-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "713b69d0ae209906db4885e8312fa142",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10053,
            "upload_time": "2023-06-07T12:51:06",
            "upload_time_iso_8601": "2023-06-07T12:51:06.285509Z",
            "url": "https://files.pythonhosted.org/packages/30/68/1fea725967184c6f722b6316bb891fadb0248388834441991b05293ab689/sabia-utils-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-07 12:51:06",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "sabia-utils"
}
        
Elapsed time: 0.08410s