dataprocessor_vb


Namedataprocessor_vb JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryA comprehensive data processing library.
upload_time2024-12-21 14:02:12
maintainerNone
docs_urlNone
authorVicba
requires_python<4.0,>=3.12
licenseNone
keywords data processing cleaning visualization feature engineering
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Data Tools Package

A comprehensive library for data preprocessing in AI development, focusing on scalability, usability, and modular design.

## Features

## Features

- **Data Loading**: Efficiently load datasets in various formats.
- **Data Cleaning**: Handle missing values, outliers, and duplicates.
- **Feature Engineering**: Create new features using advanced techniques.
- **Categorical Processing**: One-hot and label encoding for categorical variables.
- **Scaling**: Normalize and standardize numerical features.
- **Outlier Handling**: Detect and remove outliers using IQR.
- **Text Processing**: Clean, tokenize, and vectorize text data.
- **Time Series Processing**: Create time-based features and resample data.
- **Image Processing**: Load, resize, normalize, and convert images.
- **Image Augmentation**: Apply transformations to increase the diversity of your training dataset.

## usage

```py
from dataprocessor import DataLoader, DataCleaner, FeatureEngineer, ImageProcessor, ImageAugmenter

# Example usage of the package
loader = DataLoader()
data = loader.load_csv("data.csv")

cleaner = DataCleaner()
cleaned_data = cleaner.clean(data)

# Image processing example
image = ImageProcessor.load_image("path/to/image.jpg")
resized_image = ImageProcessor.resize_image(image, (224, 224))
normalized_image = ImageProcessor.normalize_image(resized_image)

# Image augmentation example
augmented_image = ImageAugmenter.augment_image(normalized_image)

```

## testing
```bash
poetry run pytest
```

# TODO:
- Fix file structure

# Package

[dataprocessor_vb pypi](https://pypi.org/project/dataprocessor_vb/)

1. configure pypi credentials if not already done
```bash
poetry config pypi-token.pypi <your-api-token>
```

2. publish the package
```bash
poetry publish --build
```

3. make also sure you add token to secrets under your repo settings in github

I think that the version should be updated manually, because now it updates the patch every commit.
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "dataprocessor_vb",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.12",
    "maintainer_email": null,
    "keywords": "data, processing, cleaning, visualization, feature engineering",
    "author": "Vicba",
    "author_email": "victor.barra@live.be",
    "download_url": "https://files.pythonhosted.org/packages/d6/04/df2534725b5491e62ce09ca4b1b8fc6a0a3ee3ec461e327db7093943ba8b/dataprocessor_vb-0.1.1.tar.gz",
    "platform": null,
    "description": "# Data Tools Package\n\nA comprehensive library for data preprocessing in AI development, focusing on scalability, usability, and modular design.\n\n## Features\n\n## Features\n\n- **Data Loading**: Efficiently load datasets in various formats.\n- **Data Cleaning**: Handle missing values, outliers, and duplicates.\n- **Feature Engineering**: Create new features using advanced techniques.\n- **Categorical Processing**: One-hot and label encoding for categorical variables.\n- **Scaling**: Normalize and standardize numerical features.\n- **Outlier Handling**: Detect and remove outliers using IQR.\n- **Text Processing**: Clean, tokenize, and vectorize text data.\n- **Time Series Processing**: Create time-based features and resample data.\n- **Image Processing**: Load, resize, normalize, and convert images.\n- **Image Augmentation**: Apply transformations to increase the diversity of your training dataset.\n\n## usage\n\n```py\nfrom dataprocessor import DataLoader, DataCleaner, FeatureEngineer, ImageProcessor, ImageAugmenter\n\n# Example usage of the package\nloader = DataLoader()\ndata = loader.load_csv(\"data.csv\")\n\ncleaner = DataCleaner()\ncleaned_data = cleaner.clean(data)\n\n# Image processing example\nimage = ImageProcessor.load_image(\"path/to/image.jpg\")\nresized_image = ImageProcessor.resize_image(image, (224, 224))\nnormalized_image = ImageProcessor.normalize_image(resized_image)\n\n# Image augmentation example\naugmented_image = ImageAugmenter.augment_image(normalized_image)\n\n```\n\n## testing\n```bash\npoetry run pytest\n```\n\n# TODO:\n- Fix file structure\n\n# Package\n\n[dataprocessor_vb pypi](https://pypi.org/project/dataprocessor_vb/)\n\n1. configure pypi credentials if not already done\n```bash\npoetry config pypi-token.pypi <your-api-token>\n```\n\n2. publish the package\n```bash\npoetry publish --build\n```\n\n3. make also sure you add token to secrets under your repo settings in github\n\nI think that the version should be updated manually, because now it updates the patch every commit.",
    "bugtrack_url": null,
    "license": null,
    "summary": "A comprehensive data processing library.",
    "version": "0.1.1",
    "project_urls": {
        "repository": "https://github.com/Vicba/data-preprocessing-package"
    },
    "split_keywords": [
        "data",
        " processing",
        " cleaning",
        " visualization",
        " feature engineering"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "838e52e682e0d0c10ca971efea3503c80034c30cbcbad9bebc6d8011e265a86a",
                "md5": "f8891061614444128f4a4bf75989e901",
                "sha256": "12b311487a80c0f71547d53acb46be8eb86440beb4a6cb34ed3a527921e57cec"
            },
            "downloads": -1,
            "filename": "dataprocessor_vb-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f8891061614444128f4a4bf75989e901",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.12",
            "size": 8592,
            "upload_time": "2024-12-21T14:02:10",
            "upload_time_iso_8601": "2024-12-21T14:02:10.653161Z",
            "url": "https://files.pythonhosted.org/packages/83/8e/52e682e0d0c10ca971efea3503c80034c30cbcbad9bebc6d8011e265a86a/dataprocessor_vb-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d604df2534725b5491e62ce09ca4b1b8fc6a0a3ee3ec461e327db7093943ba8b",
                "md5": "d55a694e12dc57f70a775e3960fce06b",
                "sha256": "71dd0a4153563127babbd5438a25470b0826a66673ec6861270a8212eab93da4"
            },
            "downloads": -1,
            "filename": "dataprocessor_vb-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "d55a694e12dc57f70a775e3960fce06b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.12",
            "size": 6183,
            "upload_time": "2024-12-21T14:02:12",
            "upload_time_iso_8601": "2024-12-21T14:02:12.972786Z",
            "url": "https://files.pythonhosted.org/packages/d6/04/df2534725b5491e62ce09ca4b1b8fc6a0a3ee3ec461e327db7093943ba8b/dataprocessor_vb-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-21 14:02:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Vicba",
    "github_project": "data-preprocessing-package",
    "github_not_found": true,
    "lcname": "dataprocessor_vb"
}
        
Elapsed time: 0.44772s