financial-dataset-preprocessor


Namefinancial-dataset-preprocessor JSON
Version 0.4.4 PyPI version JSON
download
home_pagehttps://github.com/nailen1/financial_dataset_preprocessor
SummaryA package for preprocessing financial datasets, powering the Life Asset Management development team.
upload_time2025-07-31 00:59:43
maintainerNone
docs_urlNone
authorJune Young Park
requires_python>=3.11
licenseNone
keywords
VCS
bugtrack_url
requirements pandas tqdm string_date_controller financial_dataset_loader canonical_transformer mongodb_controller aws_s3_controller universal_timeseries_transformer
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Financial Dataset Preprocessor

A Python package for preprocessing financial datasets from various sources. This package provides tools and utilities for cleaning, transforming, and preparing financial data for analysis.

## Version Updates

### v0.4.0 (2025-01-27)

- Removed virtual environment folder from Git repository
- Cleaned up unnecessary number conversion columns for Menu 2206
- Enhanced data preprocessing accuracy and consistency

### v0.3.9 (2025-01-27)

- Updated industry classification column preprocessing format for Menu 2206
- Enhanced data consistency and standardization

### v0.3.8 (2025-06-26)

- Enhanced stability of Menu 3233 preprocessor
- Improved custom index creation functionality

### v0.3.7 (2025-05-26)

- Refactored parse_utils module for better modularity
- Added type hints for improved code quality
- Enhanced number parsing functionality

### v0.3.6 (2025-05-23)

- Added filtering functionality to Menu 2206 preprocessor
- Improved data quality by removing summary rows

### v0.3.5 (2025-05-20)

- Enhanced Menu 2206 preprocessor with column renaming functionality
- Improved fund code handling in portfolio analysis

### v0.3.4 (2025-05-20)

- Added Menu 2206 preprocessor module
- Added time series basis utilities
- Added Bloomberg time series preprocessor
- Enhanced integration with universal_timeseries_transformer

## Features

- Menu 2205 Preprocessor
  - Corporation Name Finder
  - Domestic Beneficiary Certificates Processing
  - Domestic Bonds Analysis
  - Repo Agreement Processing
  - Borrowings Management
- Menu 2206 Preprocessor
  - Fund Portfolio Analysis
  - Investment Asset Classification
- Bloomberg Time Series Preprocessor
  - Index Data Processing
  - Currency Data Processing
- Time Series Utilities
  - Date Range Operations
  - Time Series Extension
- Additional preprocessors for other financial datasets (coming soon)

## Installation

You can install the package using pip:

```bash
pip install financial_dataset_preprocessor
```

## Requirements

- Python >= 3.11
- Dependencies are listed in requirements.txt

## Usage Examples

### 1. Search for Funds with Bonds

```python
from financial_dataset_preprocessor import (
    search_funds_having_domestic_bonds,
    get_domestic_bonds_by_fund
)

# Get all funds that have domestic bonds
fund_bonds = search_funds_having_domestic_bonds(date_ref='2025-02-21')

# Get bond details for a specific fund
fund_code = '100075'
bond_details = get_domestic_bonds_by_fund(fund_code=fund_code, date_ref='2025-02-21')
```

### 2. Analyze Fund Borrowings

```python
from financial_dataset_preprocessor import (
    search_funds_having_borrowings,
    get_borriwings_by_fund
)

# Find funds with borrowings
funds_with_borrowings = search_funds_having_borrowings(date_ref='2025-02-21')

# Get borrowing details
fund_code = '100075'
borrowing_details = get_borriwings_by_fund(fund_code=fund_code, date_ref='2025-02-21')
```

### 3. Check Repo Agreements

```python
from financial_dataset_preprocessor import (
    search_funds_having_repos,
    get_repos_by_fund
)

# Find funds with repos
funds_with_repos = search_funds_having_repos(date_ref='2025-02-21')

# Get repo details for a specific fund
fund_code = '100075'
repo_details = get_repos_by_fund(fund_code=fund_code, date_ref='2025-02-21')
```

## Development

To set up the development environment:

1. Clone the repository
2. Create a virtual environment
3. Install dependencies:

```bash
pip install -r requirements.txt
```

## License

This project is licensed under a proprietary license. All rights reserved.

### Terms of Use

- Source code viewing and forking is allowed
- Commercial use is prohibited without explicit permission
- Redistribution or modification of the code is prohibited
- Academic and research use is allowed with proper attribution

## Author

**June Young Park**  
AI Management Development Team Lead & Quant Strategist at LIFE Asset Management

LIFE Asset Management is a hedge fund management firm that integrates value investing and engagement strategies with quantitative approaches and financial technology, headquartered in Seoul, South Korea.

### Contact

- Email: juneyoungpaak@gmail.com
- Location: TWO IFC, Yeouido, Seoul

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/nailen1/financial_dataset_preprocessor",
    "name": "financial-dataset-preprocessor",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": "June Young Park",
    "author_email": "juneyoungpaak@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a3/1d/209be185203f558534382f04984447f07f84764b83327e52c66d7a693714/financial_dataset_preprocessor-0.4.4.tar.gz",
    "platform": null,
    "description": "# Financial Dataset Preprocessor\n\nA Python package for preprocessing financial datasets from various sources. This package provides tools and utilities for cleaning, transforming, and preparing financial data for analysis.\n\n## Version Updates\n\n### v0.4.0 (2025-01-27)\n\n- Removed virtual environment folder from Git repository\n- Cleaned up unnecessary number conversion columns for Menu 2206\n- Enhanced data preprocessing accuracy and consistency\n\n### v0.3.9 (2025-01-27)\n\n- Updated industry classification column preprocessing format for Menu 2206\n- Enhanced data consistency and standardization\n\n### v0.3.8 (2025-06-26)\n\n- Enhanced stability of Menu 3233 preprocessor\n- Improved custom index creation functionality\n\n### v0.3.7 (2025-05-26)\n\n- Refactored parse_utils module for better modularity\n- Added type hints for improved code quality\n- Enhanced number parsing functionality\n\n### v0.3.6 (2025-05-23)\n\n- Added filtering functionality to Menu 2206 preprocessor\n- Improved data quality by removing summary rows\n\n### v0.3.5 (2025-05-20)\n\n- Enhanced Menu 2206 preprocessor with column renaming functionality\n- Improved fund code handling in portfolio analysis\n\n### v0.3.4 (2025-05-20)\n\n- Added Menu 2206 preprocessor module\n- Added time series basis utilities\n- Added Bloomberg time series preprocessor\n- Enhanced integration with universal_timeseries_transformer\n\n## Features\n\n- Menu 2205 Preprocessor\n  - Corporation Name Finder\n  - Domestic Beneficiary Certificates Processing\n  - Domestic Bonds Analysis\n  - Repo Agreement Processing\n  - Borrowings Management\n- Menu 2206 Preprocessor\n  - Fund Portfolio Analysis\n  - Investment Asset Classification\n- Bloomberg Time Series Preprocessor\n  - Index Data Processing\n  - Currency Data Processing\n- Time Series Utilities\n  - Date Range Operations\n  - Time Series Extension\n- Additional preprocessors for other financial datasets (coming soon)\n\n## Installation\n\nYou can install the package using pip:\n\n```bash\npip install financial_dataset_preprocessor\n```\n\n## Requirements\n\n- Python >= 3.11\n- Dependencies are listed in requirements.txt\n\n## Usage Examples\n\n### 1. Search for Funds with Bonds\n\n```python\nfrom financial_dataset_preprocessor import (\n    search_funds_having_domestic_bonds,\n    get_domestic_bonds_by_fund\n)\n\n# Get all funds that have domestic bonds\nfund_bonds = search_funds_having_domestic_bonds(date_ref='2025-02-21')\n\n# Get bond details for a specific fund\nfund_code = '100075'\nbond_details = get_domestic_bonds_by_fund(fund_code=fund_code, date_ref='2025-02-21')\n```\n\n### 2. Analyze Fund Borrowings\n\n```python\nfrom financial_dataset_preprocessor import (\n    search_funds_having_borrowings,\n    get_borriwings_by_fund\n)\n\n# Find funds with borrowings\nfunds_with_borrowings = search_funds_having_borrowings(date_ref='2025-02-21')\n\n# Get borrowing details\nfund_code = '100075'\nborrowing_details = get_borriwings_by_fund(fund_code=fund_code, date_ref='2025-02-21')\n```\n\n### 3. Check Repo Agreements\n\n```python\nfrom financial_dataset_preprocessor import (\n    search_funds_having_repos,\n    get_repos_by_fund\n)\n\n# Find funds with repos\nfunds_with_repos = search_funds_having_repos(date_ref='2025-02-21')\n\n# Get repo details for a specific fund\nfund_code = '100075'\nrepo_details = get_repos_by_fund(fund_code=fund_code, date_ref='2025-02-21')\n```\n\n## Development\n\nTo set up the development environment:\n\n1. Clone the repository\n2. Create a virtual environment\n3. Install dependencies:\n\n```bash\npip install -r requirements.txt\n```\n\n## License\n\nThis project is licensed under a proprietary license. All rights reserved.\n\n### Terms of Use\n\n- Source code viewing and forking is allowed\n- Commercial use is prohibited without explicit permission\n- Redistribution or modification of the code is prohibited\n- Academic and research use is allowed with proper attribution\n\n## Author\n\n**June Young Park**  \nAI Management Development Team Lead & Quant Strategist at LIFE Asset Management\n\nLIFE Asset Management is a hedge fund management firm that integrates value investing and engagement strategies with quantitative approaches and financial technology, headquartered in Seoul, South Korea.\n\n### Contact\n\n- Email: juneyoungpaak@gmail.com\n- Location: TWO IFC, Yeouido, Seoul\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A package for preprocessing financial datasets, powering the Life Asset Management development team.",
    "version": "0.4.4",
    "project_urls": {
        "Homepage": "https://github.com/nailen1/financial_dataset_preprocessor"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2f2c7c804c050bc8a2b175d6a1928e90350bb54b6f04b62a9557a7c4c07c734f",
                "md5": "e78481e94f1759f817f2e384fd13d147",
                "sha256": "8ce554b7ddc7e4499fb47149bed724f27337a586cbb7e8efadc6bcdd067237d8"
            },
            "downloads": -1,
            "filename": "financial_dataset_preprocessor-0.4.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e78481e94f1759f817f2e384fd13d147",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 64496,
            "upload_time": "2025-07-31T00:59:41",
            "upload_time_iso_8601": "2025-07-31T00:59:41.874888Z",
            "url": "https://files.pythonhosted.org/packages/2f/2c/7c804c050bc8a2b175d6a1928e90350bb54b6f04b62a9557a7c4c07c734f/financial_dataset_preprocessor-0.4.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a31d209be185203f558534382f04984447f07f84764b83327e52c66d7a693714",
                "md5": "7a6ed2a9e689102b98dc2118e196d5ab",
                "sha256": "3dc412e5bb1d2319c43aa246b7d7e2b8d748236eebdcef3fb7a80dc259b10960"
            },
            "downloads": -1,
            "filename": "financial_dataset_preprocessor-0.4.4.tar.gz",
            "has_sig": false,
            "md5_digest": "7a6ed2a9e689102b98dc2118e196d5ab",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 30429,
            "upload_time": "2025-07-31T00:59:43",
            "upload_time_iso_8601": "2025-07-31T00:59:43.142719Z",
            "url": "https://files.pythonhosted.org/packages/a3/1d/209be185203f558534382f04984447f07f84764b83327e52c66d7a693714/financial_dataset_preprocessor-0.4.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-31 00:59:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nailen1",
    "github_project": "financial_dataset_preprocessor",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "string_date_controller",
            "specs": [
                [
                    ">=",
                    "0.3.0"
                ]
            ]
        },
        {
            "name": "financial_dataset_loader",
            "specs": [
                [
                    ">=",
                    "0.2.7"
                ]
            ]
        },
        {
            "name": "canonical_transformer",
            "specs": [
                [
                    ">=",
                    "0.2.4"
                ]
            ]
        },
        {
            "name": "mongodb_controller",
            "specs": [
                [
                    ">=",
                    "0.2.1"
                ]
            ]
        },
        {
            "name": "aws_s3_controller",
            "specs": [
                [
                    ">=",
                    "0.7.5"
                ]
            ]
        },
        {
            "name": "universal_timeseries_transformer",
            "specs": [
                [
                    ">=",
                    "0.3.7"
                ]
            ]
        }
    ],
    "lcname": "financial-dataset-preprocessor"
}
        
Elapsed time: 1.98428s