# Financial Dataset Preprocessor
A Python package for preprocessing financial datasets from various sources. This package provides tools and utilities for cleaning, transforming, and preparing financial data for analysis.
## Version Updates
### v0.4.0 (2025-01-27)
- Removed virtual environment folder from Git repository
- Cleaned up unnecessary number conversion columns for Menu 2206
- Enhanced data preprocessing accuracy and consistency
### v0.3.9 (2025-01-27)
- Updated industry classification column preprocessing format for Menu 2206
- Enhanced data consistency and standardization
### v0.3.8 (2025-06-26)
- Enhanced stability of Menu 3233 preprocessor
- Improved custom index creation functionality
### v0.3.7 (2025-05-26)
- Refactored parse_utils module for better modularity
- Added type hints for improved code quality
- Enhanced number parsing functionality
### v0.3.6 (2025-05-23)
- Added filtering functionality to Menu 2206 preprocessor
- Improved data quality by removing summary rows
### v0.3.5 (2025-05-20)
- Enhanced Menu 2206 preprocessor with column renaming functionality
- Improved fund code handling in portfolio analysis
### v0.3.4 (2025-05-20)
- Added Menu 2206 preprocessor module
- Added time series basis utilities
- Added Bloomberg time series preprocessor
- Enhanced integration with universal_timeseries_transformer
## Features
- Menu 2205 Preprocessor
- Corporation Name Finder
- Domestic Beneficiary Certificates Processing
- Domestic Bonds Analysis
- Repo Agreement Processing
- Borrowings Management
- Menu 2206 Preprocessor
- Fund Portfolio Analysis
- Investment Asset Classification
- Bloomberg Time Series Preprocessor
- Index Data Processing
- Currency Data Processing
- Time Series Utilities
- Date Range Operations
- Time Series Extension
- Additional preprocessors for other financial datasets (coming soon)
## Installation
You can install the package using pip:
```bash
pip install financial_dataset_preprocessor
```
## Requirements
- Python >= 3.11
- Dependencies are listed in requirements.txt
## Usage Examples
### 1. Search for Funds with Bonds
```python
from financial_dataset_preprocessor import (
search_funds_having_domestic_bonds,
get_domestic_bonds_by_fund
)
# Get all funds that have domestic bonds
fund_bonds = search_funds_having_domestic_bonds(date_ref='2025-02-21')
# Get bond details for a specific fund
fund_code = '100075'
bond_details = get_domestic_bonds_by_fund(fund_code=fund_code, date_ref='2025-02-21')
```
### 2. Analyze Fund Borrowings
```python
from financial_dataset_preprocessor import (
search_funds_having_borrowings,
get_borriwings_by_fund
)
# Find funds with borrowings
funds_with_borrowings = search_funds_having_borrowings(date_ref='2025-02-21')
# Get borrowing details
fund_code = '100075'
borrowing_details = get_borriwings_by_fund(fund_code=fund_code, date_ref='2025-02-21')
```
### 3. Check Repo Agreements
```python
from financial_dataset_preprocessor import (
search_funds_having_repos,
get_repos_by_fund
)
# Find funds with repos
funds_with_repos = search_funds_having_repos(date_ref='2025-02-21')
# Get repo details for a specific fund
fund_code = '100075'
repo_details = get_repos_by_fund(fund_code=fund_code, date_ref='2025-02-21')
```
## Development
To set up the development environment:
1. Clone the repository
2. Create a virtual environment
3. Install dependencies:
```bash
pip install -r requirements.txt
```
## License
This project is licensed under a proprietary license. All rights reserved.
### Terms of Use
- Source code viewing and forking is allowed
- Commercial use is prohibited without explicit permission
- Redistribution or modification of the code is prohibited
- Academic and research use is allowed with proper attribution
## Author
**June Young Park**
AI Management Development Team Lead & Quant Strategist at LIFE Asset Management
LIFE Asset Management is a hedge fund management firm that integrates value investing and engagement strategies with quantitative approaches and financial technology, headquartered in Seoul, South Korea.
### Contact
- Email: juneyoungpaak@gmail.com
- Location: TWO IFC, Yeouido, Seoul
Raw data
{
"_id": null,
"home_page": "https://github.com/nailen1/financial_dataset_preprocessor",
"name": "financial-dataset-preprocessor",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": null,
"author": "June Young Park",
"author_email": "juneyoungpaak@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/a3/1d/209be185203f558534382f04984447f07f84764b83327e52c66d7a693714/financial_dataset_preprocessor-0.4.4.tar.gz",
"platform": null,
"description": "# Financial Dataset Preprocessor\n\nA Python package for preprocessing financial datasets from various sources. This package provides tools and utilities for cleaning, transforming, and preparing financial data for analysis.\n\n## Version Updates\n\n### v0.4.0 (2025-01-27)\n\n- Removed virtual environment folder from Git repository\n- Cleaned up unnecessary number conversion columns for Menu 2206\n- Enhanced data preprocessing accuracy and consistency\n\n### v0.3.9 (2025-01-27)\n\n- Updated industry classification column preprocessing format for Menu 2206\n- Enhanced data consistency and standardization\n\n### v0.3.8 (2025-06-26)\n\n- Enhanced stability of Menu 3233 preprocessor\n- Improved custom index creation functionality\n\n### v0.3.7 (2025-05-26)\n\n- Refactored parse_utils module for better modularity\n- Added type hints for improved code quality\n- Enhanced number parsing functionality\n\n### v0.3.6 (2025-05-23)\n\n- Added filtering functionality to Menu 2206 preprocessor\n- Improved data quality by removing summary rows\n\n### v0.3.5 (2025-05-20)\n\n- Enhanced Menu 2206 preprocessor with column renaming functionality\n- Improved fund code handling in portfolio analysis\n\n### v0.3.4 (2025-05-20)\n\n- Added Menu 2206 preprocessor module\n- Added time series basis utilities\n- Added Bloomberg time series preprocessor\n- Enhanced integration with universal_timeseries_transformer\n\n## Features\n\n- Menu 2205 Preprocessor\n - Corporation Name Finder\n - Domestic Beneficiary Certificates Processing\n - Domestic Bonds Analysis\n - Repo Agreement Processing\n - Borrowings Management\n- Menu 2206 Preprocessor\n - Fund Portfolio Analysis\n - Investment Asset Classification\n- Bloomberg Time Series Preprocessor\n - Index Data Processing\n - Currency Data Processing\n- Time Series Utilities\n - Date Range Operations\n - Time Series Extension\n- Additional preprocessors for other financial datasets (coming soon)\n\n## Installation\n\nYou can install the package using pip:\n\n```bash\npip install financial_dataset_preprocessor\n```\n\n## Requirements\n\n- Python >= 3.11\n- Dependencies are listed in requirements.txt\n\n## Usage Examples\n\n### 1. Search for Funds with Bonds\n\n```python\nfrom financial_dataset_preprocessor import (\n search_funds_having_domestic_bonds,\n get_domestic_bonds_by_fund\n)\n\n# Get all funds that have domestic bonds\nfund_bonds = search_funds_having_domestic_bonds(date_ref='2025-02-21')\n\n# Get bond details for a specific fund\nfund_code = '100075'\nbond_details = get_domestic_bonds_by_fund(fund_code=fund_code, date_ref='2025-02-21')\n```\n\n### 2. Analyze Fund Borrowings\n\n```python\nfrom financial_dataset_preprocessor import (\n search_funds_having_borrowings,\n get_borriwings_by_fund\n)\n\n# Find funds with borrowings\nfunds_with_borrowings = search_funds_having_borrowings(date_ref='2025-02-21')\n\n# Get borrowing details\nfund_code = '100075'\nborrowing_details = get_borriwings_by_fund(fund_code=fund_code, date_ref='2025-02-21')\n```\n\n### 3. Check Repo Agreements\n\n```python\nfrom financial_dataset_preprocessor import (\n search_funds_having_repos,\n get_repos_by_fund\n)\n\n# Find funds with repos\nfunds_with_repos = search_funds_having_repos(date_ref='2025-02-21')\n\n# Get repo details for a specific fund\nfund_code = '100075'\nrepo_details = get_repos_by_fund(fund_code=fund_code, date_ref='2025-02-21')\n```\n\n## Development\n\nTo set up the development environment:\n\n1. Clone the repository\n2. Create a virtual environment\n3. Install dependencies:\n\n```bash\npip install -r requirements.txt\n```\n\n## License\n\nThis project is licensed under a proprietary license. All rights reserved.\n\n### Terms of Use\n\n- Source code viewing and forking is allowed\n- Commercial use is prohibited without explicit permission\n- Redistribution or modification of the code is prohibited\n- Academic and research use is allowed with proper attribution\n\n## Author\n\n**June Young Park** \nAI Management Development Team Lead & Quant Strategist at LIFE Asset Management\n\nLIFE Asset Management is a hedge fund management firm that integrates value investing and engagement strategies with quantitative approaches and financial technology, headquartered in Seoul, South Korea.\n\n### Contact\n\n- Email: juneyoungpaak@gmail.com\n- Location: TWO IFC, Yeouido, Seoul\n",
"bugtrack_url": null,
"license": null,
"summary": "A package for preprocessing financial datasets, powering the Life Asset Management development team.",
"version": "0.4.4",
"project_urls": {
"Homepage": "https://github.com/nailen1/financial_dataset_preprocessor"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2f2c7c804c050bc8a2b175d6a1928e90350bb54b6f04b62a9557a7c4c07c734f",
"md5": "e78481e94f1759f817f2e384fd13d147",
"sha256": "8ce554b7ddc7e4499fb47149bed724f27337a586cbb7e8efadc6bcdd067237d8"
},
"downloads": -1,
"filename": "financial_dataset_preprocessor-0.4.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e78481e94f1759f817f2e384fd13d147",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 64496,
"upload_time": "2025-07-31T00:59:41",
"upload_time_iso_8601": "2025-07-31T00:59:41.874888Z",
"url": "https://files.pythonhosted.org/packages/2f/2c/7c804c050bc8a2b175d6a1928e90350bb54b6f04b62a9557a7c4c07c734f/financial_dataset_preprocessor-0.4.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a31d209be185203f558534382f04984447f07f84764b83327e52c66d7a693714",
"md5": "7a6ed2a9e689102b98dc2118e196d5ab",
"sha256": "3dc412e5bb1d2319c43aa246b7d7e2b8d748236eebdcef3fb7a80dc259b10960"
},
"downloads": -1,
"filename": "financial_dataset_preprocessor-0.4.4.tar.gz",
"has_sig": false,
"md5_digest": "7a6ed2a9e689102b98dc2118e196d5ab",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 30429,
"upload_time": "2025-07-31T00:59:43",
"upload_time_iso_8601": "2025-07-31T00:59:43.142719Z",
"url": "https://files.pythonhosted.org/packages/a3/1d/209be185203f558534382f04984447f07f84764b83327e52c66d7a693714/financial_dataset_preprocessor-0.4.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-31 00:59:43",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nailen1",
"github_project": "financial_dataset_preprocessor",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "pandas",
"specs": []
},
{
"name": "tqdm",
"specs": []
},
{
"name": "string_date_controller",
"specs": [
[
">=",
"0.3.0"
]
]
},
{
"name": "financial_dataset_loader",
"specs": [
[
">=",
"0.2.7"
]
]
},
{
"name": "canonical_transformer",
"specs": [
[
">=",
"0.2.4"
]
]
},
{
"name": "mongodb_controller",
"specs": [
[
">=",
"0.2.1"
]
]
},
{
"name": "aws_s3_controller",
"specs": [
[
">=",
"0.7.5"
]
]
},
{
"name": "universal_timeseries_transformer",
"specs": [
[
">=",
"0.3.7"
]
]
}
],
"lcname": "financial-dataset-preprocessor"
}