# AWS S3 Controller
- A collection of natural language-like utility functions to intuitively and easily control AWS's cloud object storage resource, S3.
- Control S3. Manage, interact with, and handle S3 just like your local storage.
- *AWS: Amazon Web Services *S3: Simple Storage Service in AWS
## Features
- **File Scanning**: Search files in S3 buckets and local directories using regex patterns
- **File Transfer**: Upload, download, and relocate files between S3 buckets and local directories
- **Data Processing**: Read CSV and Excel files directly from S3 into pandas DataFrames
- **Bucket Management**: Create and manage S3 bucket structure
- **Special Operations**: Handle specific use cases like timeseries data processing
## Installation
```bash
pip install -r requirements.txt
```
## Module Structure
The module is organized into several specialized components:
- `s3_scanner.py`: File search functionality in S3 buckets and local directories
- `s3_transfer.py`: File transfer operations between S3 and local storage
- `s3_dataframe_reader.py`: Functions for reading files into pandas DataFrames
- `s3_structure.py`: S3 bucket structure management
- `s3_special_operations.py`: Special purpose functions for specific operations
## Usage Examples
### Scanning Files
```python
from aws_s3_controller import scan_files_in_bucket_by_regex
# Find all CSV files in a bucket
files = scan_files_in_bucket_by_regex(
bucket="my-bucket",
bucket_prefix="data",
regex=r".*\.csv$",
option="key"
)
```
### Transferring Files
```python
from aws_s3_controller import download_files_from_s3, upload_files_to_s3
# Download files matching a pattern
download_files_from_s3(
bucket="my-bucket",
regex=r".*\.csv$",
file_folder_local="./downloads",
bucket_prefix="data"
)
# Upload files to S3
upload_files_to_s3(
file_folder_local="./uploads",
regex=r".*\.xlsx$",
bucket="my-bucket",
bucket_prefix="excel-files"
)
```
### Reading Data
```python
from aws_s3_controller import open_df_in_bucket, open_excel_in_bucket
# Read CSV file
df = open_df_in_bucket(
bucket="my-bucket",
bucket_prefix="data",
file_name="example.csv"
)
# Read Excel file
df = open_excel_in_bucket(
bucket="my-bucket",
bucket_prefix="excel",
file_name="example.xlsx"
)
```
## Dependencies
- boto3
- pandas
- python-dotenv
- xlrd (for Excel file support)
- shining_pebbles
## Configuration
1. Create a `.env` file in your project root
2. Add your AWS credentials:
```
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_DEFAULT_REGION=your_region
```
## Documentation
Detailed documentation is available in the `doc` directory:
- `design.md`: Project design documentation
- `context.md`: Project context and progress
- `commands-cascade.md`: Command history and functionality
## Contributing
1. Fork the repository
2. Create your feature branch
3. Commit your changes with descriptive commit messages
4. Push to your branch
5. Create a Pull Request
### Author
**June Young Park**
AI Management Development Team Lead
LIFE Asset Management
### Expertise
- Advanced Data Architecture Design
- Financial Technology Integration
- Enterprise AI/ML Systems
- Investment Analytics Platforms
### Organization
**LIFE Asset Management**
A premier investment management firm specializing in hedge fund and private equity solutions. Headquartered in the TWO IFC, Yeouido, South Korea.
_Focus Areas:_
- Quantitative Investment Strategies
- Corporate Value Enhancement
- Sustainable Shareholder Returns
- Innovation in Financial Technology
### Contact Information
- **Professional Email**: juneyoungpaak@gmail.com
- **Location**: TWO IFC, Yeouido, Seoul, South Korea
Raw data
{
"_id": null,
"home_page": "https://github.com/nailen1/aws_s3_controller.git",
"name": "aws-s3-controller",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "aws s3 storage file-management data-processing",
"author": "June Young Park",
"author_email": "juneyoungpaak@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/be/ed/4b191220dad08df041f559c80822f3492a5e39ab9661899b50c5dfb06a4b/aws_s3_controller-0.7.1.tar.gz",
"platform": null,
"description": "# AWS S3 Controller\n\n- A collection of natural language-like utility functions to intuitively and easily control AWS's cloud object storage resource, S3.\n- Control S3. Manage, interact with, and handle S3 just like your local storage.\n- *AWS: Amazon Web Services *S3: Simple Storage Service in AWS\n\n## Features\n\n- **File Scanning**: Search files in S3 buckets and local directories using regex patterns\n- **File Transfer**: Upload, download, and relocate files between S3 buckets and local directories\n- **Data Processing**: Read CSV and Excel files directly from S3 into pandas DataFrames\n- **Bucket Management**: Create and manage S3 bucket structure\n- **Special Operations**: Handle specific use cases like timeseries data processing\n\n## Installation\n\n```bash\npip install -r requirements.txt\n```\n\n## Module Structure\n\nThe module is organized into several specialized components:\n\n- `s3_scanner.py`: File search functionality in S3 buckets and local directories\n- `s3_transfer.py`: File transfer operations between S3 and local storage\n- `s3_dataframe_reader.py`: Functions for reading files into pandas DataFrames\n- `s3_structure.py`: S3 bucket structure management\n- `s3_special_operations.py`: Special purpose functions for specific operations\n\n## Usage Examples\n\n### Scanning Files\n\n```python\nfrom aws_s3_controller import scan_files_in_bucket_by_regex\n\n# Find all CSV files in a bucket\nfiles = scan_files_in_bucket_by_regex(\n bucket=\"my-bucket\",\n bucket_prefix=\"data\",\n regex=r\".*\\.csv$\",\n option=\"key\"\n)\n```\n\n### Transferring Files\n\n```python\nfrom aws_s3_controller import download_files_from_s3, upload_files_to_s3\n\n# Download files matching a pattern\ndownload_files_from_s3(\n bucket=\"my-bucket\",\n regex=r\".*\\.csv$\",\n file_folder_local=\"./downloads\",\n bucket_prefix=\"data\"\n)\n\n# Upload files to S3\nupload_files_to_s3(\n file_folder_local=\"./uploads\",\n regex=r\".*\\.xlsx$\",\n bucket=\"my-bucket\",\n bucket_prefix=\"excel-files\"\n)\n```\n\n### Reading Data\n\n```python\nfrom aws_s3_controller import open_df_in_bucket, open_excel_in_bucket\n\n# Read CSV file\ndf = open_df_in_bucket(\n bucket=\"my-bucket\",\n bucket_prefix=\"data\",\n file_name=\"example.csv\"\n)\n\n# Read Excel file\ndf = open_excel_in_bucket(\n bucket=\"my-bucket\",\n bucket_prefix=\"excel\",\n file_name=\"example.xlsx\"\n)\n```\n\n## Dependencies\n\n- boto3\n- pandas\n- python-dotenv\n- xlrd (for Excel file support)\n- shining_pebbles\n\n## Configuration\n\n1. Create a `.env` file in your project root\n2. Add your AWS credentials:\n\n```\nAWS_ACCESS_KEY_ID=your_access_key\nAWS_SECRET_ACCESS_KEY=your_secret_key\nAWS_DEFAULT_REGION=your_region\n```\n\n## Documentation\n\nDetailed documentation is available in the `doc` directory:\n\n- `design.md`: Project design documentation\n- `context.md`: Project context and progress\n- `commands-cascade.md`: Command history and functionality\n\n## Contributing\n\n1. Fork the repository\n2. Create your feature branch\n3. Commit your changes with descriptive commit messages\n4. Push to your branch\n5. Create a Pull Request\n\n### Author\n\n**June Young Park** \nAI Management Development Team Lead \nLIFE Asset Management\n\n### Expertise\n\n- Advanced Data Architecture Design\n- Financial Technology Integration\n- Enterprise AI/ML Systems\n- Investment Analytics Platforms\n\n### Organization\n\n**LIFE Asset Management** \nA premier investment management firm specializing in hedge fund and private equity solutions. Headquartered in the TWO IFC, Yeouido, South Korea.\n\n_Focus Areas:_\n\n- Quantitative Investment Strategies\n- Corporate Value Enhancement\n- Sustainable Shareholder Returns\n- Innovation in Financial Technology\n\n### Contact Information\n\n- **Professional Email**: juneyoungpaak@gmail.com\n- **Location**: TWO IFC, Yeouido, Seoul, South Korea\n",
"bugtrack_url": null,
"license": null,
"summary": "A collection of natural language-like utility functions to intuitively and easily control AWS's cloud object storage resource, S3.",
"version": "0.7.1",
"project_urls": {
"Homepage": "https://github.com/nailen1/aws_s3_controller.git"
},
"split_keywords": [
"aws",
"s3",
"storage",
"file-management",
"data-processing"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "570537327eb05150545954a7c986910480dbca43da085f568eb90e902c60c87c",
"md5": "d5c3ad0ef794abf8475790149c942f28",
"sha256": "c8dc72523bc4cd983a2ab58f7e3243bdb9bd615e312630aa5a28ebb1a0d09196"
},
"downloads": -1,
"filename": "aws_s3_controller-0.7.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d5c3ad0ef794abf8475790149c942f28",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 11068,
"upload_time": "2025-02-13T01:59:46",
"upload_time_iso_8601": "2025-02-13T01:59:46.343689Z",
"url": "https://files.pythonhosted.org/packages/57/05/37327eb05150545954a7c986910480dbca43da085f568eb90e902c60c87c/aws_s3_controller-0.7.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "beed4b191220dad08df041f559c80822f3492a5e39ab9661899b50c5dfb06a4b",
"md5": "04d674134303c3f6efd7ab32a8aa5901",
"sha256": "a448df0263f15d9403b10877ef81933be923bb7bcc46b1664cf7c4826d9185d9"
},
"downloads": -1,
"filename": "aws_s3_controller-0.7.1.tar.gz",
"has_sig": false,
"md5_digest": "04d674134303c3f6efd7ab32a8aa5901",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 10138,
"upload_time": "2025-02-13T01:59:49",
"upload_time_iso_8601": "2025-02-13T01:59:49.023202Z",
"url": "https://files.pythonhosted.org/packages/be/ed/4b191220dad08df041f559c80822f3492a5e39ab9661899b50c5dfb06a4b/aws_s3_controller-0.7.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-13 01:59:49",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nailen1",
"github_project": "aws_s3_controller",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "boto3",
"specs": [
[
">=",
"1.26.0"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "pytest",
"specs": [
[
">=",
"7.4.0"
]
]
},
{
"name": "pytest-cov",
"specs": [
[
">=",
"4.1.0"
]
]
},
{
"name": "black",
"specs": [
[
">=",
"23.7.0"
]
]
},
{
"name": "isort",
"specs": [
[
">=",
"5.12.0"
]
]
},
{
"name": "flake8",
"specs": [
[
">=",
"6.1.0"
]
]
},
{
"name": "mypy",
"specs": [
[
">=",
"1.5.0"
]
]
},
{
"name": "shining_pebbles",
"specs": []
},
{
"name": "xlrd",
"specs": [
[
">=",
"2.0.1"
]
]
}
],
"lcname": "aws-s3-controller"
}