# pgsql_upserter
[](https://pypi.org/project/pgsql-upserter/)
[](https://github.com/machado000/pgsql-upserter/blob/main/LICENSE)
[](https://github.com/machado000/pgsql-upserter/issues)
[](https://github.com/machado000/pgsql-upserter/commits/main)
A powerful, production-ready PostgreSQL upsert utility with automatic schema introspection and intelligent conflict resolution. Perfect for serverless ETL pipelines and data integration workflows.
## 🚀 Key Features
- **Zero Configuration**: Automatic schema detection and column matching
- **Intelligent Conflict Resolution**: Automatically detects primary keys and unique constraints
- **Production Tested**: Handles deduplication, data validation, and error recovery
- **Flexible Input**: Supports both direct data (API responses) and CSV files
## 📦 Installation
```bash
pip install pgsql-upserter
```
## 🎯 Quick Start
### Serverless ETL (Recommended)
Perfect for AWS Lambda, Google Cloud Functions, or any API-driven ETL:
```python
from pgsql_upserter import execute_upsert_workflow, create_connection_from_env
# Your API response data (Facebook Ads, Google Ads, etc.)
api_data = [
{
'account_id': '123456789',
'campaign_id': 'camp_001',
'impressions': 1000,
'clicks': 50,
'spend': 25.50,
'date_start': '2025-08-31'
}
# ... more records
]
# One function call does everything!
connection = create_connection_from_env()
result = execute_upsert_workflow(
connection=connection,
data=api_data, # Direct API data
target_table='ads_metrics'
)
print(f"✅ {result.total_affected} rows processed")
print(f"📈 {result.rows_inserted} inserted, {result.rows_updated} updated")
```
### CSV File Processing
```python
# Automatic CSV processing
result = execute_upsert_workflow(
connection=connection,
data='path/to/data.csv', # File path
target_table='ads_metrics'
)
```
## 🔧 Environment Setup
Set your PostgreSQL connection via environment variables:
```bash
export PGHOST=your-host
export PGPORT=5432
export PGDATABASE=your-db
export PGUSER=your-user
export PGPASSWORD=your-password
```
Or use a connection string:
```bash
export DATABASE_URL=postgresql://user:pass@host:port/dbname
```
## 🧠 How It Works
1. **Schema Introspection**: Analyzes your table structure automatically
2. **Column Matching**: Maps your data columns to table columns
3. **Conflict Detection**: Finds primary keys and unique constraints
4. **Data Deduplication**: Removes duplicates using conflict resolution strategy
5. **Intelligent Upsert**: Uses PostgreSQL's native `INSERT...ON CONFLICT`
## 🎯 Perfect For
- **API Data Ingestion**: Facebook Ads, Google Ads, LinkedIn Ads APIs
- **Serverless ETL**: AWS Lambda, Google Cloud Functions, Azure Functions
- **Data Warehousing**: Loading data into analytics databases
- **Real-time Sync**: Keeping databases in sync with external sources
- **Batch Processing**: Traditional CSV and file-based workflows
## 📊 Automatic Conflict Resolution
The library automatically chooses the best upsert strategy:
1. **Primary Key**: Uses table's primary key if available in data
2. **Unique Constraints**: Combines all unique constraints for conflict detection
3. **Insert Only**: Falls back to simple insert if no conflicts possible
## 🔍 Advanced Usage
### Data Processing Before Upsert
```python
from pgsql_upserter import UpsertResult
# Read and process CSV data
csv_data = UpsertResult.read_csv_to_dict_list('data.csv')
# Filter or transform data
filtered_data = [row for row in csv_data if float(row.get('spend', 0)) > 10.0]
# Upsert processed data
result = execute_upsert_workflow(
connection=connection,
data=filtered_data,
target_table='ads_metrics'
)
```
### Custom Connection
```python
import psycopg2
from pgsql_upserter import execute_upsert_workflow
connection = psycopg2.connect(
host="localhost",
database="mydb",
user="user",
password="password"
)
result = execute_upsert_workflow(
connection=connection,
data=your_data,
target_table='your_table',
schema_name='public' # optional, defaults to 'public'
)
```
## 🛡️ Error Handling
The library provides comprehensive error handling and validation:
```python
from pgsql_upserter import execute_upsert_workflow, PgsqlUpserterError
try:
result = execute_upsert_workflow(connection, data, 'my_table')
print(f"Success: {result.total_affected} rows processed")
except PgsqlUpserterError as e:
print(f"Upsert failed: {e}")
```
## 📋 Requirements
- Python 3.11+
- PostgreSQL 12+
- psycopg2-binary
## 🤝 Contributing
Issues and pull requests are welcome! Please see our contributing guidelines.
## 📄 License
MIT License - see LICENSE file for details.
## 🔗 Links
- [PyPI Package](https://pypi.org/project/pgsql-upserter/)
- [Source Code](https://github.com/machadoo000/pgsql-upserter)
- [Issues](https://github.com/machadoo000/pgsql-upserter/issues)
Raw data
{
"_id": null,
"home_page": "https://github.com/machado000/pgsql-upserter",
"name": "pgsql-upserter",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "postgresql, upsert, database, etl",
"author": "Joao Brito",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/11/ce/6e21fabac668ed6dbd75af8dcb5f770529b1f33502753717cd4a1c50992c/pgsql_upserter-0.9.1.tar.gz",
"platform": null,
"description": "# pgsql_upserter\n\n[](https://pypi.org/project/pgsql-upserter/)\n[](https://github.com/machado000/pgsql-upserter/blob/main/LICENSE)\n[](https://github.com/machado000/pgsql-upserter/issues)\n[](https://github.com/machado000/pgsql-upserter/commits/main)\n\nA powerful, production-ready PostgreSQL upsert utility with automatic schema introspection and intelligent conflict resolution. Perfect for serverless ETL pipelines and data integration workflows.\n\n## \ud83d\ude80 Key Features\n\n- **Zero Configuration**: Automatic schema detection and column matching\n- **Intelligent Conflict Resolution**: Automatically detects primary keys and unique constraints\n- **Production Tested**: Handles deduplication, data validation, and error recovery\n- **Flexible Input**: Supports both direct data (API responses) and CSV files\n\n## \ud83d\udce6 Installation\n\n```bash\npip install pgsql-upserter\n```\n\n## \ud83c\udfaf Quick Start\n\n### Serverless ETL (Recommended)\n\nPerfect for AWS Lambda, Google Cloud Functions, or any API-driven ETL:\n\n```python\nfrom pgsql_upserter import execute_upsert_workflow, create_connection_from_env\n\n# Your API response data (Facebook Ads, Google Ads, etc.)\napi_data = [\n {\n 'account_id': '123456789',\n 'campaign_id': 'camp_001', \n 'impressions': 1000,\n 'clicks': 50,\n 'spend': 25.50,\n 'date_start': '2025-08-31'\n }\n # ... more records\n]\n\n# One function call does everything!\nconnection = create_connection_from_env()\nresult = execute_upsert_workflow(\n connection=connection,\n data=api_data, # Direct API data\n target_table='ads_metrics'\n)\n\nprint(f\"\u2705 {result.total_affected} rows processed\")\nprint(f\"\ud83d\udcc8 {result.rows_inserted} inserted, {result.rows_updated} updated\")\n```\n\n### CSV File Processing\n\n```python\n# Automatic CSV processing\nresult = execute_upsert_workflow(\n connection=connection,\n data='path/to/data.csv', # File path\n target_table='ads_metrics'\n)\n```\n\n## \ud83d\udd27 Environment Setup\n\nSet your PostgreSQL connection via environment variables:\n\n```bash\nexport PGHOST=your-host\nexport PGPORT=5432\nexport PGDATABASE=your-db\nexport PGUSER=your-user\nexport PGPASSWORD=your-password\n```\n\nOr use a connection string:\n```bash\nexport DATABASE_URL=postgresql://user:pass@host:port/dbname\n```\n\n## \ud83e\udde0 How It Works\n\n1. **Schema Introspection**: Analyzes your table structure automatically\n2. **Column Matching**: Maps your data columns to table columns\n3. **Conflict Detection**: Finds primary keys and unique constraints \n4. **Data Deduplication**: Removes duplicates using conflict resolution strategy\n5. **Intelligent Upsert**: Uses PostgreSQL's native `INSERT...ON CONFLICT`\n\n## \ud83c\udfaf Perfect For\n\n- **API Data Ingestion**: Facebook Ads, Google Ads, LinkedIn Ads APIs\n- **Serverless ETL**: AWS Lambda, Google Cloud Functions, Azure Functions\n- **Data Warehousing**: Loading data into analytics databases\n- **Real-time Sync**: Keeping databases in sync with external sources\n- **Batch Processing**: Traditional CSV and file-based workflows\n\n## \ud83d\udcca Automatic Conflict Resolution\n\nThe library automatically chooses the best upsert strategy:\n\n1. **Primary Key**: Uses table's primary key if available in data\n2. **Unique Constraints**: Combines all unique constraints for conflict detection \n3. **Insert Only**: Falls back to simple insert if no conflicts possible\n\n## \ud83d\udd0d Advanced Usage\n\n### Data Processing Before Upsert\n\n```python\nfrom pgsql_upserter import UpsertResult\n\n# Read and process CSV data\ncsv_data = UpsertResult.read_csv_to_dict_list('data.csv')\n\n# Filter or transform data\nfiltered_data = [row for row in csv_data if float(row.get('spend', 0)) > 10.0]\n\n# Upsert processed data\nresult = execute_upsert_workflow(\n connection=connection,\n data=filtered_data,\n target_table='ads_metrics'\n)\n```\n\n### Custom Connection\n\n```python\nimport psycopg2\nfrom pgsql_upserter import execute_upsert_workflow\n\nconnection = psycopg2.connect(\n host=\"localhost\",\n database=\"mydb\",\n user=\"user\",\n password=\"password\"\n)\n\nresult = execute_upsert_workflow(\n connection=connection,\n data=your_data,\n target_table='your_table',\n schema_name='public' # optional, defaults to 'public'\n)\n```\n\n## \ud83d\udee1\ufe0f Error Handling\n\nThe library provides comprehensive error handling and validation:\n\n```python\nfrom pgsql_upserter import execute_upsert_workflow, PgsqlUpserterError\n\ntry:\n result = execute_upsert_workflow(connection, data, 'my_table')\n print(f\"Success: {result.total_affected} rows processed\")\nexcept PgsqlUpserterError as e:\n print(f\"Upsert failed: {e}\")\n```\n\n## \ud83d\udccb Requirements\n\n- Python 3.11+\n- PostgreSQL 12+\n- psycopg2-binary\n\n## \ud83e\udd1d Contributing\n\nIssues and pull requests are welcome! Please see our contributing guidelines.\n\n## \ud83d\udcc4 License\n\nMIT License - see LICENSE file for details.\n\n## \ud83d\udd17 Links\n\n- [PyPI Package](https://pypi.org/project/pgsql-upserter/)\n- [Source Code](https://github.com/machadoo000/pgsql-upserter)\n- [Issues](https://github.com/machadoo000/pgsql-upserter/issues)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "PostgreSQL upsert engine using temp tables and automatic conflict resolution",
"version": "0.9.1",
"project_urls": {
"Homepage": "https://github.com/machado000/pgsql-upserter",
"Issues": "https://github.com/machado000/pgsql-upserter/issues"
},
"split_keywords": [
"postgresql",
" upsert",
" database",
" etl"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4312e4a27aaf7d9f849db65f9c86d431ea9b6e9411d01f00580a3215548c5888",
"md5": "c4315ae623f3e81ff4c0ddccc14df875",
"sha256": "2ee5750665d4f3303b5a698567c30e851b5fb017112be31ddbd8e4a3301cf504"
},
"downloads": -1,
"filename": "pgsql_upserter-0.9.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c4315ae623f3e81ff4c0ddccc14df875",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 19327,
"upload_time": "2025-08-31T19:25:27",
"upload_time_iso_8601": "2025-08-31T19:25:27.785594Z",
"url": "https://files.pythonhosted.org/packages/43/12/e4a27aaf7d9f849db65f9c86d431ea9b6e9411d01f00580a3215548c5888/pgsql_upserter-0.9.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "11ce6e21fabac668ed6dbd75af8dcb5f770529b1f33502753717cd4a1c50992c",
"md5": "08548391a3b6df35df3bb05185e1ff48",
"sha256": "a26444f4e1e5630d59c360183335dbf085dd015af9c043b380c19acc11d7963e"
},
"downloads": -1,
"filename": "pgsql_upserter-0.9.1.tar.gz",
"has_sig": false,
"md5_digest": "08548391a3b6df35df3bb05185e1ff48",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 16408,
"upload_time": "2025-08-31T19:25:29",
"upload_time_iso_8601": "2025-08-31T19:25:29.163431Z",
"url": "https://files.pythonhosted.org/packages/11/ce/6e21fabac668ed6dbd75af8dcb5f770529b1f33502753717cd4a1c50992c/pgsql_upserter-0.9.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-31 19:25:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "machado000",
"github_project": "pgsql-upserter",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pgsql-upserter"
}