<div align="center">
<img src="logo.png" alt="HydroAnomaly Logo" width="200"/>
</div>
# HydroAnomaly
A Python package for water bodies anomaly detection. It retrieves **USGS water data** and **Sentinel-2 bands** to use in ML models for checking the quality of the data collected by USGS gages.
<div align="center">
[](https://badge.fury.io/py/hydroanomaly)
[](https://pepy.tech/project/hydroanomaly)
[](https://python.org)
[](https://opensource.org/licenses/MIT)
[](https://github.com/Ehsankahrizi/HydroAnomaly/stargazers)
</div>
## Installation
Python script:
```bash
pip install hydroanomaly
```
For Jupyter
```bash:
!pip install hydroanomaly
```
For updating the package:
```bash:
!pip install hydroanomaly --upgrade
```
---
## USGS Data Retrieval
Easily retrieve real-time and historical turbidity of water from USGS Water Services:
```python
import ee
import geemap
import hydroanomaly
# ------------------------
# User-defined settings: Example USGS site and date range (change to a site with turbidity data)
# ------------------------
site_number = "294643095035200" # USGS site number
start_date = "2020-01-01"
end_date = "2024-12-30"
# ------------------------
# Data Extraction from USGS
# ------------------------
USGSdata, (lat, lon) = hydroanomaly.get_turbidity(site_number, start_date, end_date)
print("=" * 70)
print("Latitude:", lat)
print("Longitude:", lon)
print("=" * 70)
print(USGSdata.head())
```
---
## Sentinel-2 Data Retrieval
Retrieve Sentinel data from Google Earth Engine
### Defining the Google Earth Engine API
```python
ee.Authenticate()
ee.Initialize(project='XXXXXX-XXXX-XXXXXX-XX') # Replace with your own project ID number
```
### Defining settings, coordinates, masks, etc:
### Defining the area from which you want to retrieve data:
```python
latitude = 29.7785861
longitude = -95.0644278
bands = ['B2','B3','B4','B5','B6','B7','B8','B8A','B9','B11','B12', 'SCL']
buffer_meters = 20
cloudy_pixel_percentage = 20
masks_to_apply = [
"water",
"no_cloud_shadow",
"no_clouds",
"no_snow_ice",
"no_saturated"
]
```
### Sentinel-2 data retrieval:
```python
from hydroanomaly import get_sentinel_bands
df = get_sentinel_bands(
latitude=latitude,
longitude=longitude,
start_date=start_date,
end_date=end_date,
bands=bands,
buffer_meters=buffer_meters,
cloudy_pixel_percentage=cloudy_pixel_percentage,
masks_to_apply=masks_to_apply
)
print(df.head())
print("=" * 70)
print(f"Retrieved {len(df)} rows")
```
### Visualizing the map:
```python
from hydroanomaly import show_sentinel_ndwi_map
Map = show_sentinel_ndwi_map(
latitude, longitude, start_date, end_date,
buffer_meters=buffer_meters, cloudy_pixel_percentage=cloudy_pixel_percentage, zoom=15)
Map
```
---
## Time Series Plotting
Create visualizations of your water data:
```python
from hydroanomaly.visualize import plot_timeseries
# For USGS data
plot_timeseries(USGSdata)
# For Sentinel data
plot_timeseries(df)
```
```python
from hydroanomaly.visualize import plot_turbidity
# For USGS data
plot_turbidity(USGSdata)
```
```python
from hydroanomaly.visualize import plot_sentinel
# For Sentinel data
plot_sentinel(df)
```
```python
from hydroanomaly.visualize import plot_comparison
plot_comparison(USGSdata, df[['B6']], label1="Turbidity", label2="Sentinel-2 B6", title="Comparison: Turbidity vs Band 6")
```
```python
from hydroanomaly import plot
# For Sentinel data
plot(df)
```
```python
from hydroanomaly import visualize
# For USGS data
visualize(USGSdata)
```
#### NDVI:
```python
import matplotlib.pyplot as plt
# Check available columns
print(df.columns)
print("=" * 70)
# Calculate NDVI if bands are available
if {'B4', 'B8'}.issubset(df.columns):
df['NDVI'] = (df['B8'] - df['B4']) / (df['B8'] + df['B4'])
df['NDVI'].plot(marker='o')
plt.title("NDVI Time Series")
plt.xlabel("Date")
plt.ylabel("NDVI")
plt.grid()
plt.show()
else:
print("NDVI bands (B4, B8) not found. Try plotting individual bands:")
df[['B2', 'B3', 'B4', 'B8']].plot()
plt.title("Sentinel-2 Reflectance (selected bands)")
plt.ylabel("Reflectance")
plt.xlabel("Date")
plt.grid()
plt.show()
```
---
## Machine Learning for Anomaly Detection of USGS Data
```python
print(df.columns)
print("=" * 70)
display(df.head(2))
print("=" * 70)
print(USGSdata.columns)
print("=" * 70)
display(USGSdata.head(2))
```
```python
USGSdata = USGSdata[~USGSdata.index.duplicated(keep='first')]
print(df.index.duplicated().sum()) # Number of duplicate datetimes in df
print(USGSdata.index.duplicated().sum()) # Number of duplicate datetimes in USGSdata
```
### OneClassSVM
```python
from hydroanomaly.ml import run_oneclass_svm
df_out, params, f1 = run_oneclass_svm(df, USGSdata)
```
```python
# F1 Score
print(f"F1: {f1:.3f}")
```
### Isolation Forest
```python
from hydroanomaly.ml import run_isolation_forest
df_out_if, params_if, f1_if = run_isolation_forest(df, USGSdata)
```
```python
print(f"F1: {f1_if:.3f}")
```
---
## Features
* **USGS & Sentinel-2 Data Retrieval**
* Download real-time and historical water data from USGS Water Services (any site, any parameter)
* Retrieve Sentinel-2 satellite bands using Google Earth Engine for any location and time range
* Automatic data cleaning, validation, and alignment between ground (USGS) and satellite (Sentinel) data
* Synthetic data generation fallback for testing
* Convenient CSV export functionality
* **Time Series & Satellite Visualization**
* Quick plotting for single or multiple water quality parameters
* Multi-parameter and multi-site comparison plots
* Satellite band and index visualization (NDVI, NDWI, etc.)
* Statistical analysis plots (histograms, box plots, trendlines)
* High-quality plot export (PNG, PDF, SVG) with auto legends and formatting
* **Machine Learning & Anomaly Detection**
* Built-in anomaly detection using One-Class SVM and Isolation Forest models
* Visual comparison of predicted vs. true anomalies in time series data
* Feature engineering for satellite and in-situ sensor data
* Easy integration with Pandas workflows
* **Powerful Data Analysis Tools**
* Mathematical operations and filtering for hydrologic data
* Statistical summaries, validation, and automated quality checks
* Utilities for matching, joining, and synchronizing time series
* **Easy to Use**
* Simple, Pythonic API for rapid data exploration and analysis
* One-liner data retrieval and plotting functions
* Comprehensive error handling
* Well-documented with step-by-step examples and tutorials
Find USGS site numbers at: https://waterdata.usgs.gov/nwis
---
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
---
**HydroAnomaly** - Making water data analysis simple and beautiful!
Raw data
{
"_id": null,
"home_page": null,
"name": "hydroanomaly",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "python, package, hydrology, anomaly detection, remote sensing",
"author": null,
"author_email": "Ehsan Kahrizi <ehsan.kahrizi@usu.edu>",
"download_url": "https://files.pythonhosted.org/packages/2e/31/9f7b2b1736ad584501ab3ad5691ab15cac3fcedaaaa364dc7ec6e0e7e675/hydroanomaly-1.2.9.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n <img src=\"logo.png\" alt=\"HydroAnomaly Logo\" width=\"200\"/>\n</div>\n\n# HydroAnomaly\n\nA Python package for water bodies anomaly detection. It retrieves **USGS water data** and **Sentinel-2 bands** to use in ML models for checking the quality of the data collected by USGS gages.\n\n<div align=\"center\">\n\n[](https://badge.fury.io/py/hydroanomaly)\n[](https://pepy.tech/project/hydroanomaly)\n[](https://python.org)\n[](https://opensource.org/licenses/MIT)\n[](https://github.com/Ehsankahrizi/HydroAnomaly/stargazers)\n\n</div>\n\n## Installation\nPython script:\n```bash\npip install hydroanomaly\n```\nFor Jupyter\n```bash:\n!pip install hydroanomaly\n```\nFor updating the package:\n\n```bash:\n!pip install hydroanomaly --upgrade\n```\n\n---\n## USGS Data Retrieval\nEasily retrieve real-time and historical turbidity of water from USGS Water Services:\n\n```python\nimport ee\nimport geemap\nimport hydroanomaly\n\n# ------------------------\n# User-defined settings: Example USGS site and date range (change to a site with turbidity data)\n# ------------------------\nsite_number = \"294643095035200\" # USGS site number\nstart_date = \"2020-01-01\"\nend_date = \"2024-12-30\"\n\n# ------------------------\n# Data Extraction from USGS\n# ------------------------\nUSGSdata, (lat, lon) = hydroanomaly.get_turbidity(site_number, start_date, end_date)\nprint(\"=\" * 70)\nprint(\"Latitude:\", lat)\nprint(\"Longitude:\", lon)\nprint(\"=\" * 70)\nprint(USGSdata.head())\n```\n---\n## Sentinel-2 Data Retrieval\nRetrieve Sentinel data from Google Earth Engine\n\n### Defining the Google Earth Engine API\n```python\nee.Authenticate()\nee.Initialize(project='XXXXXX-XXXX-XXXXXX-XX') # Replace with your own project ID number\n```\n### Defining settings, coordinates, masks, etc:\n\n### Defining the area from which you want to retrieve data:\n```python\nlatitude = 29.7785861\nlongitude = -95.0644278\nbands = ['B2','B3','B4','B5','B6','B7','B8','B8A','B9','B11','B12', 'SCL']\nbuffer_meters = 20\ncloudy_pixel_percentage = 20\nmasks_to_apply = [\n \"water\",\n \"no_cloud_shadow\",\n \"no_clouds\",\n \"no_snow_ice\",\n \"no_saturated\"\n]\n```\n\n### Sentinel-2 data retrieval:\n```python\nfrom hydroanomaly import get_sentinel_bands\n\ndf = get_sentinel_bands(\n latitude=latitude,\n longitude=longitude,\n start_date=start_date,\n end_date=end_date,\n bands=bands,\n buffer_meters=buffer_meters,\n cloudy_pixel_percentage=cloudy_pixel_percentage,\n masks_to_apply=masks_to_apply\n)\n\nprint(df.head())\nprint(\"=\" * 70)\nprint(f\"Retrieved {len(df)} rows\")\n```\n\n### Visualizing the map:\n\n```python\nfrom hydroanomaly import show_sentinel_ndwi_map\n\nMap = show_sentinel_ndwi_map(\n latitude, longitude, start_date, end_date,\n buffer_meters=buffer_meters, cloudy_pixel_percentage=cloudy_pixel_percentage, zoom=15)\nMap\n```\n---\n## Time Series Plotting\nCreate visualizations of your water data:\n\n```python\nfrom hydroanomaly.visualize import plot_timeseries\n# For USGS data\nplot_timeseries(USGSdata)\n# For Sentinel data\nplot_timeseries(df)\n```\n\n```python\nfrom hydroanomaly.visualize import plot_turbidity\n# For USGS data\nplot_turbidity(USGSdata)\n```\n\n```python\nfrom hydroanomaly.visualize import plot_sentinel\n# For Sentinel data\nplot_sentinel(df)\n```\n\n```python\nfrom hydroanomaly.visualize import plot_comparison\nplot_comparison(USGSdata, df[['B6']], label1=\"Turbidity\", label2=\"Sentinel-2 B6\", title=\"Comparison: Turbidity vs Band 6\")\n```\n\n```python\nfrom hydroanomaly import plot\n# For Sentinel data\nplot(df)\n```\n\n```python\nfrom hydroanomaly import visualize\n# For USGS data\nvisualize(USGSdata)\n```\n\n#### NDVI:\n```python\nimport matplotlib.pyplot as plt\n# Check available columns\nprint(df.columns)\nprint(\"=\" * 70)\n# Calculate NDVI if bands are available\nif {'B4', 'B8'}.issubset(df.columns):\n df['NDVI'] = (df['B8'] - df['B4']) / (df['B8'] + df['B4'])\n df['NDVI'].plot(marker='o')\n plt.title(\"NDVI Time Series\")\n plt.xlabel(\"Date\")\n plt.ylabel(\"NDVI\")\n plt.grid()\n plt.show()\nelse:\n print(\"NDVI bands (B4, B8) not found. Try plotting individual bands:\")\n df[['B2', 'B3', 'B4', 'B8']].plot()\n plt.title(\"Sentinel-2 Reflectance (selected bands)\")\n plt.ylabel(\"Reflectance\")\n plt.xlabel(\"Date\")\n plt.grid()\n plt.show()\n```\n\n---\n## Machine Learning for Anomaly Detection of USGS Data\n\n```python\nprint(df.columns)\nprint(\"=\" * 70)\ndisplay(df.head(2))\nprint(\"=\" * 70)\nprint(USGSdata.columns)\nprint(\"=\" * 70)\ndisplay(USGSdata.head(2))\n```\n\n```python\nUSGSdata = USGSdata[~USGSdata.index.duplicated(keep='first')]\nprint(df.index.duplicated().sum()) # Number of duplicate datetimes in df\nprint(USGSdata.index.duplicated().sum()) # Number of duplicate datetimes in USGSdata\n```\n\n### OneClassSVM\n\n```python\nfrom hydroanomaly.ml import run_oneclass_svm\ndf_out, params, f1 = run_oneclass_svm(df, USGSdata)\n```\n\n```python\n# F1 Score\nprint(f\"F1: {f1:.3f}\")\n```\n\n### Isolation Forest\n```python\nfrom hydroanomaly.ml import run_isolation_forest\ndf_out_if, params_if, f1_if = run_isolation_forest(df, USGSdata)\n```\n\n```python\nprint(f\"F1: {f1_if:.3f}\")\n```\n---\n## Features\n* **USGS & Sentinel-2 Data Retrieval**\n * Download real-time and historical water data from USGS Water Services (any site, any parameter)\n * Retrieve Sentinel-2 satellite bands using Google Earth Engine for any location and time range\n * Automatic data cleaning, validation, and alignment between ground (USGS) and satellite (Sentinel) data\n * Synthetic data generation fallback for testing\n * Convenient CSV export functionality\n\n* **Time Series & Satellite Visualization**\n * Quick plotting for single or multiple water quality parameters\n * Multi-parameter and multi-site comparison plots\n * Satellite band and index visualization (NDVI, NDWI, etc.)\n * Statistical analysis plots (histograms, box plots, trendlines)\n * High-quality plot export (PNG, PDF, SVG) with auto legends and formatting\n\n* **Machine Learning & Anomaly Detection**\n * Built-in anomaly detection using One-Class SVM and Isolation Forest models\n * Visual comparison of predicted vs. true anomalies in time series data\n * Feature engineering for satellite and in-situ sensor data\n * Easy integration with Pandas workflows\n\n* **Powerful Data Analysis Tools**\n * Mathematical operations and filtering for hydrologic data\n * Statistical summaries, validation, and automated quality checks\n * Utilities for matching, joining, and synchronizing time series\n\n* **Easy to Use**\n * Simple, Pythonic API for rapid data exploration and analysis\n * One-liner data retrieval and plotting functions\n * Comprehensive error handling\n * Well-documented with step-by-step examples and tutorials\n\n\nFind USGS site numbers at: https://waterdata.usgs.gov/nwis\n\n---\n## Contributing\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n---\n\n**HydroAnomaly** - Making water data analysis simple and beautiful!\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python package for hydro anomaly detection with simple USGS data retrieval",
"version": "1.2.9",
"project_urls": {
"Bug Reports": "https://github.com/Ehsankahrizi/HydroAnomaly/issues",
"Homepage": "https://github.com/Ehsankahrizi/HydroAnomaly",
"Source": "https://github.com/Ehsankahrizi/HydroAnomaly"
},
"split_keywords": [
"python",
" package",
" hydrology",
" anomaly detection",
" remote sensing"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "99e601d974157feed4dcbac608713bb08317afbb5a0d2e14bde55a7969e0c4be",
"md5": "3479f810088790e79ad5d195522f474d",
"sha256": "2829cf5b812808ed0b397b0b15ffdf8ddeafe66fff953962eae7ba28506d1805"
},
"downloads": -1,
"filename": "hydroanomaly-1.2.9-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3479f810088790e79ad5d195522f474d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 12771,
"upload_time": "2025-07-31T19:39:20",
"upload_time_iso_8601": "2025-07-31T19:39:20.896022Z",
"url": "https://files.pythonhosted.org/packages/99/e6/01d974157feed4dcbac608713bb08317afbb5a0d2e14bde55a7969e0c4be/hydroanomaly-1.2.9-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "2e319f7b2b1736ad584501ab3ad5691ab15cac3fcedaaaa364dc7ec6e0e7e675",
"md5": "01590d68b810b39a129d0c6fcf46a513",
"sha256": "d81d41d8a52569094eb2f74012a2ea200b17ab4e9c6b82f7ed440e547282cbed"
},
"downloads": -1,
"filename": "hydroanomaly-1.2.9.tar.gz",
"has_sig": false,
"md5_digest": "01590d68b810b39a129d0c6fcf46a513",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 16288,
"upload_time": "2025-07-31T19:39:21",
"upload_time_iso_8601": "2025-07-31T19:39:21.807888Z",
"url": "https://files.pythonhosted.org/packages/2e/31/9f7b2b1736ad584501ab3ad5691ab15cac3fcedaaaa364dc7ec6e0e7e675/hydroanomaly-1.2.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-31 19:39:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Ehsankahrizi",
"github_project": "HydroAnomaly",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "hydroanomaly"
}