hydutils


Namehydutils JSON
Version 2.0.0 PyPI version JSON
download
home_pageNone
SummaryHydUtils is a Python utility library designed for data handling and validation, especially for time series and hydrological datasets.
upload_time2025-01-06 16:23:14
maintainerNone
docs_urlNone
authorDuy Nguyen
requires_python<4.0,>=3.10
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # HydUtils

![PyPI - Version](https://img.shields.io/pypi/v/hydutils)

**HydUtils** is a Python utility library designed for data handling and validation, especially for time series and
hydrological datasets. It provides several useful functions for working with time series data, including validation,
filtering, error metrics, and more, making it easier to handle and analyze hydrological and weather-related datasets.

## Installation

```bash
pip install hydutils
```

## Usage

### 1. Validate Columns for Nulls

The function `validate_columns_for_nulls` checks for columns that contain null values and raises an error if any are
found.

```python
from hydutils.df_helper import validate_columns_for_nulls
import pandas as pd

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, None], "c": [7, 8, 9]})

# Validate for null values in any column
validate_columns_for_nulls(df)

# Specify columns to check
validate_columns_for_nulls(df, columns=["b"])

# Handling missing columns
validate_columns_for_nulls(df, columns=["d"])  # This will raise an error if column "d" is missing
```

### 2. Validate Time Series Interval

The `validate_interval` function checks that the time intervals between rows in the time series are consistent.

```python
from hydutils.df_helper import validate_interval
import pandas as pd

df = pd.DataFrame({
    "time": pd.date_range(start="2023-01-01", periods=5, freq="h")
})

# Check if the time intervals are consistent
validate_interval(df, interval=1)
```

### 3. Filter Time Series

The `filter_timeseries` function allows you to filter your time series DataFrame based on a start and/or end date.

```python
from hydutils.df_helper import filter_timeseries
import pandas as pd
from datetime import datetime

df = pd.DataFrame({
    "time": pd.date_range(start="2023-01-01", periods=5, freq="h")
})

# Filter data between a start and end date
start = datetime(2023, 1, 1, 1)
end = datetime(2023, 1, 1, 3)
filtered_data = filter_timeseries(df, start=start, end=end)
```

### 4. Error Metrics

The `hydutils.metrics` module includes several commonly used metrics to evaluate model performance. These include MSE,
RMSE, NSE, R², PBIAS, and FBIAS.

#### 4.1 Mean Squared Error (MSE)

The `mse` function calculates the Mean Squared Error between two arrays.

```python
from hydutils.statistical_metrics import mse
import numpy as np

simulated = np.array([3.0, 4.0, 5.0])
observed = np.array([2.9, 4.1, 5.0])

mse_value = mse(simulated, observed)
```

#### 4.2 Root Mean Squared Error (RMSE)

The `rmse` function calculates the Root Mean Squared Error.

```python
from hydutils.statistical_metrics import rmse

rmse_value = rmse(simulated, observed)
```

#### 4.3 Nash-Sutcliffe Efficiency (NSE)

The `nse` function calculates the Nash-Sutcliffe Efficiency coefficient.

```python
from hydutils.statistical_metrics import nse

nse_value = nse(simulated, observed)
```

#### 4.4 R² (Coefficient of Determination)

The `r2` function calculates the coefficient of determination, R².

```python
from hydutils.statistical_metrics import r2

r2_value = r2(simulated, observed)
```

#### 4.5 Percentage Bias (PBIAS)

The `pbias` function calculates the Percentage Bias between observed and simulated values.

```python
from hydutils.statistical_metrics import pbias

pbias_value = pbias(observed, simulated)
```

#### 4.6 Fractional Bias (FBIAS)

The `fbias` function calculates the Fractional Bias between observed and simulated values.

```python
from hydutils.statistical_metrics import fbias

fbias_value = fbias(observed, simulated)
```

## License

This library is released under the MIT License.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "hydutils",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Duy Nguyen",
    "author_email": "duynguyen02.dev@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/49/a4/da94493e79eb90be5b3e29d6cbd20c0b65faa03cd8c101a7bb60b828c831/hydutils-2.0.0.tar.gz",
    "platform": null,
    "description": "# HydUtils\n\n![PyPI - Version](https://img.shields.io/pypi/v/hydutils)\n\n**HydUtils** is a Python utility library designed for data handling and validation, especially for time series and\nhydrological datasets. It provides several useful functions for working with time series data, including validation,\nfiltering, error metrics, and more, making it easier to handle and analyze hydrological and weather-related datasets.\n\n## Installation\n\n```bash\npip install hydutils\n```\n\n## Usage\n\n### 1. Validate Columns for Nulls\n\nThe function `validate_columns_for_nulls` checks for columns that contain null values and raises an error if any are\nfound.\n\n```python\nfrom hydutils.df_helper import validate_columns_for_nulls\nimport pandas as pd\n\ndf = pd.DataFrame({\"a\": [1, 2, 3], \"b\": [4, 5, None], \"c\": [7, 8, 9]})\n\n# Validate for null values in any column\nvalidate_columns_for_nulls(df)\n\n# Specify columns to check\nvalidate_columns_for_nulls(df, columns=[\"b\"])\n\n# Handling missing columns\nvalidate_columns_for_nulls(df, columns=[\"d\"])  # This will raise an error if column \"d\" is missing\n```\n\n### 2. Validate Time Series Interval\n\nThe `validate_interval` function checks that the time intervals between rows in the time series are consistent.\n\n```python\nfrom hydutils.df_helper import validate_interval\nimport pandas as pd\n\ndf = pd.DataFrame({\n    \"time\": pd.date_range(start=\"2023-01-01\", periods=5, freq=\"h\")\n})\n\n# Check if the time intervals are consistent\nvalidate_interval(df, interval=1)\n```\n\n### 3. Filter Time Series\n\nThe `filter_timeseries` function allows you to filter your time series DataFrame based on a start and/or end date.\n\n```python\nfrom hydutils.df_helper import filter_timeseries\nimport pandas as pd\nfrom datetime import datetime\n\ndf = pd.DataFrame({\n    \"time\": pd.date_range(start=\"2023-01-01\", periods=5, freq=\"h\")\n})\n\n# Filter data between a start and end date\nstart = datetime(2023, 1, 1, 1)\nend = datetime(2023, 1, 1, 3)\nfiltered_data = filter_timeseries(df, start=start, end=end)\n```\n\n### 4. Error Metrics\n\nThe `hydutils.metrics` module includes several commonly used metrics to evaluate model performance. These include MSE,\nRMSE, NSE, R\u00b2, PBIAS, and FBIAS.\n\n#### 4.1 Mean Squared Error (MSE)\n\nThe `mse` function calculates the Mean Squared Error between two arrays.\n\n```python\nfrom hydutils.statistical_metrics import mse\nimport numpy as np\n\nsimulated = np.array([3.0, 4.0, 5.0])\nobserved = np.array([2.9, 4.1, 5.0])\n\nmse_value = mse(simulated, observed)\n```\n\n#### 4.2 Root Mean Squared Error (RMSE)\n\nThe `rmse` function calculates the Root Mean Squared Error.\n\n```python\nfrom hydutils.statistical_metrics import rmse\n\nrmse_value = rmse(simulated, observed)\n```\n\n#### 4.3 Nash-Sutcliffe Efficiency (NSE)\n\nThe `nse` function calculates the Nash-Sutcliffe Efficiency coefficient.\n\n```python\nfrom hydutils.statistical_metrics import nse\n\nnse_value = nse(simulated, observed)\n```\n\n#### 4.4 R\u00b2 (Coefficient of Determination)\n\nThe `r2` function calculates the coefficient of determination, R\u00b2.\n\n```python\nfrom hydutils.statistical_metrics import r2\n\nr2_value = r2(simulated, observed)\n```\n\n#### 4.5 Percentage Bias (PBIAS)\n\nThe `pbias` function calculates the Percentage Bias between observed and simulated values.\n\n```python\nfrom hydutils.statistical_metrics import pbias\n\npbias_value = pbias(observed, simulated)\n```\n\n#### 4.6 Fractional Bias (FBIAS)\n\nThe `fbias` function calculates the Fractional Bias between observed and simulated values.\n\n```python\nfrom hydutils.statistical_metrics import fbias\n\nfbias_value = fbias(observed, simulated)\n```\n\n## License\n\nThis library is released under the MIT License.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "HydUtils is a Python utility library designed for data handling and validation, especially for time series and hydrological datasets.",
    "version": "2.0.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "49e290841bf3de0aeef91cb42e145c14c67b13673573a22244eb78d58c1240fc",
                "md5": "cddab0b2241287111986763317ff5543",
                "sha256": "522e67f16584c38271bd898f8ab157f52cbe76a69c64a9931bd7a2c4d2a993be"
            },
            "downloads": -1,
            "filename": "hydutils-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cddab0b2241287111986763317ff5543",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 6920,
            "upload_time": "2025-01-06T16:23:11",
            "upload_time_iso_8601": "2025-01-06T16:23:11.716021Z",
            "url": "https://files.pythonhosted.org/packages/49/e2/90841bf3de0aeef91cb42e145c14c67b13673573a22244eb78d58c1240fc/hydutils-2.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "49a4da94493e79eb90be5b3e29d6cbd20c0b65faa03cd8c101a7bb60b828c831",
                "md5": "2dc0c35ff685d0c77d1a901b5ae31ec7",
                "sha256": "f8eaeaecd402cc653308da919ea9a309c2c169b206237bf3d8a38f90a19186ac"
            },
            "downloads": -1,
            "filename": "hydutils-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2dc0c35ff685d0c77d1a901b5ae31ec7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 5852,
            "upload_time": "2025-01-06T16:23:14",
            "upload_time_iso_8601": "2025-01-06T16:23:14.865454Z",
            "url": "https://files.pythonhosted.org/packages/49/a4/da94493e79eb90be5b3e29d6cbd20c0b65faa03cd8c101a7bb60b828c831/hydutils-2.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-06 16:23:14",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "hydutils"
}
        
Elapsed time: 0.46246s