tailestim


Nametailestim JSON
Version 0.0.5 PyPI version JSON
download
home_pageNone
SummaryA Python package for estimating tail parameters of heavy-tailed distributions, which is useful for analyzing power-law behavior in complex networks.
upload_time2025-02-28 23:56:01
maintainerNone
docs_urlNone
authorNone
requires_python>=3.6
licenseNone
keywords complex-network heavy-tail network-science power-law
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # tailestim

A Python package for estimating tail parameters of heavy-tailed distributions, which is useful for analyzing power-law behavior in complex networks.

> [!NOTE]
> The original estimation implementations are from [ivanvoitalov/tail-estimation](https://github.com/ivanvoitalov/tail-estimation). This is a wrapper package that provides a more convenient/modern interface and logging, that can be installed using `pip` and `conda`.

## Features
- Multiple estimation methods including Hill, Moments, Kernel, Pickands, and Smooth Hill estimators
- Double-bootstrap procedure for optimal threshold selection
- Built-in dataset loader for example networks
- Support for custom network data analysis
- Comprehensive parameter estimation and diagnostics

## Installation
```bash
pip install tailestim
```

## Quick Start

### Using Built-in Datasets
```python
from tailestim.datasets import TailData
from tailestim.estimator import TailEstimator

# Load a sample dataset
data = TailData(name='CAIDA_KONECT').data

# Initialize and fit the estimator
estimator = TailEstimator(method='hill')
estimator.fit(data)

# Get the estimated parameters
result = estimator.get_parameters()
gamma = result['gamma']

# Print full results
print(estimator)
```

### Using degree sequence from networkx graphs
```python
import networkx as nx
from tailestim.estimator import TailEstimator

# Create or load your network
G = nx.barabasi_albert_graph(10000, 2)
degree = list(dict(G.degree()).values()) # Degree sequence

# Initialize and fit the estimator
estimator = TailEstimator(method='hill')
estimator.fit(degree)

# Get the estimated parameters
result = estimator.get_parameters()
gamma = result['gamma']

# Print full results
print(estimator)
```

## Available Methods
The package provides several methods for tail estimation. For details on parameters that can be specified to each methods, please refer to the original repository [ivanvoitalov/tail-estimation](https://github.com/ivanvoitalov/tail-estimation), [original paper](https://doi.org/10.1103/PhysRevResearch.1.033034), or the [actual code](https://github.com/mu373/tailestim/blob/main/src/tailestim/tail_methods.py).

1. **Hill Estimator** (`method='hill'`)
   - Classical Hill estimator with double-bootstrap for optimal threshold selection
   - Default method, generally recommended for power law analysis
2. **Moments Estimator** (`method='moments'`)
   - Moments-based estimation with double-bootstrap
   - More robust to certain types of deviations from pure power law
3. **Kernel-type Estimator** (`method='kernel'`)
   - Kernel-based estimation with double-bootstrap and bandwidth selection
   - Additional parameters: `hsteps` (int, default=200), `alpha` (float, default=0.6)
4. **Pickands Estimator** (`method='pickands'`)
   - Pickands-based estimation (no bootstrap)
   - Provides arrays of estimates across different thresholds
5. **Smooth Hill Estimator** (`method='smooth_hill'`)
   - Smoothed version of the Hill estimator (no bootstrap)
   - Additional parameter: `r_smooth` (int, default=2)

## Results
The results can be obtained by `estimator.get_parameters()`, which returns a dictionary. This includes:
- `gamma`: Power law exponent (γ = 1 + 1/ξ)
- `xi_star`: Tail index (ξ)
- `k_star`: Optimal order statistic
- Bootstrap results (when applicable):
  - First and second bootstrap AMSE values
  - Optimal bandwidths or minimum AMSE fractions

## Example Output
When you `print(estimator)` after fitting, you will get the following output.
```
==================================================
Tail Estimation Results (Hill Method)
==================================================

Parameters:
--------------------
Optimal order statistic (k*): 6873
Tail index (ξ): 0.6191
Gamma (powerlaw exponent) (γ): 2.6151

Bootstrap Results:
--------------------
First bootstrap minimum AMSE fraction: 0.6899
Second bootstrap minimum AMSE fraction: 0.6901
```

## Built-in Datasets

The package includes several example datasets:
- `CAIDA_KONECT`
- `Libimseti_in_KONECT`
- `Pareto`

Load any example dataset using:
```python
from tailestim.datasets import TailData
data = TailData(name='dataset_name').data
```

## References
- I. Voitalov, P. van der Hoorn, R. van der Hofstad, and D. Krioukov. Scale-free networks well done. *Phys. Rev. Res.*, Oct. 2019, doi: [10.1103/PhysRevResearch.1.033034](https://doi.org/10.1103/PhysRevResearch.1.033034).
- I. Voitalov. `ivanvoitalov/tail-estimation`, GitHub. Mar. 2018. [https://github.com/ivanvoitalov/tail-estimation](https://github.com/ivanvoitalov/tail-estimation).


## License
`tailestim` is distributed under the terms of the [MIT license](https://github.com/mu373/tailestim/blob/main/LICENSE.txt).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "tailestim",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "complex-network, heavy-tail, network-science, power-law",
    "author": null,
    "author_email": "Minami Ueda <minami.ueda@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/56/4a/35c0d6796eb92a3a1b076f7ec058661951b024121c5a84ac157e50526f53/tailestim-0.0.5.tar.gz",
    "platform": null,
    "description": "# tailestim\n\nA Python package for estimating tail parameters of heavy-tailed distributions, which is useful for analyzing power-law behavior in complex networks.\n\n> [!NOTE]\n> The original estimation implementations are from [ivanvoitalov/tail-estimation](https://github.com/ivanvoitalov/tail-estimation). This is a wrapper package that provides a more convenient/modern interface and logging, that can be installed using `pip` and `conda`.\n\n## Features\n- Multiple estimation methods including Hill, Moments, Kernel, Pickands, and Smooth Hill estimators\n- Double-bootstrap procedure for optimal threshold selection\n- Built-in dataset loader for example networks\n- Support for custom network data analysis\n- Comprehensive parameter estimation and diagnostics\n\n## Installation\n```bash\npip install tailestim\n```\n\n## Quick Start\n\n### Using Built-in Datasets\n```python\nfrom tailestim.datasets import TailData\nfrom tailestim.estimator import TailEstimator\n\n# Load a sample dataset\ndata = TailData(name='CAIDA_KONECT').data\n\n# Initialize and fit the estimator\nestimator = TailEstimator(method='hill')\nestimator.fit(data)\n\n# Get the estimated parameters\nresult = estimator.get_parameters()\ngamma = result['gamma']\n\n# Print full results\nprint(estimator)\n```\n\n### Using degree sequence from networkx graphs\n```python\nimport networkx as nx\nfrom tailestim.estimator import TailEstimator\n\n# Create or load your network\nG = nx.barabasi_albert_graph(10000, 2)\ndegree = list(dict(G.degree()).values()) # Degree sequence\n\n# Initialize and fit the estimator\nestimator = TailEstimator(method='hill')\nestimator.fit(degree)\n\n# Get the estimated parameters\nresult = estimator.get_parameters()\ngamma = result['gamma']\n\n# Print full results\nprint(estimator)\n```\n\n## Available Methods\nThe package provides several methods for tail estimation. For details on parameters that can be specified to each methods, please refer to the original repository [ivanvoitalov/tail-estimation](https://github.com/ivanvoitalov/tail-estimation), [original paper](https://doi.org/10.1103/PhysRevResearch.1.033034), or the [actual code](https://github.com/mu373/tailestim/blob/main/src/tailestim/tail_methods.py).\n\n1. **Hill Estimator** (`method='hill'`)\n   - Classical Hill estimator with double-bootstrap for optimal threshold selection\n   - Default method, generally recommended for power law analysis\n2. **Moments Estimator** (`method='moments'`)\n   - Moments-based estimation with double-bootstrap\n   - More robust to certain types of deviations from pure power law\n3. **Kernel-type Estimator** (`method='kernel'`)\n   - Kernel-based estimation with double-bootstrap and bandwidth selection\n   - Additional parameters: `hsteps` (int, default=200), `alpha` (float, default=0.6)\n4. **Pickands Estimator** (`method='pickands'`)\n   - Pickands-based estimation (no bootstrap)\n   - Provides arrays of estimates across different thresholds\n5. **Smooth Hill Estimator** (`method='smooth_hill'`)\n   - Smoothed version of the Hill estimator (no bootstrap)\n   - Additional parameter: `r_smooth` (int, default=2)\n\n## Results\nThe results can be obtained by `estimator.get_parameters()`, which returns a dictionary. This includes:\n- `gamma`: Power law exponent (\u03b3 = 1 + 1/\u03be)\n- `xi_star`: Tail index (\u03be)\n- `k_star`: Optimal order statistic\n- Bootstrap results (when applicable):\n  - First and second bootstrap AMSE values\n  - Optimal bandwidths or minimum AMSE fractions\n\n## Example Output\nWhen you `print(estimator)` after fitting, you will get the following output.\n```\n==================================================\nTail Estimation Results (Hill Method)\n==================================================\n\nParameters:\n--------------------\nOptimal order statistic (k*): 6873\nTail index (\u03be): 0.6191\nGamma (powerlaw exponent) (\u03b3): 2.6151\n\nBootstrap Results:\n--------------------\nFirst bootstrap minimum AMSE fraction: 0.6899\nSecond bootstrap minimum AMSE fraction: 0.6901\n```\n\n## Built-in Datasets\n\nThe package includes several example datasets:\n- `CAIDA_KONECT`\n- `Libimseti_in_KONECT`\n- `Pareto`\n\nLoad any example dataset using:\n```python\nfrom tailestim.datasets import TailData\ndata = TailData(name='dataset_name').data\n```\n\n## References\n- I. Voitalov, P. van der Hoorn, R. van der Hofstad, and D. Krioukov. Scale-free networks well done. *Phys. Rev. Res.*, Oct. 2019, doi: [10.1103/PhysRevResearch.1.033034](https://doi.org/10.1103/PhysRevResearch.1.033034).\n- I. Voitalov. `ivanvoitalov/tail-estimation`, GitHub. Mar. 2018. [https://github.com/ivanvoitalov/tail-estimation](https://github.com/ivanvoitalov/tail-estimation).\n\n\n## License\n`tailestim` is distributed under the terms of the [MIT license](https://github.com/mu373/tailestim/blob/main/LICENSE.txt).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python package for estimating tail parameters of heavy-tailed distributions, which is useful for analyzing power-law behavior in complex networks.",
    "version": "0.0.5",
    "project_urls": {
        "Documentation": "https://github.com/mu373/tailestim#readme",
        "Issues": "https://github.com/mu373/tailestim/issues",
        "Source": "https://github.com/mu373/tailestim"
    },
    "split_keywords": [
        "complex-network",
        " heavy-tail",
        " network-science",
        " power-law"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b368a0d42ba2450f4125c240ebb2af916fa9b07b3e19f47a03857b3fe36b6683",
                "md5": "e9c55034131a0f22be1488b0f72c6a6c",
                "sha256": "4ba266ddb5e3ca88420c4a71d97abd17289d4c83e28705f04e399ac4980dc0b3"
            },
            "downloads": -1,
            "filename": "tailestim-0.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e9c55034131a0f22be1488b0f72c6a6c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 65257,
            "upload_time": "2025-02-28T23:56:00",
            "upload_time_iso_8601": "2025-02-28T23:56:00.261481Z",
            "url": "https://files.pythonhosted.org/packages/b3/68/a0d42ba2450f4125c240ebb2af916fa9b07b3e19f47a03857b3fe36b6683/tailestim-0.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "564a35c0d6796eb92a3a1b076f7ec058661951b024121c5a84ac157e50526f53",
                "md5": "8f1c334a2614a41235411dcf7ec44449",
                "sha256": "d20b315b91cac9d023aee6a1a804a675111a097f895e38395a293ffe902820ab"
            },
            "downloads": -1,
            "filename": "tailestim-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "8f1c334a2614a41235411dcf7ec44449",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 67100,
            "upload_time": "2025-02-28T23:56:01",
            "upload_time_iso_8601": "2025-02-28T23:56:01.375851Z",
            "url": "https://files.pythonhosted.org/packages/56/4a/35c0d6796eb92a3a1b076f7ec058661951b024121c5a84ac157e50526f53/tailestim-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-28 23:56:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mu373",
    "github_project": "tailestim#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "tailestim"
}
        
Elapsed time: 1.62554s