powerlaw


Namepowerlaw JSON
Version 1.5 PyPI version JSON
download
home_pagehttp://www.github.com/jeffalstott/powerlaw
SummaryToolbox for testing if a probability distribution fits a power law
upload_time2021-08-18 01:21:04
maintainer
docs_urlhttps://pythonhosted.org/powerlaw/
authorJeff Alstott
requires_python
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            powerlaw: A Python Package for Analysis of Heavy-Tailed Distributions
=====================================================================

``powerlaw`` is a toolbox using the statistical methods developed in
`Clauset et al. 2007 <http://arxiv.org/abs/0706.1062>`_ and `Klaus et al. 2011 <http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0019779>`_ to determine if a
probability distribution fits a power law. Academics, please cite as:

Jeff Alstott, Ed Bullmore, Dietmar Plenz. (2014). powerlaw: a Python package
for analysis of heavy-tailed distributions. `PLoS ONE 9(1): e85777 <http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0085777>`_

Also available at `arXiv:1305.0215 [physics.data-an] <http://arxiv.org/abs/1305.0215>`_


Basic Usage
------------
For the simplest, typical use cases, this tells you everything you need to
know.::

    import powerlaw
    data = array([1.7, 3.2 ...]) # data can be list or numpy array
    results = powerlaw.Fit(data)
    print(results.power_law.alpha)
    print(results.power_law.xmin)
    R, p = results.distribution_compare('power_law', 'lognormal')

For more explanation, understanding, and figures, see the paper,
which illustrates all of ``powerlaw``'s features. For details of the math, 
see Clauset et al. 2007, which developed these methods.

Quick Links
------------
`Paper illustrating all of powerlaw's features, with figures <http://arxiv.org/abs/1305.0215>`__

`Code examples from manuscript, as an IPython Notebook <http://nbviewer.ipython.org/github/jeffalstott/powerlaw/blob/master/manuscript/Manuscript_Code.ipynb>`__
Note: Some results involving lognormals will now be different from the
manuscript, as the lognormal fitting has been improved to allow for
greater numerical precision.

`Documentation <http://pythonhosted.org/powerlaw/>`__

This code was developed and tested for Python 2.x with the 
`Enthought Python Distribution <http://www.enthought.com/products/epd.php>`__,  and later amended to be
compatible with 3.x. The full version of Enthought is 
`available for free for academic use <http://www.enthought.com/products/edudownload.php>`__.


Installation
------------
``powerlaw`` is hosted on `PyPI <https://pypi.python.org/pypi/powerlaw>`__, so installation is straightforward. The easiest way to install type this at the command line (Linux, Mac, or Windows)::

    easy_install powerlaw

or, better yet::

    pip install powerlaw

``easy_install`` or ``pip`` just need to be on your PATH, which for Linux or Mac is probably the case.

``pip`` should install all dependencies automagically. These other dependencies are ``numpy``, ``scipy``, and ``matplotlib``. These are all present in Enthought, Anaconda, and most other scientific Python stacks. To fit truncated power laws or gamma distributions, ``mpmath`` is also required, which is less common and is installable with::

    pip install mpmath

The requirement of ``mpmath`` will be dropped if/when the ``scipy`` functions ``gamma``, ``gammainc`` and ``gammaincc`` are updated to have sufficient numerical accuracy for negative numbers.

You can also build from source from the code here on Github, though it may be a development version slightly ahead of the PyPI version.


Update Notifications and Mailing List
-----------------
Get notified of updates by joining the Google Group `here <https://groups.google.com/forum/?fromgroups#!forum/powerlaw-updates>`__.

Questions/discussions/help go on the Google Group `here <https://groups.google.com/forum/?fromgroups#!forum/powerlaw-general>`__. Also receives update info.

Further Development
-----------------
The original author of `powerlaw`, Jeff Alstott, is now only writing minor tweaks, but ``powerlaw`` remains open for further development by the community. If there's a feature you'd like to see in ``powerlaw`` you can `submit an issue <https://github.com/jeffalstott/powerlaw/issues>`_, but pull requests are even better. Offers for expansion or inclusion in other projects are welcomed and encouraged.


Acknowledgements
-----------------
Many thanks to Andreas Klaus, Mika Rubinov and Shan Yu for helpful
discussions. Thanks also to `Andreas Klaus <http://neuroscience.nih.gov/Fellows/Fellow.asp?People_ID=2709>`_,
`Aaron Clauset, Cosma Shalizi <http://tuvalu.santafe.edu/~aaronc/powerlaws/>`_,
and `Adam Ginsburg <http://code.google.com/p/agpy/wiki/PowerLaw>`_ for making 
their code available. Their implementations were a critical starting point for
making ``powerlaw``.


Power Laws vs. Lognormals and powerlaw's 'lognormal_positive' option
-----------------
When fitting a power law to a data set, one should compare the goodness of fit to that of a `lognormal distribution <https://en.wikipedia.org/wiki/Lognormal_distribution>`__. This is done because lognormal distributions are another heavy-tailed distribution, but they can be generated by a very simple process: multiplying random positive variables together. The lognormal is thus much like the normal distribution, which can be created by adding random variables together; in fact, the log of a lognormal distribution is a normal distribution (hence the name), and the exponential of a normal distribution is the lognormal (which maybe would be better called an expnormal). In contrast, creating a power law generally requires fancy or exotic generative mechanisms (this is probably why you're looking for a power law to begin with; they're sexy). So, even though the power law has only one parameter (``alpha``: the slope) and the lognormal has two (``mu``: the mean of the random variables in the underlying normal and ``sigma``: the standard deviation of the underlying normal distribution), we typically consider the lognormal to be a simpler explanation for observed data, as long as the distribution fits the data just as well. For most data sets, a power law is actually a worse fit than a lognormal distribution, or perhaps equally good, but rarely better. This fact was one of the central empirical results of the paper `Clauset et al. 2007 <http://arxiv.org/abs/0706.1062>`__, which developed the statistical methods that ``powerlaw`` implements. 

However, for many data sets, the superior lognormal fit is only possible if one allows the fitted parameter ``mu`` to go negative. Whether or not this is sensible depends on your theory of what's generating the data. If the data is thought to be generated by multiplying random positive variables, ``mu`` is just the log of the distribution's median; a negative ``mu`` just indicates those variables' products are typically below 1. However, if the data is thought to be generated by exponentiating a normal distribution, then ``mu`` is interpreted as the median of the underlying normal data. In that case, the normal data is likely generated by summing random variables (positive and negative), and ``mu`` is those sums' median (and mean). A negative ``mu``, then, indicates that the random variables are typically negative. For some physical systems, this is perfectly possible. For the data you're studying, though, it may be a weird assumption. For starters, all of the data points you're fitting to are positive by definition, since power laws must have positive values (indeed, ``powerlaw`` throws out 0s or negative values). Why would those data be generated by a process that sums and exponentiates *negative* variables?

If you think that your physical system could be modeled by summing and exponentiating random variables, but you think that those random variables should be positive, one possible hacks is ``powerlaw``'s ``lognormal_positive``. This is just a regular lognormal distribution, except ``mu`` must be positive. Note that this does not force the underlying normal distribution to be the sum of only positive variables; it only forces the sums' *average* to be positive, but it's a start. You can compare a power law to this distribution in the normal way shown above::

    R, p = results.distribution_compare('power_law', 'lognormal_positive')
    
You may find that a lognormal where ``mu`` must be positive gives a much worse fit to your data, and that leaves the power law looking like the best explanation of the data. Before concluding that the data is in fact power law distributed, consider carefully whether a more likely explanation is that the data was generated by multiplying positive random variables, or even by summing and exponentiating random variables; either one would allow for a lognormal with an intelligible negative value of ``mu``.



            

Raw data

            {
    "_id": null,
    "home_page": "http://www.github.com/jeffalstott/powerlaw",
    "name": "powerlaw",
    "maintainer": "",
    "docs_url": "https://pythonhosted.org/powerlaw/",
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Jeff Alstott",
    "author_email": "jeffalstott@gmail.com",
    "download_url": "",
    "platform": "",
    "description": "powerlaw: A Python Package for Analysis of Heavy-Tailed Distributions\n=====================================================================\n\n``powerlaw`` is a toolbox using the statistical methods developed in\n`Clauset et al. 2007 <http://arxiv.org/abs/0706.1062>`_ and `Klaus et al. 2011 <http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0019779>`_ to determine if a\nprobability distribution fits a power law. Academics, please cite as:\n\nJeff Alstott, Ed Bullmore, Dietmar Plenz. (2014). powerlaw: a Python package\nfor analysis of heavy-tailed distributions. `PLoS ONE 9(1): e85777 <http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0085777>`_\n\nAlso available at `arXiv:1305.0215 [physics.data-an] <http://arxiv.org/abs/1305.0215>`_\n\n\nBasic Usage\n------------\nFor the simplest, typical use cases, this tells you everything you need to\nknow.::\n\n    import powerlaw\n    data = array([1.7, 3.2 ...]) # data can be list or numpy array\n    results = powerlaw.Fit(data)\n    print(results.power_law.alpha)\n    print(results.power_law.xmin)\n    R, p = results.distribution_compare('power_law', 'lognormal')\n\nFor more explanation, understanding, and figures, see the paper,\nwhich illustrates all of ``powerlaw``'s features. For details of the math, \nsee Clauset et al. 2007, which developed these methods.\n\nQuick Links\n------------\n`Paper illustrating all of powerlaw's features, with figures <http://arxiv.org/abs/1305.0215>`__\n\n`Code examples from manuscript, as an IPython Notebook <http://nbviewer.ipython.org/github/jeffalstott/powerlaw/blob/master/manuscript/Manuscript_Code.ipynb>`__\nNote: Some results involving lognormals will now be different from the\nmanuscript, as the lognormal fitting has been improved to allow for\ngreater numerical precision.\n\n`Documentation <http://pythonhosted.org/powerlaw/>`__\n\nThis code was developed and tested for Python 2.x with the \n`Enthought Python Distribution <http://www.enthought.com/products/epd.php>`__,  and later amended to be\ncompatible with 3.x. The full version of Enthought is \n`available for free for academic use <http://www.enthought.com/products/edudownload.php>`__.\n\n\nInstallation\n------------\n``powerlaw`` is hosted on `PyPI <https://pypi.python.org/pypi/powerlaw>`__, so installation is straightforward. The easiest way to install type this at the command line (Linux, Mac, or Windows)::\n\n    easy_install powerlaw\n\nor, better yet::\n\n    pip install powerlaw\n\n``easy_install`` or ``pip`` just need to be on your PATH, which for Linux or Mac is probably the case.\n\n``pip`` should install all dependencies automagically. These other dependencies are ``numpy``, ``scipy``, and ``matplotlib``. These are all present in Enthought, Anaconda, and most other scientific Python stacks. To fit truncated power laws or gamma distributions, ``mpmath`` is also required, which is less common and is installable with::\n\n    pip install mpmath\n\nThe requirement of ``mpmath`` will be dropped if/when the ``scipy`` functions ``gamma``, ``gammainc`` and ``gammaincc`` are updated to have sufficient numerical accuracy for negative numbers.\n\nYou can also build from source from the code here on Github, though it may be a development version slightly ahead of the PyPI version.\n\n\nUpdate Notifications and Mailing List\n-----------------\nGet notified of updates by joining the Google Group `here <https://groups.google.com/forum/?fromgroups#!forum/powerlaw-updates>`__.\n\nQuestions/discussions/help go on the Google Group `here <https://groups.google.com/forum/?fromgroups#!forum/powerlaw-general>`__. Also receives update info.\n\nFurther Development\n-----------------\nThe original author of `powerlaw`, Jeff Alstott, is now only writing minor tweaks, but ``powerlaw`` remains open for further development by the community. If there's a feature you'd like to see in ``powerlaw`` you can `submit an issue <https://github.com/jeffalstott/powerlaw/issues>`_, but pull requests are even better. Offers for expansion or inclusion in other projects are welcomed and encouraged.\n\n\nAcknowledgements\n-----------------\nMany thanks to Andreas Klaus, Mika Rubinov and Shan Yu for helpful\ndiscussions. Thanks also to `Andreas Klaus <http://neuroscience.nih.gov/Fellows/Fellow.asp?People_ID=2709>`_,\n`Aaron Clauset, Cosma Shalizi <http://tuvalu.santafe.edu/~aaronc/powerlaws/>`_,\nand `Adam Ginsburg <http://code.google.com/p/agpy/wiki/PowerLaw>`_ for making \ntheir code available. Their implementations were a critical starting point for\nmaking ``powerlaw``.\n\n\nPower Laws vs. Lognormals and powerlaw's 'lognormal_positive' option\n-----------------\nWhen fitting a power law to a data set, one should compare the goodness of fit to that of a `lognormal distribution <https://en.wikipedia.org/wiki/Lognormal_distribution>`__. This is done because lognormal distributions are another heavy-tailed distribution, but they can be generated by a very simple process: multiplying random positive variables together. The lognormal is thus much like the normal distribution, which can be created by adding random variables together; in fact, the log of a lognormal distribution is a normal distribution (hence the name), and the exponential of a normal distribution is the lognormal (which maybe would be better called an expnormal). In contrast, creating a power law generally requires fancy or exotic generative mechanisms (this is probably why you're looking for a power law to begin with; they're sexy). So, even though the power law has only one parameter (``alpha``: the slope) and the lognormal has two (``mu``: the mean of the random variables in the underlying normal and ``sigma``: the standard deviation of the underlying normal distribution), we typically consider the lognormal to be a simpler explanation for observed data, as long as the distribution fits the data just as well. For most data sets, a power law is actually a worse fit than a lognormal distribution, or perhaps equally good, but rarely better. This fact was one of the central empirical results of the paper `Clauset et al. 2007 <http://arxiv.org/abs/0706.1062>`__, which developed the statistical methods that ``powerlaw`` implements. \n\nHowever, for many data sets, the superior lognormal fit is only possible if one allows the fitted parameter ``mu`` to go negative. Whether or not this is sensible depends on your theory of what's generating the data. If the data is thought to be generated by multiplying random positive variables, ``mu`` is just the log of the distribution's median; a negative ``mu`` just indicates those variables' products are typically below 1. However, if the data is thought to be generated by exponentiating a normal distribution, then ``mu`` is interpreted as the median of the underlying normal data. In that case, the normal data is likely generated by summing random variables (positive and negative), and ``mu`` is those sums' median (and mean). A negative ``mu``, then, indicates that the random variables are typically negative. For some physical systems, this is perfectly possible. For the data you're studying, though, it may be a weird assumption. For starters, all of the data points you're fitting to are positive by definition, since power laws must have positive values (indeed, ``powerlaw`` throws out 0s or negative values). Why would those data be generated by a process that sums and exponentiates *negative* variables?\n\nIf you think that your physical system could be modeled by summing and exponentiating random variables, but you think that those random variables should be positive, one possible hacks is ``powerlaw``'s ``lognormal_positive``. This is just a regular lognormal distribution, except ``mu`` must be positive. Note that this does not force the underlying normal distribution to be the sum of only positive variables; it only forces the sums' *average* to be positive, but it's a start. You can compare a power law to this distribution in the normal way shown above::\n\n    R, p = results.distribution_compare('power_law', 'lognormal_positive')\n    \nYou may find that a lognormal where ``mu`` must be positive gives a much worse fit to your data, and that leaves the power law looking like the best explanation of the data. Before concluding that the data is in fact power law distributed, consider carefully whether a more likely explanation is that the data was generated by multiplying positive random variables, or even by summing and exponentiating random variables; either one would allow for a lognormal with an intelligible negative value of ``mu``.\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Toolbox for testing if a probability distribution fits a power law",
    "version": "1.5",
    "project_urls": {
        "Homepage": "http://www.github.com/jeffalstott/powerlaw"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e826e0daa306f83d705bc1ed4d6759b7fc945cc787530c230ee1fe299cc28093",
                "md5": "ab9596d47b0cd9ff99a2b2af15e7d137",
                "sha256": "633a669573d9fd663d2f452f121117f2d6b2f2c502eca532f9355f733abfec96"
            },
            "downloads": -1,
            "filename": "powerlaw-1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ab9596d47b0cd9ff99a2b2af15e7d137",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 24982,
            "upload_time": "2021-08-18T01:21:04",
            "upload_time_iso_8601": "2021-08-18T01:21:04.803437Z",
            "url": "https://files.pythonhosted.org/packages/e8/26/e0daa306f83d705bc1ed4d6759b7fc945cc787530c230ee1fe299cc28093/powerlaw-1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2021-08-18 01:21:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jeffalstott",
    "github_project": "powerlaw",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "powerlaw"
}
        
Elapsed time: 0.14136s