opda


Nameopda JSON
Version 0.6.1 PyPI version JSON
download
home_pageNone
SummaryDesign and analyze optimal deep learning models.
upload_time2024-04-04 03:41:54
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseApache-2.0
keywords opda optimal design analysis hyperparameter tuning machine learning ml deep learning dl artificial intelligence ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            =============================
opda: optimal design analysis
=============================
`Docs <https://nicholaslourie.github.io/opda>`_
| `Source <https://github.com/nicholaslourie/opda>`_
| `Issues <https://github.com/nicholaslourie/opda/issues>`_
| `Changelog <https://nicholaslourie.github.io/opda/changelog.html>`_

..
  The content below is included into the docs.

*Design and analyze optimal deep learning models.*

**Optimal design analysis** (OPDA) combines an empirical theory of
deep learning with statistical analyses to answer questions such as:

1. Does a change actually improve performance when you account for
   hyperparameter tuning?
2. What aspects of the data or existing hyperparameters does a new
   hyperparameter interact with?
3. What is the best possible score a model can achieve with perfectly
   tuned hyperparameters?

This toolkit provides everything you need to get started with optimal
design analysis. Jump to the section most relevant to you:

- `Installation`_
- `Quickstart`_
- `Resources`_
- `Citation`_
- `Contact`_


Installation
============
Install opda via ``pip``:

.. code-block:: console

   $ pip install opda

See the `Setup
<https://nicholaslourie.github.io/opda/tutorial/setup.html>`_
documentation for information on optional dependencies and development
setups.


Quickstart
==========
Let's evaluate a model while accounting for hyperparameter tuning
effort.

A key concept for opda is the tuning curve. Given a model and
hyperparameter search space, its *tuning curve* plots model
performance as a function of the number of rounds of random
search. Thus, tuning curves capture the cost-benefit trade-off offered
by tuning the model's hyperparameters.

We can compute tuning curves using the
``opda.nonparametric.EmpiricalDistribution``  class. Beforehand, run
several rounds of random search, then instantiate
``EmpiricalDistribution`` with the results:

.. code-block:: python

   >>> from opda.nonparametric import EmpiricalDistribution
   >>>
   >>> ys = [  # accuracy results from random search
   ...   0.8420, 0.9292, 0.8172, 0.8264, 0.8851, 0.8765, 0.8824, 0.9221,
   ...   0.9456, 0.7533, 0.8141, 0.9061, 0.8986, 0.8287, 0.8645, 0.8495,
   ...   0.8134, 0.8456, 0.9034, 0.7861, 0.8336, 0.9036, 0.7796, 0.9449,
   ...   0.8216, 0.7520, 0.9089, 0.7890, 0.9198, 0.9428, 0.8140, 0.7734,
   ... ]
   >>> dist_lo, dist_pt, dist_hi = EmpiricalDistribution.confidence_bands(
   ...   ys=ys,            # accuracy results from random search
   ...   confidence=0.80,  # confidence level
   ...   a=0.,             # (optional) lower bound on accuracy
   ...   b=1.,             # (optional) upper bound on accuracy
   ... )

Beyond point estimates, opda offers powerful, nonparametric confidence
bands. The code above yields 80% confidence bands for the probability
distribution. You can use the estimate, ``dist_pt``, to evaluate
points along the tuning curve:

.. code-block:: python

   >>> n_search_iterations = [1, 2, 3, 4, 5]
   >>> dist_pt.quantile_tuning_curve(n_search_iterations)
   array([0.8456, 0.9034, 0.9089, 0.9198, 0.9221])

Or, better still, you can plot the entire tuning curve with confidence
bands, and compare it to a baseline:

.. code-block:: python

   >>> from matplotlib import pyplot as plt
   >>> import numpy as np
   >>>
   >>> ys_old = [  # random search results from the baseline
   ...   0.7440, 0.7710, 0.8774, 0.8924, 0.8074, 0.7173, 0.7890, 0.7449,
   ...   0.8278, 0.7951, 0.7216, 0.8069, 0.7849, 0.8332, 0.7702, 0.7364,
   ...   0.7306, 0.8272, 0.8555, 0.8801, 0.8046, 0.7496, 0.7950, 0.7012,
   ...   0.7097, 0.7017, 0.8720, 0.7758, 0.7038, 0.8567, 0.7086, 0.7487,
   ... ]
   >>> ys_new = [  # random search results from the new model
   ...   0.8420, 0.9292, 0.8172, 0.8264, 0.8851, 0.8765, 0.8824, 0.9221,
   ...   0.9456, 0.7533, 0.8141, 0.9061, 0.8986, 0.8287, 0.8645, 0.8495,
   ...   0.8134, 0.8456, 0.9034, 0.7861, 0.8336, 0.9036, 0.7796, 0.9449,
   ...   0.8216, 0.7520, 0.9089, 0.7890, 0.9198, 0.9428, 0.8140, 0.7734,
   ... ]
   >>>
   >>> ns = np.linspace(1, 5, num=1_000)
   >>> for name, ys in [("baseline", ys_old), ("model", ys_new)]:
   ...   dist_lo, dist_pt, dist_hi = EmpiricalDistribution.confidence_bands(
   ...     ys=ys,            # accuracy results from random search
   ...     confidence=0.80,  # confidence level
   ...     a=0.,             # (optional) lower bound on accuracy
   ...     b=1.,             # (optional) upper bound on accuracy
   ...   )
   ...   plt.plot(ns, dist_pt.quantile_tuning_curve(ns), label=name)
   ...   plt.fill_between(
   ...     ns,
   ...     dist_hi.quantile_tuning_curve(ns),
   ...     dist_lo.quantile_tuning_curve(ns),
   ...     alpha=0.275,
   ...     label="80% confidence",
   ...   )
   [...
   >>> plt.xlabel("search iterations")
   Text(...)
   >>> plt.ylabel("accuracy")
   Text(...)
   >>> plt.legend(loc="lower right")
   <matplotlib.legend.Legend object at ...>
   >>> # plt.show() or plt.savefig(...)

.. image:: https://nicholaslourie.github.io/opda/_static/readme_tuning-curve-comparison.png
   :alt: A simulated comparison of tuning curves with confidence bands.

See the `Usage <https://nicholaslourie.github.io/opda/tutorial/usage.html>`_,
`Examples <https://nicholaslourie.github.io/opda/tutorial/examples.html>`_, or
`Reference <https://nicholaslourie.github.io/opda/reference/opda.html>`_
documentation for a deeper dive into opda.


Resources
=========
For more information on OPDA, checkout our paper: `Show Your Work with
Confidence: Confidence Bands for Tuning Curves
<https://arxiv.org/abs/2311.09480>`_.


Citation
========
If you use the code, data, or other work presented in this repository,
please cite:

.. code-block:: none

    @misc{lourie2023work,
        title={Show Your Work with Confidence: Confidence Bands for Tuning Curves},
        author={Nicholas Lourie and Kyunghyun Cho and He He},
        year={2023},
        eprint={2311.09480},
        archivePrefix={arXiv},
        primaryClass={cs.CL}
    }


Contact
=======
For more information, see the code
repository, `opda <https://github.com/nicholaslourie/opda>`_. Questions
and comments may be addressed to Nicholas Lourie.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "opda",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "opda, optimal design analysis, hyperparameter tuning, machine learning, ml, deep learning, dl, artificial intelligence, ai",
    "author": null,
    "author_email": "Nicholas Lourie <dev@nicholaslourie.com>",
    "download_url": "https://files.pythonhosted.org/packages/cc/43/9cb12d53120a938e7ad62c5ef4c84ea37d65925f71d1ec54c875b8f31d3d/opda-0.6.1.tar.gz",
    "platform": null,
    "description": "=============================\nopda: optimal design analysis\n=============================\n`Docs <https://nicholaslourie.github.io/opda>`_\n| `Source <https://github.com/nicholaslourie/opda>`_\n| `Issues <https://github.com/nicholaslourie/opda/issues>`_\n| `Changelog <https://nicholaslourie.github.io/opda/changelog.html>`_\n\n..\n  The content below is included into the docs.\n\n*Design and analyze optimal deep learning models.*\n\n**Optimal design analysis** (OPDA) combines an empirical theory of\ndeep learning with statistical analyses to answer questions such as:\n\n1. Does a change actually improve performance when you account for\n   hyperparameter tuning?\n2. What aspects of the data or existing hyperparameters does a new\n   hyperparameter interact with?\n3. What is the best possible score a model can achieve with perfectly\n   tuned hyperparameters?\n\nThis toolkit provides everything you need to get started with optimal\ndesign analysis. Jump to the section most relevant to you:\n\n- `Installation`_\n- `Quickstart`_\n- `Resources`_\n- `Citation`_\n- `Contact`_\n\n\nInstallation\n============\nInstall opda via ``pip``:\n\n.. code-block:: console\n\n   $ pip install opda\n\nSee the `Setup\n<https://nicholaslourie.github.io/opda/tutorial/setup.html>`_\ndocumentation for information on optional dependencies and development\nsetups.\n\n\nQuickstart\n==========\nLet's evaluate a model while accounting for hyperparameter tuning\neffort.\n\nA key concept for opda is the tuning curve. Given a model and\nhyperparameter search space, its *tuning curve* plots model\nperformance as a function of the number of rounds of random\nsearch. Thus, tuning curves capture the cost-benefit trade-off offered\nby tuning the model's hyperparameters.\n\nWe can compute tuning curves using the\n``opda.nonparametric.EmpiricalDistribution``  class. Beforehand, run\nseveral rounds of random search, then instantiate\n``EmpiricalDistribution`` with the results:\n\n.. code-block:: python\n\n   >>> from opda.nonparametric import EmpiricalDistribution\n   >>>\n   >>> ys = [  # accuracy results from random search\n   ...   0.8420, 0.9292, 0.8172, 0.8264, 0.8851, 0.8765, 0.8824, 0.9221,\n   ...   0.9456, 0.7533, 0.8141, 0.9061, 0.8986, 0.8287, 0.8645, 0.8495,\n   ...   0.8134, 0.8456, 0.9034, 0.7861, 0.8336, 0.9036, 0.7796, 0.9449,\n   ...   0.8216, 0.7520, 0.9089, 0.7890, 0.9198, 0.9428, 0.8140, 0.7734,\n   ... ]\n   >>> dist_lo, dist_pt, dist_hi = EmpiricalDistribution.confidence_bands(\n   ...   ys=ys,            # accuracy results from random search\n   ...   confidence=0.80,  # confidence level\n   ...   a=0.,             # (optional) lower bound on accuracy\n   ...   b=1.,             # (optional) upper bound on accuracy\n   ... )\n\nBeyond point estimates, opda offers powerful, nonparametric confidence\nbands. The code above yields 80% confidence bands for the probability\ndistribution. You can use the estimate, ``dist_pt``, to evaluate\npoints along the tuning curve:\n\n.. code-block:: python\n\n   >>> n_search_iterations = [1, 2, 3, 4, 5]\n   >>> dist_pt.quantile_tuning_curve(n_search_iterations)\n   array([0.8456, 0.9034, 0.9089, 0.9198, 0.9221])\n\nOr, better still, you can plot the entire tuning curve with confidence\nbands, and compare it to a baseline:\n\n.. code-block:: python\n\n   >>> from matplotlib import pyplot as plt\n   >>> import numpy as np\n   >>>\n   >>> ys_old = [  # random search results from the baseline\n   ...   0.7440, 0.7710, 0.8774, 0.8924, 0.8074, 0.7173, 0.7890, 0.7449,\n   ...   0.8278, 0.7951, 0.7216, 0.8069, 0.7849, 0.8332, 0.7702, 0.7364,\n   ...   0.7306, 0.8272, 0.8555, 0.8801, 0.8046, 0.7496, 0.7950, 0.7012,\n   ...   0.7097, 0.7017, 0.8720, 0.7758, 0.7038, 0.8567, 0.7086, 0.7487,\n   ... ]\n   >>> ys_new = [  # random search results from the new model\n   ...   0.8420, 0.9292, 0.8172, 0.8264, 0.8851, 0.8765, 0.8824, 0.9221,\n   ...   0.9456, 0.7533, 0.8141, 0.9061, 0.8986, 0.8287, 0.8645, 0.8495,\n   ...   0.8134, 0.8456, 0.9034, 0.7861, 0.8336, 0.9036, 0.7796, 0.9449,\n   ...   0.8216, 0.7520, 0.9089, 0.7890, 0.9198, 0.9428, 0.8140, 0.7734,\n   ... ]\n   >>>\n   >>> ns = np.linspace(1, 5, num=1_000)\n   >>> for name, ys in [(\"baseline\", ys_old), (\"model\", ys_new)]:\n   ...   dist_lo, dist_pt, dist_hi = EmpiricalDistribution.confidence_bands(\n   ...     ys=ys,            # accuracy results from random search\n   ...     confidence=0.80,  # confidence level\n   ...     a=0.,             # (optional) lower bound on accuracy\n   ...     b=1.,             # (optional) upper bound on accuracy\n   ...   )\n   ...   plt.plot(ns, dist_pt.quantile_tuning_curve(ns), label=name)\n   ...   plt.fill_between(\n   ...     ns,\n   ...     dist_hi.quantile_tuning_curve(ns),\n   ...     dist_lo.quantile_tuning_curve(ns),\n   ...     alpha=0.275,\n   ...     label=\"80% confidence\",\n   ...   )\n   [...\n   >>> plt.xlabel(\"search iterations\")\n   Text(...)\n   >>> plt.ylabel(\"accuracy\")\n   Text(...)\n   >>> plt.legend(loc=\"lower right\")\n   <matplotlib.legend.Legend object at ...>\n   >>> # plt.show() or plt.savefig(...)\n\n.. image:: https://nicholaslourie.github.io/opda/_static/readme_tuning-curve-comparison.png\n   :alt: A simulated comparison of tuning curves with confidence bands.\n\nSee the `Usage <https://nicholaslourie.github.io/opda/tutorial/usage.html>`_,\n`Examples <https://nicholaslourie.github.io/opda/tutorial/examples.html>`_, or\n`Reference <https://nicholaslourie.github.io/opda/reference/opda.html>`_\ndocumentation for a deeper dive into opda.\n\n\nResources\n=========\nFor more information on OPDA, checkout our paper: `Show Your Work with\nConfidence: Confidence Bands for Tuning Curves\n<https://arxiv.org/abs/2311.09480>`_.\n\n\nCitation\n========\nIf you use the code, data, or other work presented in this repository,\nplease cite:\n\n.. code-block:: none\n\n    @misc{lourie2023work,\n        title={Show Your Work with Confidence: Confidence Bands for Tuning Curves},\n        author={Nicholas Lourie and Kyunghyun Cho and He He},\n        year={2023},\n        eprint={2311.09480},\n        archivePrefix={arXiv},\n        primaryClass={cs.CL}\n    }\n\n\nContact\n=======\nFor more information, see the code\nrepository, `opda <https://github.com/nicholaslourie/opda>`_. Questions\nand comments may be addressed to Nicholas Lourie.\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Design and analyze optimal deep learning models.",
    "version": "0.6.1",
    "project_urls": {
        "Changelog": "https://nicholaslourie.github.io/opda/changelog.html",
        "Documentation": "https://nicholaslourie.github.io/opda",
        "Issues": "https://github.com/nicholaslourie/opda/issues",
        "Source": "https://github.com/nicholaslourie/opda"
    },
    "split_keywords": [
        "opda",
        " optimal design analysis",
        " hyperparameter tuning",
        " machine learning",
        " ml",
        " deep learning",
        " dl",
        " artificial intelligence",
        " ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "afac7e11a5fc0caa924cce504df093efdd34a315e7b4deda776e59c400fffbd6",
                "md5": "2fd2f7c6cb4647a50dfc342dda21dc75",
                "sha256": "dfe9184dea4be73ea0f747d99716935029cbdc7c458b08b4649435b952bfbc00"
            },
            "downloads": -1,
            "filename": "opda-0.6.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2fd2f7c6cb4647a50dfc342dda21dc75",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 44448,
            "upload_time": "2024-04-04T03:41:52",
            "upload_time_iso_8601": "2024-04-04T03:41:52.675535Z",
            "url": "https://files.pythonhosted.org/packages/af/ac/7e11a5fc0caa924cce504df093efdd34a315e7b4deda776e59c400fffbd6/opda-0.6.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cc439cb12d53120a938e7ad62c5ef4c84ea37d65925f71d1ec54c875b8f31d3d",
                "md5": "0f3473e999df16de7b135192de321e88",
                "sha256": "a11e204405659b425a34622af858146fd6393594907849d75ed275b8432b02b6"
            },
            "downloads": -1,
            "filename": "opda-0.6.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0f3473e999df16de7b135192de321e88",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 70036,
            "upload_time": "2024-04-04T03:41:54",
            "upload_time_iso_8601": "2024-04-04T03:41:54.587065Z",
            "url": "https://files.pythonhosted.org/packages/cc/43/9cb12d53120a938e7ad62c5ef4c84ea37d65925f71d1ec54c875b8f31d3d/opda-0.6.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-04 03:41:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nicholaslourie",
    "github_project": "opda",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "opda"
}
        
Elapsed time: 0.22816s