Name | skll JSON |
Version |
5.1.0
JSON |
| download |
home_page | None |
Summary | SciKit-Learn Laboratory makes it easier to run machine learning experiments with scikit-learn. |
upload_time | 2024-12-27 16:29:34 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | New BSD License Copyright (c) 2012–2022 Educational Testing Service All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: a. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. b. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. c. Neither the name of Educational Testing Service nor the names of the contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
keywords |
learning
scikit-learn
|
VCS |
 |
bugtrack_url |
|
requirements |
beautifulsoup4
joblib
numpy
pandas
ruamel.yaml
scikit-learn
scipy
seaborn
tabulate
typing_extensions
wandb
|
Travis-CI |
No Travis.
|
coveralls test coverage |
|
SciKit-Learn Laboratory
-----------------------
.. image:: https://gitlab.com/EducationalTestingService/skll/badges/main/pipeline.svg
:target: https://gitlab.com/EducationalTestingService/skll/-/pipelines
:alt: Gitlab CI status
.. image:: https://dev.azure.com/EducationalTestingService/SKLL/_apis/build/status/EducationalTestingService.skll
:target: https://dev.azure.com/EducationalTestingService/SKLL/_build?view=runs
:alt: Azure Pipelines status
.. image:: https://codecov.io/gh/EducationalTestingService/skll/branch/main/graph/badge.svg
:target: https://codecov.io/gh/EducationalTestingService/skll
.. image:: https://img.shields.io/pypi/v/skll.svg
:target: https://pypi.org/project/skll/
:alt: Latest version on PyPI
.. image:: https://img.shields.io/pypi/l/skll.svg
:alt: License
.. image:: https://img.shields.io/conda/v/ets/skll.svg
:target: https://anaconda.org/ets/skll
:alt: Conda package for SKLL
.. image:: https://img.shields.io/pypi/pyversions/skll.svg
:target: https://pypi.org/project/skll/
:alt: Supported python versions for SKLL
.. image:: https://img.shields.io/badge/DOI-10.5281%2Fzenodo.12825-blue.svg
:target: http://dx.doi.org/10.5281/zenodo.12825
:alt: DOI for citing SKLL 1.0.0
.. image:: https://mybinder.org/badge_logo.svg
:target: https://mybinder.org/v2/gh/EducationalTestingService/skll/main?filepath=examples%2FTutorial.ipynb
This Python package provides command-line utilities to make it easier to run
machine learning experiments with scikit-learn. One of the primary goals of
our project is to make it so that you can run scikit-learn experiments without
actually needing to write any code other than what you used to generate/extract
the features.
Installation
~~~~~~~~~~~~
You can install using either ``pip`` or ``conda``. See details `here <https://skll.readthedocs.io/en/latest/getting_started.html>`__.
Requirements
~~~~~~~~~~~~
- Python 3.10, 3.11, or 3.12.
- `beautifulsoup4 <http://www.crummy.com/software/BeautifulSoup/>`__
- `gridmap <https://pypi.org/project/gridmap/>`__ (only required if you plan
to run things in parallel on a DRMAA-compatible cluster)
- `joblib <https://pypi.org/project/joblib/>`__
- `pandas <http://pandas.pydata.org>`__
- `ruamel.yaml <http://yaml.readthedocs.io/en/latest/overview.html>`__
- `scikit-learn <http://scikit-learn.org/stable/>`__
- `seaborn <http://seaborn.pydata.org>`__
- `tabulate <https://pypi.org/project/tabulate/>`__
Command-line Interface
~~~~~~~~~~~~~~~~~~~~~~
The main utility we provide is called ``run_experiment`` and it can be used to
easily run a series of learners on datasets specified in a configuration file
like:
.. code:: ini
[General]
experiment_name = Titanic_Evaluate_Tuned
# valid tasks: cross_validate, evaluate, predict, train
task = evaluate
[Input]
# these directories could also be absolute paths
# (and must be if you're not running things in local mode)
train_directory = train
test_directory = dev
# Can specify multiple sets of feature files that are merged together automatically
featuresets = [["family.csv", "misc.csv", "socioeconomic.csv", "vitals.csv"]]
# List of scikit-learn learners to use
learners = ["RandomForestClassifier", "DecisionTreeClassifier", "SVC", "MultinomialNB"]
# Column in CSV containing labels to predict
label_col = Survived
# Column in CSV containing instance IDs (if any)
id_col = PassengerId
[Tuning]
# Should we tune parameters of all learners by searching provided parameter grids?
grid_search = true
# Function to maximize when performing grid search
objectives = ['accuracy']
[Output]
# Also compute the area under the ROC curve as an additional metric
metrics = ['roc_auc']
# The following can also be absolute paths
logs = output
results = output
predictions = output
probability = true
models = output
For more information about getting started with ``run_experiment``, please check
out `our tutorial <https://skll.readthedocs.org/en/latest/tutorial.html>`__, or
`our config file specs <https://skll.readthedocs.org/en/latest/run_experiment.html>`__.
You can also follow this `interactive Jupyter tutorial <https://mybinder.org/v2/gh/AVajpayeeJr/skll/feature/448-interactive-binder?filepath=examples>`__.
We also provide utilities for:
- `converting between machine learning toolkit formats <https://skll.readthedocs.org/en/latest/utilities.html#skll-convert>`__
(e.g., ARFF, CSV)
- `filtering feature files <https://skll.readthedocs.org/en/latest/utilities.html#filter-features>`__
- `joining feature files <https://skll.readthedocs.org/en/latest/utilities.html#join-features>`__
- `other common tasks <https://skll.readthedocs.org/en/latest/utilities.html>`__
Python API
~~~~~~~~~~
If you just want to avoid writing a lot of boilerplate learning code, you can
also use our simple Python API which also supports pandas DataFrames.
The main way you'll want to use the API is through
the ``Learner`` and ``Reader`` classes. For more details on our API, see
`the documentation <https://skll.readthedocs.org/en/latest/api.html>`__.
While our API can be broadly useful, it should be noted that the command-line
utilities are intended as the primary way of using SKLL. The API is just a nice
side-effect of our developing the utilities.
A Note on Pronunciation
~~~~~~~~~~~~~~~~~~~~~~~
.. image:: doc/skll.png
:alt: SKLL logo
:align: right
.. container:: clear
.. image:: doc/spacer.png
SciKit-Learn Laboratory (SKLL) is pronounced "skull": that's where the learning
happens.
Talks
~~~~~
- *Simpler Machine Learning with SKLL 1.0*, Dan Blanchard, PyData NYC 2014 (`video <https://www.youtube.com/watch?v=VEo2shBuOrc&feature=youtu.be&t=1s>`__ | `slides <http://www.slideshare.net/DanielBlanchard2/py-data-nyc-2014>`__)
- *Simpler Machine Learning with SKLL*, Dan Blanchard, PyData NYC 2013 (`video <http://vimeo.com/79511496>`__ | `slides <http://www.slideshare.net/DanielBlanchard2/simple-machine-learning-with-skll>`__)
Citing
~~~~~~
If you are using SKLL in your work, you can cite it as follows: "We used scikit-learn (Pedragosa et al, 2011) via the SKLL toolkit (https://github.com/EducationalTestingService/skll)."
Books
~~~~~
SKLL is featured in `Data Science at the Command Line <http://datascienceatthecommandline.com>`__
by `Jeroen Janssens <http://jeroenjanssens.com>`__.
Changelog
~~~~~~~~~
See `GitHub releases <https://github.com/EducationalTestingService/skll/releases>`__.
Contribute
~~~~~~~~~~
Thank you for your interest in contributing to SKLL! See `CONTRIBUTING.md <https://github.com/EducationalTestingService/skll/blob/main/CONTRIBUTING.md>`__ for instructions on how to get started.
Raw data
{
"_id": null,
"home_page": null,
"name": "skll",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Nitin Madnani <nmadnani@gmail.com>, Tamar Lavee <tamarlv@hotmail.com>",
"keywords": "learning scikit-learn",
"author": null,
"author_email": "Nitin Madnani <nmadnani@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/b5/2d/7101e5137c26cffc3b43586247893baae13c0901d106c1fbb819aa8688a9/skll-5.1.0.tar.gz",
"platform": null,
"description": "SciKit-Learn Laboratory\n-----------------------\n\n.. image:: https://gitlab.com/EducationalTestingService/skll/badges/main/pipeline.svg\n :target: https://gitlab.com/EducationalTestingService/skll/-/pipelines\n :alt: Gitlab CI status\n\n.. image:: https://dev.azure.com/EducationalTestingService/SKLL/_apis/build/status/EducationalTestingService.skll\n :target: https://dev.azure.com/EducationalTestingService/SKLL/_build?view=runs\n :alt: Azure Pipelines status\n\n.. image:: https://codecov.io/gh/EducationalTestingService/skll/branch/main/graph/badge.svg\n :target: https://codecov.io/gh/EducationalTestingService/skll\n\n.. image:: https://img.shields.io/pypi/v/skll.svg\n :target: https://pypi.org/project/skll/\n :alt: Latest version on PyPI\n\n.. image:: https://img.shields.io/pypi/l/skll.svg\n :alt: License\n\n.. image:: https://img.shields.io/conda/v/ets/skll.svg\n :target: https://anaconda.org/ets/skll\n :alt: Conda package for SKLL\n\n.. image:: https://img.shields.io/pypi/pyversions/skll.svg\n :target: https://pypi.org/project/skll/\n :alt: Supported python versions for SKLL\n\n.. image:: https://img.shields.io/badge/DOI-10.5281%2Fzenodo.12825-blue.svg\n :target: http://dx.doi.org/10.5281/zenodo.12825\n :alt: DOI for citing SKLL 1.0.0\n\n.. image:: https://mybinder.org/badge_logo.svg\n :target: https://mybinder.org/v2/gh/EducationalTestingService/skll/main?filepath=examples%2FTutorial.ipynb\n\n\nThis Python package provides command-line utilities to make it easier to run\nmachine learning experiments with scikit-learn. One of the primary goals of\nour project is to make it so that you can run scikit-learn experiments without\nactually needing to write any code other than what you used to generate/extract\nthe features.\n\nInstallation\n~~~~~~~~~~~~\n\nYou can install using either ``pip`` or ``conda``. See details `here <https://skll.readthedocs.io/en/latest/getting_started.html>`__.\n\nRequirements\n~~~~~~~~~~~~\n\n- Python 3.10, 3.11, or 3.12.\n- `beautifulsoup4 <http://www.crummy.com/software/BeautifulSoup/>`__\n- `gridmap <https://pypi.org/project/gridmap/>`__ (only required if you plan\n to run things in parallel on a DRMAA-compatible cluster)\n- `joblib <https://pypi.org/project/joblib/>`__\n- `pandas <http://pandas.pydata.org>`__\n- `ruamel.yaml <http://yaml.readthedocs.io/en/latest/overview.html>`__\n- `scikit-learn <http://scikit-learn.org/stable/>`__\n- `seaborn <http://seaborn.pydata.org>`__\n- `tabulate <https://pypi.org/project/tabulate/>`__\n\nCommand-line Interface\n~~~~~~~~~~~~~~~~~~~~~~\n\nThe main utility we provide is called ``run_experiment`` and it can be used to\neasily run a series of learners on datasets specified in a configuration file\nlike:\n\n.. code:: ini\n\n [General]\n experiment_name = Titanic_Evaluate_Tuned\n # valid tasks: cross_validate, evaluate, predict, train\n task = evaluate\n\n [Input]\n # these directories could also be absolute paths\n # (and must be if you're not running things in local mode)\n train_directory = train\n test_directory = dev\n # Can specify multiple sets of feature files that are merged together automatically\n featuresets = [[\"family.csv\", \"misc.csv\", \"socioeconomic.csv\", \"vitals.csv\"]]\n # List of scikit-learn learners to use\n learners = [\"RandomForestClassifier\", \"DecisionTreeClassifier\", \"SVC\", \"MultinomialNB\"]\n # Column in CSV containing labels to predict\n label_col = Survived\n # Column in CSV containing instance IDs (if any)\n id_col = PassengerId\n\n [Tuning]\n # Should we tune parameters of all learners by searching provided parameter grids?\n grid_search = true\n # Function to maximize when performing grid search\n objectives = ['accuracy']\n\n [Output]\n # Also compute the area under the ROC curve as an additional metric\n metrics = ['roc_auc']\n # The following can also be absolute paths\n logs = output\n results = output\n predictions = output\n probability = true\n models = output\n\nFor more information about getting started with ``run_experiment``, please check\nout `our tutorial <https://skll.readthedocs.org/en/latest/tutorial.html>`__, or\n`our config file specs <https://skll.readthedocs.org/en/latest/run_experiment.html>`__.\n\nYou can also follow this `interactive Jupyter tutorial <https://mybinder.org/v2/gh/AVajpayeeJr/skll/feature/448-interactive-binder?filepath=examples>`__.\n\nWe also provide utilities for:\n\n- `converting between machine learning toolkit formats <https://skll.readthedocs.org/en/latest/utilities.html#skll-convert>`__\n (e.g., ARFF, CSV)\n- `filtering feature files <https://skll.readthedocs.org/en/latest/utilities.html#filter-features>`__\n- `joining feature files <https://skll.readthedocs.org/en/latest/utilities.html#join-features>`__\n- `other common tasks <https://skll.readthedocs.org/en/latest/utilities.html>`__\n\n\nPython API\n~~~~~~~~~~\n\nIf you just want to avoid writing a lot of boilerplate learning code, you can\nalso use our simple Python API which also supports pandas DataFrames.\nThe main way you'll want to use the API is through\nthe ``Learner`` and ``Reader`` classes. For more details on our API, see\n`the documentation <https://skll.readthedocs.org/en/latest/api.html>`__.\n\nWhile our API can be broadly useful, it should be noted that the command-line\nutilities are intended as the primary way of using SKLL. The API is just a nice\nside-effect of our developing the utilities.\n\n\nA Note on Pronunciation\n~~~~~~~~~~~~~~~~~~~~~~~\n\n.. image:: doc/skll.png\n :alt: SKLL logo\n :align: right\n\n.. container:: clear\n\n .. image:: doc/spacer.png\n\nSciKit-Learn Laboratory (SKLL) is pronounced \"skull\": that's where the learning\nhappens.\n\nTalks\n~~~~~\n\n- *Simpler Machine Learning with SKLL 1.0*, Dan Blanchard, PyData NYC 2014 (`video <https://www.youtube.com/watch?v=VEo2shBuOrc&feature=youtu.be&t=1s>`__ | `slides <http://www.slideshare.net/DanielBlanchard2/py-data-nyc-2014>`__)\n- *Simpler Machine Learning with SKLL*, Dan Blanchard, PyData NYC 2013 (`video <http://vimeo.com/79511496>`__ | `slides <http://www.slideshare.net/DanielBlanchard2/simple-machine-learning-with-skll>`__)\n\nCiting\n~~~~~~\nIf you are using SKLL in your work, you can cite it as follows: \"We used scikit-learn (Pedragosa et al, 2011) via the SKLL toolkit (https://github.com/EducationalTestingService/skll).\"\n\nBooks\n~~~~~\n\nSKLL is featured in `Data Science at the Command Line <http://datascienceatthecommandline.com>`__\nby `Jeroen Janssens <http://jeroenjanssens.com>`__.\n\nChangelog\n~~~~~~~~~\n\nSee `GitHub releases <https://github.com/EducationalTestingService/skll/releases>`__.\n\nContribute\n~~~~~~~~~~\n\nThank you for your interest in contributing to SKLL! See `CONTRIBUTING.md <https://github.com/EducationalTestingService/skll/blob/main/CONTRIBUTING.md>`__ for instructions on how to get started.\n",
"bugtrack_url": null,
"license": "New BSD License Copyright (c) 2012\u20132022 Educational Testing Service All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: a. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. b. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. c. Neither the name of Educational Testing Service nor the names of the contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ",
"summary": "SciKit-Learn Laboratory makes it easier to run machine learning experiments with scikit-learn.",
"version": "5.1.0",
"project_urls": {
"Documentation": "https://skll.readthedocs.org",
"Repository": "http://github.com/EducationalTestingService/skll"
},
"split_keywords": [
"learning",
"scikit-learn"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9aa0061cd3f05724fd0cb806f1f2cb36b216420a88009ce750f34788b881bc0a",
"md5": "56749c3b88b17b66887f078eb6923a0b",
"sha256": "04ca1af09d95304a12bf8e2968a82b5c105f80b406408832361c461a109b51da"
},
"downloads": -1,
"filename": "skll-5.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "56749c3b88b17b66887f078eb6923a0b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 149130,
"upload_time": "2024-12-27T16:29:32",
"upload_time_iso_8601": "2024-12-27T16:29:32.213683Z",
"url": "https://files.pythonhosted.org/packages/9a/a0/061cd3f05724fd0cb806f1f2cb36b216420a88009ce750f34788b881bc0a/skll-5.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b52d7101e5137c26cffc3b43586247893baae13c0901d106c1fbb819aa8688a9",
"md5": "4506a58c86603c0a5b45b3fd045875aa",
"sha256": "cfd690177a9ca2e7a8aa2c060ea86b14aded6ca045dc381609d0f9ed3c1b216d"
},
"downloads": -1,
"filename": "skll-5.1.0.tar.gz",
"has_sig": false,
"md5_digest": "4506a58c86603c0a5b45b3fd045875aa",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 132611,
"upload_time": "2024-12-27T16:29:34",
"upload_time_iso_8601": "2024-12-27T16:29:34.342152Z",
"url": "https://files.pythonhosted.org/packages/b5/2d/7101e5137c26cffc3b43586247893baae13c0901d106c1fbb819aa8688a9/skll-5.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-27 16:29:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "EducationalTestingService",
"github_project": "skll",
"travis_ci": false,
"coveralls": true,
"github_actions": false,
"requirements": [
{
"name": "beautifulsoup4",
"specs": []
},
{
"name": "joblib",
"specs": []
},
{
"name": "numpy",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "ruamel.yaml",
"specs": []
},
{
"name": "scikit-learn",
"specs": [
[
"<",
"1.6.0"
],
[
">=",
"1.5.0"
]
]
},
{
"name": "scipy",
"specs": []
},
{
"name": "seaborn",
"specs": []
},
{
"name": "tabulate",
"specs": []
},
{
"name": "typing_extensions",
"specs": []
},
{
"name": "wandb",
"specs": []
}
],
"lcname": "skll"
}