survivors


Namesurvivors JSON
Version 1.7.0 PyPI version JSON
download
home_pageNone
SummaryNone
upload_time2024-05-13 05:26:53
maintainerNone
docs_urlNone
authorIulii Vasilev
requires_python>=3.10
licenseBSD 3-Clause License
keywords survival analysis time-to-event event data machine learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. -*- mode: rst -*-

|Price| |License| |PyPi|_ |DOI|_

.. |Price| image:: https://img.shields.io/badge/price-FREE-0098f7.svg
   :target: https://github.com/iuliivasilev/dev-survivors/blob/master/LICENSE

.. |PyPi| image:: https://img.shields.io/pypi/v/survivors
.. _PyPi: https://pypi.org/project/survivors/

.. |License| image:: https://img.shields.io/badge/license-BSD%203--Clause-blue.svg
   :target: https://github.com/iuliivasilev/dev-survivors/blob/master/LICENSE

.. |DOI| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.10649986.svg
.. _DOI: https://zenodo.org/doi/10.5281/zenodo.10649777

=========
survivors
=========

Event analysis has many applications: healthcare, hardware, social science, bioinformatics, and more.
Survival analysis allows you to predict not only the time and probability of an event but also how the probability of that event changes over time.
In particular, there are three functions: the survival function *S(t)*, the density function *f(t)*, and the hazard function *h(t)*:

.. math::
    S(t)=P(T>t), f(t)=P(T=t), h(t)=P(T=t|T>=t)

The open-source **survivors** library aims to fit accurate data-sensitive tree-based models. 
These models handle categorical features and deal with missing values,overcoming the limitations of existing `lifelines <https://github.com/lifelines/lifelines?ysclid=lta0m13i2b832399887>`_, `scikit-survival <https://github.com/sebp/scikit-survival>`_, and `pycox <https://github.com/havakv/pycox>`_ models.
Survivors is a platform for conducting experimental research. The experiment module is compatible with scikit-survival and lifelines models (non-parametric and parametric models have already been embedded into the library).

Developed models published in scientific articles [1]_, [2]_, [3]_ and outperformed existing models in terms of accuracy based on open medical data. We invite survival analysis researchers to join the development of survivors, using the library for their projects, reporting any problems, and creating new solutions.
Documentation is available on https://iuliivasilev.github.io/dev-survivors/

Principles
-----------

Built-in **survivors** models were developed as part of a PhD thesis at Lomonosov Moscow State University. The goal of the library is to analyze existing methods of survival analysis and develop new techniques to overcome these limitations.

Existing methods are not suitable for real-world data. Discrete methods use a fixed time scale. Statistical methods are based on strong assumptions, and tree-based methods use the log-rank statistic with low sensitivity.
**Survivors** has the following features:

1. Continuous Predictions: The timeline is user-friendly and only needs to be set at the prediction stage.
2. Modified Quality Metrics: Existing metrics are excessively sensitive to data features, such as class imbalance, event distribution, and timeline.
3. Weighted Survival Tree. For the first time, CRAID uses weighted log-rank criteria. Wilcoxon, Peto, and Tarone-Ware weights increase the significance of events within a certain time interval.
4. Speed of work. The models are developed from scratch and utilize parallelization, vectorization, and JIT compilation. A new histogram-based method is employed to identify splits in censored data. This method optimizes memory usage and has a high level of operation speed.
5. Ease of Use. CRAID and ensembles work out-of-the-box. Categorical and missing data are processed within the models.
6. A Platform for Experiments. The experiments module provides a flexible interface for working with built-in and external survival models, their hyperparameters, and experiment strategies, such as hold-out, cross-validation, grid search with cross-validation and sampling, and time-aware cross-validation.

Installation
------------

Requirements
~~~~~~~~~~~~

- Python (>= 3.9)
- NumPy (>= 1.22)
- Pandas (>=0.25)
- Numba (>= 0.58.0)
- matplotlib (>= 3.5.0)
- seaborn
- graphviz (>= 2.50.0)
- joblib
- scikit-learn (>= 1.0.2)
- openpyxl

Optional for comprehensive experiments:

- lifelines (>= 0.27.8)
- scikit-survival (>= 0.17.2)

User Installation
~~~~~~~~~~~~~~~~~

The most convenient and fastest way to install a package is to directly download the library from the Python package catalog (Python Package Index, PyPI).
The version of the source files in the directory is up-to-date and consistent with the GitHub repository::

  pip install survivors

An alternative installation method is based on the use of source files. 
The first step is to download the source files using Github::

  git clone command https://github.com/iuliivasilev/dev-survivors.git

Or getting and unpacking the archive of `the latest published version <https://github.com/iuliivasilev/dev-survivors/releases/>`_. Next, use the command line to go to the **dev-survivors** directory. Finally, the manual installation of the library is completed after executing the following command::

  python command setup.py install


Examples
------------

The user guides in the *doc* and *demonstration* directories provide detailed information on the key concepts for **survivors**. 
They also include hands-on examples in the form of `Jupyter notebooks <https://jupyter.org/>`_.
In particular, the library allows users to carry out a range of scenarios.

1. Loading and preparing 9 open medical datasets: GBSG, PBC, SMARTO, SUPPORT2, WUHAN, ACTG, FLCHAIN, ROTT2, FRAMINGHAM.
2. Fitting Survival Analysis Models: There are the following models available: a Decision Tree (CRAID), a Bootstrap Ensemble (BootstrapCRAID), and an Adaptive Boosting Ensemble (BoostingCRAID). Each model has a wide range of hyperparameters, providing flexibility for the model.
3. Predict the probability and timing of the event. Forecasts can help users solve the problem of classifying or ranking new patients based on the expected severity of their disease. 
4. Predict the individual survival functions and cumulative hazards of patients. Forecasts can be used to support medical decisions and adjust treatments.
5. Visualizing and interpreting dependencies in data.

Help and Support
----------------

Communication
~~~~~~~~~~~~~

- Email: iuliivasilev@gmail.com
- LinkedIn: https://www.linkedin.com/in/iulii-vasilev


Citation
~~~~~~~~~~

If you use **survivors** in a scientific publication, we would appreciate citations:

.. [1] Vasilev I., Petrovskiy M., Mashechkin I. Survival Analysis Algorithms based on Decision Trees with Weighted Log-rank Criteria. - 2022.

.. [2] Vasilev, Iulii, Mikhail Petrovskiy, and Igor Mashechkin. "Sensitivity of Survival Analysis Metrics." Mathematics 11.20 (2023): 4246.

.. [3] Vasilev, Iulii, Mikhail Petrovskiy, and Igor Mashechkin. "Adaptive Sampling for Weighted Log-Rank Survival Trees Boosting." International Conference on Pattern Recognition Applications and Methods. Cham: Springer International Publishing, 2021.

.. _survival analysis: https://en.wikipedia.org/wiki/Survival_analysis



            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "survivors",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "survival analysis, time-to-event, event data, machine learning",
    "author": "Iulii Vasilev",
    "author_email": "iuliivasilev@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/64/23/d665059ea40ec4996ad2989f2dd6014b3582ae4273adae7b4bd7a109cb01/survivors-1.7.0.tar.gz",
    "platform": null,
    "description": ".. -*- mode: rst -*-\n\n|Price| |License| |PyPi|_ |DOI|_\n\n.. |Price| image:: https://img.shields.io/badge/price-FREE-0098f7.svg\n   :target: https://github.com/iuliivasilev/dev-survivors/blob/master/LICENSE\n\n.. |PyPi| image:: https://img.shields.io/pypi/v/survivors\n.. _PyPi: https://pypi.org/project/survivors/\n\n.. |License| image:: https://img.shields.io/badge/license-BSD%203--Clause-blue.svg\n   :target: https://github.com/iuliivasilev/dev-survivors/blob/master/LICENSE\n\n.. |DOI| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.10649986.svg\n.. _DOI: https://zenodo.org/doi/10.5281/zenodo.10649777\n\n=========\nsurvivors\n=========\n\nEvent analysis has many applications: healthcare, hardware, social science, bioinformatics, and more.\u0432\u0402\u0401Survival analysis allows you to predict not only the time and probability of an event but also how the probability of that event changes over time.\nIn particular, there are three functions: the survival function *S(t)*, the density function *f(t)*, and the hazard function *h(t)*:\n\n.. math::\n    S(t)=P(T>t), f(t)=P(T=t), h(t)=P(T=t|T>=t)\n\nThe open-source **survivors** library aims to fit accurate data-sensitive tree-based models. \nThese models handle categorical features and deal with missing values,overcoming the limitations of existing `lifelines <https://github.com/lifelines/lifelines?ysclid=lta0m13i2b832399887>`_, `scikit-survival <https://github.com/sebp/scikit-survival>`_, and `pycox <https://github.com/havakv/pycox>`_ models.\nSurvivors is a platform for conducting experimental research. The experiment module is compatible with scikit-survival and lifelines models (non-parametric and parametric models have already been embedded into the library).\n\nDeveloped models published in scientific articles [1]_, [2]_, [3]_ and outperformed existing models in terms of accuracy based on open medical data. We invite survival analysis researchers to join the development of survivors, using the library for their projects, reporting any problems, and creating new solutions.\nDocumentation is available on https://iuliivasilev.github.io/dev-survivors/\n\nPrinciples\n-----------\n\nBuilt-in **survivors** models were developed as part of a PhD thesis at Lomonosov Moscow State University. The goal of the library is to analyze existing methods of survival analysis and develop new techniques to overcome these limitations.\n\nExisting methods are not suitable for real-world data. Discrete methods use a fixed time scale. Statistical methods are based on strong assumptions, and tree-based methods use the log-rank statistic with low sensitivity.\n**Survivors** has the following features:\n\n1. Continuous Predictions: The timeline is user-friendly and only needs to be set at the prediction stage.\n2. Modified Quality Metrics: Existing metrics are excessively sensitive to data features, such as class imbalance, event distribution, and timeline.\n3. Weighted Survival Tree. For the first time, CRAID uses weighted log-rank criteria. Wilcoxon, Peto, and Tarone-Ware weights increase the significance of events within a certain time interval.\n4. Speed of work. The models are developed from scratch and utilize parallelization, vectorization, and JIT compilation. A new histogram-based method is employed to identify splits in censored data. This method optimizes memory usage and has a high level of operation speed.\n5. Ease of Use. CRAID and ensembles work out-of-the-box. Categorical and missing data are processed within the models.\n6. A Platform for Experiments. The experiments module provides a flexible interface for working with built-in and external survival models, their hyperparameters, and experiment strategies, such as hold-out, cross-validation, grid search with cross-validation and sampling, and time-aware cross-validation.\n\nInstallation\n------------\n\nRequirements\n~~~~~~~~~~~~\n\n- Python (>= 3.9)\n- NumPy (>= 1.22)\n- Pandas (>=0.25)\n- Numba (>= 0.58.0)\n- matplotlib (>= 3.5.0)\n- seaborn\n- graphviz (>= 2.50.0)\n- joblib\n- scikit-learn (>= 1.0.2)\n- openpyxl\n\nOptional for comprehensive experiments:\n\n- lifelines (>= 0.27.8)\n- scikit-survival (>= 0.17.2)\n\nUser Installation\n~~~~~~~~~~~~~~~~~\n\nThe most convenient and fastest way to install a package is to directly download the library from the Python package catalog (Python Package Index, PyPI).\nThe version of the source files in the directory is up-to-date and consistent with the GitHub repository::\n\n  pip install survivors\n\nAn alternative installation method is based on the use of source files. \nThe first step is to download the source files using Github::\n\n  git clone command https://github.com/iuliivasilev/dev-survivors.git\n\nOr getting and unpacking the archive of `the latest published version <https://github.com/iuliivasilev/dev-survivors/releases/>`_. Next, use the command line to go to the **dev-survivors** directory. Finally, the manual installation of the library is completed after executing the following command::\n\n  python command setup.py install\n\n\nExamples\n------------\n\nThe user guides in the *doc* and *demonstration* directories provide detailed information on the key concepts for **survivors**. \nThey also include hands-on examples in the form of `Jupyter notebooks <https://jupyter.org/>`_.\nIn particular, the library allows users to carry out a range of scenarios.\n\n1. Loading and preparing 9 open medical datasets: GBSG, PBC, SMARTO, SUPPORT2, WUHAN, ACTG, FLCHAIN, ROTT2, FRAMINGHAM.\n2. Fitting Survival Analysis Models: There are the following models available: a Decision Tree (CRAID), a Bootstrap Ensemble (BootstrapCRAID), and an Adaptive Boosting Ensemble (BoostingCRAID). Each model has a wide range of hyperparameters, providing flexibility for the model.\n3. Predict the probability and timing of the event. Forecasts can help users solve the problem of classifying or ranking new patients based on the expected severity of their disease. \n4. Predict the individual survival functions and cumulative hazards of patients. Forecasts can be used to support medical decisions and adjust treatments.\n5. Visualizing and interpreting dependencies in data.\n\nHelp and Support\n----------------\n\nCommunication\n~~~~~~~~~~~~~\n\n- Email: iuliivasilev@gmail.com\n- LinkedIn: https://www.linkedin.com/in/iulii-vasilev\n\n\nCitation\n~~~~~~~~~~\n\nIf you use **survivors** in a scientific publication, we would appreciate citations:\n\n.. [1] Vasilev I., Petrovskiy M., Mashechkin I. Survival Analysis Algorithms based on Decision Trees with Weighted Log-rank Criteria. - 2022.\n\n.. [2] Vasilev, Iulii, Mikhail Petrovskiy, and Igor Mashechkin. \"Sensitivity of Survival Analysis Metrics.\" Mathematics 11.20 (2023): 4246.\n\n.. [3] Vasilev, Iulii, Mikhail Petrovskiy, and Igor Mashechkin. \"Adaptive Sampling for Weighted Log-Rank Survival Trees Boosting.\" International Conference on Pattern Recognition Applications and Methods. Cham: Springer International Publishing, 2021.\n\n.. _survival analysis: https://en.wikipedia.org/wiki/Survival_analysis\n\n\n",
    "bugtrack_url": null,
    "license": "BSD 3-Clause License",
    "summary": null,
    "version": "1.7.0",
    "project_urls": null,
    "split_keywords": [
        "survival analysis",
        " time-to-event",
        " event data",
        " machine learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5f4ebebcae1d5cb8499dd54ca9dcb549ad49efceade86812e8ce8a676f8481c7",
                "md5": "4f4bc54f4617b661941253990e113785",
                "sha256": "fd5a2012a5a43ce9d8c9c914f88e0f578e7e113a13efdffcefb189e18622e560"
            },
            "downloads": -1,
            "filename": "survivors-1.7.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4f4bc54f4617b661941253990e113785",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 1459204,
            "upload_time": "2024-05-13T05:26:51",
            "upload_time_iso_8601": "2024-05-13T05:26:51.345176Z",
            "url": "https://files.pythonhosted.org/packages/5f/4e/bebcae1d5cb8499dd54ca9dcb549ad49efceade86812e8ce8a676f8481c7/survivors-1.7.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6423d665059ea40ec4996ad2989f2dd6014b3582ae4273adae7b4bd7a109cb01",
                "md5": "9141e3e38c180c77fe50f17a34a45881",
                "sha256": "829c0e4970d5479ca90c1254c1dcc3ed54b94e62e2016c76ee0bf8892494acc4"
            },
            "downloads": -1,
            "filename": "survivors-1.7.0.tar.gz",
            "has_sig": false,
            "md5_digest": "9141e3e38c180c77fe50f17a34a45881",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 1427073,
            "upload_time": "2024-05-13T05:26:53",
            "upload_time_iso_8601": "2024-05-13T05:26:53.520002Z",
            "url": "https://files.pythonhosted.org/packages/64/23/d665059ea40ec4996ad2989f2dd6014b3582ae4273adae7b4bd7a109cb01/survivors-1.7.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-13 05:26:53",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "survivors"
}
        
Elapsed time: 0.39181s