DecisionTreeClassifier


NameDecisionTreeClassifier JSON
Version 0.0.7 PyPI version JSON
download
home_pagehttps://github.com/mlouii/Decision-Tree-Practicum
SummaryA Decision Tree Classifier.
upload_time2023-04-28 03:56:35
maintainer
docs_urlNone
authorMark Lou, Jobin Joyson
requires_python
licenseMIT
keywords decision tree
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            *First iteration of a decision tree classifier that can handle string variables.*

## Overview

- This is an implementation of a Decision Tree Classifier for both numerical and categorical features. It is comprised of several classes:

    - DecisionNodeNumerical: Represents a numerical decision node, which holds a feature name, threshold, left and right children, info gain, and null direction.
    - DecisionNodeCategorical: Represents a categorical decision node, which holds a feature name, categories, children, info gain, and null category.
    - LeafNode: Represents a leaf node in the decision tree, which holds the final class value, the size of the samples, entropy, and Gini impurity.
    - DecisionTreeClassifier: Main class that implements the decision tree classifier. It has methods to fit the data, predict the class of unseen samples, calculate information gain, and split the data based on the best feature and threshold.

- The DecisionTreeClassifier class contains methods for fitting the model to the input data, predicting the class labels for new data, and calculating information gain, entropy, and Gini impurity. The fit method builds the decision tree by recursively finding the best split for each node and splitting the data accordingly. The predict method traverses the decision tree for each input sample and returns the class label associated with the reached leaf node. The get_best_split method finds the best feature and threshold for each node by maximizing the information gain.

- The tree can be built with a specified maximum depth and minimum sample leaf size. Additionally, the classifier can handle missing values in the input data by assigning them to a specified null direction or null category.

## Example Usage

- Load up a categorical data set that you want to test and create a dataframe for it. Create a Decision Tree Classifier:
    - classifer = DecisionTreeClassifier(max_depth, min_sample_leaf)

- Fit the tree with training data:
    - classifier.fit(training_dataframe, Name of the target column)

- See a visual of the tree:
    - classifier.show_tree()

- Predict the target values with the testing dataframe excluding the target column:
    - classifier.predict(testing_dataframe with no target column)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/mlouii/Decision-Tree-Practicum",
    "name": "DecisionTreeClassifier",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "decision tree",
    "author": "Mark Lou, Jobin Joyson",
    "author_email": "mlou@hawk.iit.edu, jjoyson1@hawk.iit.edu",
    "download_url": "https://files.pythonhosted.org/packages/68/a2/7b0f39f567331a14962d55d22a5bf98755e7819e9fad03817acab9ee2411/DecisionTreeClassifier-0.0.7.tar.gz",
    "platform": null,
    "description": "*First iteration of a decision tree classifier that can handle string variables.*\r\n\r\n## Overview\r\n\r\n- This is an implementation of a Decision Tree Classifier for both numerical and categorical features. It is comprised of several classes:\r\n\r\n    - DecisionNodeNumerical: Represents a numerical decision node, which holds a feature name, threshold, left and right children, info gain, and null direction.\r\n    - DecisionNodeCategorical: Represents a categorical decision node, which holds a feature name, categories, children, info gain, and null category.\r\n    - LeafNode: Represents a leaf node in the decision tree, which holds the final class value, the size of the samples, entropy, and Gini impurity.\r\n    - DecisionTreeClassifier: Main class that implements the decision tree classifier. It has methods to fit the data, predict the class of unseen samples, calculate information gain, and split the data based on the best feature and threshold.\r\n\r\n- The DecisionTreeClassifier class contains methods for fitting the model to the input data, predicting the class labels for new data, and calculating information gain, entropy, and Gini impurity. The fit method builds the decision tree by recursively finding the best split for each node and splitting the data accordingly. The predict method traverses the decision tree for each input sample and returns the class label associated with the reached leaf node. The get_best_split method finds the best feature and threshold for each node by maximizing the information gain.\r\n\r\n- The tree can be built with a specified maximum depth and minimum sample leaf size. Additionally, the classifier can handle missing values in the input data by assigning them to a specified null direction or null category.\r\n\r\n## Example Usage\r\n\r\n- Load up a categorical data set that you want to test and create a dataframe for it. Create a Decision Tree Classifier:\r\n    - classifer = DecisionTreeClassifier(max_depth, min_sample_leaf)\r\n\r\n- Fit the tree with training data:\r\n    - classifier.fit(training_dataframe, Name of the target column)\r\n\r\n- See a visual of the tree:\r\n    - classifier.show_tree()\r\n\r\n- Predict the target values with the testing dataframe excluding the target column:\r\n    - classifier.predict(testing_dataframe with no target column)\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Decision Tree Classifier.",
    "version": "0.0.7",
    "split_keywords": [
        "decision",
        "tree"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "588d577be51b9b3fcd4e479a21ac2e1d08cbcb2ac1204d635d7bef324fcdbab7",
                "md5": "464c32eb4cf15d6e1ca3eb05526ae73a",
                "sha256": "92c13272b64fc0c9fdd89b90e0226728cfadf018951f168df45d7c664cf7ae2d"
            },
            "downloads": -1,
            "filename": "DecisionTreeClassifier-0.0.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "464c32eb4cf15d6e1ca3eb05526ae73a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 12422,
            "upload_time": "2023-04-28T03:56:33",
            "upload_time_iso_8601": "2023-04-28T03:56:33.962346Z",
            "url": "https://files.pythonhosted.org/packages/58/8d/577be51b9b3fcd4e479a21ac2e1d08cbcb2ac1204d635d7bef324fcdbab7/DecisionTreeClassifier-0.0.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "68a27b0f39f567331a14962d55d22a5bf98755e7819e9fad03817acab9ee2411",
                "md5": "cc5145b07604ec45cc86ce3cae42b83d",
                "sha256": "93d469ff3cbb3876059ac36fee8c2e051f95570f84c1ab12cb6a10ed3b3a91bc"
            },
            "downloads": -1,
            "filename": "DecisionTreeClassifier-0.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "cc5145b07604ec45cc86ce3cae42b83d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 12097,
            "upload_time": "2023-04-28T03:56:35",
            "upload_time_iso_8601": "2023-04-28T03:56:35.869164Z",
            "url": "https://files.pythonhosted.org/packages/68/a2/7b0f39f567331a14962d55d22a5bf98755e7819e9fad03817acab9ee2411/DecisionTreeClassifier-0.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-28 03:56:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "mlouii",
    "github_project": "Decision-Tree-Practicum",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "decisiontreeclassifier"
}
        
Elapsed time: 0.08055s