vishuml


Namevishuml JSON
Version 0.1.6 PyPI version JSON
download
home_pagehttps://github.com/vishuRizz/vishuml-pip-library
SummaryA machine learning library implementing algorithms from scratch
upload_time2025-08-09 18:32:21
maintainerNone
docs_urlNone
authorVishu pratap
requires_python>=3.7
licenseNone
keywords machine learning algorithms classification regression clustering data science
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # VishuML

A comprehensive machine learning library implementing fundamental algorithms from scratch in Python. This library provides educational implementations of popular ML algorithms without relying on external ML frameworks like scikit-learn.

## Features

**🎯 sklearn-compatible API** - Works seamlessly with pandas DataFrames and CSV data!

VishuML implements the following machine learning algorithms:

### Supervised Learning

- **Linear Regression** - For continuous target prediction
- **Logistic Regression** - For binary classification
- **K-Nearest Neighbors (KNN)** - For classification and regression
- **Support Vector Machine (SVM)** - For binary classification with linear and RBF kernels
- **Decision Tree** - For classification using CART algorithm
- **Naive Bayes** - Gaussian Naive Bayes for classification
- **Perceptron** - Linear binary classifier

### Unsupervised Learning

- **K-Means Clustering** - For data clustering

### Utilities

- Data splitting (train/test split)
- Evaluation metrics (accuracy, R², MSE)
- Distance functions
- Data normalization
- Confusion matrix

## Installation

### From PyPI (when published)

```bash
pip install vishuml
```

### From Source

```bash
git clone https://github.com/vishuRizz/vishuml.git
cd vishuml
pip install -e .
```

## Quick Start

### 🚀 Works with pandas DataFrames (Just like sklearn!)

```python
import pandas as pd
from vishuml import LinearRegression, LogisticRegression
from vishuml.utils import train_test_split, r2_score, accuracy_score

# Load your CSV data (just like sklearn!)
df = pd.read_csv('your_data.csv')
X = df[['feature1', 'feature2', 'feature3']]  # Select features
y = df['target']                               # Select target

# Train-test split (works with DataFrames!)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model (accepts DataFrames!)
model = LinearRegression()
model.fit(X_train, y_train)  # DataFrame input!

# Make predictions (works with DataFrames!)
predictions = model.predict(X_test)
score = model.score(X_test, y_test)
print(f"R² Score: {score:.4f}")

# Classification Example with real data
from vishuml import datasets as ds
X, y = ds.load_iris()

# Convert to DataFrame for realistic workflow
iris_df = pd.DataFrame(X, columns=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'])
iris_df['species'] = y

# sklearn-like feature selection
features = iris_df[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]
target = (iris_df['species'] == 0).astype(int)  # Binary classification

X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.3)

classifier = LogisticRegression()
classifier.fit(X_train, y_train)  # DataFrame input!
accuracy = classifier.score(X_test, y_test)
print(f"Accuracy: {accuracy:.4f}")
```

### Traditional NumPy Arrays

```python
import numpy as np
from vishuml import LinearRegression, KMeans

# NumPy arrays also work (backward compatibility)
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 6, 8, 10])

model = LinearRegression()
model.fit(X, y)
predictions = model.predict([[6], [7]])
print(f"Predictions: {predictions}")  # Should be close to [12, 14]

# Clustering Example
X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
kmeans = KMeans(k=2, random_state=42)
clusters = kmeans.fit_predict(X)
print(f"Cluster labels: {clusters}")
```

## Algorithm Documentation

### Linear Regression

```python
from vishuml import LinearRegression

# Create and train model
model = LinearRegression(fit_intercept=True)
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)

# Get R² score
score = model.score(X_test, y_test)
```

### Logistic Regression

```python
from vishuml import LogisticRegression

# Create and train model
model = LogisticRegression(learning_rate=0.01, max_iterations=1000)
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)
probabilities = model.predict_proba(X_test)

# Get accuracy
accuracy = model.score(X_test, y_test)
```

### K-Nearest Neighbors

```python
from vishuml import KNearestNeighbors

# For classification
knn_clf = KNearestNeighbors(k=3, task_type='classification')
knn_clf.fit(X_train, y_train)
predictions = knn_clf.predict(X_test)

# For regression
knn_reg = KNearestNeighbors(k=5, task_type='regression')
knn_reg.fit(X_train, y_train)
predictions = knn_reg.predict(X_test)
```

### Support Vector Machine

```python
from vishuml import SupportVectorMachine

# Linear SVM
svm_linear = SupportVectorMachine(C=1.0, kernel='linear')
svm_linear.fit(X_train, y_train)

# RBF SVM
svm_rbf = SupportVectorMachine(C=1.0, kernel='rbf', gamma=1.0)
svm_rbf.fit(X_train, y_train)

predictions = svm_rbf.predict(X_test)
decision_scores = svm_rbf.decision_function(X_test)
```

### Decision Tree

```python
from vishuml import DecisionTree

# Create and train model
tree = DecisionTree(max_depth=5, min_samples_split=2, min_samples_leaf=1)
tree.fit(X_train, y_train)

# Make predictions
predictions = tree.predict(X_test)
accuracy = tree.score(X_test, y_test)
```

### Naive Bayes

```python
from vishuml import NaiveBayes

# Create and train model
nb = NaiveBayes()
nb.fit(X_train, y_train)

# Make predictions
predictions = nb.predict(X_test)
probabilities = nb.predict_proba(X_test)
```

### Perceptron

```python
from vishuml import Perceptron

# Create and train model
perceptron = Perceptron(learning_rate=0.01, max_iterations=1000)
perceptron.fit(X_train, y_train)

# Make predictions
predictions = perceptron.predict(X_test)
decision_scores = perceptron.decision_function(X_test)
```

### K-Means Clustering

```python
from vishuml import KMeans

# Create and train model
kmeans = KMeans(k=3, init='k-means++', random_state=42)
kmeans.fit(X)

# Get cluster labels
labels = kmeans.labels
# Or predict for new data
new_labels = kmeans.predict(X_new)

# Transform to distance space
distances = kmeans.transform(X)
```

## Utility Functions

```python
from vishuml.utils import (
    train_test_split, accuracy_score, r2_score,
    mean_squared_error, euclidean_distance,
    normalize, confusion_matrix
)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Evaluate predictions
accuracy = accuracy_score(y_true, y_pred)
r2 = r2_score(y_true, y_pred)
mse = mean_squared_error(y_true, y_pred)

# Normalize features
X_normalized = normalize(X)

# Confusion matrix
cm = confusion_matrix(y_true, y_pred)
```

## Sample Datasets

The library includes sample datasets in CSV format:

- `datasets/iris.csv` - Classic iris flower classification dataset
- `datasets/housing.csv` - Housing price regression dataset
- `datasets/wine.csv` - Wine quality classification dataset

```python
import pandas as pd
import os

# Load sample datasets
iris_data = pd.read_csv('datasets/iris.csv')
housing_data = pd.read_csv('datasets/housing.csv')
wine_data = pd.read_csv('datasets/wine.csv')
```

## Examples

Check out the `examples/` directory for Jupyter notebook tutorials demonstrating each algorithm:

- `examples/linear_regression_example.ipynb`
- `examples/logistic_regression_example.ipynb`
- `examples/knn_example.ipynb`
- `examples/svm_example.ipynb`
- `examples/decision_tree_example.ipynb`
- `examples/naive_bayes_example.ipynb`
- `examples/perceptron_example.ipynb`
- `examples/kmeans_example.ipynb`

## Development

### Setup Development Environment

```bash
git clone https://github.com/vishuRizz/vishuml.git
cd vishuml
pip install -e ".[dev]"
```

### Running Tests

```bash
pytest tests/ -v --cov=vishuml
```

### Code Formatting

```bash
black vishuml/
flake8 vishuml/
```

## Requirements

- Python >= 3.7
- NumPy >= 1.19.0

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

## Educational Purpose

This library is designed for educational purposes to help understand how machine learning algorithms work under the hood. For production use, consider using mature libraries like scikit-learn, which are more optimized and feature-complete.

## Author

**Vishu** - [GitHub Profile](https://github.com/vishuRizz)

## Acknowledgments

- Inspired by scikit-learn's API design
- Algorithms implemented based on standard textbook descriptions
- Built for educational and learning purposes


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/vishuRizz/vishuml-pip-library",
    "name": "vishuml",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "machine learning, algorithms, classification, regression, clustering, data science",
    "author": "Vishu pratap",
    "author_email": "vishurizz0@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/59/0d/d5a4499b3f659e82d01f9fe1434e4f3d4cc14e3b09fa8d260dd70bfb23da/vishuml-0.1.6.tar.gz",
    "platform": null,
    "description": "# VishuML\n\nA comprehensive machine learning library implementing fundamental algorithms from scratch in Python. This library provides educational implementations of popular ML algorithms without relying on external ML frameworks like scikit-learn.\n\n## Features\n\n**\ud83c\udfaf sklearn-compatible API** - Works seamlessly with pandas DataFrames and CSV data!\n\nVishuML implements the following machine learning algorithms:\n\n### Supervised Learning\n\n- **Linear Regression** - For continuous target prediction\n- **Logistic Regression** - For binary classification\n- **K-Nearest Neighbors (KNN)** - For classification and regression\n- **Support Vector Machine (SVM)** - For binary classification with linear and RBF kernels\n- **Decision Tree** - For classification using CART algorithm\n- **Naive Bayes** - Gaussian Naive Bayes for classification\n- **Perceptron** - Linear binary classifier\n\n### Unsupervised Learning\n\n- **K-Means Clustering** - For data clustering\n\n### Utilities\n\n- Data splitting (train/test split)\n- Evaluation metrics (accuracy, R\u00b2, MSE)\n- Distance functions\n- Data normalization\n- Confusion matrix\n\n## Installation\n\n### From PyPI (when published)\n\n```bash\npip install vishuml\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/vishuRizz/vishuml.git\ncd vishuml\npip install -e .\n```\n\n## Quick Start\n\n### \ud83d\ude80 Works with pandas DataFrames (Just like sklearn!)\n\n```python\nimport pandas as pd\nfrom vishuml import LinearRegression, LogisticRegression\nfrom vishuml.utils import train_test_split, r2_score, accuracy_score\n\n# Load your CSV data (just like sklearn!)\ndf = pd.read_csv('your_data.csv')\nX = df[['feature1', 'feature2', 'feature3']]  # Select features\ny = df['target']                               # Select target\n\n# Train-test split (works with DataFrames!)\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n\n# Train model (accepts DataFrames!)\nmodel = LinearRegression()\nmodel.fit(X_train, y_train)  # DataFrame input!\n\n# Make predictions (works with DataFrames!)\npredictions = model.predict(X_test)\nscore = model.score(X_test, y_test)\nprint(f\"R\u00b2 Score: {score:.4f}\")\n\n# Classification Example with real data\nfrom vishuml import datasets as ds\nX, y = ds.load_iris()\n\n# Convert to DataFrame for realistic workflow\niris_df = pd.DataFrame(X, columns=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'])\niris_df['species'] = y\n\n# sklearn-like feature selection\nfeatures = iris_df[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]\ntarget = (iris_df['species'] == 0).astype(int)  # Binary classification\n\nX_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.3)\n\nclassifier = LogisticRegression()\nclassifier.fit(X_train, y_train)  # DataFrame input!\naccuracy = classifier.score(X_test, y_test)\nprint(f\"Accuracy: {accuracy:.4f}\")\n```\n\n### Traditional NumPy Arrays\n\n```python\nimport numpy as np\nfrom vishuml import LinearRegression, KMeans\n\n# NumPy arrays also work (backward compatibility)\nX = np.array([[1], [2], [3], [4], [5]])\ny = np.array([2, 4, 6, 8, 10])\n\nmodel = LinearRegression()\nmodel.fit(X, y)\npredictions = model.predict([[6], [7]])\nprint(f\"Predictions: {predictions}\")  # Should be close to [12, 14]\n\n# Clustering Example\nX = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])\nkmeans = KMeans(k=2, random_state=42)\nclusters = kmeans.fit_predict(X)\nprint(f\"Cluster labels: {clusters}\")\n```\n\n## Algorithm Documentation\n\n### Linear Regression\n\n```python\nfrom vishuml import LinearRegression\n\n# Create and train model\nmodel = LinearRegression(fit_intercept=True)\nmodel.fit(X_train, y_train)\n\n# Make predictions\npredictions = model.predict(X_test)\n\n# Get R\u00b2 score\nscore = model.score(X_test, y_test)\n```\n\n### Logistic Regression\n\n```python\nfrom vishuml import LogisticRegression\n\n# Create and train model\nmodel = LogisticRegression(learning_rate=0.01, max_iterations=1000)\nmodel.fit(X_train, y_train)\n\n# Make predictions\npredictions = model.predict(X_test)\nprobabilities = model.predict_proba(X_test)\n\n# Get accuracy\naccuracy = model.score(X_test, y_test)\n```\n\n### K-Nearest Neighbors\n\n```python\nfrom vishuml import KNearestNeighbors\n\n# For classification\nknn_clf = KNearestNeighbors(k=3, task_type='classification')\nknn_clf.fit(X_train, y_train)\npredictions = knn_clf.predict(X_test)\n\n# For regression\nknn_reg = KNearestNeighbors(k=5, task_type='regression')\nknn_reg.fit(X_train, y_train)\npredictions = knn_reg.predict(X_test)\n```\n\n### Support Vector Machine\n\n```python\nfrom vishuml import SupportVectorMachine\n\n# Linear SVM\nsvm_linear = SupportVectorMachine(C=1.0, kernel='linear')\nsvm_linear.fit(X_train, y_train)\n\n# RBF SVM\nsvm_rbf = SupportVectorMachine(C=1.0, kernel='rbf', gamma=1.0)\nsvm_rbf.fit(X_train, y_train)\n\npredictions = svm_rbf.predict(X_test)\ndecision_scores = svm_rbf.decision_function(X_test)\n```\n\n### Decision Tree\n\n```python\nfrom vishuml import DecisionTree\n\n# Create and train model\ntree = DecisionTree(max_depth=5, min_samples_split=2, min_samples_leaf=1)\ntree.fit(X_train, y_train)\n\n# Make predictions\npredictions = tree.predict(X_test)\naccuracy = tree.score(X_test, y_test)\n```\n\n### Naive Bayes\n\n```python\nfrom vishuml import NaiveBayes\n\n# Create and train model\nnb = NaiveBayes()\nnb.fit(X_train, y_train)\n\n# Make predictions\npredictions = nb.predict(X_test)\nprobabilities = nb.predict_proba(X_test)\n```\n\n### Perceptron\n\n```python\nfrom vishuml import Perceptron\n\n# Create and train model\nperceptron = Perceptron(learning_rate=0.01, max_iterations=1000)\nperceptron.fit(X_train, y_train)\n\n# Make predictions\npredictions = perceptron.predict(X_test)\ndecision_scores = perceptron.decision_function(X_test)\n```\n\n### K-Means Clustering\n\n```python\nfrom vishuml import KMeans\n\n# Create and train model\nkmeans = KMeans(k=3, init='k-means++', random_state=42)\nkmeans.fit(X)\n\n# Get cluster labels\nlabels = kmeans.labels\n# Or predict for new data\nnew_labels = kmeans.predict(X_new)\n\n# Transform to distance space\ndistances = kmeans.transform(X)\n```\n\n## Utility Functions\n\n```python\nfrom vishuml.utils import (\n    train_test_split, accuracy_score, r2_score,\n    mean_squared_error, euclidean_distance,\n    normalize, confusion_matrix\n)\n\n# Split data\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n\n# Evaluate predictions\naccuracy = accuracy_score(y_true, y_pred)\nr2 = r2_score(y_true, y_pred)\nmse = mean_squared_error(y_true, y_pred)\n\n# Normalize features\nX_normalized = normalize(X)\n\n# Confusion matrix\ncm = confusion_matrix(y_true, y_pred)\n```\n\n## Sample Datasets\n\nThe library includes sample datasets in CSV format:\n\n- `datasets/iris.csv` - Classic iris flower classification dataset\n- `datasets/housing.csv` - Housing price regression dataset\n- `datasets/wine.csv` - Wine quality classification dataset\n\n```python\nimport pandas as pd\nimport os\n\n# Load sample datasets\niris_data = pd.read_csv('datasets/iris.csv')\nhousing_data = pd.read_csv('datasets/housing.csv')\nwine_data = pd.read_csv('datasets/wine.csv')\n```\n\n## Examples\n\nCheck out the `examples/` directory for Jupyter notebook tutorials demonstrating each algorithm:\n\n- `examples/linear_regression_example.ipynb`\n- `examples/logistic_regression_example.ipynb`\n- `examples/knn_example.ipynb`\n- `examples/svm_example.ipynb`\n- `examples/decision_tree_example.ipynb`\n- `examples/naive_bayes_example.ipynb`\n- `examples/perceptron_example.ipynb`\n- `examples/kmeans_example.ipynb`\n\n## Development\n\n### Setup Development Environment\n\n```bash\ngit clone https://github.com/vishuRizz/vishuml.git\ncd vishuml\npip install -e \".[dev]\"\n```\n\n### Running Tests\n\n```bash\npytest tests/ -v --cov=vishuml\n```\n\n### Code Formatting\n\n```bash\nblack vishuml/\nflake8 vishuml/\n```\n\n## Requirements\n\n- Python >= 3.7\n- NumPy >= 1.19.0\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## Educational Purpose\n\nThis library is designed for educational purposes to help understand how machine learning algorithms work under the hood. For production use, consider using mature libraries like scikit-learn, which are more optimized and feature-complete.\n\n## Author\n\n**Vishu** - [GitHub Profile](https://github.com/vishuRizz)\n\n## Acknowledgments\n\n- Inspired by scikit-learn's API design\n- Algorithms implemented based on standard textbook descriptions\n- Built for educational and learning purposes\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A machine learning library implementing algorithms from scratch",
    "version": "0.1.6",
    "project_urls": {
        "Bug Reports": "https://github.com/vishuRizz/vishuml-pip-library/issues",
        "Documentation": "https://github.com/vishuRizz/vishuml-pip-library#readme",
        "Homepage": "https://github.com/vishuRizz/vishuml-pip-library",
        "Source": "https://github.com/vishuRizz/vishuml-pip-library"
    },
    "split_keywords": [
        "machine learning",
        " algorithms",
        " classification",
        " regression",
        " clustering",
        " data science"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6e2628bbd279e9ff55ecc4c82bcc6fd40c6ae58b31bfaeb8410edeff5e07a80c",
                "md5": "7522e3844effabed36145bad3341c14e",
                "sha256": "1632b13882565717a3b38c280ab182d3c486d984d99f2cab6a4d137265b35fdf"
            },
            "downloads": -1,
            "filename": "vishuml-0.1.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7522e3844effabed36145bad3341c14e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 32059,
            "upload_time": "2025-08-09T18:32:20",
            "upload_time_iso_8601": "2025-08-09T18:32:20.446208Z",
            "url": "https://files.pythonhosted.org/packages/6e/26/28bbd279e9ff55ecc4c82bcc6fd40c6ae58b31bfaeb8410edeff5e07a80c/vishuml-0.1.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "590dd5a4499b3f659e82d01f9fe1434e4f3d4cc14e3b09fa8d260dd70bfb23da",
                "md5": "9c3f4cfa9dd0edc9c0da7345bebe0e32",
                "sha256": "1ebe7032af485f2f30afdb8195b4bd25380620c68ed0ca5384be1f1068c53fb5"
            },
            "downloads": -1,
            "filename": "vishuml-0.1.6.tar.gz",
            "has_sig": false,
            "md5_digest": "9c3f4cfa9dd0edc9c0da7345bebe0e32",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 27381,
            "upload_time": "2025-08-09T18:32:21",
            "upload_time_iso_8601": "2025-08-09T18:32:21.281543Z",
            "url": "https://files.pythonhosted.org/packages/59/0d/d5a4499b3f659e82d01f9fe1434e4f3d4cc14e3b09fa8d260dd70bfb23da/vishuml-0.1.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-09 18:32:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "vishuRizz",
    "github_project": "vishuml-pip-library",
    "github_not_found": true,
    "lcname": "vishuml"
}
        
Elapsed time: 1.16739s