adaptivepca


Nameadaptivepca JSON
Version 1.0.0 PyPI version JSON
download
home_pagehttps://github.com/nqmn/adaptivepca
SummaryAdaptive PCA with parallel scaling and dimensionality reduction
upload_time2024-10-26 18:13:46
maintainerNone
docs_urlNone
authorMohd Adil
requires_python>=3.6
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            AdaptivePCA
AdaptivePCA is a flexible, scalable Python package that enables dimensionality reduction with PCA, automatically selecting the best scaler and the optimal number of components to meet a specified variance threshold. Built for efficiency, AdaptivePCA includes parallel processing capabilities to speed up large-scale data transformations, making it ideal for data scientists and machine learning practitioners working with high-dimensional datasets.

Features
Automatic Component Selection: Automatically selects the optimal number of principal components based on a specified variance threshold.
Scaler Selection: Compares multiple scalers (StandardScaler and MinMaxScaler) to find the best fit for the data.
Parallel Processing: Option to use concurrent scaling for faster computations.
Easy Integration: Built on top of widely-used libraries like scikit-learn and numpy.
Installation
You can install AdaptivePCA via pip:

bash
Copy code
pip install adaptivepca
Usage
Import and Initialize
python
Copy code
from adaptivepca import AdaptivePCA
import pandas as pd

# Load your dataset
X = pd.read_csv("your_data.csv")  # Ensure your dataset is loaded as a Pandas DataFrame
Basic Usage
Initialize AdaptivePCA and fit it to your data:

python
Copy code
# Initialize AdaptivePCA with desired variance threshold and maximum components
adaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=10)

# Fit and transform data
X_transformed = adaptive_pca.fit_transform(X)
Parallel Processing
For larger datasets, enable parallel processing to speed up computations:

python
Copy code
# Fit AdaptivePCA with parallel processing
adaptive_pca.fit(X, parallel=True)
Accessing Best Parameters
After fitting, you can retrieve the best scaler, number of components, and explained variance score:

python
Copy code
print(f"Best Scaler: {adaptive_pca.best_scaler}")
print(f"Optimal Components: {adaptive_pca.best_n_components}")
print(f"Explained Variance Score: {adaptive_pca.best_explained_variance}")
Parameters
variance_threshold (float): Desired variance threshold for component selection. Default is 0.95.
max_components (int): Maximum number of PCA components to consider. Default is 10.
Methods
fit(X, parallel=False): Fits AdaptivePCA to the dataset X. Use parallel=True to enable parallel processing.
transform(X): Transforms the dataset X using the previously fitted configuration.
fit_transform(X): Combines fit and transform steps in one call.
Example
python
Copy code
from adaptivepca import AdaptivePCA
import pandas as pd

# Example dataset
X = pd.DataFrame({
    'feature1': [1, 2, 3, 4, 5],
    'feature2': [10, 9, 8, 7, 6],
    'feature3': [2, 4, 6, 8, 10]
})

adaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=2)
X_transformed = adaptive_pca.fit_transform(X)

# Retrieve best configuration details
print(f"Best Scaler: {adaptive_pca.best_scaler}")
print(f"Optimal Components: {adaptive_pca.best_n_components}")
print(f"Explained Variance Score: {adaptive_pca.best_explained_variance}")
Dependencies
scikit-learn>=0.24
numpy>=1.19
pandas>=1.1
License
This project is licensed under the MIT License. See the LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/nqmn/adaptivepca",
    "name": "adaptivepca",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "Mohd Adil",
    "author_email": "mohdadil@live.com",
    "download_url": "https://files.pythonhosted.org/packages/64/9c/557c3706c1978df838ab5ba8d06dc6954a28a74f2905a9c9ca411463cd0d/adaptivepca-1.0.0.tar.gz",
    "platform": null,
    "description": "AdaptivePCA\r\nAdaptivePCA is a flexible, scalable Python package that enables dimensionality reduction with PCA, automatically selecting the best scaler and the optimal number of components to meet a specified variance threshold. Built for efficiency, AdaptivePCA includes parallel processing capabilities to speed up large-scale data transformations, making it ideal for data scientists and machine learning practitioners working with high-dimensional datasets.\r\n\r\nFeatures\r\nAutomatic Component Selection: Automatically selects the optimal number of principal components based on a specified variance threshold.\r\nScaler Selection: Compares multiple scalers (StandardScaler and MinMaxScaler) to find the best fit for the data.\r\nParallel Processing: Option to use concurrent scaling for faster computations.\r\nEasy Integration: Built on top of widely-used libraries like scikit-learn and numpy.\r\nInstallation\r\nYou can install AdaptivePCA via pip:\r\n\r\nbash\r\nCopy code\r\npip install adaptivepca\r\nUsage\r\nImport and Initialize\r\npython\r\nCopy code\r\nfrom adaptivepca import AdaptivePCA\r\nimport pandas as pd\r\n\r\n# Load your dataset\r\nX = pd.read_csv(\"your_data.csv\")  # Ensure your dataset is loaded as a Pandas DataFrame\r\nBasic Usage\r\nInitialize AdaptivePCA and fit it to your data:\r\n\r\npython\r\nCopy code\r\n# Initialize AdaptivePCA with desired variance threshold and maximum components\r\nadaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=10)\r\n\r\n# Fit and transform data\r\nX_transformed = adaptive_pca.fit_transform(X)\r\nParallel Processing\r\nFor larger datasets, enable parallel processing to speed up computations:\r\n\r\npython\r\nCopy code\r\n# Fit AdaptivePCA with parallel processing\r\nadaptive_pca.fit(X, parallel=True)\r\nAccessing Best Parameters\r\nAfter fitting, you can retrieve the best scaler, number of components, and explained variance score:\r\n\r\npython\r\nCopy code\r\nprint(f\"Best Scaler: {adaptive_pca.best_scaler}\")\r\nprint(f\"Optimal Components: {adaptive_pca.best_n_components}\")\r\nprint(f\"Explained Variance Score: {adaptive_pca.best_explained_variance}\")\r\nParameters\r\nvariance_threshold (float): Desired variance threshold for component selection. Default is 0.95.\r\nmax_components (int): Maximum number of PCA components to consider. Default is 10.\r\nMethods\r\nfit(X, parallel=False): Fits AdaptivePCA to the dataset X. Use parallel=True to enable parallel processing.\r\ntransform(X): Transforms the dataset X using the previously fitted configuration.\r\nfit_transform(X): Combines fit and transform steps in one call.\r\nExample\r\npython\r\nCopy code\r\nfrom adaptivepca import AdaptivePCA\r\nimport pandas as pd\r\n\r\n# Example dataset\r\nX = pd.DataFrame({\r\n    'feature1': [1, 2, 3, 4, 5],\r\n    'feature2': [10, 9, 8, 7, 6],\r\n    'feature3': [2, 4, 6, 8, 10]\r\n})\r\n\r\nadaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=2)\r\nX_transformed = adaptive_pca.fit_transform(X)\r\n\r\n# Retrieve best configuration details\r\nprint(f\"Best Scaler: {adaptive_pca.best_scaler}\")\r\nprint(f\"Optimal Components: {adaptive_pca.best_n_components}\")\r\nprint(f\"Explained Variance Score: {adaptive_pca.best_explained_variance}\")\r\nDependencies\r\nscikit-learn>=0.24\r\nnumpy>=1.19\r\npandas>=1.1\r\nLicense\r\nThis project is licensed under the MIT License. See the LICENSE file for details.\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Adaptive PCA with parallel scaling and dimensionality reduction",
    "version": "1.0.0",
    "project_urls": {
        "Homepage": "https://github.com/nqmn/adaptivepca"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "67cd315289b64e2f638ded3d2836f86dbb6af1b1cf12fd77c090b1c32e64b293",
                "md5": "70da688e7e6112ee07d196462ac31ab4",
                "sha256": "27aa9874b0298934e933f8d187f27bb03fc31bdc0ca3d958490b6c6ef675e55e"
            },
            "downloads": -1,
            "filename": "adaptivepca-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "70da688e7e6112ee07d196462ac31ab4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 4828,
            "upload_time": "2024-10-26T18:13:44",
            "upload_time_iso_8601": "2024-10-26T18:13:44.926786Z",
            "url": "https://files.pythonhosted.org/packages/67/cd/315289b64e2f638ded3d2836f86dbb6af1b1cf12fd77c090b1c32e64b293/adaptivepca-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "649c557c3706c1978df838ab5ba8d06dc6954a28a74f2905a9c9ca411463cd0d",
                "md5": "c646106d55db95bff8d4a4a9924f326a",
                "sha256": "dfceba8f71f43db78c40aeb82d1e06acb668b55a75ad2e9ae096f08bb858865c"
            },
            "downloads": -1,
            "filename": "adaptivepca-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c646106d55db95bff8d4a4a9924f326a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 4471,
            "upload_time": "2024-10-26T18:13:46",
            "upload_time_iso_8601": "2024-10-26T18:13:46.638868Z",
            "url": "https://files.pythonhosted.org/packages/64/9c/557c3706c1978df838ab5ba8d06dc6954a28a74f2905a9c9ca411463cd0d/adaptivepca-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-26 18:13:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nqmn",
    "github_project": "adaptivepca",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "adaptivepca"
}
        
Elapsed time: 0.37503s