| Name | adaptivepca JSON |
| Version |
1.0.0
JSON |
| download |
| home_page | https://github.com/nqmn/adaptivepca |
| Summary | Adaptive PCA with parallel scaling and dimensionality reduction |
| upload_time | 2024-10-26 18:13:46 |
| maintainer | None |
| docs_url | None |
| author | Mohd Adil |
| requires_python | >=3.6 |
| license | None |
| keywords |
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
AdaptivePCA
AdaptivePCA is a flexible, scalable Python package that enables dimensionality reduction with PCA, automatically selecting the best scaler and the optimal number of components to meet a specified variance threshold. Built for efficiency, AdaptivePCA includes parallel processing capabilities to speed up large-scale data transformations, making it ideal for data scientists and machine learning practitioners working with high-dimensional datasets.
Features
Automatic Component Selection: Automatically selects the optimal number of principal components based on a specified variance threshold.
Scaler Selection: Compares multiple scalers (StandardScaler and MinMaxScaler) to find the best fit for the data.
Parallel Processing: Option to use concurrent scaling for faster computations.
Easy Integration: Built on top of widely-used libraries like scikit-learn and numpy.
Installation
You can install AdaptivePCA via pip:
bash
Copy code
pip install adaptivepca
Usage
Import and Initialize
python
Copy code
from adaptivepca import AdaptivePCA
import pandas as pd
# Load your dataset
X = pd.read_csv("your_data.csv") # Ensure your dataset is loaded as a Pandas DataFrame
Basic Usage
Initialize AdaptivePCA and fit it to your data:
python
Copy code
# Initialize AdaptivePCA with desired variance threshold and maximum components
adaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=10)
# Fit and transform data
X_transformed = adaptive_pca.fit_transform(X)
Parallel Processing
For larger datasets, enable parallel processing to speed up computations:
python
Copy code
# Fit AdaptivePCA with parallel processing
adaptive_pca.fit(X, parallel=True)
Accessing Best Parameters
After fitting, you can retrieve the best scaler, number of components, and explained variance score:
python
Copy code
print(f"Best Scaler: {adaptive_pca.best_scaler}")
print(f"Optimal Components: {adaptive_pca.best_n_components}")
print(f"Explained Variance Score: {adaptive_pca.best_explained_variance}")
Parameters
variance_threshold (float): Desired variance threshold for component selection. Default is 0.95.
max_components (int): Maximum number of PCA components to consider. Default is 10.
Methods
fit(X, parallel=False): Fits AdaptivePCA to the dataset X. Use parallel=True to enable parallel processing.
transform(X): Transforms the dataset X using the previously fitted configuration.
fit_transform(X): Combines fit and transform steps in one call.
Example
python
Copy code
from adaptivepca import AdaptivePCA
import pandas as pd
# Example dataset
X = pd.DataFrame({
'feature1': [1, 2, 3, 4, 5],
'feature2': [10, 9, 8, 7, 6],
'feature3': [2, 4, 6, 8, 10]
})
adaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=2)
X_transformed = adaptive_pca.fit_transform(X)
# Retrieve best configuration details
print(f"Best Scaler: {adaptive_pca.best_scaler}")
print(f"Optimal Components: {adaptive_pca.best_n_components}")
print(f"Explained Variance Score: {adaptive_pca.best_explained_variance}")
Dependencies
scikit-learn>=0.24
numpy>=1.19
pandas>=1.1
License
This project is licensed under the MIT License. See the LICENSE file for details.
Raw data
{
"_id": null,
"home_page": "https://github.com/nqmn/adaptivepca",
"name": "adaptivepca",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "Mohd Adil",
"author_email": "mohdadil@live.com",
"download_url": "https://files.pythonhosted.org/packages/64/9c/557c3706c1978df838ab5ba8d06dc6954a28a74f2905a9c9ca411463cd0d/adaptivepca-1.0.0.tar.gz",
"platform": null,
"description": "AdaptivePCA\r\nAdaptivePCA is a flexible, scalable Python package that enables dimensionality reduction with PCA, automatically selecting the best scaler and the optimal number of components to meet a specified variance threshold. Built for efficiency, AdaptivePCA includes parallel processing capabilities to speed up large-scale data transformations, making it ideal for data scientists and machine learning practitioners working with high-dimensional datasets.\r\n\r\nFeatures\r\nAutomatic Component Selection: Automatically selects the optimal number of principal components based on a specified variance threshold.\r\nScaler Selection: Compares multiple scalers (StandardScaler and MinMaxScaler) to find the best fit for the data.\r\nParallel Processing: Option to use concurrent scaling for faster computations.\r\nEasy Integration: Built on top of widely-used libraries like scikit-learn and numpy.\r\nInstallation\r\nYou can install AdaptivePCA via pip:\r\n\r\nbash\r\nCopy code\r\npip install adaptivepca\r\nUsage\r\nImport and Initialize\r\npython\r\nCopy code\r\nfrom adaptivepca import AdaptivePCA\r\nimport pandas as pd\r\n\r\n# Load your dataset\r\nX = pd.read_csv(\"your_data.csv\") # Ensure your dataset is loaded as a Pandas DataFrame\r\nBasic Usage\r\nInitialize AdaptivePCA and fit it to your data:\r\n\r\npython\r\nCopy code\r\n# Initialize AdaptivePCA with desired variance threshold and maximum components\r\nadaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=10)\r\n\r\n# Fit and transform data\r\nX_transformed = adaptive_pca.fit_transform(X)\r\nParallel Processing\r\nFor larger datasets, enable parallel processing to speed up computations:\r\n\r\npython\r\nCopy code\r\n# Fit AdaptivePCA with parallel processing\r\nadaptive_pca.fit(X, parallel=True)\r\nAccessing Best Parameters\r\nAfter fitting, you can retrieve the best scaler, number of components, and explained variance score:\r\n\r\npython\r\nCopy code\r\nprint(f\"Best Scaler: {adaptive_pca.best_scaler}\")\r\nprint(f\"Optimal Components: {adaptive_pca.best_n_components}\")\r\nprint(f\"Explained Variance Score: {adaptive_pca.best_explained_variance}\")\r\nParameters\r\nvariance_threshold (float): Desired variance threshold for component selection. Default is 0.95.\r\nmax_components (int): Maximum number of PCA components to consider. Default is 10.\r\nMethods\r\nfit(X, parallel=False): Fits AdaptivePCA to the dataset X. Use parallel=True to enable parallel processing.\r\ntransform(X): Transforms the dataset X using the previously fitted configuration.\r\nfit_transform(X): Combines fit and transform steps in one call.\r\nExample\r\npython\r\nCopy code\r\nfrom adaptivepca import AdaptivePCA\r\nimport pandas as pd\r\n\r\n# Example dataset\r\nX = pd.DataFrame({\r\n 'feature1': [1, 2, 3, 4, 5],\r\n 'feature2': [10, 9, 8, 7, 6],\r\n 'feature3': [2, 4, 6, 8, 10]\r\n})\r\n\r\nadaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=2)\r\nX_transformed = adaptive_pca.fit_transform(X)\r\n\r\n# Retrieve best configuration details\r\nprint(f\"Best Scaler: {adaptive_pca.best_scaler}\")\r\nprint(f\"Optimal Components: {adaptive_pca.best_n_components}\")\r\nprint(f\"Explained Variance Score: {adaptive_pca.best_explained_variance}\")\r\nDependencies\r\nscikit-learn>=0.24\r\nnumpy>=1.19\r\npandas>=1.1\r\nLicense\r\nThis project is licensed under the MIT License. See the LICENSE file for details.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Adaptive PCA with parallel scaling and dimensionality reduction",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/nqmn/adaptivepca"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "67cd315289b64e2f638ded3d2836f86dbb6af1b1cf12fd77c090b1c32e64b293",
"md5": "70da688e7e6112ee07d196462ac31ab4",
"sha256": "27aa9874b0298934e933f8d187f27bb03fc31bdc0ca3d958490b6c6ef675e55e"
},
"downloads": -1,
"filename": "adaptivepca-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "70da688e7e6112ee07d196462ac31ab4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 4828,
"upload_time": "2024-10-26T18:13:44",
"upload_time_iso_8601": "2024-10-26T18:13:44.926786Z",
"url": "https://files.pythonhosted.org/packages/67/cd/315289b64e2f638ded3d2836f86dbb6af1b1cf12fd77c090b1c32e64b293/adaptivepca-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "649c557c3706c1978df838ab5ba8d06dc6954a28a74f2905a9c9ca411463cd0d",
"md5": "c646106d55db95bff8d4a4a9924f326a",
"sha256": "dfceba8f71f43db78c40aeb82d1e06acb668b55a75ad2e9ae096f08bb858865c"
},
"downloads": -1,
"filename": "adaptivepca-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "c646106d55db95bff8d4a4a9924f326a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 4471,
"upload_time": "2024-10-26T18:13:46",
"upload_time_iso_8601": "2024-10-26T18:13:46.638868Z",
"url": "https://files.pythonhosted.org/packages/64/9c/557c3706c1978df838ab5ba8d06dc6954a28a74f2905a9c9ca411463cd0d/adaptivepca-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-26 18:13:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nqmn",
"github_project": "adaptivepca",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "adaptivepca"
}