camel-learn


Namecamel-learn JSON
Version 1.1.2 PyPI version JSON
download
home_pagehttps://github.com/ymlasu/CAMEL
SummaryThe official implementation for CAMEL: Curvature Augmented Manifold Embedding and Learning
upload_time2024-06-19 16:41:23
maintainerNone
docs_urlNone
authorYongming Liu
requires_python>=3.9
licenseNone
keywords camel dimension reduction umap tsne trimap pacmap largevis smallvis manifold
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. -*- mode: rst -*-

.. image:: docs/Camel_logo.png
  :width: 600
  :alt: CAMELlogo
  :align: center

|pypi_version|_ 

.. |pypi_version| image:: https://img.shields.io/pypi/v/camel-learn.svg
.. _pypi_version: https://pypi.python.org/pypi/camel-learn/

#################################################################
Curvature Augmented Manifold Embedding and Learning -- CAMEL
#################################################################

CAMEL is a Python tool for dimension reduction and data visualization. It can perform unsupervised, supervised, semi-supervised, metric, and inverse learning.

---------------------------
Theory and Reference
---------------------------
Detailed derivation and examples can be found in the ArXiv paper.
https://arxiv.org/abs/2403.14813

Detailed documentation and examples can be found at https://camel-learn.readthedocs.io/en/latest/

-----------
Installing
-----------

CAMEL Requirements:

* Python 3.6 or greater
* numpy
* scikit-learn
* numba
* annoy
* pandas

Recommended packages:

* For plotting
   * matplotlib
* For metrics evaluation
   * gap statistics, coranking, optics

**Install Options**

.. code:: bash

     pip install camel-learn

If pip is having difficulties pulling the dependencies, then I'd suggest installing
the dependencies manually using Anaconda. The author has tried Anaconda in Mac OS 14 with M1 and M2 CPU.




-----------------
How to use CAMEL
-----------------

The camel package is inspired and developed based on many dimension reduction packages, such as UMAP, TriMAP, and PaCMAP, which follow a similar setting from sklearn classes. Thus, CAMEL shares a similar calling format using the CAMEL API.

1. There is only one class, CAMEL().
2. fit(X, y) and fit_transform(X, y) perform training in embedding data and constructing a "model". X refers to input feature data, and y refers to input label data. y is optional and can also have missing/NaN data. This module is mainly used for unsupervised, supervised, and semi-supervised learning.
3. transform(Xnew, basis) is for embedding if new testing data Xnew is provided and the model is constructed using basis datasets. Basis data is optional. This module is mainly used for metric learning, where the metric model is already learned from training data, whether it is supervised, unsupervised, or semi-supervised learning. 
4. invser_transform(ynews, X, y) is used for inverse embedding and dimension augmentation from low to high dimensions. This module assumes that you have a forward embedding constructed from training data X (basis feature) and y (embedding of basis feature). Then, one can reverse this process by constructing a feature space vector from a new unseen point in a dimension point. This is in analogy to the generative model from a latent space in ML. 

The CAMEL is very easy to start with. You can start a basic unsupervised learning job by plotting with less than 10 lines of code!

.. code:: python

    import matplotlib.pyplot as plt
    from camel import CAMEL
    from sklearn import datasets

    X, y = datasets.make_swiss_roll(n_samples=50000, random_state=None)

    reducer= CAMEL()

    X_embedding = reducer.fit_transform(X)

    y = y.astype(int) #convert to category for easy visulization

    # Visualization

    plt.figure(1)
    plt.scatter(X_embedding[:, 0], X_embedding[:, 1], c=y, cmap='jet', s=0.2)
    plt.title('CAMEL Embedding')
    plt.tight_layout()
    plt.show()


Once done, you will see the 2D embedding of the 3D Swiss Roll.

.. image:: docs/swiss_roll_unsupervised.png
  :width: 600
  :alt: swiss_roll_unsupervised
  :align: center

Simple code examples in the test folder: (more coming)

=====
API
=====
Several parameters can control the CAMEL's results and performance. Default values have been set if you want to start quickly. Below is a description of several main factors if you want to fine-tune the CAMEL.

- ''n_components'': int, default=2
        Dimensions of the embedded space. Typical values are 2 or 3. It can be any integer.

- ' ' n_neighbors'': int, default=10
        Number of neighbors considered for nearest neighbor pairs for local structure preservation.

- ''FP_number'': float, default=20
        Number of further points(e.g., 20 Further pairs per node)
        Further pairs are used for both local and global structure preservation.

- ''tail_coe'': float, default=0.05
        The parameter to control the attractive force of neighbors (1/(1+tail_coe*dij)**2), smaller values indicate flat tail, and it is not recommended to change.
    
- ''w_neighbors'': float, default=1.0
        weight coefficient for the attractive force of neighbors, large values indicate strong force for the same distance metric
        
- ''w_curv'': float, default=0.001
        weight coefficient for attractive/repulsive force due to local curvature, large values indicate strong force for the same distance metric        

- ''w_FP'': float, default=20
        weight coefficient for the repulsive force of far points, large values indicate strong force for the same distance metric    
    
- ''lr'': float, default=1.0
        The learning rate of the Adam optimizer for embedding. do not recommend changing.

- ''num_iters'': int, default=400
        The number of iterations for optimizing embedding. It is observed that 200 is sufficient for most cases, and 400 is used here for safety reasons.

- ''target_weight'': float, default=0.5
        weight factor for target/label during the supervised learning, 0 indicates no weight, and it reduces to unsupervised one,
        1 indicates infinity weight (set as a large value in practice.

- ''random_state'': int, optional
        Random state for the camel instance.
        Setting a random state is useful for repeatability.



The other setting can be seen in the source code and will be updated in future documentation.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ymlasu/CAMEL",
    "name": "camel-learn",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "CAMEL, dimension reduction, UMAP, tSNE, TriMap, PaCMAP, LargeVis, smallvis, manifold",
    "author": "Yongming Liu",
    "author_email": "yongming.liu@asu.edu",
    "download_url": "https://files.pythonhosted.org/packages/23/33/c1408ee56fb00f873b40b61bdc8f6c879d4fa651ffd0f99acc9fb24233f2/camel_learn-1.1.2.tar.gz",
    "platform": null,
    "description": ".. -*- mode: rst -*-\n\n.. image:: docs/Camel_logo.png\n  :width: 600\n  :alt: CAMELlogo\n  :align: center\n\n|pypi_version|_ \n\n.. |pypi_version| image:: https://img.shields.io/pypi/v/camel-learn.svg\n.. _pypi_version: https://pypi.python.org/pypi/camel-learn/\n\n#################################################################\nCurvature Augmented Manifold Embedding and Learning -- CAMEL\n#################################################################\n\nCAMEL is a Python tool for dimension reduction and data visualization. It can perform unsupervised, supervised, semi-supervised, metric, and inverse learning.\n\n---------------------------\nTheory and Reference\n---------------------------\nDetailed derivation and examples can be found in the ArXiv paper.\nhttps://arxiv.org/abs/2403.14813\n\nDetailed documentation and examples can be found at https://camel-learn.readthedocs.io/en/latest/\n\n-----------\nInstalling\n-----------\n\nCAMEL Requirements:\n\n* Python 3.6 or greater\n* numpy\n* scikit-learn\n* numba\n* annoy\n* pandas\n\nRecommended packages:\n\n* For plotting\n   * matplotlib\n* For metrics evaluation\n   * gap statistics, coranking, optics\n\n**Install Options**\n\n.. code:: bash\n\n     pip install camel-learn\n\nIf pip is having difficulties pulling the dependencies, then I'd suggest installing\nthe dependencies manually using Anaconda. The author has tried Anaconda in Mac OS 14 with M1 and M2 CPU.\n\n\n\n\n-----------------\nHow to use CAMEL\n-----------------\n\nThe camel package is inspired and developed based on many dimension reduction packages, such as UMAP, TriMAP, and PaCMAP, which follow a similar setting from sklearn classes. Thus, CAMEL shares a similar calling format using the CAMEL API.\n\n1. There is only one class, CAMEL().\n2. fit(X, y) and fit_transform(X, y) perform training in embedding data and constructing a \"model\". X refers to input feature data, and y refers to input label data. y is optional and can also have missing/NaN data. This module is mainly used for unsupervised, supervised, and semi-supervised learning.\n3. transform(Xnew, basis) is for embedding if new testing data Xnew is provided and the model is constructed using basis datasets. Basis data is optional. This module is mainly used for metric learning, where the metric model is already learned from training data, whether it is supervised, unsupervised, or semi-supervised learning. \n4. invser_transform(ynews, X, y) is used for inverse embedding and dimension augmentation from low to high dimensions. This module assumes that you have a forward embedding constructed from training data X (basis feature) and y (embedding of basis feature). Then, one can reverse this process by constructing a feature space vector from a new unseen point in a dimension point. This is in analogy to the generative model from a latent space in ML. \n\nThe CAMEL is very easy to start with. You can start a basic unsupervised learning job by plotting with less than 10 lines of code!\n\n.. code:: python\n\n    import matplotlib.pyplot as plt\n    from camel import CAMEL\n    from sklearn import datasets\n\n    X, y = datasets.make_swiss_roll(n_samples=50000, random_state=None)\n\n    reducer= CAMEL()\n\n    X_embedding = reducer.fit_transform(X)\n\n    y = y.astype(int) #convert to category for easy visulization\n\n    # Visualization\n\n    plt.figure(1)\n    plt.scatter(X_embedding[:, 0], X_embedding[:, 1], c=y, cmap='jet', s=0.2)\n    plt.title('CAMEL Embedding')\n    plt.tight_layout()\n    plt.show()\n\n\nOnce done, you will see the 2D embedding of the 3D Swiss Roll.\n\n.. image:: docs/swiss_roll_unsupervised.png\n  :width: 600\n  :alt: swiss_roll_unsupervised\n  :align: center\n\nSimple code examples in the test folder: (more coming)\n\n=====\nAPI\n=====\nSeveral parameters can control the CAMEL's results and performance. Default values have been set if you want to start quickly. Below is a description of several main factors if you want to fine-tune the CAMEL.\n\n- ''n_components'': int, default=2\n        Dimensions of the embedded space. Typical values are 2 or 3. It can be any integer.\n\n- ' ' n_neighbors'': int, default=10\n        Number of neighbors considered for nearest neighbor pairs for local structure preservation.\n\n- ''FP_number'': float, default=20\n        Number of further points(e.g., 20 Further pairs per node)\n        Further pairs are used for both local and global structure preservation.\n\n- ''tail_coe'': float, default=0.05\n        The parameter to control the attractive force of neighbors (1/(1+tail_coe*dij)**2), smaller values indicate flat tail, and it is not recommended to change.\n    \n- ''w_neighbors'': float, default=1.0\n        weight coefficient for the attractive force of neighbors, large values indicate strong force for the same distance metric\n        \n- ''w_curv'': float, default=0.001\n        weight coefficient for attractive/repulsive force due to local curvature, large values indicate strong force for the same distance metric        \n\n- ''w_FP'': float, default=20\n        weight coefficient for the repulsive force of far points, large values indicate strong force for the same distance metric    \n    \n- ''lr'': float, default=1.0\n        The learning rate of the Adam optimizer for embedding. do not recommend changing.\n\n- ''num_iters'': int, default=400\n        The number of iterations for optimizing embedding. It is observed that 200 is sufficient for most cases, and 400 is used here for safety reasons.\n\n- ''target_weight'': float, default=0.5\n        weight factor for target/label during the supervised learning, 0 indicates no weight, and it reduces to unsupervised one,\n        1 indicates infinity weight (set as a large value in practice.\n\n- ''random_state'': int, optional\n        Random state for the camel instance.\n        Setting a random state is useful for repeatability.\n\n\n\nThe other setting can be seen in the source code and will be updated in future documentation.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "The official implementation for CAMEL: Curvature Augmented Manifold Embedding and Learning",
    "version": "1.1.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/ymlasu/CAMEL/issues",
        "Homepage": "https://github.com/ymlasu/CAMEL"
    },
    "split_keywords": [
        "camel",
        " dimension reduction",
        " umap",
        " tsne",
        " trimap",
        " pacmap",
        " largevis",
        " smallvis",
        " manifold"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "506126a756c5e81c8b62505673effde93cdb7e738280e4057a3db4d58d5da3a1",
                "md5": "a803c8c3ebb86e31032d9a118a1a59cd",
                "sha256": "5e7fa74344cbe0c007a3f1050b2a0b38edc04698a8153a32082cb88ffdd1032a"
            },
            "downloads": -1,
            "filename": "camel_learn-1.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a803c8c3ebb86e31032d9a118a1a59cd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 18233,
            "upload_time": "2024-06-19T16:41:22",
            "upload_time_iso_8601": "2024-06-19T16:41:22.194305Z",
            "url": "https://files.pythonhosted.org/packages/50/61/26a756c5e81c8b62505673effde93cdb7e738280e4057a3db4d58d5da3a1/camel_learn-1.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2333c1408ee56fb00f873b40b61bdc8f6c879d4fa651ffd0f99acc9fb24233f2",
                "md5": "04b969cb04ad3e1067bb71e0e3f4d35e",
                "sha256": "0fd8e8cc0d318b83db2193108764ec797023fb16dd7d9e829de63b08c7fbed8d"
            },
            "downloads": -1,
            "filename": "camel_learn-1.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "04b969cb04ad3e1067bb71e0e3f4d35e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 20988,
            "upload_time": "2024-06-19T16:41:23",
            "upload_time_iso_8601": "2024-06-19T16:41:23.614070Z",
            "url": "https://files.pythonhosted.org/packages/23/33/c1408ee56fb00f873b40b61bdc8f6c879d4fa651ffd0f99acc9fb24233f2/camel_learn-1.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-19 16:41:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ymlasu",
    "github_project": "CAMEL",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "camel-learn"
}
        
Elapsed time: 0.26387s