text2mapdata


Nametext2mapdata JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/papasega/text2embeddingview
SummaryA python package to map your own csv files data using Atlas from NOMIC
upload_time2023-04-13 04:33:11
maintainer
docs_urlNone
authorPapa Séga WADE
requires_python
licenseMIT
keywords embedding visualization map text csv search keywords dynamic
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            This is a vesy simple way to map your text data using [Altas from NOMIC](https://docs.nomic.ai/index.html) using the lib `click`. 

You have to create an account to get API_KEY NOMIC. 

Atlas enables you to:

Store, update and organize multi-million point datasets of unstructured text, images and embeddings.

Visually interact with your datasets from a web browser.

Run semantic search and vector operations over your datasets.

Use Atlas to:

    - Visualize, interact, collaborate and share large datasets of text and embeddings.
    
    - Collaboratively clean, tag and label your datasets
    
    - Build high-availability apps powered by semantic search
    
    - Understand and debug the latent space of your AI model trains

# How to use
### Installation

To install the necessary dependencies, run the following command:

```bash
python -m venv mymapenv 
source mymapenv/bin/activate
pip install --upgrade pip 
pip install text2mapdata
```

## Supported Transformer Models from Hugging Face 

This project supports a variety of transformer models, including models from the Hugging Face Model Hub and sentence-transformers. Below are some examples:
    - Hugging Face Model: 'prajjwal1/bert-mini'
    - Hugging Face Model: 'Sahajtomar/french_semantic'  (french version for semantic search embedding) 
    - Sentence-Transformers Model: 'sentence-transformers/all-MiniLM-L6-v2' etc...

Please ensure that the model you choose is compatible with the project requirements and adjust the `--transformer_model_name` option accordingly.

## To map your text/csv  files

```bash
python main.py --transformer-model-name MODEL_NAME --cache_dir CACHE_DIR --batch-size BATCH_SIZE --file-path FILE_PATH
```
Remarque for the CACHE_DIR : you can setup it like ==> 

```bash
export TRANSFORMERS_CACHE=/path_to_your/transformers_cache
```

Give a fidback. 

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/papasega/text2embeddingview",
    "name": "text2mapdata",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "Embedding,Visualization,Map,Text,CSV,Search keywords,dynamic",
    "author": "Papa S\u00e9ga WADE",
    "author_email": "pasega.wade@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/1e/2a/669769ca859b3c0e8eb710fa8dbd2000bc5778b498d2baedca006ac7e20a/text2mapdata-0.1.1.tar.gz",
    "platform": null,
    "description": "This is a vesy simple way to map your text data using [Altas from NOMIC](https://docs.nomic.ai/index.html) using the lib `click`. \n\nYou have to create an account to get API_KEY NOMIC. \n\nAtlas enables you to:\n\nStore, update and organize multi-million point datasets of unstructured text, images and embeddings.\n\nVisually interact with your datasets from a web browser.\n\nRun semantic search and vector operations over your datasets.\n\nUse Atlas to:\n\n    - Visualize, interact, collaborate and share large datasets of text and embeddings.\n    \n    - Collaboratively clean, tag and label your datasets\n    \n    - Build high-availability apps powered by semantic search\n    \n    - Understand and debug the latent space of your AI model trains\n\n# How to use\n### Installation\n\nTo install the necessary dependencies, run the following command:\n\n```bash\npython -m venv mymapenv \nsource mymapenv/bin/activate\npip install --upgrade pip \npip install text2mapdata\n```\n\n## Supported Transformer Models from Hugging Face \n\nThis project supports a variety of transformer models, including models from the Hugging Face Model Hub and sentence-transformers. Below are some examples:\n    - Hugging Face Model: 'prajjwal1/bert-mini'\n    - Hugging Face Model: 'Sahajtomar/french_semantic'  (french version for semantic search embedding) \n    - Sentence-Transformers Model: 'sentence-transformers/all-MiniLM-L6-v2' etc...\n\nPlease ensure that the model you choose is compatible with the project requirements and adjust the `--transformer_model_name` option accordingly.\n\n## To map your text/csv  files\n\n```bash\npython main.py --transformer-model-name MODEL_NAME --cache_dir CACHE_DIR --batch-size BATCH_SIZE --file-path FILE_PATH\n```\nRemarque for the CACHE_DIR : you can setup it like ==> \n\n```bash\nexport TRANSFORMERS_CACHE=/path_to_your/transformers_cache\n```\n\nGive a fidback. \n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A python package to map your own csv files data using Atlas from NOMIC",
    "version": "0.1.1",
    "split_keywords": [
        "embedding",
        "visualization",
        "map",
        "text",
        "csv",
        "search keywords",
        "dynamic"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "87478a535cc55aeeeeb9c9669a2fec61f126963c01f09494456444e658599864",
                "md5": "01b3dc975c89162cef7457cb163a063e",
                "sha256": "ace41aeae771e02eacfd5769d94e829767252fd55850403040fe643ec957651a"
            },
            "downloads": -1,
            "filename": "text2mapdata-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "01b3dc975c89162cef7457cb163a063e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 3115,
            "upload_time": "2023-04-13T04:33:10",
            "upload_time_iso_8601": "2023-04-13T04:33:10.049669Z",
            "url": "https://files.pythonhosted.org/packages/87/47/8a535cc55aeeeeb9c9669a2fec61f126963c01f09494456444e658599864/text2mapdata-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1e2a669769ca859b3c0e8eb710fa8dbd2000bc5778b498d2baedca006ac7e20a",
                "md5": "3ca57d47fd68c96adae4a27abf40d1ab",
                "sha256": "5e484de66584295bada87008b2f28d528efa4d61c84e1c9888a91224bf97865d"
            },
            "downloads": -1,
            "filename": "text2mapdata-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3ca57d47fd68c96adae4a27abf40d1ab",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 4618,
            "upload_time": "2023-04-13T04:33:11",
            "upload_time_iso_8601": "2023-04-13T04:33:11.865233Z",
            "url": "https://files.pythonhosted.org/packages/1e/2a/669769ca859b3c0e8eb710fa8dbd2000bc5778b498d2baedca006ac7e20a/text2mapdata-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-13 04:33:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "papasega",
    "github_project": "text2embeddingview",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "text2mapdata"
}
        
Elapsed time: 0.18320s