ragdata


Nameragdata JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/neuml/ragdata
SummaryBuild knowledge bases for RAG
upload_time2024-12-18 13:25:19
maintainerNone
docs_urlNone
authorNeuML
requires_python>=3.9
licenseApache 2.0: http://www.apache.org/licenses/LICENSE-2.0
keywords search embedding machine-learning nlp
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ragdata: Build knowledge bases for RAG

<p align="center">
    <a href="https://github.com/neuml/ragdata/releases">
        <img src="https://img.shields.io/github/release/neuml/ragdata.svg?style=flat&color=success" alt="Version"/>
    </a>
    <a href="https://github.com/neuml/ragdata/releases">
        <img src="https://img.shields.io/github/release-date/neuml/ragdata.svg?style=flat&color=blue" alt="GitHub Release Date"/>
    </a>
    <a href="https://github.com/neuml/ragdata/issues">
        <img src="https://img.shields.io/github/issues/neuml/ragdata.svg?style=flat&color=success" alt="GitHub issues"/>
    </a>
    <a href="https://github.com/neuml/ragdata">
        <img src="https://img.shields.io/github/last-commit/neuml/ragdata.svg?style=flat&color=blue" alt="GitHub last commit"/>
    </a>
</p>

`ragdata` builds knowledge bases for Retrieval Augmented Generation (RAG).

This project has processes to build [txtai](https://github.com/neuml/txtai) embeddings databases for common datasets.

The currently supported datasets are:

- [ArXiv](https://huggingface.co/NeuML/txtai-arxiv)
- [Wikipedia](https://huggingface.co/NeuML/txtai-wikipedia)

Each of the links above has full instructions on how to build those datasets, including using this project.

## Installation
The easiest way to install is via pip and PyPI

```
pip install ragdata
```

Python 3.9+ is supported. Using a Python [virtual environment](https://docs.python.org/3/library/venv.html) is recommended.

`ragdata` can also be installed directly from GitHub to access the latest, unreleased features.

```
pip install git+https://github.com/neuml/ragdata
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/neuml/ragdata",
    "name": "ragdata",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "search embedding machine-learning nlp",
    "author": "NeuML",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/f1/a8/0b39b17362e5fac98d28caa2f499264f4a47c54c46de9a00c804e51fed7f/ragdata-0.1.0.tar.gz",
    "platform": null,
    "description": "# ragdata: Build knowledge bases for RAG\n\n<p align=\"center\">\n    <a href=\"https://github.com/neuml/ragdata/releases\">\n        <img src=\"https://img.shields.io/github/release/neuml/ragdata.svg?style=flat&color=success\" alt=\"Version\"/>\n    </a>\n    <a href=\"https://github.com/neuml/ragdata/releases\">\n        <img src=\"https://img.shields.io/github/release-date/neuml/ragdata.svg?style=flat&color=blue\" alt=\"GitHub Release Date\"/>\n    </a>\n    <a href=\"https://github.com/neuml/ragdata/issues\">\n        <img src=\"https://img.shields.io/github/issues/neuml/ragdata.svg?style=flat&color=success\" alt=\"GitHub issues\"/>\n    </a>\n    <a href=\"https://github.com/neuml/ragdata\">\n        <img src=\"https://img.shields.io/github/last-commit/neuml/ragdata.svg?style=flat&color=blue\" alt=\"GitHub last commit\"/>\n    </a>\n</p>\n\n`ragdata` builds knowledge bases for Retrieval Augmented Generation (RAG).\n\nThis project has processes to build [txtai](https://github.com/neuml/txtai) embeddings databases for common datasets.\n\nThe currently supported datasets are:\n\n- [ArXiv](https://huggingface.co/NeuML/txtai-arxiv)\n- [Wikipedia](https://huggingface.co/NeuML/txtai-wikipedia)\n\nEach of the links above has full instructions on how to build those datasets, including using this project.\n\n## Installation\nThe easiest way to install is via pip and PyPI\n\n```\npip install ragdata\n```\n\nPython 3.9+ is supported. Using a Python [virtual environment](https://docs.python.org/3/library/venv.html) is recommended.\n\n`ragdata` can also be installed directly from GitHub to access the latest, unreleased features.\n\n```\npip install git+https://github.com/neuml/ragdata\n```\n",
    "bugtrack_url": null,
    "license": "Apache 2.0: http://www.apache.org/licenses/LICENSE-2.0",
    "summary": "Build knowledge bases for RAG",
    "version": "0.1.0",
    "project_urls": {
        "Documentation": "https://github.com/neuml/ragdata",
        "Homepage": "https://github.com/neuml/ragdata",
        "Issue Tracker": "https://github.com/neuml/ragdata/issues",
        "Source Code": "https://github.com/neuml/ragdata"
    },
    "split_keywords": [
        "search",
        "embedding",
        "machine-learning",
        "nlp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1cfabfd461b487b55e75d9064a2feab69cc60be2b54f8cecc11e4070c199ac2b",
                "md5": "51bf33d9470d7052cbe64d09bcdd565b",
                "sha256": "0f20c393d1ac95c33c424ae89d90aa75050e96c0a980f12695b132f9daee079d"
            },
            "downloads": -1,
            "filename": "ragdata-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "51bf33d9470d7052cbe64d09bcdd565b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 12842,
            "upload_time": "2024-12-18T13:25:18",
            "upload_time_iso_8601": "2024-12-18T13:25:18.176004Z",
            "url": "https://files.pythonhosted.org/packages/1c/fa/bfd461b487b55e75d9064a2feab69cc60be2b54f8cecc11e4070c199ac2b/ragdata-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f1a80b39b17362e5fac98d28caa2f499264f4a47c54c46de9a00c804e51fed7f",
                "md5": "a492f9aa82e4c1595e34d97e63fd55d4",
                "sha256": "8e73ede9d2f235821fed8d27bbcaa2627fcbf3992b08273251f3dfa6ce10dd71"
            },
            "downloads": -1,
            "filename": "ragdata-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a492f9aa82e4c1595e34d97e63fd55d4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 11863,
            "upload_time": "2024-12-18T13:25:19",
            "upload_time_iso_8601": "2024-12-18T13:25:19.542764Z",
            "url": "https://files.pythonhosted.org/packages/f1/a8/0b39b17362e5fac98d28caa2f499264f4a47c54c46de9a00c804e51fed7f/ragdata-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-18 13:25:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "neuml",
    "github_project": "ragdata",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "ragdata"
}
        
Elapsed time: 0.42189s