mdatasets


Namemdatasets JSON
Version 0.1.4 PyPI version JSON
download
home_pagehttps://github.com/sleepingcat4/Mdataset
SummaryAn one-stop Python library for dataset compilation and processing.
upload_time2023-12-07 18:23:32
maintainer
docs_urlNone
authorTAWSIF AHMED
requires_python
licenseApache 2.0
keywords dataset compilation processing python library
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Introducing Mdataset (/em-dataset/): a comprehensive solution tailored for researchers and students seeking a streamlined approach to compiling high-quality datasets and processing. 

### **Why is it necessary?**

Dataset compilation and processing can be a formidable challenge in various fields utilizing big data. The complexities lie in either navigating unfamiliar data banks or grappling with the intricacies of downloading specific datasets. Mdataset addresses these hurdles, providing an all-encompassing solution that equips users with the essential tools and methods needed to download existing datasets from renowned sources like Kaggle and Hugging Face. The only limiting factor is your computational power.

### **What do we offer?**

Mdataset delivers a set of wrappers and functions, either wrapping existing tools developed by researchers or providing our solutions. With a simple three-line command, users can effortlessly download and compile datasets while also performing various processing tasks. Our offerings include:

- Downloading datasets from high-quality sources
- Video transcription
- Text-to-audio conversion
- User-friendly web scraping tools
- Secure local-to-internet file transfer on demand
- Scraping popular image boards
- Synthetic generation of tabular and text data
- Secure one-on-one interview method for data extraction by researchers
- YouTube video and audio download
- Powerful Optical Character Recognition for extracting data from PDFs
- Unrestricted on-terminal search engine
- Tor route circuits for secure communication

Choose Mdataset for a comprehensive and efficient solution to your dataset compilation and processing needs.

### Quickstart

`pip install mdataset`

```Python
from mdataset import scrape_data
url = "www.example.com'
wanted_list = ['What's the data ca?']
scrape_data(url, wanted_list)
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/sleepingcat4/Mdataset",
    "name": "mdatasets",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "dataset compilation processing python library",
    "author": "TAWSIF AHMED",
    "author_email": "sleeping4cat@outlook.com",
    "download_url": "https://files.pythonhosted.org/packages/be/c8/1db0c6475b3928c07b2606f1a13ae6236b15bf12b7e16cb9eba2b2185916/mdatasets-0.1.4.tar.gz",
    "platform": null,
    "description": "Introducing Mdataset (/em-dataset/): a comprehensive solution tailored for researchers and students seeking a streamlined approach to compiling high-quality datasets and processing. \r\n\r\n### **Why is it necessary?**\r\n\r\nDataset compilation and processing can be a formidable challenge in various fields utilizing big data. The complexities lie in either navigating unfamiliar data banks or grappling with the intricacies of downloading specific datasets. Mdataset addresses these hurdles, providing an all-encompassing solution that equips users with the essential tools and methods needed to download existing datasets from renowned sources like Kaggle and Hugging Face. The only limiting factor is your computational power.\r\n\r\n### **What do we offer?**\r\n\r\nMdataset delivers a set of wrappers and functions, either wrapping existing tools developed by researchers or providing our solutions. With a simple three-line command, users can effortlessly download and compile datasets while also performing various processing tasks. Our offerings include:\r\n\r\n- Downloading datasets from high-quality sources\r\n- Video transcription\r\n- Text-to-audio conversion\r\n- User-friendly web scraping tools\r\n- Secure local-to-internet file transfer on demand\r\n- Scraping popular image boards\r\n- Synthetic generation of tabular and text data\r\n- Secure one-on-one interview method for data extraction by researchers\r\n- YouTube video and audio download\r\n- Powerful Optical Character Recognition for extracting data from PDFs\r\n- Unrestricted on-terminal search engine\r\n- Tor route circuits for secure communication\r\n\r\nChoose Mdataset for a comprehensive and efficient solution to your dataset compilation and processing needs.\r\n\r\n### Quickstart\r\n\r\n`pip install mdataset`\r\n\r\n```Python\r\nfrom mdataset import scrape_data\r\nurl = \"www.example.com'\r\nwanted_list = ['What's the data ca?']\r\nscrape_data(url, wanted_list)\r\n```\r\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "An one-stop Python library for dataset compilation and processing.",
    "version": "0.1.4",
    "project_urls": {
        "Homepage": "https://github.com/sleepingcat4/Mdataset",
        "Source": "https://github.com/sleepingcat4/Mdataset"
    },
    "split_keywords": [
        "dataset",
        "compilation",
        "processing",
        "python",
        "library"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a92b70462f29eb93875de41f4700417d70ebb4e899e66b3a5713537ab8c5747e",
                "md5": "e33150d8c051366a4f269b2edc294f67",
                "sha256": "1804fc7852c4d722016c301d7007ef17b8961a656b293155276fe0821423a657"
            },
            "downloads": -1,
            "filename": "mdatasets-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e33150d8c051366a4f269b2edc294f67",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 8988,
            "upload_time": "2023-12-07T18:23:26",
            "upload_time_iso_8601": "2023-12-07T18:23:26.368009Z",
            "url": "https://files.pythonhosted.org/packages/a9/2b/70462f29eb93875de41f4700417d70ebb4e899e66b3a5713537ab8c5747e/mdatasets-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bec81db0c6475b3928c07b2606f1a13ae6236b15bf12b7e16cb9eba2b2185916",
                "md5": "ea74f056d63b5e450cc5cad70e2e9136",
                "sha256": "683919dd3ed58aa09859a2d34dc29d9a01fb15595d6d3042ab42159b9f7d4b56"
            },
            "downloads": -1,
            "filename": "mdatasets-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "ea74f056d63b5e450cc5cad70e2e9136",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 7916,
            "upload_time": "2023-12-07T18:23:32",
            "upload_time_iso_8601": "2023-12-07T18:23:32.380193Z",
            "url": "https://files.pythonhosted.org/packages/be/c8/1db0c6475b3928c07b2606f1a13ae6236b15bf12b7e16cb9eba2b2185916/mdatasets-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-07 18:23:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sleepingcat4",
    "github_project": "Mdataset",
    "github_not_found": true,
    "lcname": "mdatasets"
}
        
Elapsed time: 2.57559s