giza-datasets


Namegiza-datasets JSON
Version 0.2.4 PyPI version JSON
download
home_pageNone
SummaryNone
upload_time2024-05-10 14:20:41
maintainerNone
docs_urlNone
authorFran Algaba
requires_python<4.0,>=3.11
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Giza Datasets

Welcome to the Giza Datasets repository. Here you can find a collection of datasets ready to be used for blockchain ML use cases. Familiarize yourself with the ease of using dataframes through our `DatasetsLoader` class.

Before discovering how our library works, if you want to find detailed information about each dataset provided by Giza, access our [documentation](https://datasets.gizatech.xyz/welcome/giza-datasets)! You will find usage examples for each dataset, the schema of each one with descriptions of every field, the relationship between the datasets, potential use cases for them, and much more!

## Enhanced Features

Explore the robust capabilities of the Giza Datasets repository:

- **Streamlined Dataset Access**: Instantly connect to a curated collection of blockchain datasets, ready for machine learning applications, with no configuration needed.
- **Effortless Data Loading**: Utilize the `DatasetsLoader` class to easily load Parquet files, streamlining your data workflow.
- **Optimized Data Handling**: Leverage the integration with the [polars library](https://www.pola.rs/), designed for efficient manipulation of large datasets. For detailed guidance on using polars for dataset operations, refer to the [polars documentation](https://docs.pola.rs/py-polars/).

## Quick Start

To get started with Giza Datasets, follow the steps below:

1. Install the `giza-datasets` package if you haven't already:
   ```
   pip install giza-datasets
   ```
2. Import the `DatasetsLoader` class and initialize it:
   ```python
   from giza_datasets import DatasetsLoader
   loader = DatasetsLoader()
   ```
3. Optional: Depending on your device's configuration, it may be necessary to provide SSL certificates to verify the authenticity of HTTPS connections. You can ensure that all these certifications are correct by executing the following line of code:
   ```python
   import certifi
   import os
   os.environ['SSL_CERT_FILE'] = certifi.where()
   ```

4. Load a dataset using the `load` method. For example, to load `tvl-fee-per-protocol`:
   ```python
   df = loader.load('tvl-fee-per-protocol')
   ```
5. To view the loaded dataset, simply print the dataframe:
   ```python
   print(df)
   ```

Start exploring the datasets and building your machine learning models with ease!

## Datasets Hub

The `DatasetsHub` class provides methods to manage and access datasets. Here are some of its methods:

- `show()`: Prints a table of all datasets in the hub.
- `list()`: Returns a list of all datasets in the hub.
- `get(dataset_name)`: Returns a Dataset object with the given name.
- `describe(dataset_name)`: Prints a table of details for the given dataset.

To get started with the `DatasetsHub` class, follow the steps below:

1. Import the `DatasetsHub` class and initialize it:
   ```python
   from giza_datasets import DatasetsHub
   hub = DatasetsHub()
   ```
2. Use the `show` method to print a table of all datasets in the hub:
   ```python
   hub.show()
   ```
3. Use the `list` method to get a list of all datasets in the hub:
   ```python
   datasets = hub.list()
   print(datasets)
   ```
4. Use the `get` method to get a Dataset object with a given name:
   ```python
   dataset = hub.get('tvl-fee-per-protocol')
   print(dataset)
   ```
5. Use the `describe` method to print a table of details for a given dataset:
   ```python
   hub.describe('tvl-fee-per-protocol')
   ```
6. Use the `list_tags` method to print a list of all tags in the hub.
   ```python
   hub.list_tags()
   ```
7. Use the `get_by_tag` method to a list of Dataset objects with the given tag.
   ```python
   hub.get_by_tag('Liquidity')
   ```


## Contributing

We welcome contributions to the Giza Datasets repository. If you have suggestions for improvements or new features, feel free to open an issue or submit a pull request.

## License

This project is licensed under the MIT License - see the LICENSE file for details.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "giza-datasets",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": "Fran Algaba",
    "author_email": "f.algaba@outlook.es",
    "download_url": "https://files.pythonhosted.org/packages/23/98/710c157e814676cc2d269b0333cfc2a1f0682bc324ba3bf6c9b783647d9b/giza_datasets-0.2.4.tar.gz",
    "platform": null,
    "description": "# Giza Datasets\n\nWelcome to the Giza Datasets repository. Here you can find a collection of datasets ready to be used for blockchain ML use cases. Familiarize yourself with the ease of using dataframes through our `DatasetsLoader` class.\n\nBefore discovering how our library works, if you want to find detailed information about each dataset provided by Giza, access our [documentation](https://datasets.gizatech.xyz/welcome/giza-datasets)! You will find usage examples for each dataset, the schema of each one with descriptions of every field, the relationship between the datasets, potential use cases for them, and much more!\n\n## Enhanced Features\n\nExplore the robust capabilities of the Giza Datasets repository:\n\n- **Streamlined Dataset Access**: Instantly connect to a curated collection of blockchain datasets, ready for machine learning applications, with no configuration needed.\n- **Effortless Data Loading**: Utilize the `DatasetsLoader` class to easily load Parquet files, streamlining your data workflow.\n- **Optimized Data Handling**: Leverage the integration with the [polars library](https://www.pola.rs/), designed for efficient manipulation of large datasets. For detailed guidance on using polars for dataset operations, refer to the [polars documentation](https://docs.pola.rs/py-polars/).\n\n## Quick Start\n\nTo get started with Giza Datasets, follow the steps below:\n\n1. Install the `giza-datasets` package if you haven't already:\n   ```\n   pip install giza-datasets\n   ```\n2. Import the `DatasetsLoader` class and initialize it:\n   ```python\n   from giza_datasets import DatasetsLoader\n   loader = DatasetsLoader()\n   ```\n3. Optional: Depending on your device's configuration, it may be necessary to provide SSL certificates to verify the authenticity of HTTPS connections. You can ensure that all these certifications are correct by executing the following line of code:\n   ```python\n   import certifi\n   import os\n   os.environ['SSL_CERT_FILE'] = certifi.where()\n   ```\n\n4. Load a dataset using the `load` method. For example, to load `tvl-fee-per-protocol`:\n   ```python\n   df = loader.load('tvl-fee-per-protocol')\n   ```\n5. To view the loaded dataset, simply print the dataframe:\n   ```python\n   print(df)\n   ```\n\nStart exploring the datasets and building your machine learning models with ease!\n\n## Datasets Hub\n\nThe `DatasetsHub` class provides methods to manage and access datasets. Here are some of its methods:\n\n- `show()`: Prints a table of all datasets in the hub.\n- `list()`: Returns a list of all datasets in the hub.\n- `get(dataset_name)`: Returns a Dataset object with the given name.\n- `describe(dataset_name)`: Prints a table of details for the given dataset.\n\nTo get started with the `DatasetsHub` class, follow the steps below:\n\n1. Import the `DatasetsHub` class and initialize it:\n   ```python\n   from giza_datasets import DatasetsHub\n   hub = DatasetsHub()\n   ```\n2. Use the `show` method to print a table of all datasets in the hub:\n   ```python\n   hub.show()\n   ```\n3. Use the `list` method to get a list of all datasets in the hub:\n   ```python\n   datasets = hub.list()\n   print(datasets)\n   ```\n4. Use the `get` method to get a Dataset object with a given name:\n   ```python\n   dataset = hub.get('tvl-fee-per-protocol')\n   print(dataset)\n   ```\n5. Use the `describe` method to print a table of details for a given dataset:\n   ```python\n   hub.describe('tvl-fee-per-protocol')\n   ```\n6. Use the `list_tags` method to print a list of all tags in the hub.\n   ```python\n   hub.list_tags()\n   ```\n7. Use the `get_by_tag` method to a list of Dataset objects with the given tag.\n   ```python\n   hub.get_by_tag('Liquidity')\n   ```\n\n\n## Contributing\n\nWe welcome contributions to the Giza Datasets repository. If you have suggestions for improvements or new features, feel free to open an issue or submit a pull request.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": null,
    "version": "0.2.4",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "27af904d804958cf80cb110c12716566fed52e8bfa74094277c06d4851c4ede3",
                "md5": "b279c9ff2fb8b58f027ad01f2e5e7a64",
                "sha256": "88ab28776beae4a61c45058745576d135bc6551518ed05ed39e014805a03de1b"
            },
            "downloads": -1,
            "filename": "giza_datasets-0.2.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b279c9ff2fb8b58f027ad01f2e5e7a64",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 18522,
            "upload_time": "2024-05-10T14:20:40",
            "upload_time_iso_8601": "2024-05-10T14:20:40.709126Z",
            "url": "https://files.pythonhosted.org/packages/27/af/904d804958cf80cb110c12716566fed52e8bfa74094277c06d4851c4ede3/giza_datasets-0.2.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2398710c157e814676cc2d269b0333cfc2a1f0682bc324ba3bf6c9b783647d9b",
                "md5": "8dc8fabb2167fb2d282dc8073ac38b02",
                "sha256": "53d8f0ec4a84696dd7c3eb9346f97ba84222ae43f59cba0e98b96f38762e510e"
            },
            "downloads": -1,
            "filename": "giza_datasets-0.2.4.tar.gz",
            "has_sig": false,
            "md5_digest": "8dc8fabb2167fb2d282dc8073ac38b02",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 18291,
            "upload_time": "2024-05-10T14:20:41",
            "upload_time_iso_8601": "2024-05-10T14:20:41.855463Z",
            "url": "https://files.pythonhosted.org/packages/23/98/710c157e814676cc2d269b0333cfc2a1f0682bc324ba3bf6c9b783647d9b/giza_datasets-0.2.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-10 14:20:41",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "giza-datasets"
}
        
Elapsed time: 0.24403s