# vega_datasets
[](https://travis-ci.org/altair-viz/vega_datasets)
[](https://github.com/altair-viz/vega_datasets/actions?query=workflow%3Abuild)
[](https://github.com/altair-viz/vega_datasets/actions?query=workflow%3Alint)
[](https://github.com/psf/black)
A Python package for offline access to [vega datasets](https://github.com/vega/vega-datasets).
This package has several goals:
- Provide straightforward access in Python to the datasets made available at [vega-datasets](https://github.com/vega/vega-datasets).
- return the results in the form of a Pandas dataframe.
- wherever dataset size and/or license constraints make it possible, bundle the dataset with the package so that datasets can be loaded in the absence of a web connection.
Currently the package bundles a half-dozen datasets, and falls back to using HTTP requests for the others.
## Installation
``vega_datasets`` is compatible with Python 3.5 or newer. Install with:
```
$ pip install vega_datasets
```
## Usage
The main object in this library is ``data``:
```python
>>> from vega_datasets import data
```
It contains attributes that access all available datasets, locally if
available. For example, here is the well-known iris dataset:
```python
>>> df = data.iris()
>>> df.head()
petalLength petalWidth sepalLength sepalWidth species
0 1.4 0.2 5.1 3.5 setosa
1 1.4 0.2 4.9 3.0 setosa
2 1.3 0.2 4.7 3.2 setosa
3 1.5 0.2 4.6 3.1 setosa
4 1.4 0.2 5.0 3.6 setosa
```
If you're curious about the source data, you can access the URL for any of the available datasets:
```python
>>> data.iris.url
'https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/iris.json'
```
For datasets bundled with the package, you can also find their location on disk:
```python
>>> data.iris.filepath
'/lib/python3.6/site-packages/vega_datasets/data/iris.json'
```
## Available Datasets
To list all the available datsets, use ``list_datasets``:
```python
>>> data.list_datasets()
['7zip', 'airports', 'anscombe', 'barley', 'birdstrikes', 'budget', 'budgets', 'burtin', 'cars', 'climate', 'co2-concentration', 'countries', 'crimea', 'disasters', 'driving', 'earthquakes', 'ffox', 'flare', 'flare-dependencies', 'flights-10k', 'flights-200k', 'flights-20k', 'flights-2k', 'flights-3m', 'flights-5k', 'flights-airport', 'gapminder', 'gapminder-health-income', 'gimp', 'github', 'graticule', 'income', 'iris', 'jobs', 'londonBoroughs', 'londonCentroids', 'londonTubeLines', 'lookup_groups', 'lookup_people', 'miserables', 'monarchs', 'movies', 'normal-2d', 'obesity', 'points', 'population', 'population_engineers_hurricanes', 'seattle-temps', 'seattle-weather', 'sf-temps', 'sp500', 'stocks', 'udistrict', 'unemployment', 'unemployment-across-industries', 'us-10m', 'us-employment', 'us-state-capitals', 'weather', 'weball26', 'wheat', 'world-110m', 'zipcodes']
```
To list local datasets (i.e. those that are bundled with the package and can be used without a web connection), use the ``local_data`` object instead:
```python
>>> from vega_datasets import local_data
>>> local_data.list_datasets()
['airports', 'anscombe', 'barley', 'burtin', 'cars', 'crimea', 'driving', 'iowa-electricity', 'iris', 'seattle-temps', 'seattle-weather', 'sf-temps', 'stocks', 'us-employment', "wheat"]
```
We plan to add more local datasets in the future, subject to size and licensing constraints. See the [local datasets issue](https://github.com/altair-viz/vega_datasets/issues/1) if you would like to help with this.
## Dataset Information
If you want more information about any dataset, you can use the ``description`` property:
```python
>>> data.iris.description
'This classic dataset contains lengths and widths of petals and sepals for 150 iris flowers, drawn from three species. It was introduced by R.A. Fisher in 1936 [1]_.'
```
This information is also part of the ``data.iris`` doc string.
Descriptions are not yet included for all the datasets in the package; we hope to add more information on this in the future.
Raw data
{
"_id": null,
"home_page": "http://github.com/altair-viz/vega_datasets",
"name": "vega-datasets",
"maintainer": "Jake VanderPlas",
"docs_url": null,
"requires_python": ">=3.5",
"maintainer_email": "jakevdp@gmail.com",
"keywords": "",
"author": "Jake VanderPlas",
"author_email": "jakevdp@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/8f/a0/ce608d9a5b82fce2ebaa2311136b1e1d1dc2807f501bbdfa56bd174fff76/vega_datasets-0.9.0.tar.gz",
"platform": "",
"description": "# vega_datasets\n\n[](https://travis-ci.org/altair-viz/vega_datasets)\n[](https://github.com/altair-viz/vega_datasets/actions?query=workflow%3Abuild)\n[](https://github.com/altair-viz/vega_datasets/actions?query=workflow%3Alint)\n[](https://github.com/psf/black)\n\nA Python package for offline access to [vega datasets](https://github.com/vega/vega-datasets).\n\nThis package has several goals:\n\n- Provide straightforward access in Python to the datasets made available at [vega-datasets](https://github.com/vega/vega-datasets).\n- return the results in the form of a Pandas dataframe.\n- wherever dataset size and/or license constraints make it possible, bundle the dataset with the package so that datasets can be loaded in the absence of a web connection.\n\nCurrently the package bundles a half-dozen datasets, and falls back to using HTTP requests for the others.\n\n## Installation\n``vega_datasets`` is compatible with Python 3.5 or newer. Install with:\n```\n$ pip install vega_datasets\n```\n\n## Usage\n\nThe main object in this library is ``data``:\n\n```python\n>>> from vega_datasets import data\n```\n\nIt contains attributes that access all available datasets, locally if\navailable. For example, here is the well-known iris dataset:\n\n```python\n>>> df = data.iris()\n>>> df.head()\n petalLength petalWidth sepalLength sepalWidth species\n0 1.4 0.2 5.1 3.5 setosa\n1 1.4 0.2 4.9 3.0 setosa\n2 1.3 0.2 4.7 3.2 setosa\n3 1.5 0.2 4.6 3.1 setosa\n4 1.4 0.2 5.0 3.6 setosa\n```\n\nIf you're curious about the source data, you can access the URL for any of the available datasets:\n\n```python\n>>> data.iris.url\n'https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/iris.json'\n```\n\nFor datasets bundled with the package, you can also find their location on disk:\n\n```python\n>>> data.iris.filepath\n'/lib/python3.6/site-packages/vega_datasets/data/iris.json'\n```\n\n## Available Datasets\n\nTo list all the available datsets, use ``list_datasets``:\n\n```python\n>>> data.list_datasets()\n['7zip', 'airports', 'anscombe', 'barley', 'birdstrikes', 'budget', 'budgets', 'burtin', 'cars', 'climate', 'co2-concentration', 'countries', 'crimea', 'disasters', 'driving', 'earthquakes', 'ffox', 'flare', 'flare-dependencies', 'flights-10k', 'flights-200k', 'flights-20k', 'flights-2k', 'flights-3m', 'flights-5k', 'flights-airport', 'gapminder', 'gapminder-health-income', 'gimp', 'github', 'graticule', 'income', 'iris', 'jobs', 'londonBoroughs', 'londonCentroids', 'londonTubeLines', 'lookup_groups', 'lookup_people', 'miserables', 'monarchs', 'movies', 'normal-2d', 'obesity', 'points', 'population', 'population_engineers_hurricanes', 'seattle-temps', 'seattle-weather', 'sf-temps', 'sp500', 'stocks', 'udistrict', 'unemployment', 'unemployment-across-industries', 'us-10m', 'us-employment', 'us-state-capitals', 'weather', 'weball26', 'wheat', 'world-110m', 'zipcodes']\n```\n\nTo list local datasets (i.e. those that are bundled with the package and can be used without a web connection), use the ``local_data`` object instead:\n\n```python\n>>> from vega_datasets import local_data\n>>> local_data.list_datasets()\n\n['airports', 'anscombe', 'barley', 'burtin', 'cars', 'crimea', 'driving', 'iowa-electricity', 'iris', 'seattle-temps', 'seattle-weather', 'sf-temps', 'stocks', 'us-employment', \"wheat\"]\n```\n\nWe plan to add more local datasets in the future, subject to size and licensing constraints. See the [local datasets issue](https://github.com/altair-viz/vega_datasets/issues/1) if you would like to help with this.\n\n## Dataset Information\n\nIf you want more information about any dataset, you can use the ``description`` property:\n\n```python\n>>> data.iris.description\n'This classic dataset contains lengths and widths of petals and sepals for 150 iris flowers, drawn from three species. It was introduced by R.A. Fisher in 1936 [1]_.'\n```\n\nThis information is also part of the ``data.iris`` doc string.\nDescriptions are not yet included for all the datasets in the package; we hope to add more information on this in the future.\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python package for offline access to Vega datasets",
"version": "0.9.0",
"project_urls": {
"Bug Reports": "https://github.com/altair-viz/vega_datasets/issues",
"Download": "http://github.com/altair-viz/vega_datasets",
"Homepage": "http://github.com/altair-viz/vega_datasets",
"Source": "https://github.com/altair-viz/vega_datasets"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e69fca52771fe972e0dcc5167fedb609940e01516066938ff2ee28b273ae4f29",
"md5": "f7752c8afa2243230549d7b8c8d2e6b0",
"sha256": "3d7c63917be6ca9b154b565f4779a31fedce57b01b5b9d99d8a34a7608062a1d"
},
"downloads": -1,
"filename": "vega_datasets-0.9.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f7752c8afa2243230549d7b8c8d2e6b0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.5",
"size": 210822,
"upload_time": "2020-11-26T13:56:57",
"upload_time_iso_8601": "2020-11-26T13:56:57.776671Z",
"url": "https://files.pythonhosted.org/packages/e6/9f/ca52771fe972e0dcc5167fedb609940e01516066938ff2ee28b273ae4f29/vega_datasets-0.9.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8fa0ce608d9a5b82fce2ebaa2311136b1e1d1dc2807f501bbdfa56bd174fff76",
"md5": "5a17b42f507880037f9b7040b75d2e19",
"sha256": "9dbe9834208e8ec32ab44970df315de9102861e4cda13d8e143aab7a80d93fc0"
},
"downloads": -1,
"filename": "vega_datasets-0.9.0.tar.gz",
"has_sig": false,
"md5_digest": "5a17b42f507880037f9b7040b75d2e19",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.5",
"size": 215013,
"upload_time": "2020-11-26T13:56:59",
"upload_time_iso_8601": "2020-11-26T13:56:59.421583Z",
"url": "https://files.pythonhosted.org/packages/8f/a0/ce608d9a5b82fce2ebaa2311136b1e1d1dc2807f501bbdfa56bd174fff76/vega_datasets-0.9.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2020-11-26 13:56:59",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "altair-viz",
"github_project": "vega_datasets",
"travis_ci": true,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "vega-datasets"
}