<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
<img src="https://i.ibb.co/6bHMXK7/logo.png"/>
`istatapi` is a Python interface to discover and retrieve data from
ISTAT API (The Italian National Institute of Statistics). The library is
designed to explore ISTAT metadata and to retreive data in different
formats. `istatapi` is built on top of [ISTAT SDMX RESTful
API](https://developers.italia.it/it/api/istat-sdmx-rest).
Whether you are an existing organization, a curious individual or an
academic researcher, `istatapi` aims to allow you to easily access ISTAT
databases with just a few lines of code. The library implements
functions to:
- Explore all available ISTAT datasets (dataflows in SDMX terminology)
- Search available datasets by keywords
- Retrieve information on a specific dataset like: the ID of the
dataflow, the names and available values of the dimensions of the
dataset, available filters.
- Get data of an available dataset in a pandas DataFrame, csv or json
format.
## Install
You can easily install the library by using the pip command:
`pip install istatapi`
## Tutorial
First, let’s simply import the modules we need:
``` python
from istatapi import discovery, retrieval
import matplotlib.pyplot as plt
```
With `istatapi` we can search through all the available datasets by
simply using the following function:
``` python
discovery.all_available()
```
<div>
| | df_id | version | df_description | df_structure_id |
|-----|----------|---------|------------------------------------------------------------|----------------------|
| 0 | 101_1015 | 1.3 | Crops | DCSP_COLTIVAZIONI |
| 1 | 101_1030 | 1.0 | PDO, PGI and TSG quality products | DCSP_DOPIGP |
| 2 | 101_1033 | 1.0 | slaughtering | DCSP_MACELLAZIONI |
| 3 | 101_1039 | 1.2 | Agritourism - municipalities | DCSP_AGRITURISMO_COM |
| 4 | 101_1077 | 1.0 | PDO, PGI and TSG products: operators - municipalities data | DCSP_DOPIGP_COM |
</div>
You can also search for a specific dataset (in this example, a dataset
on imports), by doing:
``` python
discovery.search_dataset("import")
```
<div>
| | df_id | version | df_description | df_structure_id |
|-----|---------|---------|------------------------------------------------------|-------------------|
| 10 | 101_962 | 1.0 | Livestock import export | DCSP_LIVESTIMPEXP |
| 47 | 139_176 | 1.0 | Import and export by country and commodity Nace 2007 | DCSP_COEIMPEX1 |
| 49 | 143_222 | 1.0 | Import price index - monthly data | DCSC_PREIMPIND |
</div>
To retrieve data from a specific dataset, we first need to create an
instance of the
[`DataSet`](https://Attol8.github.io/istatapi/discovery.html#dataset)
class. We can use `df_id`, `df_description` or `df_structure_id` from
the above DataFrame to tell to the
[`DataSet`](https://Attol8.github.io/istatapi/discovery.html#dataset)
class what dataset we want to retrieve. Here, we are going to use the
`df_id` value. This may take a few seconds to load.
``` python
# initialize the dataset and get its dimensions
ds = discovery.DataSet(dataflow_identifier="139_176")
```
We now want to see what variables are included in the dataset that we
are analysing. With `istatapi` we can easily print its variables
(“dimensions” in ISTAT terminology) and their description.
``` python
ds.dimensions_info()
```
<div>
| | dimension | dimension_ID | description |
|-----|------------------|---------------------|---------------------|
| 0 | FREQ | CL_FREQ | Frequency |
| 1 | MERCE_ATECO_2007 | CL_ATECO_2007_MERCE | Commodity Nace 2007 |
| 2 | PAESE_PARTNER | CL_ISO | Geopolitics |
| 3 | ITTER107 | CL_ITTER107 | Territory |
| 4 | TIPO_DATO | CL_TIPO_DATO12 | Data type 12 |
</div>
Now, each dimension can have a few possible values. `istatapi` provides
a quick method to analyze these values and print their English
descriptions.
``` python
dimension = "TIPO_DATO" #use "dimension" column from above
ds.get_dimension_values(dimension)
```
<div>
| | values_ids | values_description |
|-----|------------|---------------------------------------------------------------------------------|
| 0 | EV | export - value (euro) |
| 1 | TBV | trade balance - value (euro) |
| 2 | ISAV | import - seasonally adjusted value - world based model (millions of euro) |
| 3 | ESAV | export - seasonally adjusted value - world based model (millions of euro) |
| 4 | TBSAV | trade balance - seasonally adjusted value -world based model (millions of euro) |
| 5 | IV | import - value (euro) |
</div>
If we do not filter any of our variables, the data will just include all
the possible values in the dataset. This could result in too much data
that would slow our code and make it difficult to analyze. Thus, we need
to filter our dataset. To do so, we can simply use the `values_ids` that
we found using the function `get_dimension_values` in the cell above.
**Note**: Make sure to pass the names of the dimensions in lower case
letters as arguments of the `set_filter` function. If you want to filter
for multiple values, simply pass them as lists.
``` python
freq = "M" #monthly frequency
tipo_dato = ["ISAV", "ESAV"] #imports and exports seasonally adjusted data
paese_partner = "WORLD" #trade with all countries
ds.set_filters(freq = freq, tipo_dato = tipo_dato, paese_partner = paese_partner)
```
Having set our filters, we can now finally retrieve the data by simply
passing our
[`DataSet`](https://Attol8.github.io/istatapi/discovery.html#dataset)
instance to the function
[`get_data`](https://Attol8.github.io/istatapi/retrieval.html#get_data).
It will return a pandas DataFrame with all the data that we requested.
The data will be already sorted by datetime
``` python
trade_df = retrieval.get_data(ds)
trade_df.head()
```
<div>
| | DATAFLOW | FREQ | MERCE_ATECO_2007 | PAESE_PARTNER | ITTER107 | TIPO_DATO | TIME_PERIOD | OBS_VALUE | BREAK | CONF_STATUS | OBS_PRE_BREAK | OBS_STATUS | BASE_PER | UNIT_MEAS | UNIT_MULT | METADATA_EN | METADATA_IT |
|-----|------------------|------|------------------|---------------|----------|-----------|-------------|-----------|-------|-------------|---------------|------------|----------|-----------|-----------|-------------|-------------|
| 0 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ESAV | 1993-01-01 | 10767 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 368 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ISAV | 1993-01-01 | 9226 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 372 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ISAV | 1993-02-01 | 10015 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 4 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ESAV | 1993-02-01 | 10681 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 373 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ISAV | 1993-03-01 | 9954 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
</div>
Now that we have our data, we can do whatever we want with it. For
example, we can plot the data after having it cleaned up a bit. You are
free to make your own analysis!
``` python
# set matplotlib themes
plt.style.use('fivethirtyeight')
plt.rcParams['figure.figsize'] = [16, 5]
#fiveThirtyEight palette
colors = ['#30a2da', '#fc4f30', '#e5ae38', '#6d904f', '#8b8b8b']
# calculate moving averages for the plot
trade_df["MA_3"] = trade_df.groupby("TIPO_DATO")["OBS_VALUE"].transform(
lambda x: x.rolling(window=3).mean()
)
#replace the "TIPO_DATO" column values with more meaningful labels
trade_df["TIPO_DATO"] = trade_df["TIPO_DATO"].replace(
{"ISAV": "Imports", "ESAV": "Exports"}
)
# Plot the data
after_2013 = trade_df["TIME_PERIOD"] >= "2013"
is_ESAV = trade_df["TIPO_DATO"] == "Exports"
is_ISAV = trade_df["TIPO_DATO"] == "Imports"
exports = trade_df[is_ESAV & after_2013].rename(columns={"OBS_VALUE": "Exports", "MA_3": "Exports - three months moving average"})
imports = trade_df[is_ISAV & after_2013].rename(columns={"OBS_VALUE": "Imports", "MA_3": "Imports - three months moving average"})
plt.plot(
"TIME_PERIOD",
"Exports",
data=exports,
marker="",
linestyle="dashed",
color = colors[0],
linewidth=1
)
plt.plot(
"TIME_PERIOD",
"Imports",
data=imports,
marker="",
linestyle="dashed",
color = colors[1],
linewidth=1
)
plt.plot(
"TIME_PERIOD",
"Exports - three months moving average",
data=exports,
color = colors[0],
linewidth=2
)
plt.plot(
"TIME_PERIOD",
"Imports - three months moving average",
data=imports,
marker="",
color = colors[1],
linewidth=2
)
# add a title
plt.title("Italy's trade with the world")
# add a label to the x axis
plt.xlabel("Year")
# turn y scale from millions to billions (divide by a 1000), and add a label
plt.ylabel("Value in billions of euros")
plt.gca().yaxis.set_major_formatter(plt.FuncFormatter(lambda x, loc: "{:,}".format(int(x/1000))))
plt.legend()
```
![](index_files/figure-commonmark/cell-16-output-1.png)
With just a few lines of code, we have been able to retrieve data from
ISTAT and make a simple plot. This is just a simple example of what you
can do with `istatapi`. You can find more examples in the `_examples`
folder. Enjoy!
Raw data
{
"_id": null,
"home_page": "https://github.com/Attol8/istatapi",
"name": "istatapi",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "istat API python development datascience",
"author": "Jacopo Attolini",
"author_email": "jacopoattolini@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/ff/f7/b2d34155501076ab6b38f4f745936c3a3065c540df66009bba6173493262/istatapi-1.0.0.tar.gz",
"platform": null,
"description": "\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n<img src=\"https://i.ibb.co/6bHMXK7/logo.png\"/>\n\n`istatapi` is a Python interface to discover and retrieve data from\nISTAT API (The Italian National Institute of Statistics). The library is\ndesigned to explore ISTAT metadata and to retreive data in different\nformats. `istatapi` is built on top of [ISTAT SDMX RESTful\nAPI](https://developers.italia.it/it/api/istat-sdmx-rest).\n\nWhether you are an existing organization, a curious individual or an\nacademic researcher, `istatapi` aims to allow you to easily access ISTAT\ndatabases with just a few lines of code. The library implements\nfunctions to:\n\n- Explore all available ISTAT datasets (dataflows in SDMX terminology)\n- Search available datasets by keywords\n- Retrieve information on a specific dataset like: the ID of the\n dataflow, the names and available values of the dimensions of the\n dataset, available filters.\n- Get data of an available dataset in a pandas DataFrame, csv or json\n format.\n\n## Install\n\nYou can easily install the library by using the pip command:\n\n`pip install istatapi`\n\n## Tutorial\n\nFirst, let\u2019s simply import the modules we need:\n\n``` python\nfrom istatapi import discovery, retrieval\nimport matplotlib.pyplot as plt\n```\n\nWith `istatapi` we can search through all the available datasets by\nsimply using the following function:\n\n``` python\ndiscovery.all_available()\n```\n\n<div>\n\n| | df_id | version | df_description | df_structure_id |\n|-----|----------|---------|------------------------------------------------------------|----------------------|\n| 0 | 101_1015 | 1.3 | Crops | DCSP_COLTIVAZIONI |\n| 1 | 101_1030 | 1.0 | PDO, PGI and TSG quality products | DCSP_DOPIGP |\n| 2 | 101_1033 | 1.0 | slaughtering | DCSP_MACELLAZIONI |\n| 3 | 101_1039 | 1.2 | Agritourism - municipalities | DCSP_AGRITURISMO_COM |\n| 4 | 101_1077 | 1.0 | PDO, PGI and TSG products: operators - municipalities data | DCSP_DOPIGP_COM |\n\n</div>\n\nYou can also search for a specific dataset (in this example, a dataset\non imports), by doing:\n\n``` python\ndiscovery.search_dataset(\"import\")\n```\n\n<div>\n\n| | df_id | version | df_description | df_structure_id |\n|-----|---------|---------|------------------------------------------------------|-------------------|\n| 10 | 101_962 | 1.0 | Livestock import export | DCSP_LIVESTIMPEXP |\n| 47 | 139_176 | 1.0 | Import and export by country and commodity Nace 2007 | DCSP_COEIMPEX1 |\n| 49 | 143_222 | 1.0 | Import price index - monthly data | DCSC_PREIMPIND |\n\n</div>\n\nTo retrieve data from a specific dataset, we first need to create an\ninstance of the\n[`DataSet`](https://Attol8.github.io/istatapi/discovery.html#dataset)\nclass. We can use `df_id`, `df_description` or `df_structure_id` from\nthe above DataFrame to tell to the\n[`DataSet`](https://Attol8.github.io/istatapi/discovery.html#dataset)\nclass what dataset we want to retrieve. Here, we are going to use the\n`df_id` value. This may take a few seconds to load.\n\n``` python\n# initialize the dataset and get its dimensions\nds = discovery.DataSet(dataflow_identifier=\"139_176\")\n```\n\nWe now want to see what variables are included in the dataset that we\nare analysing. With `istatapi` we can easily print its variables\n(\u201cdimensions\u201d in ISTAT terminology) and their description.\n\n``` python\nds.dimensions_info()\n```\n\n<div>\n\n| | dimension | dimension_ID | description |\n|-----|------------------|---------------------|---------------------|\n| 0 | FREQ | CL_FREQ | Frequency |\n| 1 | MERCE_ATECO_2007 | CL_ATECO_2007_MERCE | Commodity Nace 2007 |\n| 2 | PAESE_PARTNER | CL_ISO | Geopolitics |\n| 3 | ITTER107 | CL_ITTER107 | Territory |\n| 4 | TIPO_DATO | CL_TIPO_DATO12 | Data type 12 |\n\n</div>\n\nNow, each dimension can have a few possible values. `istatapi` provides\na quick method to analyze these values and print their English\ndescriptions.\n\n``` python\ndimension = \"TIPO_DATO\" #use \"dimension\" column from above\nds.get_dimension_values(dimension)\n```\n\n<div>\n\n| | values_ids | values_description |\n|-----|------------|---------------------------------------------------------------------------------|\n| 0 | EV | export - value (euro) |\n| 1 | TBV | trade balance - value (euro) |\n| 2 | ISAV | import - seasonally adjusted value - world based model (millions of euro) |\n| 3 | ESAV | export - seasonally adjusted value - world based model (millions of euro) |\n| 4 | TBSAV | trade balance - seasonally adjusted value -world based model (millions of euro) |\n| 5 | IV | import - value (euro) |\n\n</div>\n\nIf we do not filter any of our variables, the data will just include all\nthe possible values in the dataset. This could result in too much data\nthat would slow our code and make it difficult to analyze. Thus, we need\nto filter our dataset. To do so, we can simply use the `values_ids` that\nwe found using the function `get_dimension_values` in the cell above.\n\n**Note**: Make sure to pass the names of the dimensions in lower case\nletters as arguments of the `set_filter` function. If you want to filter\nfor multiple values, simply pass them as lists.\n\n``` python\nfreq = \"M\" #monthly frequency\ntipo_dato = [\"ISAV\", \"ESAV\"] #imports and exports seasonally adjusted data\npaese_partner = \"WORLD\" #trade with all countries\n\nds.set_filters(freq = freq, tipo_dato = tipo_dato, paese_partner = paese_partner)\n```\n\nHaving set our filters, we can now finally retrieve the data by simply\npassing our\n[`DataSet`](https://Attol8.github.io/istatapi/discovery.html#dataset)\ninstance to the function\n[`get_data`](https://Attol8.github.io/istatapi/retrieval.html#get_data).\nIt will return a pandas DataFrame with all the data that we requested.\nThe data will be already sorted by datetime\n\n``` python\ntrade_df = retrieval.get_data(ds)\ntrade_df.head()\n```\n\n<div>\n\n| | DATAFLOW | FREQ | MERCE_ATECO_2007 | PAESE_PARTNER | ITTER107 | TIPO_DATO | TIME_PERIOD | OBS_VALUE | BREAK | CONF_STATUS | OBS_PRE_BREAK | OBS_STATUS | BASE_PER | UNIT_MEAS | UNIT_MULT | METADATA_EN | METADATA_IT |\n|-----|------------------|------|------------------|---------------|----------|-----------|-------------|-----------|-------|-------------|---------------|------------|----------|-----------|-----------|-------------|-------------|\n| 0 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ESAV | 1993-01-01 | 10767 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |\n| 368 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ISAV | 1993-01-01 | 9226 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |\n| 372 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ISAV | 1993-02-01 | 10015 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |\n| 4 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ESAV | 1993-02-01 | 10681 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |\n| 373 | IT1:139_176(1.0) | M | 10 | WORLD | ITTOT | ISAV | 1993-03-01 | 9954 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |\n\n</div>\n\nNow that we have our data, we can do whatever we want with it. For\nexample, we can plot the data after having it cleaned up a bit. You are\nfree to make your own analysis!\n\n``` python\n# set matplotlib themes\nplt.style.use('fivethirtyeight')\nplt.rcParams['figure.figsize'] = [16, 5]\n\n#fiveThirtyEight palette\ncolors = ['#30a2da', '#fc4f30', '#e5ae38', '#6d904f', '#8b8b8b']\n\n# calculate moving averages for the plot\ntrade_df[\"MA_3\"] = trade_df.groupby(\"TIPO_DATO\")[\"OBS_VALUE\"].transform(\n lambda x: x.rolling(window=3).mean()\n)\n\n#replace the \"TIPO_DATO\" column values with more meaningful labels\ntrade_df[\"TIPO_DATO\"] = trade_df[\"TIPO_DATO\"].replace(\n {\"ISAV\": \"Imports\", \"ESAV\": \"Exports\"}\n)\n\n# Plot the data\nafter_2013 = trade_df[\"TIME_PERIOD\"] >= \"2013\"\nis_ESAV = trade_df[\"TIPO_DATO\"] == \"Exports\"\nis_ISAV = trade_df[\"TIPO_DATO\"] == \"Imports\"\n\nexports = trade_df[is_ESAV & after_2013].rename(columns={\"OBS_VALUE\": \"Exports\", \"MA_3\": \"Exports - three months moving average\"})\nimports = trade_df[is_ISAV & after_2013].rename(columns={\"OBS_VALUE\": \"Imports\", \"MA_3\": \"Imports - three months moving average\"})\n\nplt.plot(\n \"TIME_PERIOD\",\n \"Exports\",\n data=exports,\n marker=\"\",\n linestyle=\"dashed\",\n color = colors[0],\n linewidth=1\n)\nplt.plot(\n \"TIME_PERIOD\",\n \"Imports\",\n data=imports,\n marker=\"\",\n linestyle=\"dashed\",\n color = colors[1],\n linewidth=1\n)\nplt.plot(\n \"TIME_PERIOD\",\n \"Exports - three months moving average\",\n data=exports,\n color = colors[0],\n linewidth=2\n)\nplt.plot(\n \"TIME_PERIOD\",\n \"Imports - three months moving average\",\n data=imports,\n marker=\"\",\n color = colors[1],\n linewidth=2\n)\n\n# add a title\nplt.title(\"Italy's trade with the world\")\n\n# add a label to the x axis\nplt.xlabel(\"Year\")\n\n# turn y scale from millions to billions (divide by a 1000), and add a label\nplt.ylabel(\"Value in billions of euros\")\nplt.gca().yaxis.set_major_formatter(plt.FuncFormatter(lambda x, loc: \"{:,}\".format(int(x/1000))))\nplt.legend()\n```\n\n![](index_files/figure-commonmark/cell-16-output-1.png)\n\nWith just a few lines of code, we have been able to retrieve data from\nISTAT and make a simple plot. This is just a simple example of what you\ncan do with `istatapi`. You can find more examples in the `_examples`\nfolder. Enjoy!\n",
"bugtrack_url": null,
"license": "Apache Software License 2.0",
"summary": "Python API for ISTAT (The Italian National Institute of Statistics)",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/Attol8/istatapi"
},
"split_keywords": [
"istat",
"api",
"python",
"development",
"datascience"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8590509d129ae845c411d9668f83e56f3942433da77dd58dded7bcb3b81b6a89",
"md5": "435c618e7a8da0257f225ee2a84ae3e0",
"sha256": "0b8299b50c34e7e29338c220c6b8721b816339883de5a4ff003af3e0c73f4f68"
},
"downloads": -1,
"filename": "istatapi-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "435c618e7a8da0257f225ee2a84ae3e0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 15317,
"upload_time": "2024-06-04T15:28:24",
"upload_time_iso_8601": "2024-06-04T15:28:24.323304Z",
"url": "https://files.pythonhosted.org/packages/85/90/509d129ae845c411d9668f83e56f3942433da77dd58dded7bcb3b81b6a89/istatapi-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "fff7b2d34155501076ab6b38f4f745936c3a3065c540df66009bba6173493262",
"md5": "0a72174c8625af3ce6c2d61fe09f01e8",
"sha256": "6901529f0c8764c3fb463c14dc980ca54399565d2de4c38dd196de21e51fcbc5"
},
"downloads": -1,
"filename": "istatapi-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "0a72174c8625af3ce6c2d61fe09f01e8",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 19728,
"upload_time": "2024-06-04T15:28:26",
"upload_time_iso_8601": "2024-06-04T15:28:26.511094Z",
"url": "https://files.pythonhosted.org/packages/ff/f7/b2d34155501076ab6b38f4f745936c3a3065c540df66009bba6173493262/istatapi-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-04 15:28:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Attol8",
"github_project": "istatapi",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "istatapi"
}