imfp


Nameimfp JSON
Version 1.1.1 PyPI version JSON
download
home_pagehttps://github.com/chriscarrollsmith/imfp
SummaryPython package for downloading economic data from the International Monetary Fund JSON RESTful API endpoint.
upload_time2024-04-23 19:53:00
maintainerNone
docs_urlNone
authorChristopher C. Smith
requires_python<4.0,>=3.8
licenseMIT
keywords economics finance imf api
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # imfp

[![Tests](https://github.com/chriscarrollsmith/imfp/actions/workflows/actions.yml/badge.svg)](https://github.com/chriscarrollsmith/imfp/actions/workflows/actions.yml)
[![PyPI Version](https://img.shields.io/pypi/v/imfp.svg)](https://pypi.python.org/pypi/imfp)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

`imfp`, by Christopher C. Smith, is a Python package for downloading data from the [International Monetary
Fund's](http://data.imf.org/) [RESTful JSON
API](http://datahelp.imf.org/knowledgebase/articles/667681-using-json-restful-web-service).

## Installation

To install the stable version of imfp from PyPi, use pip.


```python
pip install -q --upgrade imfp
```

To load the library, use `import`:


```python
import imfp
```


## Usage

### Suggested packages

`imfp` outputs data in a `pandas` data frame, so you will want to use the `pandas` package for its functions for viewing and manipulating this object type. I also recommend `matplotlib` or `seaborn` for making plots, and `numpy` for computation. These packages can be installed using `pip` and loaded using `import`:


```python
import seaborn
```


### Fetching an Index of Databases with the `imf_databases` Function

The `imfp` package introduces four core functions: `imfp.imf_databases`, `imfp.imf_parameters`, `imfp.imf_parameter_defs`, and `imfp.imf_dataset`. The function for downloading datasets is `imfp.imf_dataset`, but you will need the other functions to determine what arguments to supply to `imfp.imf_dataset`. For instance, all calls to `imfp.imf_dataset` require a `database_id`. This is because the IMF serves many different databases through its API, and the API needs to know which of these many databases you're requesting data from. To obtain a list of databases, use `imfp.imf_databases`, like so:


```python
#Fetch the list of databases available through the IMF API
databases = imfp.imf_databases()
databases.head()
```




<div>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>database_id</th>
      <th>description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>BOP_2017M06</td>
      <td>Balance of Payments (BOP), 2017 M06</td>
    </tr>
    <tr>
      <th>1</th>
      <td>BOP_2020M3</td>
      <td>Balance of Payments (BOP), 2020 M03</td>
    </tr>
    <tr>
      <th>2</th>
      <td>BOP_2017M11</td>
      <td>Balance of Payments (BOP), 2017 M11</td>
    </tr>
    <tr>
      <th>3</th>
      <td>DOT_2020Q1</td>
      <td>Direction of Trade Statistics (DOTS), 2020 Q1</td>
    </tr>
    <tr>
      <th>4</th>
      <td>GFSMAB2016</td>
      <td>Government Finance Statistics Yearbook (GFSY 2...</td>
    </tr>
  </tbody>
</table>
</div>



This function returns the IMF’s listing of 259 databases available through the API. (In reality, 8 of the listed databases are defunct and not actually available: FAS_2015, GFS01, FM202010, APDREO202010, AFRREO202010, WHDREO202010, BOPAGG_2020, DOT_2020Q1.)

To view and explore the database list, it’s possible to explore subsets of the data frame by row number with `databases.loc`:


```python
# View a subset consisting of rows 5 through 9
databases.loc[5:9]
```




<div>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>database_id</th>
      <th>description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>5</th>
      <td>BOP_2019M12</td>
      <td>Balance of Payments (BOP), 2019 M12</td>
    </tr>
    <tr>
      <th>6</th>
      <td>GFSYFALCS2014</td>
      <td>Government Finance Statistics Yearbook (GFSY 2...</td>
    </tr>
    <tr>
      <th>7</th>
      <td>GFSE2016</td>
      <td>Government Finance Statistics Yearbook (GFSY 2...</td>
    </tr>
    <tr>
      <th>8</th>
      <td>FM201510</td>
      <td>Fiscal Monitor (FM) October 2015</td>
    </tr>
    <tr>
      <th>9</th>
      <td>GFSIBS2016</td>
      <td>Government Finance Statistics Yearbook (GFSY 2...</td>
    </tr>
  </tbody>
</table>
</div>




 Or, if you already know which database you want, you can fetch the corresponding code by searching for a string match using `str.contains` and subsetting the data frame for matching rows. For instance, here’s how to search for the Primary Commodity Price System:


```python
databases[databases['description'].str.contains("Commodity")]
```




<div>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>database_id</th>
      <th>description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>237</th>
      <td>PCTOT</td>
      <td>Commodity Terms of Trade</td>
    </tr>
    <tr>
      <th>239</th>
      <td>PCPS</td>
      <td>Primary Commodity Price System (PCPS)</td>
    </tr>
  </tbody>
</table>
</div>



### Fetching a List of Parameters and Input Codes with `imf_parameters` and `imf_parameter_defs`

Once you have a `database_id`, it’s possible to make a call to `imfp.imf_dataset` to fetch the entire database: `imfp.imf_dataset(database_id)`. However, while this will succeed for a few small databases, it will fail for all of the larger ones. And even in the rare case when it succeeds, fetching an entire database can take a long time. You’re much better off supplying additional filter parameters to reduce the size of your request.

Requests to databases available through the IMF API are complicated by the fact that each database uses a different set of parameters when making a request. (At last count, there were 43 unique parameters used in making API requests from the various databases!) You also have to have the list of valid input codes for each parameter. The `imfp.imf_parameters` function solves this problem. Use the function to obtain the full list of parameters and valid input codes for a given database:


```python
# Fetch list of valid parameters and input codes for commodity price database
params = imfp.imf_parameters("PCPS")
```

The `imfp.imf_parameters` function returns a dictionary of data frames. Each dictionary key name corresponds to a parameter used in making requests from the database:


```python
# Get key names from the params object
params.keys()
```




    dict_keys(['freq', 'ref_area', 'commodity', 'unit_measure'])



In the event that a parameter name is not self-explanatory, the `imfp.imf_parameter_defs` function can be used to fetch short text descriptions of each parameter:


```python
# Fetch and display parameter text descriptions for the commodity price database
imfp.imf_parameter_defs("PCPS")
```




<div>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>parameter</th>
      <th>description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>freq</td>
      <td>Frequency</td>
    </tr>
    <tr>
      <th>1</th>
      <td>ref_area</td>
      <td>Geographical Areas</td>
    </tr>
    <tr>
      <th>2</th>
      <td>commodity</td>
      <td>Indicator</td>
    </tr>
    <tr>
      <th>3</th>
      <td>unit_measure</td>
      <td>Unit</td>
    </tr>
  </tbody>
</table>
</div>



Each named list item is a data frame containing a vector of valid input codes that can be used with the named parameter, and a vector of text descriptions of what each code represents.

To access the data frame containing valid values for each parameter, subset the `params` dict by the parameter name:


```python
# View the data frame of valid input codes for the frequency parameter
params['freq']
```




<div>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>input_code</th>
      <th>description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>A</td>
      <td>Annual</td>
    </tr>
    <tr>
      <th>1</th>
      <td>M</td>
      <td>Monthly</td>
    </tr>
    <tr>
      <th>2</th>
      <td>Q</td>
      <td>Quarterly</td>
    </tr>
  </tbody>
</table>
</div>



### Viewing Data Frames

Note that `pandas` data frames in Python can be a little difficult to work with, because Python doesn't have a built-in variable explorer. If you're doing data science, I recommend using an IDE like RStudio or Spyder that has a built-in variable explorer. However, if you don't have a variable explorer, you can prevent Python from truncating data frames using the `options` in `pandas`. For instance, to increase the maximum allowed column width to 100 characters, we can use `pandas.options.display.max_colwidth = 100`.

Alternatively, it's possible to open the data frame in a new window to view it in full:


```python
import imfp
import tempfile
import webbrowser

# Define a simple function to view data frame in a browser window
def View(df):
    html = df.to_html()
    with tempfile.NamedTemporaryFile('w', delete=False, suffix='.html') as f:
        url = 'file://' + f.name
        f.write(html)
    webbrowser.open(url)

# Open data frame in a new browser window using the function
df = imfp.imf_databases()
View(df)
```

### Supplying Parameter Arguments to `imf_dataset`: A Tale of Two Workflows

There are two ways to supply parameters to `imfp.imf_dataset`: by supplying list arguments or by supplying a modified parameters dict. The list arguments workflow will be more intuitive for most users, but the dict argument workflow requires a little less code.

#### The List Arguments Workflow

To supply list arguments, just find the codes you want and supply them to `imfp.imf_dataset` using the parameter name as the argument name. The example below shows how to request 2000–2015 annual coal prices from the Primary Commodity Price System database:


```python
# Fetch the 'freq' input code for annual frequency
selected_freq = list(
    params['freq']['input_code'][params['freq']['description'].str.contains("Annual")]
)

# Fetch the 'commodity' input code for coal
selected_commodity = list(
    params['commodity']['input_code'][params['commodity']['description'].str.contains("Coal")]
)

# Fetch the 'unit_measure' input code for index
selected_unit_measure = list(
    params['unit_measure']['input_code'][params['unit_measure']['description'].str.contains("Index")]
)

# Request data from the API
df = imfp.imf_dataset(database_id = "PCPS",
         freq = selected_freq, commodity = selected_commodity,
         unit_measure = selected_unit_measure,
         start_year = 2000, end_year = 2015)

# Display the first few entries in the retrieved data frame
df.head()
```




<div>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>freq</th>
      <th>ref_area</th>
      <th>commodity</th>
      <th>unit_measure</th>
      <th>unit_mult</th>
      <th>time_format</th>
      <th>time_period</th>
      <th>obs_value</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2000</td>
      <td>39.3510230293202</td>
    </tr>
    <tr>
      <th>1</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2001</td>
      <td>49.3378587284039</td>
    </tr>
    <tr>
      <th>2</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2002</td>
      <td>39.4949091648006</td>
    </tr>
    <tr>
      <th>3</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2003</td>
      <td>43.2878876950788</td>
    </tr>
    <tr>
      <th>4</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2004</td>
      <td>82.9185858052862</td>
    </tr>
  </tbody>
</table>
</div>



#### The Parameters Argument Workflow

To supply a list object, modify each data frame in the `params` list object to retain only the rows you want, and then supply the modified list object to `imfp.imf_dataset` as its parameters argument. Here is how to make the same request for annual coal price data using a parameters list:


```python
# Fetch the 'freq' input code for annual frequency
params['freq'] = params['freq'][params['freq']['description'].str.contains("Annual")]

# Fetch the 'commodity' input code(s) for coal
params['commodity'] = params['commodity'][params['commodity']['description'].str.contains("Coal")]

# Fetch the 'unit_measure' input code for index
params['unit_measure'] = params['unit_measure'][params['unit_measure']['description'].str.contains("Index")]

# Request data from the API
df = imfp.imf_dataset(database_id = "PCPS",
         parameters = params,
         start_year = 2000, end_year = 2015)

# Display the first few entries in the retrieved data frame
df.head()
```




<div>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>freq</th>
      <th>ref_area</th>
      <th>commodity</th>
      <th>unit_measure</th>
      <th>unit_mult</th>
      <th>time_format</th>
      <th>time_period</th>
      <th>obs_value</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2000</td>
      <td>39.3510230293202</td>
    </tr>
    <tr>
      <th>1</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2001</td>
      <td>49.3378587284039</td>
    </tr>
    <tr>
      <th>2</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2002</td>
      <td>39.4949091648006</td>
    </tr>
    <tr>
      <th>3</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2003</td>
      <td>43.2878876950788</td>
    </tr>
    <tr>
      <th>4</th>
      <td>A</td>
      <td>W00</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2004</td>
      <td>82.9185858052862</td>
    </tr>
  </tbody>
</table>
</div>



## Working with the Returned Data Frame

Note that all columns in the returned data frame are character vectors, and that to plot the series we will need to convert to valid numeric or date formats. Using `seaborn` with `hue`, we can plot different indicators in different colors:


```python
# Convert obs_value to numeric and time_period to integer year
df = df.astype({"time_period" : int, "obs_value" : float})

# Plot prices of different commodities in different colors with seaborn
seaborn.lineplot(data=df, x='time_period', y='obs_value', hue='commodity');
```


    
![png](README_files/plot.png)
    



Also note that the returned data frame has mysterious-looking codes as values in some columns.

Codes in the `time_format` column are ISO 8601 duration codes. In this case, “P1Y” means “periods of 1 year.” The `unit_mult` column represents the number of zeroes you should add to the value column. For instance, if value is in millions, then the unit multiplier will be 6. If in billions, then the unit multiplier will be 9.

The meanings of the other codes are stored in our `params` object and can be fetched with a join. For instance to fetch the meaning of the `ref_area` code “W00”, we can perform a left join with the `params['ref_area']` data frame and use select to replace `ref_area` with the parameter description:


```python
# Join df with params['ref_area'] to fetch code description
df = df.merge(params['ref_area'], left_on='ref_area',right_on='input_code',how='left')

# Drop redundant columns and rename description column
df = df.drop(columns=['ref_area','input_code']).rename(columns={"description":"ref_area"})

# View first few columns in the modified data frame
df.head()
```




<div>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>freq</th>
      <th>commodity</th>
      <th>unit_measure</th>
      <th>unit_mult</th>
      <th>time_format</th>
      <th>time_period</th>
      <th>obs_value</th>
      <th>ref_area</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>A</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2000</td>
      <td>39.351023</td>
      <td>All Countries, excluding the IO</td>
    </tr>
    <tr>
      <th>1</th>
      <td>A</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2001</td>
      <td>49.337859</td>
      <td>All Countries, excluding the IO</td>
    </tr>
    <tr>
      <th>2</th>
      <td>A</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2002</td>
      <td>39.494909</td>
      <td>All Countries, excluding the IO</td>
    </tr>
    <tr>
      <th>3</th>
      <td>A</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2003</td>
      <td>43.287888</td>
      <td>All Countries, excluding the IO</td>
    </tr>
    <tr>
      <th>4</th>
      <td>A</td>
      <td>PCOAL</td>
      <td>IX</td>
      <td>0</td>
      <td>P1Y</td>
      <td>2004</td>
      <td>82.918586</td>
      <td>All Countries, excluding the IO</td>
    </tr>
  </tbody>
</table>
</div>



## Rate and Bandwidth Limit Management

### Setting a Unique Application Name with `set_imf_app_name`

`imfp.set_imf_app_name()` allows users to set a custom application name to be used when making API calls to the IMF API. The IMF API has an application-based rate limit of 50 requests per second, with the application identified by the "user_agent" variable in the request header.

This could prove problematic if the `imfp` library became too popular and too many users tried to make simultaneous API requests using the default app name. By setting a custom application name, users can avoid hitting rate limits and being blocked by the API. `imfp.set_imf_app_name()` sets the application name by changing the `IMF_APP_NAME` variable in the environment. If this variable doesn't exist, `imfp.set_imf_app_name()` will create it.

To set a custom application name, simply call the `imfp.set_imf_app_name()` function with your desired application name as an argument:


```python
# Set custom app name as an environment variable
imfp.set_imf_app_name("my_custom_app_name")
```



The function will throw an error if the provided name is missing, NULL, NA, not a string, or longer than 255 characters. If the provided name is "imfr" (the default) or an empty string, the function will issue a warning recommending the use of a unique app name to avoid hitting rate limits.

### Changing the enforced wait time between API calls with `set_imf_wait_time`

By default, `imfp` enforces a mandatory 1.5-second wait time between API calls to prevent repeated or recursive calls from exceeding the API's bandwidth/rate limit. This wait time should be sufficient for most applications. However, if you are running parallel processes using `imfp` (e.g. during cross-platform testing), this wait time may be insufficient to prevent you from running up against the API's rate and bandwidth limits. You can change this wait time by calling the `set_imf_wait_time` function with a numeric value, in seconds. For instance, to enforce a five-second wait time between API calls, use `set_imf_wait_time(10)`.

Also note that by default, `imfp` functions will retry any API call rejected for bandwidth or rate limit reasons. The number of times `imfp` will attempt the call is set by the `times` argument, with a default value of 3. (With this value, requests will be retried twice after an initial failure.) Note that `imfp` enforces an exponentially increasing wait time between function calls, with a base wait time of 5 seconds on the first retry, so it is not recommended to set a high value for `times`.

## Planned features

- Implement automatic build/render of readthedocs documentation with Sphinx
- Implement automatic build/release/publish of package updates
- Move response mocking functionality from `_download_parse` to `_imf_get`
- Investigate and implement different and more appropriate exception types, as we're currently handling too many different cases with `ValueError`
- More fully investigate the types of metadata available through the API and the most appropriate way to return them when a user calls `include_metadata`
- Implement optional response caching for `imf_databases` and `imf_parameters`
- Simplify and modularize some of the code, particularly in `imf_dataset`

## Contributing

I would love to have your help in improving `imfp`. If you encounter a bug while using the library, please open an issue. Alternatively, fix the bug and open a pull request. Thanks in advance for your help!

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/chriscarrollsmith/imfp",
    "name": "imfp",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "economics, finance, IMF, API",
    "author": "Christopher C. Smith",
    "author_email": "chriscarrollsmith@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/4b/f4/45f032e82d30aeb21cc0516e3e75ee66c76dd714e21780ca4683735981a8/imfp-1.1.1.tar.gz",
    "platform": null,
    "description": "# imfp\n\n[![Tests](https://github.com/chriscarrollsmith/imfp/actions/workflows/actions.yml/badge.svg)](https://github.com/chriscarrollsmith/imfp/actions/workflows/actions.yml)\n[![PyPI Version](https://img.shields.io/pypi/v/imfp.svg)](https://pypi.python.org/pypi/imfp)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n`imfp`, by Christopher C. Smith, is a Python package for downloading data from the [International Monetary\nFund's](http://data.imf.org/) [RESTful JSON\nAPI](http://datahelp.imf.org/knowledgebase/articles/667681-using-json-restful-web-service).\n\n## Installation\n\nTo install the stable version of imfp from PyPi, use pip.\n\n\n```python\npip install -q --upgrade imfp\n```\n\nTo load the library, use `import`:\n\n\n```python\nimport imfp\n```\n\n\n## Usage\n\n### Suggested packages\n\n`imfp` outputs data in a `pandas` data frame, so you will want to use the `pandas` package for its functions for viewing and manipulating this object type. I also recommend `matplotlib` or `seaborn` for making plots, and `numpy` for computation. These packages can be installed using `pip` and loaded using `import`:\n\n\n```python\nimport seaborn\n```\n\n\n### Fetching an Index of Databases with the `imf_databases` Function\n\nThe `imfp` package introduces four core functions: `imfp.imf_databases`, `imfp.imf_parameters`, `imfp.imf_parameter_defs`, and `imfp.imf_dataset`. The function for downloading datasets is `imfp.imf_dataset`, but you will need the other functions to determine what arguments to supply to `imfp.imf_dataset`. For instance, all calls to `imfp.imf_dataset` require a `database_id`. This is because the IMF serves many different databases through its API, and the API needs to know which of these many databases you're requesting data from. To obtain a list of databases, use `imfp.imf_databases`, like so:\n\n\n```python\n#Fetch the list of databases available through the IMF API\ndatabases = imfp.imf_databases()\ndatabases.head()\n```\n\n\n\n\n<div>\n\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>database_id</th>\n      <th>description</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>BOP_2017M06</td>\n      <td>Balance of Payments (BOP), 2017 M06</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>BOP_2020M3</td>\n      <td>Balance of Payments (BOP), 2020 M03</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>BOP_2017M11</td>\n      <td>Balance of Payments (BOP), 2017 M11</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>DOT_2020Q1</td>\n      <td>Direction of Trade Statistics (DOTS), 2020 Q1</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>GFSMAB2016</td>\n      <td>Government Finance Statistics Yearbook (GFSY 2...</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\nThis function returns the IMF\u2019s listing of 259 databases available through the API. (In reality, 8 of the listed databases are defunct and not actually available: FAS_2015, GFS01, FM202010, APDREO202010, AFRREO202010, WHDREO202010, BOPAGG_2020, DOT_2020Q1.)\n\nTo view and explore the database list, it\u2019s possible to explore subsets of the data frame by row number with `databases.loc`:\n\n\n```python\n# View a subset consisting of rows 5 through 9\ndatabases.loc[5:9]\n```\n\n\n\n\n<div>\n\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>database_id</th>\n      <th>description</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>5</th>\n      <td>BOP_2019M12</td>\n      <td>Balance of Payments (BOP), 2019 M12</td>\n    </tr>\n    <tr>\n      <th>6</th>\n      <td>GFSYFALCS2014</td>\n      <td>Government Finance Statistics Yearbook (GFSY 2...</td>\n    </tr>\n    <tr>\n      <th>7</th>\n      <td>GFSE2016</td>\n      <td>Government Finance Statistics Yearbook (GFSY 2...</td>\n    </tr>\n    <tr>\n      <th>8</th>\n      <td>FM201510</td>\n      <td>Fiscal Monitor (FM) October 2015</td>\n    </tr>\n    <tr>\n      <th>9</th>\n      <td>GFSIBS2016</td>\n      <td>Government Finance Statistics Yearbook (GFSY 2...</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\n\n Or, if you already know which database you want, you can fetch the corresponding code by searching for a string match using `str.contains` and subsetting the data frame for matching rows. For instance, here\u2019s how to search for the Primary Commodity Price System:\n\n\n```python\ndatabases[databases['description'].str.contains(\"Commodity\")]\n```\n\n\n\n\n<div>\n\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>database_id</th>\n      <th>description</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>237</th>\n      <td>PCTOT</td>\n      <td>Commodity Terms of Trade</td>\n    </tr>\n    <tr>\n      <th>239</th>\n      <td>PCPS</td>\n      <td>Primary Commodity Price System (PCPS)</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\n### Fetching a List of Parameters and Input Codes with `imf_parameters` and `imf_parameter_defs`\n\nOnce you have a `database_id`, it\u2019s possible to make a call to `imfp.imf_dataset` to fetch the entire database: `imfp.imf_dataset(database_id)`. However, while this will succeed for a few small databases, it will fail for all of the larger ones. And even in the rare case when it succeeds, fetching an entire database can take a long time. You\u2019re much better off supplying additional filter parameters to reduce the size of your request.\n\nRequests to databases available through the IMF API are complicated by the fact that each database uses a different set of parameters when making a request. (At last count, there were 43 unique parameters used in making API requests from the various databases!) You also have to have the list of valid input codes for each parameter. The `imfp.imf_parameters` function solves this problem. Use the function to obtain the full list of parameters and valid input codes for a given database:\n\n\n```python\n# Fetch list of valid parameters and input codes for commodity price database\nparams = imfp.imf_parameters(\"PCPS\")\n```\n\nThe `imfp.imf_parameters` function returns a dictionary of data frames. Each dictionary key name corresponds to a parameter used in making requests from the database:\n\n\n```python\n# Get key names from the params object\nparams.keys()\n```\n\n\n\n\n    dict_keys(['freq', 'ref_area', 'commodity', 'unit_measure'])\n\n\n\nIn the event that a parameter name is not self-explanatory, the `imfp.imf_parameter_defs` function can be used to fetch short text descriptions of each parameter:\n\n\n```python\n# Fetch and display parameter text descriptions for the commodity price database\nimfp.imf_parameter_defs(\"PCPS\")\n```\n\n\n\n\n<div>\n\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>parameter</th>\n      <th>description</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>freq</td>\n      <td>Frequency</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>ref_area</td>\n      <td>Geographical Areas</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>commodity</td>\n      <td>Indicator</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>unit_measure</td>\n      <td>Unit</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\nEach named list item is a data frame containing a vector of valid input codes that can be used with the named parameter, and a vector of text descriptions of what each code represents.\n\nTo access the data frame containing valid values for each parameter, subset the `params` dict by the parameter name:\n\n\n```python\n# View the data frame of valid input codes for the frequency parameter\nparams['freq']\n```\n\n\n\n\n<div>\n\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>input_code</th>\n      <th>description</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>A</td>\n      <td>Annual</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>M</td>\n      <td>Monthly</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>Q</td>\n      <td>Quarterly</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\n### Viewing Data Frames\n\nNote that `pandas` data frames in Python can be a little difficult to work with, because Python doesn't have a built-in variable explorer. If you're doing data science, I recommend using an IDE like RStudio or Spyder that has a built-in variable explorer. However, if you don't have a variable explorer, you can prevent Python from truncating data frames using the `options` in `pandas`. For instance, to increase the maximum allowed column width to 100 characters, we can use `pandas.options.display.max_colwidth = 100`.\n\nAlternatively, it's possible to open the data frame in a new window to view it in full:\n\n\n```python\nimport imfp\nimport tempfile\nimport webbrowser\n\n# Define a simple function to view data frame in a browser window\ndef View(df):\n    html = df.to_html()\n    with tempfile.NamedTemporaryFile('w', delete=False, suffix='.html') as f:\n        url = 'file://' + f.name\n        f.write(html)\n    webbrowser.open(url)\n\n# Open data frame in a new browser window using the function\ndf = imfp.imf_databases()\nView(df)\n```\n\n### Supplying Parameter Arguments to `imf_dataset`: A Tale of Two Workflows\n\nThere are two ways to supply parameters to `imfp.imf_dataset`: by supplying list arguments or by supplying a modified parameters dict. The list arguments workflow will be more intuitive for most users, but the dict argument workflow requires a little less code.\n\n#### The List Arguments Workflow\n\nTo supply list arguments, just find the codes you want and supply them to `imfp.imf_dataset` using the parameter name as the argument name. The example below shows how to request 2000\u20132015 annual coal prices from the Primary Commodity Price System database:\n\n\n```python\n# Fetch the 'freq' input code for annual frequency\nselected_freq = list(\n    params['freq']['input_code'][params['freq']['description'].str.contains(\"Annual\")]\n)\n\n# Fetch the 'commodity' input code for coal\nselected_commodity = list(\n    params['commodity']['input_code'][params['commodity']['description'].str.contains(\"Coal\")]\n)\n\n# Fetch the 'unit_measure' input code for index\nselected_unit_measure = list(\n    params['unit_measure']['input_code'][params['unit_measure']['description'].str.contains(\"Index\")]\n)\n\n# Request data from the API\ndf = imfp.imf_dataset(database_id = \"PCPS\",\n         freq = selected_freq, commodity = selected_commodity,\n         unit_measure = selected_unit_measure,\n         start_year = 2000, end_year = 2015)\n\n# Display the first few entries in the retrieved data frame\ndf.head()\n```\n\n\n\n\n<div>\n\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>freq</th>\n      <th>ref_area</th>\n      <th>commodity</th>\n      <th>unit_measure</th>\n      <th>unit_mult</th>\n      <th>time_format</th>\n      <th>time_period</th>\n      <th>obs_value</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2000</td>\n      <td>39.3510230293202</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2001</td>\n      <td>49.3378587284039</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2002</td>\n      <td>39.4949091648006</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2003</td>\n      <td>43.2878876950788</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2004</td>\n      <td>82.9185858052862</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\n#### The Parameters Argument Workflow\n\nTo supply a list object, modify each data frame in the `params` list object to retain only the rows you want, and then supply the modified list object to `imfp.imf_dataset` as its parameters argument. Here is how to make the same request for annual coal price data using a parameters list:\n\n\n```python\n# Fetch the 'freq' input code for annual frequency\nparams['freq'] = params['freq'][params['freq']['description'].str.contains(\"Annual\")]\n\n# Fetch the 'commodity' input code(s) for coal\nparams['commodity'] = params['commodity'][params['commodity']['description'].str.contains(\"Coal\")]\n\n# Fetch the 'unit_measure' input code for index\nparams['unit_measure'] = params['unit_measure'][params['unit_measure']['description'].str.contains(\"Index\")]\n\n# Request data from the API\ndf = imfp.imf_dataset(database_id = \"PCPS\",\n         parameters = params,\n         start_year = 2000, end_year = 2015)\n\n# Display the first few entries in the retrieved data frame\ndf.head()\n```\n\n\n\n\n<div>\n\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>freq</th>\n      <th>ref_area</th>\n      <th>commodity</th>\n      <th>unit_measure</th>\n      <th>unit_mult</th>\n      <th>time_format</th>\n      <th>time_period</th>\n      <th>obs_value</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2000</td>\n      <td>39.3510230293202</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2001</td>\n      <td>49.3378587284039</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2002</td>\n      <td>39.4949091648006</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2003</td>\n      <td>43.2878876950788</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>A</td>\n      <td>W00</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2004</td>\n      <td>82.9185858052862</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\n## Working with the Returned Data Frame\n\nNote that all columns in the returned data frame are character vectors, and that to plot the series we will need to convert to valid numeric or date formats. Using `seaborn` with `hue`, we can plot different indicators in different colors:\n\n\n```python\n# Convert obs_value to numeric and time_period to integer year\ndf = df.astype({\"time_period\" : int, \"obs_value\" : float})\n\n# Plot prices of different commodities in different colors with seaborn\nseaborn.lineplot(data=df, x='time_period', y='obs_value', hue='commodity');\n```\n\n\n    \n![png](README_files/plot.png)\n    \n\n\n\nAlso note that the returned data frame has mysterious-looking codes as values in some columns.\n\nCodes in the `time_format` column are ISO 8601 duration codes. In this case, \u201cP1Y\u201d means \u201cperiods of 1 year.\u201d The `unit_mult` column represents the number of zeroes you should add to the value column. For instance, if value is in millions, then the unit multiplier will be 6. If in billions, then the unit multiplier will be 9.\n\nThe meanings of the other codes are stored in our `params` object and can be fetched with a join. For instance to fetch the meaning of the `ref_area` code \u201cW00\u201d, we can perform a left join with the `params['ref_area']` data frame and use select to replace `ref_area` with the parameter description:\n\n\n```python\n# Join df with params['ref_area'] to fetch code description\ndf = df.merge(params['ref_area'], left_on='ref_area',right_on='input_code',how='left')\n\n# Drop redundant columns and rename description column\ndf = df.drop(columns=['ref_area','input_code']).rename(columns={\"description\":\"ref_area\"})\n\n# View first few columns in the modified data frame\ndf.head()\n```\n\n\n\n\n<div>\n\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>freq</th>\n      <th>commodity</th>\n      <th>unit_measure</th>\n      <th>unit_mult</th>\n      <th>time_format</th>\n      <th>time_period</th>\n      <th>obs_value</th>\n      <th>ref_area</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>A</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2000</td>\n      <td>39.351023</td>\n      <td>All Countries, excluding the IO</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>A</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2001</td>\n      <td>49.337859</td>\n      <td>All Countries, excluding the IO</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>A</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2002</td>\n      <td>39.494909</td>\n      <td>All Countries, excluding the IO</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>A</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2003</td>\n      <td>43.287888</td>\n      <td>All Countries, excluding the IO</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>A</td>\n      <td>PCOAL</td>\n      <td>IX</td>\n      <td>0</td>\n      <td>P1Y</td>\n      <td>2004</td>\n      <td>82.918586</td>\n      <td>All Countries, excluding the IO</td>\n    </tr>\n  </tbody>\n</table>\n</div>\n\n\n\n## Rate and Bandwidth Limit Management\n\n### Setting a Unique Application Name with `set_imf_app_name`\n\n`imfp.set_imf_app_name()` allows users to set a custom application name to be used when making API calls to the IMF API. The IMF API has an application-based rate limit of 50 requests per second, with the application identified by the \"user_agent\" variable in the request header.\n\nThis could prove problematic if the `imfp` library became too popular and too many users tried to make simultaneous API requests using the default app name. By setting a custom application name, users can avoid hitting rate limits and being blocked by the API. `imfp.set_imf_app_name()` sets the application name by changing the `IMF_APP_NAME` variable in the environment. If this variable doesn't exist, `imfp.set_imf_app_name()` will create it.\n\nTo set a custom application name, simply call the `imfp.set_imf_app_name()` function with your desired application name as an argument:\n\n\n```python\n# Set custom app name as an environment variable\nimfp.set_imf_app_name(\"my_custom_app_name\")\n```\n\n\n\nThe function will throw an error if the provided name is missing, NULL, NA, not a string, or longer than 255 characters. If the provided name is \"imfr\" (the default) or an empty string, the function will issue a warning recommending the use of a unique app name to avoid hitting rate limits.\n\n### Changing the enforced wait time between API calls with `set_imf_wait_time`\n\nBy default, `imfp` enforces a mandatory 1.5-second wait time between API calls to prevent repeated or recursive calls from exceeding the API's bandwidth/rate limit. This wait time should be sufficient for most applications. However, if you are running parallel processes using `imfp` (e.g. during cross-platform testing), this wait time may be insufficient to prevent you from running up against the API's rate and bandwidth limits. You can change this wait time by calling the `set_imf_wait_time` function with a numeric value, in seconds. For instance, to enforce a five-second wait time between API calls, use `set_imf_wait_time(10)`.\n\nAlso note that by default, `imfp` functions will retry any API call rejected for bandwidth or rate limit reasons. The number of times `imfp` will attempt the call is set by the `times` argument, with a default value of 3. (With this value, requests will be retried twice after an initial failure.) Note that `imfp` enforces an exponentially increasing wait time between function calls, with a base wait time of 5 seconds on the first retry, so it is not recommended to set a high value for `times`.\n\n## Planned features\n\n- Implement automatic build/render of readthedocs documentation with Sphinx\n- Implement automatic build/release/publish of package updates\n- Move response mocking functionality from `_download_parse` to `_imf_get`\n- Investigate and implement different and more appropriate exception types, as we're currently handling too many different cases with `ValueError`\n- More fully investigate the types of metadata available through the API and the most appropriate way to return them when a user calls `include_metadata`\n- Implement optional response caching for `imf_databases` and `imf_parameters`\n- Simplify and modularize some of the code, particularly in `imf_dataset`\n\n## Contributing\n\nI would love to have your help in improving `imfp`. If you encounter a bug while using the library, please open an issue. Alternatively, fix the bug and open a pull request. Thanks in advance for your help!\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python package for downloading economic data from the International Monetary Fund JSON RESTful API endpoint.",
    "version": "1.1.1",
    "project_urls": {
        "Homepage": "https://github.com/chriscarrollsmith/imfp",
        "Repository": "https://github.com/chriscarrollsmith/imfp"
    },
    "split_keywords": [
        "economics",
        " finance",
        " imf",
        " api"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "71a3e1b1b8ba31d089faf15412b9cf1fe512d940fc4c98287a15f36ffccc26ae",
                "md5": "e8f37dfb99ff566697e7a64358f466c1",
                "sha256": "aa1562845844ddb7e8f6a3bf35e64e7955631d9d1392749e2fc438e0599733fd"
            },
            "downloads": -1,
            "filename": "imfp-1.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e8f37dfb99ff566697e7a64358f466c1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 19009,
            "upload_time": "2024-04-23T19:52:47",
            "upload_time_iso_8601": "2024-04-23T19:52:47.347384Z",
            "url": "https://files.pythonhosted.org/packages/71/a3/e1b1b8ba31d089faf15412b9cf1fe512d940fc4c98287a15f36ffccc26ae/imfp-1.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4bf445f032e82d30aeb21cc0516e3e75ee66c76dd714e21780ca4683735981a8",
                "md5": "e9eac5955d08202b04c444c1edad08af",
                "sha256": "1fde3c5010e3008e6606a170544947ce0500c14e696b98752062980d54b987b6"
            },
            "downloads": -1,
            "filename": "imfp-1.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "e9eac5955d08202b04c444c1edad08af",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 22419,
            "upload_time": "2024-04-23T19:53:00",
            "upload_time_iso_8601": "2024-04-23T19:53:00.643787Z",
            "url": "https://files.pythonhosted.org/packages/4b/f4/45f032e82d30aeb21cc0516e3e75ee66c76dd714e21780ca4683735981a8/imfp-1.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-23 19:53:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "chriscarrollsmith",
    "github_project": "imfp",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "imfp"
}
        
Elapsed time: 0.24012s