finsets


Namefinsets JSON
Version 0.0.4 PyPI version JSON
download
home_pagehttps://github.com/ionmihai/finsets
SummaryDownload and process datasets commonly used in finance research
upload_time2023-11-19 23:00:56
maintainer
docs_urlNone
authorMihai Ion
requires_python>=3.7
licenseApache Software License 2.0
keywords nbdev jupyter notebook python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # finsets

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

> Download and process datasets commonly used in finance research

Each module handles a different data source. Almost all submodules
(other than utility ones) have a
[`get_raw_data`](https://ionmihai.github.io/finsets/01_wrds/ratios.html#get_raw_data)
function that downloads the raw data and a
[`process_raw_data`](https://ionmihai.github.io/finsets/01_wrds/ratios.html#process_raw_data)
function that processes the data into a `pandas.DataFrame` having, as
index, either:

- A `pandas.Period` date reflecting the frequency of the data (for
  time-series datasets), or
- A `pandas.MultiIndex` with a panel identifier in the first dimension
  and a `pandas.Period` date in the second dimension (for panel
  datasets).

The period date in the index will be named following the pattern `Xdate`
where X is the string literal representing the frequency of the data
(e.g. `Mdate` for monthly data, `Qdate` for quarterly data, `Ydate` for
annual data).

[Documentation site](https://ionmihai.github.io/finsets/).

[GitHub page](https://github.com/ionmihai/finsets).

## Install

``` sh
pip install finsets
```

## How to use

``` python
import finsets as fds
```

or

``` python
from finsets import fred, wrds, papers
```

Below, we very briefly describe each submodule. For more details, please
see the documentation of each submodule (they provide a lot more
functionality than presented here).

## WRDS

> Downloads and processes datasets from Wharton Research Data Services
> [WRDS](https://wrds-www.wharton.upenn.edu/).

Each WRDS module handles a different library in WRDS (e.g. `compa`
module for the Compustat Annual CCM file, `crspm` for the CRSP Monthly
Stock file, etc.).

Before you use any of the `wrds` modules, you need to create a `pgpass`
with your WRDS credentials. To do that, run

``` python
from finsets.wrds import wrds_api
```

``` python
db = wrds_api.Connection()
```

This will prompt you for your WRDS username and password. After you
enter your credentials, if you don’t have a `pgpass` file already set
up, it will ask you if you want to do that. Hit `y` and it will be
automatically created for you. After this, you will never have to input
your WRDS password.

You will still have to supply your WRDS username to functions that
retrieve data from WRDS (all of them have a `wrds_username` parameter).
If you don’t want to be prompted for the username for every download,
save it under a `WRDS_USERNAME` environment variable:

- On Windows, in a Command Prompt:
  - `setx WRDS_USERNAME "your_wrds_username_here"`
- On Linux, in a terminal:
  - `echo 'export WRDS_USERNAME="your_wrds_username_here"' >> ~/.bashrc && source ~/.bashrc`
- On macOS, since macOS Catalina:
  - `echo 'export WRDS_USERNAME="your_wrds_username_here"' >> ~/.zshrc && source ~/.szhrc`
- On macOS, prior to macOS Catalina:
  - `echo 'export WRDS_USERNAME="your_wrds_username_here"' >> ~/.bash_profile && source ~/.bash_profile`

The functions in the `wrds_` modules will close database connections to
WRDS automatically. However, if you open a connection manually, as above
(with `wrds.Connection()`) make sure you remember to close that
connection. In our example above:

``` python
db.close()
```

Check the `wrds_utils` module for an introduction to some of the main
utilities that come with the `wrds` package.

## FRED

> Downloads and processes datasets from the St. Louis
> [FRED](https://fred.stlouisfed.org/).

To use the functions in the `fred` module, you’ll need an API key from
the St. Louis FRED.

Get one [here](https://fred.stlouisfed.org/docs/api/api_key.html) and
store it in your environment variables under the name `FRED_API_KEY`

Alternatively, you can supply the API key directly as the `api_key`
parameter in each function in the `fred` module.

``` python
gdp = fred.fred.get_raw_data(['GDP'])
```

``` python
gdp['info']
```

<div>


|     | id  | realtime_start | realtime_end | title                  | observation_start | observation_end | frequency | frequency_short | units               | units_short | seasonal_adjustment             | seasonal_adjustment_short | last_updated           | popularity | notes                                            |
|-----|-----|----------------|--------------|------------------------|-------------------|-----------------|-----------|-----------------|---------------------|-------------|---------------------------------|---------------------------|------------------------|------------|--------------------------------------------------|
| 0   | GDP | 2023-11-15     | 2023-11-15   | Gross Domestic Product | 1947-01-01        | 2023-07-01      | Quarterly | Q               | Billions of Dollars | Bil. of \$  | Seasonally Adjusted Annual Rate | SAAR                      | 2023-10-26 07:55:01-05 | 92         | BEA Account Code: A191RC Gross domestic produ... |

</div>

``` python
gdp['Q']
```

<div>


|            | GDP       |
|------------|-----------|
| 1947-01-01 | 243.164   |
| 1947-04-01 | 245.968   |
| 1947-07-01 | 249.585   |
| 1947-10-01 | 259.745   |
| 1948-01-01 | 265.742   |
| ...        | ...       |
| 2022-07-01 | 25994.639 |
| 2022-10-01 | 26408.405 |
| 2023-01-01 | 26813.601 |
| 2023-04-01 | 27063.012 |
| 2023-07-01 | 27623.543 |

<p>307 rows × 1 columns</p>
</div>

## PAPERS

> Downloads and processes datasets made available by the authors of
> academic papers.

Each `papers` module handles a different paper. The naming convention is
that the module’s name is made up of the last names of the authors and
the publication year, separated by underscores. If more than two
authors, all but the first author’s name is replaced by ‘etal’. For
example, the module for the paper “Firm-Level Political Risk:
Measurement and Effects” (2019) by Tarek A. Hassan, Stephan Hollander,
Laurence van Lent, Ahmed Tahoun is named `hasan_etal_2019`.

``` python
papers.hassan_etal_2019.list_all_vars().head()
```

<div>


|     | name   |
|-----|--------|
| 0   | gvkey  |
| 1   | date   |
| 2   | PRisk  |
| 3   | NPRisk |
| 4   | Risk   |

</div>

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ionmihai/finsets",
    "name": "finsets",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "nbdev jupyter notebook python",
    "author": "Mihai Ion",
    "author_email": "mihaiion@email.arizona.edu",
    "download_url": "https://files.pythonhosted.org/packages/58/0e/6504128cf955fcb2bf6620423bb8c249447b9c238208f25663376e0cfe00/finsets-0.0.4.tar.gz",
    "platform": null,
    "description": "# finsets\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n> Download and process datasets commonly used in finance research\n\nEach module handles a different data source. Almost all submodules\n(other than utility ones) have a\n[`get_raw_data`](https://ionmihai.github.io/finsets/01_wrds/ratios.html#get_raw_data)\nfunction that downloads the raw data and a\n[`process_raw_data`](https://ionmihai.github.io/finsets/01_wrds/ratios.html#process_raw_data)\nfunction that processes the data into a `pandas.DataFrame` having, as\nindex, either:\n\n- A `pandas.Period` date reflecting the frequency of the data (for\n  time-series datasets), or\n- A `pandas.MultiIndex` with a panel identifier in the first dimension\n  and a `pandas.Period` date in the second dimension (for panel\n  datasets).\n\nThe period date in the index will be named following the pattern `Xdate`\nwhere X is the string literal representing the frequency of the data\n(e.g.\u00a0`Mdate` for monthly data, `Qdate` for quarterly data, `Ydate` for\nannual data).\n\n[Documentation site](https://ionmihai.github.io/finsets/).\n\n[GitHub page](https://github.com/ionmihai/finsets).\n\n## Install\n\n``` sh\npip install finsets\n```\n\n## How to use\n\n``` python\nimport finsets as fds\n```\n\nor\n\n``` python\nfrom finsets import fred, wrds, papers\n```\n\nBelow, we very briefly describe each submodule. For more details, please\nsee the documentation of each submodule (they provide a lot more\nfunctionality than presented here).\n\n## WRDS\n\n> Downloads and processes datasets from Wharton Research Data Services\n> [WRDS](https://wrds-www.wharton.upenn.edu/).\n\nEach WRDS module handles a different library in WRDS (e.g.\u00a0`compa`\nmodule for the Compustat Annual CCM file, `crspm` for the CRSP Monthly\nStock file, etc.).\n\nBefore you use any of the `wrds` modules, you need to create a `pgpass`\nwith your WRDS credentials. To do that, run\n\n``` python\nfrom finsets.wrds import wrds_api\n```\n\n``` python\ndb = wrds_api.Connection()\n```\n\nThis will prompt you for your WRDS username and password. After you\nenter your credentials, if you don\u2019t have a `pgpass` file already set\nup, it will ask you if you want to do that. Hit `y` and it will be\nautomatically created for you. After this, you will never have to input\nyour WRDS password.\n\nYou will still have to supply your WRDS username to functions that\nretrieve data from WRDS (all of them have a `wrds_username` parameter).\nIf you don\u2019t want to be prompted for the username for every download,\nsave it under a `WRDS_USERNAME` environment variable:\n\n- On Windows, in a Command Prompt:\n  - `setx WRDS_USERNAME \"your_wrds_username_here\"`\n- On Linux, in a terminal:\n  - `echo 'export WRDS_USERNAME=\"your_wrds_username_here\"' >> ~/.bashrc && source ~/.bashrc`\n- On macOS, since macOS Catalina:\n  - `echo 'export WRDS_USERNAME=\"your_wrds_username_here\"' >> ~/.zshrc && source ~/.szhrc`\n- On macOS, prior to macOS Catalina:\n  - `echo 'export WRDS_USERNAME=\"your_wrds_username_here\"' >> ~/.bash_profile && source ~/.bash_profile`\n\nThe functions in the `wrds_` modules will close database connections to\nWRDS automatically. However, if you open a connection manually, as above\n(with `wrds.Connection()`) make sure you remember to close that\nconnection. In our example above:\n\n``` python\ndb.close()\n```\n\nCheck the `wrds_utils` module for an introduction to some of the main\nutilities that come with the `wrds` package.\n\n## FRED\n\n> Downloads and processes datasets from the St.\u00a0Louis\n> [FRED](https://fred.stlouisfed.org/).\n\nTo use the functions in the `fred` module, you\u2019ll need an API key from\nthe St.\u00a0Louis FRED.\n\nGet one [here](https://fred.stlouisfed.org/docs/api/api_key.html) and\nstore it in your environment variables under the name `FRED_API_KEY`\n\nAlternatively, you can supply the API key directly as the `api_key`\nparameter in each function in the `fred` module.\n\n``` python\ngdp = fred.fred.get_raw_data(['GDP'])\n```\n\n``` python\ngdp['info']\n```\n\n<div>\n\n\n|     | id  | realtime_start | realtime_end | title                  | observation_start | observation_end | frequency | frequency_short | units               | units_short | seasonal_adjustment             | seasonal_adjustment_short | last_updated           | popularity | notes                                            |\n|-----|-----|----------------|--------------|------------------------|-------------------|-----------------|-----------|-----------------|---------------------|-------------|---------------------------------|---------------------------|------------------------|------------|--------------------------------------------------|\n| 0   | GDP | 2023-11-15     | 2023-11-15   | Gross Domestic Product | 1947-01-01        | 2023-07-01      | Quarterly | Q               | Billions of Dollars | Bil. of \\$  | Seasonally Adjusted Annual Rate | SAAR                      | 2023-10-26 07:55:01-05 | 92         | BEA Account Code: A191RC Gross domestic produ... |\n\n</div>\n\n``` python\ngdp['Q']\n```\n\n<div>\n\n\n|            | GDP       |\n|------------|-----------|\n| 1947-01-01 | 243.164   |\n| 1947-04-01 | 245.968   |\n| 1947-07-01 | 249.585   |\n| 1947-10-01 | 259.745   |\n| 1948-01-01 | 265.742   |\n| ...        | ...       |\n| 2022-07-01 | 25994.639 |\n| 2022-10-01 | 26408.405 |\n| 2023-01-01 | 26813.601 |\n| 2023-04-01 | 27063.012 |\n| 2023-07-01 | 27623.543 |\n\n<p>307 rows \u00d7 1 columns</p>\n</div>\n\n## PAPERS\n\n> Downloads and processes datasets made available by the authors of\n> academic papers.\n\nEach `papers` module handles a different paper. The naming convention is\nthat the module\u2019s name is made up of the last names of the authors and\nthe publication year, separated by underscores. If more than two\nauthors, all but the first author\u2019s name is replaced by \u2018etal\u2019. For\nexample, the module for the paper \u201cFirm-Level Political Risk:\nMeasurement and Effects\u201d (2019) by Tarek A. Hassan, Stephan Hollander,\nLaurence van Lent, Ahmed Tahoun is named `hasan_etal_2019`.\n\n``` python\npapers.hassan_etal_2019.list_all_vars().head()\n```\n\n<div>\n\n\n|     | name   |\n|-----|--------|\n| 0   | gvkey  |\n| 1   | date   |\n| 2   | PRisk  |\n| 3   | NPRisk |\n| 4   | Risk   |\n\n</div>\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "Download and process datasets commonly used in finance research",
    "version": "0.0.4",
    "project_urls": {
        "Homepage": "https://github.com/ionmihai/finsets"
    },
    "split_keywords": [
        "nbdev",
        "jupyter",
        "notebook",
        "python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d050ddd92b2063d18ecc5e0f2fc2f8101d357148b1237d7cd98c56779a787337",
                "md5": "49ce700ea6aac76cd6d854087a152c7d",
                "sha256": "0fa61283a7841d98396fa34f21d917bcb03cb397f9cff843a8cee8f905f9ad52"
            },
            "downloads": -1,
            "filename": "finsets-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "49ce700ea6aac76cd6d854087a152c7d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 54287,
            "upload_time": "2023-11-19T23:00:54",
            "upload_time_iso_8601": "2023-11-19T23:00:54.809739Z",
            "url": "https://files.pythonhosted.org/packages/d0/50/ddd92b2063d18ecc5e0f2fc2f8101d357148b1237d7cd98c56779a787337/finsets-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "580e6504128cf955fcb2bf6620423bb8c249447b9c238208f25663376e0cfe00",
                "md5": "54dc6a926ec4623f3b3c2b7b7165dcca",
                "sha256": "75f53c7f4c6e0b41458eb6ee39e0d02cbc1c1b2e0a183463857c1ccee85b0b11"
            },
            "downloads": -1,
            "filename": "finsets-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "54dc6a926ec4623f3b3c2b7b7165dcca",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 96799,
            "upload_time": "2023-11-19T23:00:56",
            "upload_time_iso_8601": "2023-11-19T23:00:56.803687Z",
            "url": "https://files.pythonhosted.org/packages/58/0e/6504128cf955fcb2bf6620423bb8c249447b9c238208f25663376e0cfe00/finsets-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-19 23:00:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ionmihai",
    "github_project": "finsets",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "finsets"
}
        
Elapsed time: 0.32851s