Name | AnomalyLab JSON |
Version |
0.3.8
JSON |
| download |
home_page | None |
Summary | A Python package for empirical asset pricing analysis. |
upload_time | 2025-02-18 08:52:30 |
maintainer | None |
docs_url | None |
author | FinPhd |
requires_python | >=3.10 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# AnomalyLab
## Authors
Chen Haiwei, Deng Haotian
## Overview
This Python package implements various empirical methods from the book *Empirical Asset Pricing: The Cross Section of Stock Returns* by Turan G. Bali, Robert F. Engle, and Scott Murray. The package includes functionality for:
- Summary statistics
- Correlation analysis
- Persistence analysis
- Portfolio analysis
- Fama-MacBeth regression (FM regression)
Additionally, we have added several extra features, such as:
- Missing value imputation
- Data normalization
- Leading and lagging variables
- Winsorization/truncation
- Transition matrix calculation
- Formatting output tables
## Installation
The package can be installed via:
```bash
pip install anomalylab
```
## Usage
This package provides a comprehensive suite of tools for empirical asset pricing analysis. Below are key functions with explanations and example usage to help you get started.
### Importing Data
```python
from importlib import resources
import pandas as pd
from pandas import DataFrame
from anomalylab import Panel, TimeSeries, pp
from anomalylab.datasets import DataSet
df: DataFrame = DataSet.get_panel_data()
ts: DataFrame = DataSet.get_time_series_data()
# Specifying Factor Models:
Models: dict[str, list[str]] = {
"CAPM": ["MKT(3F)"], # Capital Asset Pricing Model with Market Factor
"FF3": ["MKT(3F)", "SMB(3F)", "HML(3F)"], # Fama-French 3 Factor Model
"FF5": ["MKT(5F)", "SMB(5F)", "HML(5F)", "RMW(5F)", "CMA(5F)"], # Fama-French 5 Factor Model
}
# Creating Panel and Time Series Objects:
panel = Panel(
df,
name="Stocks",
id="permno",
time="date",
frequency="M",
ret="return",
classifications="industry",
drop_all_chars_missing=True,
is_copy=False,
)
time_series: TimeSeries = TimeSeries(
df=ts, name="Factor Series", time="date", frequency="M", is_copy=False
pp(panel)
)
```
### Preprocessing Data
Several preprocessing functions are available for handling missing values, normalizing data, shifting variables, and winsorizing data.
```python
# Filling Data:
# Filling Group Columns
panel.fill_group_column(group_column="industry", value="Other")
# Filling Missing Values
panel.fillna(method="mean", group_columns="date")
# Normalizing Data:
# panel.normalize(method="zscore", group_columns="date")
# Shifting Data:
# panel.shift(periods=1, drop_original=False)
# Winsorizing Data:
panel.winsorize(method="winsorize")
pp(panel)
```
### Summary statistics
You can compute summary statistics for your dataset using the summary() function:
```python
summary = panel.summary()
pp(summary)
```
### Correlation analysis
The correlation() function computes the correlations between different variables in the panel data:
```python
correlation = panel.correlation()
pp(correlation)
```
### Persistence analysis
Persistence analysis helps you understand the stability of certain variables over time.
The persistence() function computes persistence for a given set of periods to analyze the stability of a variable.
The transition_matrix() function calculates the transition matrix to evaluate how a variable moves between different states (e.g., deciles) over time.
```python
person = panel.persistence(periods=[1, 3, 6, 12, 36, 60])
pp(persistence)
pp(
panel.transition_matrix(
var="MktCap",
group=10,
lag=12,
draw=False,
# path="...",
decimal=2,
)
)
```
### Portfolio analysis
You can group data, and perform univariate and bivariate portfolio analyses based on factors.
```python
# Grouping
group_result = panel.group("return", "MktCap", "Illiq", 10)
# Univariate portfolio analysis
uni_ew, uni_vw = panel.univariate_analysis(
"return", "MktCap", "Illiq", 10, Models, time_series, factor_return=False
)
pp(uni_ew)
pp(uni_vw)
# Bivariate portfolio analysis
bi_ew, bi_vw = panel.bivariate_analysis(
"return",
"MktCap",
"Illiq",
"IdioVol",
5,
5,
Models,
time_series,
True,
False,
"dependent",
factor_return=False,
)
pp(bi_ew)
pp(bi_vw)
```
### Fama-MacBeth regression
You can run Fama-MacBeth regressions with multiple independent variables:
```python
fm_result = panel.fm_reg(
regs=[
["return", "MktCap"],
["return", "Illiq"],
["return", "IdioVol"],
["return", "MktCap", "Illiq", "IdioVol"],
],
exog_order=["MktCap", "Illiq", "IdioVol"],
weight="MktCap",
industry="industry",
industry_weighed_method="value",
is_winsorize=False,
is_normalize=True,
)
pp(fm_result)
```
### Formatting results
Finally, you can save and format the results to an Excel file:
```python
output_file_path = "..."
with pd.ExcelWriter(output_file_path) as writer:
summary.to_excel(writer, sheet_name="summary")
correlation.to_excel(writer, sheet_name="correlation")
persistence.to_excel(writer, sheet_name="persistence")
uni_ew.to_excel(writer, sheet_name="uni_ew")
uni_vw.to_excel(writer, sheet_name="uni_vw")
bi_ew.to_excel(writer, sheet_name="bi_ew")
bi_vw.to_excel(writer, sheet_name="bi_vw")
fm_result.to_excel(writer, sheet_name="fm_result")
panel.format_excel(
output_file_path,
align=True,
line=True,
convert_brackets=False,
adjust_col_widths=True,
)
```
Raw data
{
"_id": null,
"home_page": null,
"name": "AnomalyLab",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": "FinPhd",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/33/e7/418c5543db32c08dcabedd0af17fb66aafe16779604087e1b0b3d8ab677e/anomalylab-0.3.8.tar.gz",
"platform": null,
"description": "# AnomalyLab\r\n\r\n## Authors\r\n\r\nChen Haiwei, Deng Haotian\r\n\r\n## Overview\r\n\r\nThis Python package implements various empirical methods from the book *Empirical Asset Pricing: The Cross Section of Stock Returns* by Turan G. Bali, Robert F. Engle, and Scott Murray. The package includes functionality for:\r\n\r\n- Summary statistics\r\n- Correlation analysis\r\n- Persistence analysis\r\n- Portfolio analysis\r\n- Fama-MacBeth regression (FM regression)\r\n\r\nAdditionally, we have added several extra features, such as:\r\n\r\n- Missing value imputation\r\n- Data normalization\r\n- Leading and lagging variables\r\n- Winsorization/truncation\r\n- Transition matrix calculation\r\n- Formatting output tables\r\n\r\n## Installation\r\n\r\nThe package can be installed via:\r\n\r\n```bash\r\npip install anomalylab\r\n```\r\n\r\n## Usage\r\n\r\nThis package provides a comprehensive suite of tools for empirical asset pricing analysis. Below are key functions with explanations and example usage to help you get started.\r\n\r\n### Importing Data\r\n\r\n```python\r\nfrom importlib import resources\r\n\r\nimport pandas as pd\r\nfrom pandas import DataFrame\r\n\r\nfrom anomalylab import Panel, TimeSeries, pp\r\nfrom anomalylab.datasets import DataSet\r\n\r\ndf: DataFrame = DataSet.get_panel_data()\r\nts: DataFrame = DataSet.get_time_series_data()\r\n\r\n# Specifying Factor Models:\r\nModels: dict[str, list[str]] = {\r\n \"CAPM\": [\"MKT(3F)\"], # Capital Asset Pricing Model with Market Factor\r\n \"FF3\": [\"MKT(3F)\", \"SMB(3F)\", \"HML(3F)\"], # Fama-French 3 Factor Model\r\n \"FF5\": [\"MKT(5F)\", \"SMB(5F)\", \"HML(5F)\", \"RMW(5F)\", \"CMA(5F)\"], # Fama-French 5 Factor Model\r\n}\r\n\r\n# Creating Panel and Time Series Objects:\r\npanel = Panel(\r\n df,\r\n name=\"Stocks\",\r\n id=\"permno\",\r\n time=\"date\",\r\n frequency=\"M\",\r\n ret=\"return\",\r\n classifications=\"industry\",\r\n drop_all_chars_missing=True,\r\n is_copy=False,\r\n)\r\ntime_series: TimeSeries = TimeSeries(\r\n df=ts, name=\"Factor Series\", time=\"date\", frequency=\"M\", is_copy=False\r\npp(panel)\r\n)\r\n```\r\n\r\n### Preprocessing Data\r\n\r\nSeveral preprocessing functions are available for handling missing values, normalizing data, shifting variables, and winsorizing data.\r\n\r\n```python\r\n# Filling Data:\r\n# Filling Group Columns\r\npanel.fill_group_column(group_column=\"industry\", value=\"Other\")\r\n# Filling Missing Values\r\npanel.fillna(method=\"mean\", group_columns=\"date\")\r\n\r\n# Normalizing Data:\r\n# panel.normalize(method=\"zscore\", group_columns=\"date\")\r\n\r\n# Shifting Data:\r\n# panel.shift(periods=1, drop_original=False)\r\n\r\n# Winsorizing Data:\r\npanel.winsorize(method=\"winsorize\")\r\npp(panel)\r\n```\r\n\r\n### Summary statistics\r\n\r\nYou can compute summary statistics for your dataset using the summary() function:\r\n\r\n```python\r\nsummary = panel.summary()\r\npp(summary)\r\n```\r\n\r\n### Correlation analysis\r\n\r\nThe correlation() function computes the correlations between different variables in the panel data:\r\n\r\n```python\r\ncorrelation = panel.correlation()\r\npp(correlation)\r\n```\r\n\r\n### Persistence analysis\r\n\r\nPersistence analysis helps you understand the stability of certain variables over time.\r\nThe persistence() function computes persistence for a given set of periods to analyze the stability of a variable.\r\nThe transition_matrix() function calculates the transition matrix to evaluate how a variable moves between different states (e.g., deciles) over time.\r\n\r\n```python\r\nperson = panel.persistence(periods=[1, 3, 6, 12, 36, 60])\r\npp(persistence)\r\npp(\r\n panel.transition_matrix(\r\n var=\"MktCap\",\r\n group=10,\r\n lag=12,\r\n draw=False,\r\n # path=\"...\",\r\n decimal=2,\r\n )\r\n)\r\n```\r\n\r\n### Portfolio analysis\r\n\r\nYou can group data, and perform univariate and bivariate portfolio analyses based on factors.\r\n\r\n```python\r\n# Grouping\r\ngroup_result = panel.group(\"return\", \"MktCap\", \"Illiq\", 10)\r\n\r\n# Univariate portfolio analysis\r\nuni_ew, uni_vw = panel.univariate_analysis(\r\n \"return\", \"MktCap\", \"Illiq\", 10, Models, time_series, factor_return=False\r\n)\r\npp(uni_ew)\r\npp(uni_vw)\r\n\r\n# Bivariate portfolio analysis\r\nbi_ew, bi_vw = panel.bivariate_analysis(\r\n \"return\",\r\n \"MktCap\",\r\n \"Illiq\",\r\n \"IdioVol\",\r\n 5,\r\n 5,\r\n Models,\r\n time_series,\r\n True,\r\n False,\r\n \"dependent\",\r\n factor_return=False,\r\n)\r\npp(bi_ew)\r\npp(bi_vw)\r\n```\r\n\r\n### Fama-MacBeth regression\r\n\r\nYou can run Fama-MacBeth regressions with multiple independent variables:\r\n\r\n```python\r\nfm_result = panel.fm_reg(\r\n regs=[\r\n [\"return\", \"MktCap\"],\r\n [\"return\", \"Illiq\"],\r\n [\"return\", \"IdioVol\"],\r\n [\"return\", \"MktCap\", \"Illiq\", \"IdioVol\"],\r\n ],\r\n exog_order=[\"MktCap\", \"Illiq\", \"IdioVol\"],\r\n weight=\"MktCap\",\r\n industry=\"industry\",\r\n industry_weighed_method=\"value\",\r\n is_winsorize=False,\r\n is_normalize=True,\r\n)\r\npp(fm_result)\r\n```\r\n\r\n### Formatting results\r\n\r\nFinally, you can save and format the results to an Excel file:\r\n\r\n```python\r\noutput_file_path = \"...\"\r\nwith pd.ExcelWriter(output_file_path) as writer:\r\n summary.to_excel(writer, sheet_name=\"summary\")\r\n correlation.to_excel(writer, sheet_name=\"correlation\")\r\n persistence.to_excel(writer, sheet_name=\"persistence\")\r\n uni_ew.to_excel(writer, sheet_name=\"uni_ew\")\r\n uni_vw.to_excel(writer, sheet_name=\"uni_vw\")\r\n bi_ew.to_excel(writer, sheet_name=\"bi_ew\")\r\n bi_vw.to_excel(writer, sheet_name=\"bi_vw\")\r\n fm_result.to_excel(writer, sheet_name=\"fm_result\")\r\n\r\npanel.format_excel(\r\n output_file_path,\r\n align=True,\r\n line=True,\r\n convert_brackets=False,\r\n adjust_col_widths=True,\r\n)\r\n```\r\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python package for empirical asset pricing analysis.",
"version": "0.3.8",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "33e7418c5543db32c08dcabedd0af17fb66aafe16779604087e1b0b3d8ab677e",
"md5": "195edab0202e7c67703b006666702a24",
"sha256": "e6148856c45b3884afafe9628a9ce35248912c39351e4bf6eb273eed7a22660e"
},
"downloads": -1,
"filename": "anomalylab-0.3.8.tar.gz",
"has_sig": false,
"md5_digest": "195edab0202e7c67703b006666702a24",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 15960684,
"upload_time": "2025-02-18T08:52:30",
"upload_time_iso_8601": "2025-02-18T08:52:30.821344Z",
"url": "https://files.pythonhosted.org/packages/33/e7/418c5543db32c08dcabedd0af17fb66aafe16779604087e1b0b3d8ab677e/anomalylab-0.3.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-18 08:52:30",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "anomalylab"
}