AnomalyLab


NameAnomalyLab JSON
Version 0.3.8 PyPI version JSON
download
home_pageNone
SummaryA Python package for empirical asset pricing analysis.
upload_time2025-02-18 08:52:30
maintainerNone
docs_urlNone
authorFinPhd
requires_python>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # AnomalyLab

## Authors

Chen Haiwei, Deng Haotian

## Overview

This Python package implements various empirical methods from the book *Empirical Asset Pricing: The Cross Section of Stock Returns* by Turan G. Bali, Robert F. Engle, and Scott Murray. The package includes functionality for:

- Summary statistics
- Correlation analysis
- Persistence analysis
- Portfolio analysis
- Fama-MacBeth regression (FM regression)

Additionally, we have added several extra features, such as:

- Missing value imputation
- Data normalization
- Leading and lagging variables
- Winsorization/truncation
- Transition matrix calculation
- Formatting output tables

## Installation

The package can be installed via:

```bash
pip install anomalylab
```

## Usage

This package provides a comprehensive suite of tools for empirical asset pricing analysis. Below are key functions with explanations and example usage to help you get started.

### Importing Data

```python
from importlib import resources

import pandas as pd
from pandas import DataFrame

from anomalylab import Panel, TimeSeries, pp
from anomalylab.datasets import DataSet

df: DataFrame = DataSet.get_panel_data()
ts: DataFrame = DataSet.get_time_series_data()

# Specifying Factor Models:
Models: dict[str, list[str]] = {
    "CAPM": ["MKT(3F)"],  # Capital Asset Pricing Model with Market Factor
    "FF3": ["MKT(3F)", "SMB(3F)", "HML(3F)"],  # Fama-French 3 Factor Model
    "FF5": ["MKT(5F)", "SMB(5F)", "HML(5F)", "RMW(5F)", "CMA(5F)"],  # Fama-French 5 Factor Model
}

# Creating Panel and Time Series Objects:
panel = Panel(
    df,
    name="Stocks",
    id="permno",
    time="date",
    frequency="M",
    ret="return",
    classifications="industry",
    drop_all_chars_missing=True,
    is_copy=False,
)
time_series: TimeSeries = TimeSeries(
    df=ts, name="Factor Series", time="date", frequency="M", is_copy=False
pp(panel)
)
```

### Preprocessing Data

Several preprocessing functions are available for handling missing values, normalizing data, shifting variables, and winsorizing data.

```python
# Filling Data:
# Filling Group Columns
panel.fill_group_column(group_column="industry", value="Other")
# Filling Missing Values
panel.fillna(method="mean", group_columns="date")

# Normalizing Data:
# panel.normalize(method="zscore", group_columns="date")

# Shifting Data:
# panel.shift(periods=1, drop_original=False)

# Winsorizing Data:
panel.winsorize(method="winsorize")
pp(panel)
```

### Summary statistics

You can compute summary statistics for your dataset using the summary() function:

```python
summary = panel.summary()
pp(summary)
```

### Correlation analysis

The correlation() function computes the correlations between different variables in the panel data:

```python
correlation = panel.correlation()
pp(correlation)
```

### Persistence analysis

Persistence analysis helps you understand the stability of certain variables over time.
The persistence() function computes persistence for a given set of periods to analyze the stability of a variable.
The transition_matrix() function calculates the transition matrix to evaluate how a variable moves between different states (e.g., deciles) over time.

```python
person = panel.persistence(periods=[1, 3, 6, 12, 36, 60])
pp(persistence)
pp(
    panel.transition_matrix(
        var="MktCap",
        group=10,
        lag=12,
        draw=False,
        # path="...",
        decimal=2,
    )
)
```

### Portfolio analysis

You can group data, and perform univariate and bivariate portfolio analyses based on factors.

```python
# Grouping
group_result = panel.group("return", "MktCap", "Illiq", 10)

# Univariate portfolio analysis
uni_ew, uni_vw = panel.univariate_analysis(
    "return", "MktCap", "Illiq", 10, Models, time_series, factor_return=False
)
pp(uni_ew)
pp(uni_vw)

# Bivariate portfolio analysis
bi_ew, bi_vw = panel.bivariate_analysis(
    "return",
    "MktCap",
    "Illiq",
    "IdioVol",
    5,
    5,
    Models,
    time_series,
    True,
    False,
    "dependent",
    factor_return=False,
)
pp(bi_ew)
pp(bi_vw)
```

### Fama-MacBeth regression

You can run Fama-MacBeth regressions with multiple independent variables:

```python
fm_result = panel.fm_reg(
    regs=[
        ["return", "MktCap"],
        ["return", "Illiq"],
        ["return", "IdioVol"],
        ["return", "MktCap", "Illiq", "IdioVol"],
    ],
    exog_order=["MktCap", "Illiq", "IdioVol"],
    weight="MktCap",
    industry="industry",
    industry_weighed_method="value",
    is_winsorize=False,
    is_normalize=True,
)
pp(fm_result)
```

### Formatting results

Finally, you can save and format the results to an Excel file:

```python
output_file_path = "..."
with pd.ExcelWriter(output_file_path) as writer:
    summary.to_excel(writer, sheet_name="summary")
    correlation.to_excel(writer, sheet_name="correlation")
    persistence.to_excel(writer, sheet_name="persistence")
    uni_ew.to_excel(writer, sheet_name="uni_ew")
    uni_vw.to_excel(writer, sheet_name="uni_vw")
    bi_ew.to_excel(writer, sheet_name="bi_ew")
    bi_vw.to_excel(writer, sheet_name="bi_vw")
    fm_result.to_excel(writer, sheet_name="fm_result")

panel.format_excel(
    output_file_path,
    align=True,
    line=True,
    convert_brackets=False,
    adjust_col_widths=True,
)
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "AnomalyLab",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "FinPhd",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/33/e7/418c5543db32c08dcabedd0af17fb66aafe16779604087e1b0b3d8ab677e/anomalylab-0.3.8.tar.gz",
    "platform": null,
    "description": "# AnomalyLab\r\n\r\n## Authors\r\n\r\nChen Haiwei, Deng Haotian\r\n\r\n## Overview\r\n\r\nThis Python package implements various empirical methods from the book *Empirical Asset Pricing: The Cross Section of Stock Returns* by Turan G. Bali, Robert F. Engle, and Scott Murray. The package includes functionality for:\r\n\r\n- Summary statistics\r\n- Correlation analysis\r\n- Persistence analysis\r\n- Portfolio analysis\r\n- Fama-MacBeth regression (FM regression)\r\n\r\nAdditionally, we have added several extra features, such as:\r\n\r\n- Missing value imputation\r\n- Data normalization\r\n- Leading and lagging variables\r\n- Winsorization/truncation\r\n- Transition matrix calculation\r\n- Formatting output tables\r\n\r\n## Installation\r\n\r\nThe package can be installed via:\r\n\r\n```bash\r\npip install anomalylab\r\n```\r\n\r\n## Usage\r\n\r\nThis package provides a comprehensive suite of tools for empirical asset pricing analysis. Below are key functions with explanations and example usage to help you get started.\r\n\r\n### Importing Data\r\n\r\n```python\r\nfrom importlib import resources\r\n\r\nimport pandas as pd\r\nfrom pandas import DataFrame\r\n\r\nfrom anomalylab import Panel, TimeSeries, pp\r\nfrom anomalylab.datasets import DataSet\r\n\r\ndf: DataFrame = DataSet.get_panel_data()\r\nts: DataFrame = DataSet.get_time_series_data()\r\n\r\n# Specifying Factor Models:\r\nModels: dict[str, list[str]] = {\r\n    \"CAPM\": [\"MKT(3F)\"],  # Capital Asset Pricing Model with Market Factor\r\n    \"FF3\": [\"MKT(3F)\", \"SMB(3F)\", \"HML(3F)\"],  # Fama-French 3 Factor Model\r\n    \"FF5\": [\"MKT(5F)\", \"SMB(5F)\", \"HML(5F)\", \"RMW(5F)\", \"CMA(5F)\"],  # Fama-French 5 Factor Model\r\n}\r\n\r\n# Creating Panel and Time Series Objects:\r\npanel = Panel(\r\n    df,\r\n    name=\"Stocks\",\r\n    id=\"permno\",\r\n    time=\"date\",\r\n    frequency=\"M\",\r\n    ret=\"return\",\r\n    classifications=\"industry\",\r\n    drop_all_chars_missing=True,\r\n    is_copy=False,\r\n)\r\ntime_series: TimeSeries = TimeSeries(\r\n    df=ts, name=\"Factor Series\", time=\"date\", frequency=\"M\", is_copy=False\r\npp(panel)\r\n)\r\n```\r\n\r\n### Preprocessing Data\r\n\r\nSeveral preprocessing functions are available for handling missing values, normalizing data, shifting variables, and winsorizing data.\r\n\r\n```python\r\n# Filling Data:\r\n# Filling Group Columns\r\npanel.fill_group_column(group_column=\"industry\", value=\"Other\")\r\n# Filling Missing Values\r\npanel.fillna(method=\"mean\", group_columns=\"date\")\r\n\r\n# Normalizing Data:\r\n# panel.normalize(method=\"zscore\", group_columns=\"date\")\r\n\r\n# Shifting Data:\r\n# panel.shift(periods=1, drop_original=False)\r\n\r\n# Winsorizing Data:\r\npanel.winsorize(method=\"winsorize\")\r\npp(panel)\r\n```\r\n\r\n### Summary statistics\r\n\r\nYou can compute summary statistics for your dataset using the summary() function:\r\n\r\n```python\r\nsummary = panel.summary()\r\npp(summary)\r\n```\r\n\r\n### Correlation analysis\r\n\r\nThe correlation() function computes the correlations between different variables in the panel data:\r\n\r\n```python\r\ncorrelation = panel.correlation()\r\npp(correlation)\r\n```\r\n\r\n### Persistence analysis\r\n\r\nPersistence analysis helps you understand the stability of certain variables over time.\r\nThe persistence() function computes persistence for a given set of periods to analyze the stability of a variable.\r\nThe transition_matrix() function calculates the transition matrix to evaluate how a variable moves between different states (e.g., deciles) over time.\r\n\r\n```python\r\nperson = panel.persistence(periods=[1, 3, 6, 12, 36, 60])\r\npp(persistence)\r\npp(\r\n    panel.transition_matrix(\r\n        var=\"MktCap\",\r\n        group=10,\r\n        lag=12,\r\n        draw=False,\r\n        # path=\"...\",\r\n        decimal=2,\r\n    )\r\n)\r\n```\r\n\r\n### Portfolio analysis\r\n\r\nYou can group data, and perform univariate and bivariate portfolio analyses based on factors.\r\n\r\n```python\r\n# Grouping\r\ngroup_result = panel.group(\"return\", \"MktCap\", \"Illiq\", 10)\r\n\r\n# Univariate portfolio analysis\r\nuni_ew, uni_vw = panel.univariate_analysis(\r\n    \"return\", \"MktCap\", \"Illiq\", 10, Models, time_series, factor_return=False\r\n)\r\npp(uni_ew)\r\npp(uni_vw)\r\n\r\n# Bivariate portfolio analysis\r\nbi_ew, bi_vw = panel.bivariate_analysis(\r\n    \"return\",\r\n    \"MktCap\",\r\n    \"Illiq\",\r\n    \"IdioVol\",\r\n    5,\r\n    5,\r\n    Models,\r\n    time_series,\r\n    True,\r\n    False,\r\n    \"dependent\",\r\n    factor_return=False,\r\n)\r\npp(bi_ew)\r\npp(bi_vw)\r\n```\r\n\r\n### Fama-MacBeth regression\r\n\r\nYou can run Fama-MacBeth regressions with multiple independent variables:\r\n\r\n```python\r\nfm_result = panel.fm_reg(\r\n    regs=[\r\n        [\"return\", \"MktCap\"],\r\n        [\"return\", \"Illiq\"],\r\n        [\"return\", \"IdioVol\"],\r\n        [\"return\", \"MktCap\", \"Illiq\", \"IdioVol\"],\r\n    ],\r\n    exog_order=[\"MktCap\", \"Illiq\", \"IdioVol\"],\r\n    weight=\"MktCap\",\r\n    industry=\"industry\",\r\n    industry_weighed_method=\"value\",\r\n    is_winsorize=False,\r\n    is_normalize=True,\r\n)\r\npp(fm_result)\r\n```\r\n\r\n### Formatting results\r\n\r\nFinally, you can save and format the results to an Excel file:\r\n\r\n```python\r\noutput_file_path = \"...\"\r\nwith pd.ExcelWriter(output_file_path) as writer:\r\n    summary.to_excel(writer, sheet_name=\"summary\")\r\n    correlation.to_excel(writer, sheet_name=\"correlation\")\r\n    persistence.to_excel(writer, sheet_name=\"persistence\")\r\n    uni_ew.to_excel(writer, sheet_name=\"uni_ew\")\r\n    uni_vw.to_excel(writer, sheet_name=\"uni_vw\")\r\n    bi_ew.to_excel(writer, sheet_name=\"bi_ew\")\r\n    bi_vw.to_excel(writer, sheet_name=\"bi_vw\")\r\n    fm_result.to_excel(writer, sheet_name=\"fm_result\")\r\n\r\npanel.format_excel(\r\n    output_file_path,\r\n    align=True,\r\n    line=True,\r\n    convert_brackets=False,\r\n    adjust_col_widths=True,\r\n)\r\n```\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python package for empirical asset pricing analysis.",
    "version": "0.3.8",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "33e7418c5543db32c08dcabedd0af17fb66aafe16779604087e1b0b3d8ab677e",
                "md5": "195edab0202e7c67703b006666702a24",
                "sha256": "e6148856c45b3884afafe9628a9ce35248912c39351e4bf6eb273eed7a22660e"
            },
            "downloads": -1,
            "filename": "anomalylab-0.3.8.tar.gz",
            "has_sig": false,
            "md5_digest": "195edab0202e7c67703b006666702a24",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 15960684,
            "upload_time": "2025-02-18T08:52:30",
            "upload_time_iso_8601": "2025-02-18T08:52:30.821344Z",
            "url": "https://files.pythonhosted.org/packages/33/e7/418c5543db32c08dcabedd0af17fb66aafe16779604087e1b0b3d8ab677e/anomalylab-0.3.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-18 08:52:30",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "anomalylab"
}
        
Elapsed time: 7.08903s