reghdfe


Namereghdfe JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryPython implementation of Stata's reghdfe for high-dimensional fixed effects regression
upload_time2025-08-01 17:07:27
maintainerRegHDFE Contributors
docs_urlNone
authorRegHDFE Contributors
requires_python>=3.9
licenseNone
keywords econometrics fixed-effects regression hdfe panel-data
VCS
bugtrack_url
requirements numpy scipy pandas pyhdfe tabulate
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # RegHDFE

**Note: This package continues to be maintained. Additionally, `reghdfe` functionality is also integrated into [StatsPAI](https://github.com/brycewang-stanford/StatsPAI/) for users who prefer the unified ecosystem.**

---

[![Python Version](https://img.shields.io/pypi/pyversions/reghdfe)](https://pypi.org/project/reghdfe/)
[![PyPI Version](https://img.shields.io/pypi/v/reghdfe)](https://pypi.org/project/reghdfe/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Downloads](https://img.shields.io/pypi/dm/reghdfe)](https://pypi.org/project/reghdfe/)

Python implementation of Stata's reghdfe for high-dimensional fixed effects regression.

##  Installation

```bash
pip install reghdfe
```

## 📖 Quick Start

### Basic Example

```python
import pandas as pd
import numpy as np
from reghdfe import reghdfe

# Create sample data
np.random.seed(42)
n = 1000
data = pd.DataFrame({
    'wage': np.random.normal(10, 2, n),
    'experience': np.random.normal(5, 2, n),
    'education': np.random.normal(12, 3, n),
    'firm_id': np.random.choice(range(100), n),
    'year': np.random.choice(range(2010, 2020), n)
})

# Run regression with firm fixed effects
result = reghdfe(
    data=data,
    y='wage',
    x=['experience', 'education'],
    fe=['firm_id']
)

# Display results
print(result.summary())
```

### Advanced Usage Examples

#### 1. Multiple Fixed Effects

```python
# Regression with firm and year fixed effects
result = reghdfe(
    data=data,
    y='wage',
    x=['experience', 'education'],
    fe=['firm_id', 'year']  # Multiple dimensions
)
print(result.summary())
```

#### 2. Cluster-Robust Standard Errors

```python
# One-way clustering
result = reghdfe(
    data=data,
    y='wage',
    x=['experience', 'education'],
    fe=['firm_id'],
    cluster=['firm_id']  # Cluster by firm
)

# Two-way clustering
result = reghdfe(
    data=data,
    y='wage',
    x=['experience', 'education'],
    fe=['firm_id'],
    cluster=['firm_id', 'year']  # Cluster by firm and year
)
```

#### 3. Weighted Regression

```python
# Add weights to your data
data['weight'] = np.random.uniform(0.5, 2.0, len(data))

# Run weighted regression
result = reghdfe(
    data=data,
    y='wage',
    x=['experience', 'education'],
    fe=['firm_id'],
    weights='weight'
)
```

#### 4. No Fixed Effects (OLS)

```python
# Simple OLS regression
result = reghdfe(
    data=data,
    y='wage',
    x=['experience', 'education'],
    fe=None  # No fixed effects
)
```

## Working with Results

### Accessing Coefficients and Statistics

```python
result = reghdfe(data=data, y='wage', x=['experience', 'education'], fe=['firm_id'])

# Get coefficients
coefficients = result.coef
print("Coefficients:", coefficients)

# Get standard errors
std_errors = result.se
print("Standard Errors:", std_errors)

# Get t-statistics
t_stats = result.tstat
print("T-statistics:", t_stats)

# Get p-values
p_values = result.pvalue
print("P-values:", p_values)

# Get confidence intervals
conf_int = result.conf_int()
print("95% Confidence Intervals:", conf_int)

# Get R-squared
print(f"R-squared: {result.rsquared:.4f}")
print(f"Adjusted R-squared: {result.rsquared_adj:.4f}")
```

### Summary Statistics

```python
# Full regression summary
print(result.summary())

# Detailed summary with additional statistics
print(result.summary(show_dof=True))
```

## 🔧 Advanced Configuration

### Custom Absorption Options

```python
result = reghdfe(
    data=data,
    y='wage',
    x=['experience', 'education'],
    fe=['firm_id'],
    absorb_tolerance=1e-10,  # Higher precision
    drop_singletons=True,    # Drop singleton groups
    absorb_method='lsmr'     # Alternative solver
)
```

### Different Covariance Types

```python
# Robust standard errors (default)
result = reghdfe(data=data, y='wage', x=['experience'], fe=['firm_id'], 
                cov_type='robust')

# Clustered standard errors
result = reghdfe(data=data, y='wage', x=['experience'], fe=['firm_id'], 
                cov_type='cluster', cluster=['firm_id'])
```

## Comparison with Stata

This package aims to replicate Stata's `reghdfe` command. Here's how the syntax translates:

**Stata:**
```stata
reghdfe wage experience education, absorb(firm_id year) cluster(firm_id)
```

**Python (reghdfe):**
```python
result = reghdfe(
    data=data,
    y='wage',
    x=['experience', 'education'],
    fe=['firm_id', 'year'],
    cluster=['firm_id']
)
```

## 📋 Key Features

- ✅ **High-dimensional fixed effects** - Efficiently absorb multiple fixed effect dimensions
- ✅ **Cluster-robust standard errors** - Support for one-way and two-way clustering  
- ✅ **Weighted regression** - Handle sampling weights and frequency weights
- ✅ **Singleton dropping** - Automatically handle singleton groups
- ✅ **Fast computation** - Optimized algorithms for large datasets
- ✅ **Stata compatibility** - Results match Stata's reghdfe command

## Integration Options

This package is **actively maintained** as a standalone library. For users who prefer a unified ecosystem with additional econometric and statistical tools, `reghdfe` functionality is also available through:

- **[StatsPAI](https://github.com/brycewang-stanford/StatsPAI/)** - Stats + Econometrics + ML + AI + LLMs

## Related Projects

- **[StatsPAI](https://github.com/brycewang-stanford/StatsPAI/)** - StatsPAI = Stats + Econometrics + ML + AI + LLMs  
- **[PyStataR](https://github.com/brycewang-stanford/PyStataR)** - Unified Stata-equivalent commands and R functions

## Documentation

For detailed API reference and additional examples, visit our [GitHub repository](https://github.com/brycewang-stanford/reghdfe).

## Contributing

We welcome contributions! Please feel free to:
- Report bugs or request features via [GitHub Issues](https://github.com/brycewang-stanford/reghdfe/issues)
- Submit pull requests for improvements
- Share your use cases and examples

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

**This package is actively maintained.** For questions, bug reports, or feature requests, please open an issue on [GitHub](https://github.com/brycewang-stanford/reghdfe/issues).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "reghdfe",
    "maintainer": "RegHDFE Contributors",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "econometrics, fixed-effects, regression, hdfe, panel-data",
    "author": "RegHDFE Contributors",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/76/61/87a1d4ca25a9bb88188a6b57f75b7082f0aad8cdb7f2643534756bb03a19/reghdfe-0.1.1.tar.gz",
    "platform": null,
    "description": "# RegHDFE\n\n**Note: This package continues to be maintained. Additionally, `reghdfe` functionality is also integrated into [StatsPAI](https://github.com/brycewang-stanford/StatsPAI/) for users who prefer the unified ecosystem.**\n\n---\n\n[![Python Version](https://img.shields.io/pypi/pyversions/reghdfe)](https://pypi.org/project/reghdfe/)\n[![PyPI Version](https://img.shields.io/pypi/v/reghdfe)](https://pypi.org/project/reghdfe/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Downloads](https://img.shields.io/pypi/dm/reghdfe)](https://pypi.org/project/reghdfe/)\n\nPython implementation of Stata's reghdfe for high-dimensional fixed effects regression.\n\n##  Installation\n\n```bash\npip install reghdfe\n```\n\n## \ud83d\udcd6 Quick Start\n\n### Basic Example\n\n```python\nimport pandas as pd\nimport numpy as np\nfrom reghdfe import reghdfe\n\n# Create sample data\nnp.random.seed(42)\nn = 1000\ndata = pd.DataFrame({\n    'wage': np.random.normal(10, 2, n),\n    'experience': np.random.normal(5, 2, n),\n    'education': np.random.normal(12, 3, n),\n    'firm_id': np.random.choice(range(100), n),\n    'year': np.random.choice(range(2010, 2020), n)\n})\n\n# Run regression with firm fixed effects\nresult = reghdfe(\n    data=data,\n    y='wage',\n    x=['experience', 'education'],\n    fe=['firm_id']\n)\n\n# Display results\nprint(result.summary())\n```\n\n### Advanced Usage Examples\n\n#### 1. Multiple Fixed Effects\n\n```python\n# Regression with firm and year fixed effects\nresult = reghdfe(\n    data=data,\n    y='wage',\n    x=['experience', 'education'],\n    fe=['firm_id', 'year']  # Multiple dimensions\n)\nprint(result.summary())\n```\n\n#### 2. Cluster-Robust Standard Errors\n\n```python\n# One-way clustering\nresult = reghdfe(\n    data=data,\n    y='wage',\n    x=['experience', 'education'],\n    fe=['firm_id'],\n    cluster=['firm_id']  # Cluster by firm\n)\n\n# Two-way clustering\nresult = reghdfe(\n    data=data,\n    y='wage',\n    x=['experience', 'education'],\n    fe=['firm_id'],\n    cluster=['firm_id', 'year']  # Cluster by firm and year\n)\n```\n\n#### 3. Weighted Regression\n\n```python\n# Add weights to your data\ndata['weight'] = np.random.uniform(0.5, 2.0, len(data))\n\n# Run weighted regression\nresult = reghdfe(\n    data=data,\n    y='wage',\n    x=['experience', 'education'],\n    fe=['firm_id'],\n    weights='weight'\n)\n```\n\n#### 4. No Fixed Effects (OLS)\n\n```python\n# Simple OLS regression\nresult = reghdfe(\n    data=data,\n    y='wage',\n    x=['experience', 'education'],\n    fe=None  # No fixed effects\n)\n```\n\n## Working with Results\n\n### Accessing Coefficients and Statistics\n\n```python\nresult = reghdfe(data=data, y='wage', x=['experience', 'education'], fe=['firm_id'])\n\n# Get coefficients\ncoefficients = result.coef\nprint(\"Coefficients:\", coefficients)\n\n# Get standard errors\nstd_errors = result.se\nprint(\"Standard Errors:\", std_errors)\n\n# Get t-statistics\nt_stats = result.tstat\nprint(\"T-statistics:\", t_stats)\n\n# Get p-values\np_values = result.pvalue\nprint(\"P-values:\", p_values)\n\n# Get confidence intervals\nconf_int = result.conf_int()\nprint(\"95% Confidence Intervals:\", conf_int)\n\n# Get R-squared\nprint(f\"R-squared: {result.rsquared:.4f}\")\nprint(f\"Adjusted R-squared: {result.rsquared_adj:.4f}\")\n```\n\n### Summary Statistics\n\n```python\n# Full regression summary\nprint(result.summary())\n\n# Detailed summary with additional statistics\nprint(result.summary(show_dof=True))\n```\n\n## \ud83d\udd27 Advanced Configuration\n\n### Custom Absorption Options\n\n```python\nresult = reghdfe(\n    data=data,\n    y='wage',\n    x=['experience', 'education'],\n    fe=['firm_id'],\n    absorb_tolerance=1e-10,  # Higher precision\n    drop_singletons=True,    # Drop singleton groups\n    absorb_method='lsmr'     # Alternative solver\n)\n```\n\n### Different Covariance Types\n\n```python\n# Robust standard errors (default)\nresult = reghdfe(data=data, y='wage', x=['experience'], fe=['firm_id'], \n                cov_type='robust')\n\n# Clustered standard errors\nresult = reghdfe(data=data, y='wage', x=['experience'], fe=['firm_id'], \n                cov_type='cluster', cluster=['firm_id'])\n```\n\n## Comparison with Stata\n\nThis package aims to replicate Stata's `reghdfe` command. Here's how the syntax translates:\n\n**Stata:**\n```stata\nreghdfe wage experience education, absorb(firm_id year) cluster(firm_id)\n```\n\n**Python (reghdfe):**\n```python\nresult = reghdfe(\n    data=data,\n    y='wage',\n    x=['experience', 'education'],\n    fe=['firm_id', 'year'],\n    cluster=['firm_id']\n)\n```\n\n## \ud83d\udccb Key Features\n\n- \u2705 **High-dimensional fixed effects** - Efficiently absorb multiple fixed effect dimensions\n- \u2705 **Cluster-robust standard errors** - Support for one-way and two-way clustering  \n- \u2705 **Weighted regression** - Handle sampling weights and frequency weights\n- \u2705 **Singleton dropping** - Automatically handle singleton groups\n- \u2705 **Fast computation** - Optimized algorithms for large datasets\n- \u2705 **Stata compatibility** - Results match Stata's reghdfe command\n\n## Integration Options\n\nThis package is **actively maintained** as a standalone library. For users who prefer a unified ecosystem with additional econometric and statistical tools, `reghdfe` functionality is also available through:\n\n- **[StatsPAI](https://github.com/brycewang-stanford/StatsPAI/)** - Stats + Econometrics + ML + AI + LLMs\n\n## Related Projects\n\n- **[StatsPAI](https://github.com/brycewang-stanford/StatsPAI/)** - StatsPAI = Stats + Econometrics + ML + AI + LLMs  \n- **[PyStataR](https://github.com/brycewang-stanford/PyStataR)** - Unified Stata-equivalent commands and R functions\n\n## Documentation\n\nFor detailed API reference and additional examples, visit our [GitHub repository](https://github.com/brycewang-stanford/reghdfe).\n\n## Contributing\n\nWe welcome contributions! Please feel free to:\n- Report bugs or request features via [GitHub Issues](https://github.com/brycewang-stanford/reghdfe/issues)\n- Submit pull requests for improvements\n- Share your use cases and examples\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n---\n\n**This package is actively maintained.** For questions, bug reports, or feature requests, please open an issue on [GitHub](https://github.com/brycewang-stanford/reghdfe/issues).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Python implementation of Stata's reghdfe for high-dimensional fixed effects regression",
    "version": "0.1.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/brycewang-stanford/reghdfe/issues",
        "Documentation": "https://github.com/brycewang-stanford/reghdfe#documentation",
        "Homepage": "https://github.com/brycewang-stanford/reghdfe",
        "Repository": "https://github.com/brycewang-stanford/reghdfe.git"
    },
    "split_keywords": [
        "econometrics",
        " fixed-effects",
        " regression",
        " hdfe",
        " panel-data"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "35cdf119e96947bc37a7ab110e31e3a96b16429cda910d116cad04d2f749ab3f",
                "md5": "df6d33416fb2d3c2e81b00e4354ebe90",
                "sha256": "4f8febc73df71bdf6ee2340b8e2cb7bac37cdf70f9f314dcdb6a443310165284"
            },
            "downloads": -1,
            "filename": "reghdfe-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "df6d33416fb2d3c2e81b00e4354ebe90",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 19656,
            "upload_time": "2025-08-01T17:07:26",
            "upload_time_iso_8601": "2025-08-01T17:07:26.654650Z",
            "url": "https://files.pythonhosted.org/packages/35/cd/f119e96947bc37a7ab110e31e3a96b16429cda910d116cad04d2f749ab3f/reghdfe-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "766187a1d4ca25a9bb88188a6b57f75b7082f0aad8cdb7f2643534756bb03a19",
                "md5": "10ccb795e92838f66c3ef425eb56ab8a",
                "sha256": "73a0cb1b34e6313215e8183922e20c04779796191249316a5f5dece47a7bac4e"
            },
            "downloads": -1,
            "filename": "reghdfe-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "10ccb795e92838f66c3ef425eb56ab8a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 22931,
            "upload_time": "2025-08-01T17:07:27",
            "upload_time_iso_8601": "2025-08-01T17:07:27.576585Z",
            "url": "https://files.pythonhosted.org/packages/76/61/87a1d4ca25a9bb88188a6b57f75b7082f0aad8cdb7f2643534756bb03a19/reghdfe-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-01 17:07:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "brycewang-stanford",
    "github_project": "reghdfe",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.20.0"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.7.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "pyhdfe",
            "specs": [
                [
                    ">=",
                    "0.1.0"
                ]
            ]
        },
        {
            "name": "tabulate",
            "specs": [
                [
                    ">=",
                    "0.8.0"
                ]
            ]
        }
    ],
    "lcname": "reghdfe"
}
        
Elapsed time: 0.64210s