pdtab

Name	pdtab JSON
Version	0.1.1 JSON
	download
home_page	https://github.com/pdtab/pdtab
Summary	A pandas-based library that replicates Stata's tabulate functionality
upload_time	2025-08-01 20:15:01
maintainer	None
docs_url	None
author	pdtab Development Team
requires_python	>=3.8
license	MIT
keywords	tabulation crosstab statistics stata pandas data analysis
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # pdtab: Pandas-based Tabulation Library

[![PyPI version](https://badge.fury.io/py/pdtab.svg)](https://badge.fury.io/py/pdtab)
[![Python Versions](https://img.shields.io/pypi/pyversions/pdtab.svg)](https://pypi.org/project/pdtab/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![GitHub stars](https://img.shields.io/github/stars/brycewang-stanford/pdtab.svg?style=social&label=Star)](https://github.com/brycewang-stanford/pdtab)
[![PyPI downloads](https://img.shields.io/pypi/dm/pdtab.svg)](https://pypi.org/project/pdtab/)

**pdtab** is a comprehensive Python library that replicates the functionality of Stata's `tabulate` command using pandas as the backend. This library provides powerful one-way, two-way, and summary tabulations with statistical tests and measures of association.

##  Overview

Stata's `tabulate` command is one of the most widely used tools for creating frequency tables and cross-tabulations in statistical analysis. **pdtab** brings this functionality to Python, offering:

- **Complete Stata compatibility**: Replicates all major features of Stata's tabulate command
- **Statistical tests**: Chi-square tests, Fisher's exact test, likelihood-ratio tests
- **Association measures**: Cramér's V, Goodman and Kruskal's gamma, Kendall's τb
- **Flexible output**: Console tables, HTML, and visualization options
- **Weighted analysis**: Support for frequency, analytic, and importance weights
- **Missing value handling**: Comprehensive options for dealing with missing data

##  Integration with Broader Ecosystem

**pdtab** is part of a comprehensive econometric and statistical analysis ecosystem:

### [PyStataR](https://github.com/brycewang-stanford/PyStataR)
The **pdtab** library will be integrated into **PyStataR**, a comprehensive Python package that bridges Stata and R functionality in Python. PyStataR aims to provide Stata users with familiar commands and workflows while leveraging Python's powerful data science ecosystem.

### [StasPAI](https://github.com/brycewang-stanford/StasPAI)
For users interested in AI-powered econometric analysis, **StasPAI** offers a related project focused on integrating statistical analysis with artificial intelligence methods. StasPAI provides advanced econometric modeling capabilities enhanced by machine learning approaches.

These projects together form a unified toolkit for modern econometric analysis, combining the best of Stata's user-friendly interface, R's statistical capabilities, and Python's machine learning ecosystem.

##  Installation

```bash
pip install pdtab
```

Or install from source:

```bash
git clone https://github.com/brycewang-stanford/pdtab.git
cd pdtab
pip install -e .
```

## Requirements

- Python 3.8+
- pandas >= 1.0.0
- numpy >= 1.18.0
- scipy >= 1.4.0
- matplotlib >= 3.0.0 (for plotting)
- seaborn >= 0.11.0 (for enhanced plotting)

## 🎯 Design Philosophy

**pdtab** is designed as a **pure Python library** focused exclusively on providing Stata's tabulate functionality through a clean, programmatic API. 

### Key Design Decisions:

- **No Command-Line Interface**: pdtab is intentionally designed as a library-only package to maintain simplicity and focus on the Python ecosystem
- **Jupyter-First Approach**: Optimized for data science workflows in Jupyter notebooks and Python scripts
- **Programmatic Access**: All functionality accessible through Python functions with comprehensive options
- **Integration Ready**: Designed to integrate seamlessly with pandas, matplotlib, and the broader PyData ecosystem

This design ensures pdtab remains lightweight, maintainable, and perfectly suited for modern data science workflows.

## Quick Start

### Basic One-way Tabulation

```python
import pandas as pd
import pdtab

# Create sample data
data = pd.DataFrame({
    'gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Female'],
    'education': ['High School', 'College', 'Graduate', 'High School', 'College', 'Graduate'],
    'income': [35000, 45000, 75000, 40000, 55000, 80000]
})

# One-way frequency table
result = pdtab.tabulate('gender', data=data)
print(result)
```

```
gender      Freq    Percent    Cum
Male          3      50.00   50.00
Female        3      50.00  100.00
Total         6     100.00  100.00
```

### Two-way Cross-tabulation with Statistics

```python
# Two-way table with chi-square test
result = pdtab.tabulate('gender', 'education', data=data, chi2=True, exact=True)
print(result)
```

### Summary Tabulation

```python
# Summary statistics by group
result = pdtab.tabulate('gender', data=data, summarize='income')
print(result)
```

```
Summary of income by gender

gender     Mean     Std. Dev.   Freq
Male     55000.0    20000.0      3
Female   55000.0    20000.0      3
Total    55000.0    18257.4      6
```

## Main Functions

### `tabulate(varname1, varname2=None, data=None, **options)`

Main tabulation function supporting:

**One-way options:**
- `missing=True`: Include missing values as a category
- `sort=True`: Sort by frequency (descending)
- `plot=True`: Create bar chart
- `nolabel=True`: Show numeric codes instead of labels
- `generate='prefix'`: Create indicator variables

**Two-way options:**
- `chi2=True`: Pearson's chi-square test
- `exact=True`: Fisher's exact test  
- `lrchi2=True`: Likelihood-ratio chi-square
- `V=True`: Cramér's V
- `gamma=True`: Goodman and Kruskal's gamma
- `taub=True`: Kendall's τb
- `row=True`: Row percentages
- `column=True`: Column percentages
- `cell=True`: Cell percentages
- `expected=True`: Expected frequencies

**Summary options:**
- `summarize='variable'`: Variable to summarize
- `means=False`: Suppress means
- `standard=False`: Suppress standard deviations
- `freq=False`: Suppress frequencies

### `tab1(varlist, data=None, **options)`

Create one-way tables for multiple variables:

```python
results = pdtab.tab1(['gender', 'education'], data=data)
for var, result in results.items():
    print(f"\n{var}:")
    print(result)
```

### `tab2(varlist, data=None, **options)`

Create all possible two-way tables:

```python
results = pdtab.tab2(['gender', 'education', 'region'], data=data, chi2=True)
for (var1, var2), result in results.items():
    print(f"\n{var1} × {var2}:")
    print(result)
```

### `tabi(table_data, **options)`

Immediate tabulation from supplied data:

```python
# From string (Stata format)
result = pdtab.tabi("30 18 \\ 38 14", exact=True)

# From list
result = pdtab.tabi([[30, 18], [38, 14]], chi2=True)
```

## Visualization

Create plots directly from tabulation results:

```python
# Bar chart for one-way table
result = pdtab.tabulate('gender', data=data, plot=True)

# Heatmap for two-way table  
result = pdtab.tabulate('gender', 'education', data=data)
fig = pdtab.viz.create_tabulation_plots(result, plot_type='heatmap')
```

## Statistical Tests

### Supported Tests

1. **Pearson's Chi-square Test**: Tests independence in contingency tables
2. **Likelihood-ratio Chi-square**: Alternative to Pearson's chi-square
3. **Fisher's Exact Test**: Exact test for small samples (especially 2×2 tables)

### Association Measures

1. **Cramér's V**: Measure of association (0-1 scale)
2. **Goodman and Kruskal's Gamma**: For ordinal variables (-1 to 1)
3. **Kendall's τb**: Rank correlation with tie correction (-1 to 1)

## Weighted Analysis

Support for different weight types:

```python
# Frequency weights
result = pdtab.tabulate('gender', data=data, weights='freq_weight')

# Analytic weights  
result = pdtab.tabulate('gender', data=data, weights='analytic_weight')
```

## Missing Value Handling

Flexible options for missing data:

```python
# Exclude missing values (default)
result = pdtab.tabulate('gender', data=data)

# Include missing as category
result = pdtab.tabulate('gender', data=data, missing=True)

# Subpopulation analysis
result = pdtab.tabulate('gender', data=data, subpop='analysis_sample')
```

## Export Options

Export results in multiple formats:

```python
result = pdtab.tabulate('gender', 'education', data=data)

# Export to dictionary
data_dict = result.to_dict()

# Export to HTML
html_table = result.to_html()

# Save plot
fig = pdtab.viz.create_tabulation_plots(result)
pdtab.viz.save_plot(fig, 'crosstab.png')
```

## Advanced Examples

### Complex Two-way Analysis

```python
# Comprehensive two-way analysis
result = pdtab.tabulate(
    'treatment', 'outcome', 
    data=clinical_data,
    chi2=True,           # Chi-square test
    exact=True,          # Fisher's exact test
    V=True,              # Cramér's V
    row=True,            # Row percentages
    expected=True,       # Expected frequencies
    missing=True         # Include missing values
)

print(result)
print(f"Chi-square: {result.statistics['chi2']['statistic']:.3f}")
print(f"p-value: {result.statistics['chi2']['p_value']:.3f}")
print(f"Cramér's V: {result.statistics['cramers_v']:.3f}")
```

### Summary Analysis by Multiple Groups

```python
# Income analysis by gender and education
result = pdtab.tabulate(
    'gender', 'education',
    data=data,
    summarize='income',
    means=True,
    standard=True,
    obs=True
)
```

### Immediate Analysis of Published Data

```python
# Analyze a 2×3 contingency table from literature
published_data = """
    45 55 60 \\
    30 40 35
"""

result = pdtab.tabi(published_data, chi2=True, exact=True, V=True)
print("Published data analysis:")
print(result)
```

## Stata Comparison

pdtab aims for 100% compatibility with Stata's tabulate command:

| Stata Command | pdtab Equivalent |
|---------------|------------------|
| `tabulate gender` | `pdtab.tabulate('gender', data=df)` |
| `tabulate gender education, chi2` | `pdtab.tabulate('gender', 'education', data=df, chi2=True)` |
| `tabulate gender, summarize(income)` | `pdtab.tabulate('gender', data=df, summarize='income')` |
| `tab1 gender education region` | `pdtab.tab1(['gender', 'education', 'region'], data=df)` |
| `tab2 gender education region` | `pdtab.tab2(['gender', 'education', 'region'], data=df)` |
| `tabi 30 18 \\ 38 14, exact` | `pdtab.tabi("30 18 \\\\ 38 14", exact=True)` |

## 🤝 Contributing

We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

### Development Setup

```bash
git clone https://github.com/brycewang-stanford/pdtab.git
cd pdtab
pip install -e ".[dev]"
pytest
```

## 📄 License

MIT License - see [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- **Stata Corporation** for the original tabulate command design
- **Pandas Development Team** for the excellent data manipulation library
- **SciPy Community** for statistical computing tools

## Related Projects

**pdtab** is part of a broader ecosystem of econometric and statistical tools:

- **[PyStataR](https://github.com/brycewang-stanford/PyStataR)** - Comprehensive Python package bridging Stata and R functionality (pdtab will be integrated into this project)
- **[StasPAI](https://github.com/brycewang-stanford/StasPAI)** - AI-powered econometric analysis toolkit combining statistical methods with machine learning

## Support

- **Documentation**: [https://pdtab.readthedocs.io](https://pdtab.readthedocs.io)
- **Issues**: [GitHub Issues](https://github.com/brycewang-stanford/pdtab/issues)
- **Discussions**: [GitHub Discussions](https://github.com/brycewang-stanford/pdtab/discussions)

---

**pdtab** - Bringing Stata's tabulation power to the Python ecosystem! 🐍

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/pdtab/pdtab",
    "name": "pdtab",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "tabulation, crosstab, statistics, stata, pandas, data analysis",
    "author": "pdtab Development Team",
    "author_email": "Bryce Wang <bryce6m@stanford.edu>",
    "download_url": "https://files.pythonhosted.org/packages/34/63/4c779b3cb184c6762335484150833250c12fe10deeb4c0ae7d0644faf767/pdtab-0.1.1.tar.gz",
    "platform": null,
    "description": "# pdtab: Pandas-based Tabulation Library\n\n[![PyPI version](https://badge.fury.io/py/pdtab.svg)](https://badge.fury.io/py/pdtab)\n[![Python Versions](https://img.shields.io/pypi/pyversions/pdtab.svg)](https://pypi.org/project/pdtab/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![GitHub stars](https://img.shields.io/github/stars/brycewang-stanford/pdtab.svg?style=social&label=Star)](https://github.com/brycewang-stanford/pdtab)\n[![PyPI downloads](https://img.shields.io/pypi/dm/pdtab.svg)](https://pypi.org/project/pdtab/)\n\n**pdtab** is a comprehensive Python library that replicates the functionality of Stata's `tabulate` command using pandas as the backend. This library provides powerful one-way, two-way, and summary tabulations with statistical tests and measures of association.\n\n##  Overview\n\nStata's `tabulate` command is one of the most widely used tools for creating frequency tables and cross-tabulations in statistical analysis. **pdtab** brings this functionality to Python, offering:\n\n- **Complete Stata compatibility**: Replicates all major features of Stata's tabulate command\n- **Statistical tests**: Chi-square tests, Fisher's exact test, likelihood-ratio tests\n- **Association measures**: Cram\u00e9r's V, Goodman and Kruskal's gamma, Kendall's \u03c4b\n- **Flexible output**: Console tables, HTML, and visualization options\n- **Weighted analysis**: Support for frequency, analytic, and importance weights\n- **Missing value handling**: Comprehensive options for dealing with missing data\n\n##  Integration with Broader Ecosystem\n\n**pdtab** is part of a comprehensive econometric and statistical analysis ecosystem:\n\n### [PyStataR](https://github.com/brycewang-stanford/PyStataR)\nThe **pdtab** library will be integrated into **PyStataR**, a comprehensive Python package that bridges Stata and R functionality in Python. PyStataR aims to provide Stata users with familiar commands and workflows while leveraging Python's powerful data science ecosystem.\n\n### [StasPAI](https://github.com/brycewang-stanford/StasPAI)\nFor users interested in AI-powered econometric analysis, **StasPAI** offers a related project focused on integrating statistical analysis with artificial intelligence methods. StasPAI provides advanced econometric modeling capabilities enhanced by machine learning approaches.\n\nThese projects together form a unified toolkit for modern econometric analysis, combining the best of Stata's user-friendly interface, R's statistical capabilities, and Python's machine learning ecosystem.\n\n##  Installation\n\n```bash\npip install pdtab\n```\n\nOr install from source:\n\n```bash\ngit clone https://github.com/brycewang-stanford/pdtab.git\ncd pdtab\npip install -e .\n```\n\n## Requirements\n\n- Python 3.8+\n- pandas >= 1.0.0\n- numpy >= 1.18.0\n- scipy >= 1.4.0\n- matplotlib >= 3.0.0 (for plotting)\n- seaborn >= 0.11.0 (for enhanced plotting)\n\n## \ud83c\udfaf Design Philosophy\n\n**pdtab** is designed as a **pure Python library** focused exclusively on providing Stata's tabulate functionality through a clean, programmatic API. \n\n### Key Design Decisions:\n\n- **No Command-Line Interface**: pdtab is intentionally designed as a library-only package to maintain simplicity and focus on the Python ecosystem\n- **Jupyter-First Approach**: Optimized for data science workflows in Jupyter notebooks and Python scripts\n- **Programmatic Access**: All functionality accessible through Python functions with comprehensive options\n- **Integration Ready**: Designed to integrate seamlessly with pandas, matplotlib, and the broader PyData ecosystem\n\nThis design ensures pdtab remains lightweight, maintainable, and perfectly suited for modern data science workflows.\n\n## Quick Start\n\n### Basic One-way Tabulation\n\n```python\nimport pandas as pd\nimport pdtab\n\n# Create sample data\ndata = pd.DataFrame({\n    'gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Female'],\n    'education': ['High School', 'College', 'Graduate', 'High School', 'College', 'Graduate'],\n    'income': [35000, 45000, 75000, 40000, 55000, 80000]\n})\n\n# One-way frequency table\nresult = pdtab.tabulate('gender', data=data)\nprint(result)\n```\n\n```\ngender      Freq    Percent    Cum\nMale          3      50.00   50.00\nFemale        3      50.00  100.00\nTotal         6     100.00  100.00\n```\n\n### Two-way Cross-tabulation with Statistics\n\n```python\n# Two-way table with chi-square test\nresult = pdtab.tabulate('gender', 'education', data=data, chi2=True, exact=True)\nprint(result)\n```\n\n### Summary Tabulation\n\n```python\n# Summary statistics by group\nresult = pdtab.tabulate('gender', data=data, summarize='income')\nprint(result)\n```\n\n```\nSummary of income by gender\n\ngender     Mean     Std. Dev.   Freq\nMale     55000.0    20000.0      3\nFemale   55000.0    20000.0      3\nTotal    55000.0    18257.4      6\n```\n\n## Main Functions\n\n### `tabulate(varname1, varname2=None, data=None, **options)`\n\nMain tabulation function supporting:\n\n**One-way options:**\n- `missing=True`: Include missing values as a category\n- `sort=True`: Sort by frequency (descending)\n- `plot=True`: Create bar chart\n- `nolabel=True`: Show numeric codes instead of labels\n- `generate='prefix'`: Create indicator variables\n\n**Two-way options:**\n- `chi2=True`: Pearson's chi-square test\n- `exact=True`: Fisher's exact test  \n- `lrchi2=True`: Likelihood-ratio chi-square\n- `V=True`: Cram\u00e9r's V\n- `gamma=True`: Goodman and Kruskal's gamma\n- `taub=True`: Kendall's \u03c4b\n- `row=True`: Row percentages\n- `column=True`: Column percentages\n- `cell=True`: Cell percentages\n- `expected=True`: Expected frequencies\n\n**Summary options:**\n- `summarize='variable'`: Variable to summarize\n- `means=False`: Suppress means\n- `standard=False`: Suppress standard deviations\n- `freq=False`: Suppress frequencies\n\n### `tab1(varlist, data=None, **options)`\n\nCreate one-way tables for multiple variables:\n\n```python\nresults = pdtab.tab1(['gender', 'education'], data=data)\nfor var, result in results.items():\n    print(f\"\\n{var}:\")\n    print(result)\n```\n\n### `tab2(varlist, data=None, **options)`\n\nCreate all possible two-way tables:\n\n```python\nresults = pdtab.tab2(['gender', 'education', 'region'], data=data, chi2=True)\nfor (var1, var2), result in results.items():\n    print(f\"\\n{var1} \u00d7 {var2}:\")\n    print(result)\n```\n\n### `tabi(table_data, **options)`\n\nImmediate tabulation from supplied data:\n\n```python\n# From string (Stata format)\nresult = pdtab.tabi(\"30 18 \\\\ 38 14\", exact=True)\n\n# From list\nresult = pdtab.tabi([[30, 18], [38, 14]], chi2=True)\n```\n\n## Visualization\n\nCreate plots directly from tabulation results:\n\n```python\n# Bar chart for one-way table\nresult = pdtab.tabulate('gender', data=data, plot=True)\n\n# Heatmap for two-way table  \nresult = pdtab.tabulate('gender', 'education', data=data)\nfig = pdtab.viz.create_tabulation_plots(result, plot_type='heatmap')\n```\n\n## Statistical Tests\n\n### Supported Tests\n\n1. **Pearson's Chi-square Test**: Tests independence in contingency tables\n2. **Likelihood-ratio Chi-square**: Alternative to Pearson's chi-square\n3. **Fisher's Exact Test**: Exact test for small samples (especially 2\u00d72 tables)\n\n### Association Measures\n\n1. **Cram\u00e9r's V**: Measure of association (0-1 scale)\n2. **Goodman and Kruskal's Gamma**: For ordinal variables (-1 to 1)\n3. **Kendall's \u03c4b**: Rank correlation with tie correction (-1 to 1)\n\n## Weighted Analysis\n\nSupport for different weight types:\n\n```python\n# Frequency weights\nresult = pdtab.tabulate('gender', data=data, weights='freq_weight')\n\n# Analytic weights  \nresult = pdtab.tabulate('gender', data=data, weights='analytic_weight')\n```\n\n## Missing Value Handling\n\nFlexible options for missing data:\n\n```python\n# Exclude missing values (default)\nresult = pdtab.tabulate('gender', data=data)\n\n# Include missing as category\nresult = pdtab.tabulate('gender', data=data, missing=True)\n\n# Subpopulation analysis\nresult = pdtab.tabulate('gender', data=data, subpop='analysis_sample')\n```\n\n## Export Options\n\nExport results in multiple formats:\n\n```python\nresult = pdtab.tabulate('gender', 'education', data=data)\n\n# Export to dictionary\ndata_dict = result.to_dict()\n\n# Export to HTML\nhtml_table = result.to_html()\n\n# Save plot\nfig = pdtab.viz.create_tabulation_plots(result)\npdtab.viz.save_plot(fig, 'crosstab.png')\n```\n\n## Advanced Examples\n\n### Complex Two-way Analysis\n\n```python\n# Comprehensive two-way analysis\nresult = pdtab.tabulate(\n    'treatment', 'outcome', \n    data=clinical_data,\n    chi2=True,           # Chi-square test\n    exact=True,          # Fisher's exact test\n    V=True,              # Cram\u00e9r's V\n    row=True,            # Row percentages\n    expected=True,       # Expected frequencies\n    missing=True         # Include missing values\n)\n\nprint(result)\nprint(f\"Chi-square: {result.statistics['chi2']['statistic']:.3f}\")\nprint(f\"p-value: {result.statistics['chi2']['p_value']:.3f}\")\nprint(f\"Cram\u00e9r's V: {result.statistics['cramers_v']:.3f}\")\n```\n\n### Summary Analysis by Multiple Groups\n\n```python\n# Income analysis by gender and education\nresult = pdtab.tabulate(\n    'gender', 'education',\n    data=data,\n    summarize='income',\n    means=True,\n    standard=True,\n    obs=True\n)\n```\n\n### Immediate Analysis of Published Data\n\n```python\n# Analyze a 2\u00d73 contingency table from literature\npublished_data = \"\"\"\n    45 55 60 \\\\\n    30 40 35\n\"\"\"\n\nresult = pdtab.tabi(published_data, chi2=True, exact=True, V=True)\nprint(\"Published data analysis:\")\nprint(result)\n```\n\n## Stata Comparison\n\npdtab aims for 100% compatibility with Stata's tabulate command:\n\n| Stata Command | pdtab Equivalent |\n|---------------|------------------|\n| `tabulate gender` | `pdtab.tabulate('gender', data=df)` |\n| `tabulate gender education, chi2` | `pdtab.tabulate('gender', 'education', data=df, chi2=True)` |\n| `tabulate gender, summarize(income)` | `pdtab.tabulate('gender', data=df, summarize='income')` |\n| `tab1 gender education region` | `pdtab.tab1(['gender', 'education', 'region'], data=df)` |\n| `tab2 gender education region` | `pdtab.tab2(['gender', 'education', 'region'], data=df)` |\n| `tabi 30 18 \\\\ 38 14, exact` | `pdtab.tabi(\"30 18 \\\\\\\\ 38 14\", exact=True)` |\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n### Development Setup\n\n```bash\ngit clone https://github.com/brycewang-stanford/pdtab.git\ncd pdtab\npip install -e \".[dev]\"\npytest\n```\n\n## \ud83d\udcc4 License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- **Stata Corporation** for the original tabulate command design\n- **Pandas Development Team** for the excellent data manipulation library\n- **SciPy Community** for statistical computing tools\n\n## Related Projects\n\n**pdtab** is part of a broader ecosystem of econometric and statistical tools:\n\n- **[PyStataR](https://github.com/brycewang-stanford/PyStataR)** - Comprehensive Python package bridging Stata and R functionality (pdtab will be integrated into this project)\n- **[StasPAI](https://github.com/brycewang-stanford/StasPAI)** - AI-powered econometric analysis toolkit combining statistical methods with machine learning\n\n## Support\n\n- **Documentation**: [https://pdtab.readthedocs.io](https://pdtab.readthedocs.io)\n- **Issues**: [GitHub Issues](https://github.com/brycewang-stanford/pdtab/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/brycewang-stanford/pdtab/discussions)\n\n---\n\n**pdtab** - Bringing Stata's tabulation power to the Python ecosystem! \ud83d\udc0d\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A pandas-based library that replicates Stata's tabulate functionality",
    "version": "0.1.1",
    "project_urls": {
        "Bug Reports": "https://github.com/brycewang-stanford/pdtab/issues",
        "Documentation": "https://pdtab.readthedocs.io/",
        "Homepage": "https://github.com/brycewang-stanford/pdtab",
        "Source": "https://github.com/brycewang-stanford/pdtab"
    },
    "split_keywords": [
        "tabulation",
        " crosstab",
        " statistics",
        " stata",
        " pandas",
        " data analysis"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4cab2808c556328865494a3ec428201897f04bc298d67a9330d53d98056e30f2",
                "md5": "096045d1711483c11ebc7bb7093c3454",
                "sha256": "d6722e1ac72f161272fcc2fcc2dabb4a800c929c79f247ca05db453046ce453d"
            },
            "downloads": -1,
            "filename": "pdtab-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "096045d1711483c11ebc7bb7093c3454",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 31004,
            "upload_time": "2025-08-01T20:15:00",
            "upload_time_iso_8601": "2025-08-01T20:15:00.410291Z",
            "url": "https://files.pythonhosted.org/packages/4c/ab/2808c556328865494a3ec428201897f04bc298d67a9330d53d98056e30f2/pdtab-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "34634c779b3cb184c6762335484150833250c12fe10deeb4c0ae7d0644faf767",
                "md5": "3a9d6f7cb6727462a04bce4ffb8c3f1d",
                "sha256": "0d0e37d549702b24250d636d80d6ac92b1bde4e97bbf2c46b18d039e4880a961"
            },
            "downloads": -1,
            "filename": "pdtab-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3a9d6f7cb6727462a04bce4ffb8c3f1d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 40887,
            "upload_time": "2025-08-01T20:15:01",
            "upload_time_iso_8601": "2025-08-01T20:15:01.541238Z",
            "url": "https://files.pythonhosted.org/packages/34/63/4c779b3cb184c6762335484150833250c12fe10deeb4c0ae7d0644faf767/pdtab-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-01 20:15:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pdtab",
    "github_project": "pdtab",
    "github_not_found": true,
    "lcname": "pdtab"
}

pdtab Development Team