pandascompare


Namepandascompare JSON
Version 0.1.3 PyPI version JSON
download
home_pagehttps://github.com/zteinck/pandascompare
SummarySuite of pandas utilities including a DataFrame comparison report builder.
upload_time2024-08-12 05:52:55
maintainerNone
docs_urlNone
authorZachary Einck
requires_python<4.0,>=3.8
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pandascompare

<div>

[![Package version](https://img.shields.io/pypi/v/pandascompare?color=%2334D058&label=pypi)](https://pypi.org/project/pandascompare/)
[![License](https://img.shields.io/github/license/zteinck/pandascompare)](https://github.com/zteinck/pandascompare/blob/master/LICENSE)

</div>

`pandascompare` is a Python package designed to compare `DataFrame` objects, enabling you to quickly identify the differences between two datasets. 

## Installation
```sh
pip install pandascompare
```

## Main Features
The `PandasCompare` class compares any two `DataFrame` objects along the following dimensions:
- `Rows` ➔ discrepancies with respect to the join key(s) specified via the `join_on` argument.
- `Columns` ➔ name differences or missing columns.
- `Values` ➔ data that differs in terms of value or type.


## Example Usage
Please refer to the documentation within the code for more information.

### Imports
```python
from pandascompare import PandasCompare
```

### Create DataFrames
First, let's create two sample `DataFrame` objects to compare.
```python
import pandas as pd
import numpy as np

# February Data
left_df = pd.DataFrame({
    'id': [1, 2, 3],
    'date': [pd.to_datetime('2024-02-29')] * 3,
    'first_name': ['Alice', 'Mike', 'John'],
    'amount': [10.5, 5.3, 33.77],
    })

# January Data
right_df = pd.DataFrame({
    'id': [1, 2, 9],
    'date': [pd.to_datetime('2024-01-31')] * 3,
    'first_name': ['Alice', 'Michael', 'Zachary'],
    'last_name': ['Jones', 'Smith', 'Einck'],
    'amount': [11.1, np.nan, 14],
    })
```

### Compare DataFrames
Next, we will initialize a `PandasCompare` instance to perform the comparison. Please consult the in-code documentation for a comprehensive list of available arguments.
```python
pc = PandasCompare(
    left=left_df,
    right=right_df,
    left_label='feb',
    right_label='jan',
    join_on='id',
    left_ref=['first_name'],
    include_delta=True,
    verbose=True,
    )
```

### Export to Excel
Finally, let's export the compare report to an Excel file to view the results.
```python
pc.export_to_excel()
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/zteinck/pandascompare",
    "name": "pandascompare",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Zachary Einck",
    "author_email": "zacharyeinck@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/99/b8/a6d6dc8b73717199adccbb997dfb4ab16b0f49d721389de6dc0670c4d338/pandascompare-0.1.3.tar.gz",
    "platform": null,
    "description": "# pandascompare\n\n<div>\n\n[![Package version](https://img.shields.io/pypi/v/pandascompare?color=%2334D058&label=pypi)](https://pypi.org/project/pandascompare/)\n[![License](https://img.shields.io/github/license/zteinck/pandascompare)](https://github.com/zteinck/pandascompare/blob/master/LICENSE)\n\n</div>\n\n`pandascompare` is a Python package designed to compare `DataFrame` objects, enabling you to quickly identify the differences between two datasets. \n\n## Installation\n```sh\npip install pandascompare\n```\n\n## Main Features\nThe `PandasCompare` class compares any two `DataFrame` objects along the following dimensions:\n- `Rows` \u2794 discrepancies with respect to the join key(s) specified via the `join_on` argument.\n- `Columns` \u2794 name differences or missing columns.\n- `Values` \u2794 data that differs in terms of value or type.\n\n\n## Example Usage\nPlease refer to the documentation within the code for more information.\n\n### Imports\n```python\nfrom pandascompare import PandasCompare\n```\n\n### Create DataFrames\nFirst, let's create two sample `DataFrame` objects to compare.\n```python\nimport pandas as pd\nimport numpy as np\n\n# February Data\nleft_df = pd.DataFrame({\n    'id': [1, 2, 3],\n    'date': [pd.to_datetime('2024-02-29')] * 3,\n    'first_name': ['Alice', 'Mike', 'John'],\n    'amount': [10.5, 5.3, 33.77],\n    })\n\n# January Data\nright_df = pd.DataFrame({\n    'id': [1, 2, 9],\n    'date': [pd.to_datetime('2024-01-31')] * 3,\n    'first_name': ['Alice', 'Michael', 'Zachary'],\n    'last_name': ['Jones', 'Smith', 'Einck'],\n    'amount': [11.1, np.nan, 14],\n    })\n```\n\n### Compare DataFrames\nNext, we will initialize a `PandasCompare` instance to perform the comparison. Please consult the in-code documentation for a comprehensive list of available arguments.\n```python\npc = PandasCompare(\n    left=left_df,\n    right=right_df,\n    left_label='feb',\n    right_label='jan',\n    join_on='id',\n    left_ref=['first_name'],\n    include_delta=True,\n    verbose=True,\n    )\n```\n\n### Export to Excel\nFinally, let's export the compare report to an Excel file to view the results.\n```python\npc.export_to_excel()\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Suite of pandas utilities including a DataFrame comparison report builder.",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://github.com/zteinck/pandascompare",
        "Repository": "https://github.com/zteinck/pandascompare"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0846294e1fbc96b7fa8e68c0eb4b23afa65c81dcc9228e72ba8a527869a27d60",
                "md5": "4fe37ce5229105f96366a6ee0fdfb68f",
                "sha256": "e3724966225818161d8300e5118e0dcd5614b1f8b1fa632017aa5383f91a36fb"
            },
            "downloads": -1,
            "filename": "pandascompare-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4fe37ce5229105f96366a6ee0fdfb68f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 10140,
            "upload_time": "2024-08-12T05:52:54",
            "upload_time_iso_8601": "2024-08-12T05:52:54.021959Z",
            "url": "https://files.pythonhosted.org/packages/08/46/294e1fbc96b7fa8e68c0eb4b23afa65c81dcc9228e72ba8a527869a27d60/pandascompare-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "99b8a6d6dc8b73717199adccbb997dfb4ab16b0f49d721389de6dc0670c4d338",
                "md5": "42674d14db6bf3afecb3124ad99f26e9",
                "sha256": "e7958b7bd03049865cae45b25c8e5d5204d20bcdc3ed3ae1066a4c1eb2862316"
            },
            "downloads": -1,
            "filename": "pandascompare-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "42674d14db6bf3afecb3124ad99f26e9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 9094,
            "upload_time": "2024-08-12T05:52:55",
            "upload_time_iso_8601": "2024-08-12T05:52:55.356633Z",
            "url": "https://files.pythonhosted.org/packages/99/b8/a6d6dc8b73717199adccbb997dfb4ab16b0f49d721389de6dc0670c4d338/pandascompare-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-12 05:52:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "zteinck",
    "github_project": "pandascompare",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pandascompare"
}
        
Elapsed time: 0.98468s