# pandascompare
<div>
[](https://pypi.org/project/pandascompare/)
[](https://github.com/zteinck/pandascompare/blob/master/LICENSE)
</div>
`pandascompare` is a Python package designed to compare `DataFrame` objects, enabling you to quickly identify the differences between two datasets.
## Installation
```sh
pip install pandascompare
```
## Main Features
The `PandasCompare` class compares any two `DataFrame` objects along the following dimensions:
- `Rows` ➔ discrepancies with respect to the join key(s) specified via the `join_on` argument.
- `Columns` ➔ name differences or missing columns.
- `Values` ➔ data that differs in terms of value or type.
## Example Usage
Please refer to the documentation within the code for more information.
### Imports
```python
from pandascompare import PandasCompare
```
### Create DataFrames
First, let's create two sample `DataFrame` objects to compare.
```python
import pandas as pd
import numpy as np
# February Data
left_df = pd.DataFrame({
'id': [1, 2, 3],
'date': [pd.to_datetime('2024-02-29')] * 3,
'first_name': ['Alice', 'Mike', 'John'],
'amount': [10.5, 5.3, 33.77],
})
# January Data
right_df = pd.DataFrame({
'id': [1, 2, 9],
'date': [pd.to_datetime('2024-01-31')] * 3,
'first_name': ['Alice', 'Michael', 'Zachary'],
'last_name': ['Jones', 'Smith', 'Einck'],
'amount': [11.1, np.nan, 14],
})
```
### Compare DataFrames
Next, we will initialize a `PandasCompare` instance to perform the comparison. Please consult the in-code documentation for a comprehensive list of available arguments.
```python
pc = PandasCompare(
left=left_df,
right=right_df,
left_label='feb',
right_label='jan',
join_on='id',
left_ref=['first_name'],
include_delta=True,
verbose=True,
)
```
### Export to Excel
Finally, let's export the compare report to an Excel file to view the results.
```python
pc.export_to_excel()
```
Raw data
{
"_id": null,
"home_page": "https://github.com/zteinck/pandascompare",
"name": "pandascompare",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Zachary Einck",
"author_email": "zacharyeinck@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/99/b8/a6d6dc8b73717199adccbb997dfb4ab16b0f49d721389de6dc0670c4d338/pandascompare-0.1.3.tar.gz",
"platform": null,
"description": "# pandascompare\n\n<div>\n\n[](https://pypi.org/project/pandascompare/)\n[](https://github.com/zteinck/pandascompare/blob/master/LICENSE)\n\n</div>\n\n`pandascompare` is a Python package designed to compare `DataFrame` objects, enabling you to quickly identify the differences between two datasets. \n\n## Installation\n```sh\npip install pandascompare\n```\n\n## Main Features\nThe `PandasCompare` class compares any two `DataFrame` objects along the following dimensions:\n- `Rows` \u2794 discrepancies with respect to the join key(s) specified via the `join_on` argument.\n- `Columns` \u2794 name differences or missing columns.\n- `Values` \u2794 data that differs in terms of value or type.\n\n\n## Example Usage\nPlease refer to the documentation within the code for more information.\n\n### Imports\n```python\nfrom pandascompare import PandasCompare\n```\n\n### Create DataFrames\nFirst, let's create two sample `DataFrame` objects to compare.\n```python\nimport pandas as pd\nimport numpy as np\n\n# February Data\nleft_df = pd.DataFrame({\n 'id': [1, 2, 3],\n 'date': [pd.to_datetime('2024-02-29')] * 3,\n 'first_name': ['Alice', 'Mike', 'John'],\n 'amount': [10.5, 5.3, 33.77],\n })\n\n# January Data\nright_df = pd.DataFrame({\n 'id': [1, 2, 9],\n 'date': [pd.to_datetime('2024-01-31')] * 3,\n 'first_name': ['Alice', 'Michael', 'Zachary'],\n 'last_name': ['Jones', 'Smith', 'Einck'],\n 'amount': [11.1, np.nan, 14],\n })\n```\n\n### Compare DataFrames\nNext, we will initialize a `PandasCompare` instance to perform the comparison. Please consult the in-code documentation for a comprehensive list of available arguments.\n```python\npc = PandasCompare(\n left=left_df,\n right=right_df,\n left_label='feb',\n right_label='jan',\n join_on='id',\n left_ref=['first_name'],\n include_delta=True,\n verbose=True,\n )\n```\n\n### Export to Excel\nFinally, let's export the compare report to an Excel file to view the results.\n```python\npc.export_to_excel()\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Suite of pandas utilities including a DataFrame comparison report builder.",
"version": "0.1.3",
"project_urls": {
"Homepage": "https://github.com/zteinck/pandascompare",
"Repository": "https://github.com/zteinck/pandascompare"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0846294e1fbc96b7fa8e68c0eb4b23afa65c81dcc9228e72ba8a527869a27d60",
"md5": "4fe37ce5229105f96366a6ee0fdfb68f",
"sha256": "e3724966225818161d8300e5118e0dcd5614b1f8b1fa632017aa5383f91a36fb"
},
"downloads": -1,
"filename": "pandascompare-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4fe37ce5229105f96366a6ee0fdfb68f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.8",
"size": 10140,
"upload_time": "2024-08-12T05:52:54",
"upload_time_iso_8601": "2024-08-12T05:52:54.021959Z",
"url": "https://files.pythonhosted.org/packages/08/46/294e1fbc96b7fa8e68c0eb4b23afa65c81dcc9228e72ba8a527869a27d60/pandascompare-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "99b8a6d6dc8b73717199adccbb997dfb4ab16b0f49d721389de6dc0670c4d338",
"md5": "42674d14db6bf3afecb3124ad99f26e9",
"sha256": "e7958b7bd03049865cae45b25c8e5d5204d20bcdc3ed3ae1066a4c1eb2862316"
},
"downloads": -1,
"filename": "pandascompare-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "42674d14db6bf3afecb3124ad99f26e9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.8",
"size": 9094,
"upload_time": "2024-08-12T05:52:55",
"upload_time_iso_8601": "2024-08-12T05:52:55.356633Z",
"url": "https://files.pythonhosted.org/packages/99/b8/a6d6dc8b73717199adccbb997dfb4ab16b0f49d721389de6dc0670c4d338/pandascompare-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-12 05:52:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "zteinck",
"github_project": "pandascompare",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pandascompare"
}