comparesv

Name	comparesv JSON
Version	0.17 JSON
	download
home_page	None
Summary	CSV Comparison on steroids
upload_time	2025-08-27 15:42:41
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	None
keywords	analysis csv compare comparison data
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # comparesv
### Python CSV Comparison on steriods 

## Installation

```console
pip install comparesv
```

## Usage

```console
comparesv [-h] [-v] [--enc1 ENCODING] [--enc2 ENCODING] [-i]
              [-rm ROW_MATCH] [-cm COLUMN_MATCH] [-sm STRING_MATCH] [-ir]
              [-ic] [-is] [-s]
              [FILE1] [FILE2]

CSV files comparison

positional arguments:
  FILE1                 the first CSV file
  FILE2                 the second CSV file

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  --enc1 ENCODING       encoding of the first file (default is to autodetect)
  --enc2 ENCODING       encoding of the second file (default is to autodetect)
  -i, --ignore-case     ignore case (default is case-sensitive)
  -rm ROW_MATCH, --row-match ROW_MATCH
                        Logic to be used to identify the rows. Possible
                        options 'order', 'fuzzy', 'deep' (default is order)
  -cm COLUMN_MATCH, --column-match COLUMN_MATCH
                        Logic to be used to identify the columns. Possible
                        options 'exact','fuzzy' (default is exact)
  -sm STRING_MATCH, --string-match STRING_MATCH
                        Logic to be used to identify the columns. Possible
                        options 'exact','fuzzy' (default is exact)
  -ir, --include-addnl-rows
                        Include added additional added rows from second file
                        (default is false)
  -ic, --include-addnl-columns
                        Include added additional columns from second file
                        (default is false)
  -is, --include-stats  Include stats (default is false)
  -s, --save-output     Save output to file
```

## Examples

### Scenario 1: Simple direct comparison

|id |first   |last    |age|
|---|--------|--------|---|
|432|Roy     |Aguilar |46 |
|914|Janie   |Bowman  |24 |
|021|Grace   |Copeland|53 |
|708|Louise  |Franklin|25 |
|850|Gertrude|Carr    |60 |

vs

|id |first   |last    |age|
|---|--------|--------|---|
|432|Roy     |Aguilar |46 |
|914|Janie   |Bowman  |24 |
|021|Grace   |Copeland|53 |
|708|Louise  |Franklin|25 |
|850|Gertrude|Carr    |60 |

```console
comparesv file1 file2
```

Will provide:

|S.No|id      |first   |last|age |
|----|--------|--------|----|----|
|1   |True    |True    |True|True|
|2   |True    |True    |True|True|
|3   |True    |True    |True|True|
|4   |True    |True    |True|True|
|5   |True    |True    |True|True|

and

|S.No|id      |first   |last|age |
|----|--------|--------|----|----|
|1   |[432]:[432]|[Roy]:[Roy]|[Aguilar]:[Aguilar]|[46]:[46]|
|2   |[914]:[914]|[Janie]:[Janie]|[Bowman]:[Bowman]|[24]:[24]|
|3   |[021]:[021]|[Grace]:[Grace]|[Copeland]:[Copeland]|[53]:[53]|
|4   |[708]:[708]|[Louise]:[Louise]|[Franklin]:[Franklin]|[25]:[25]|
|5   |[850]:[850]|[Gertrude]:[Gertrude]|[Carr]:[Carr]|[60]:[60]|

---
### Scenario 2: Fuzzy column names

|id |first   |last    |age of student|
|---|--------|--------|--------------|
|432|Roy     |Aguilar |46            |
|914|Janie   |Bowman  |24            |

and 

|id |first   |last    |age|
|---|--------|--------|---|
|432|Roy     |Aguilar |46 |
|914|Janie   |Bowman  |24 |

```console
comparesv file1.csv file2.csv --column-match 'fuzzy'
```

will provide
|S.No|id      |first   |last|age |
|----|--------|--------|----|----|
|1   |True    |True    |True|True|
|2   |True    |True    |True|True|
---
### Scenario 3: Fuzzy row order - Differnt ordered textual data

|id |first   |last    |age|
|---|--------|--------|---|
|432|Roy     |Aguilar |46 |
|914|Janie   |Bowman  |24 |
|021|Grace   |Copeland|53 |

and

|id |first   |last    |age of student|
|---|--------|--------|--------------|
|021|Grace   |Copeland|53            |
|432|Roy     |Aguilar |46            |
|914|Janie   |Bowman  |24            |

```console
comparesv file1.csv file2.csv --column-match 'fuzzy' --row-match 'fuzzy'
```
will provide

|S.No|id      |first   |last|age |
|----|--------|--------|----|----|
|1   |True    |True    |True|True|
|2   |True    |True    |True|True|
|3   |True    |True    |True|True|
--- 
### Scenario 3: Deep row order - Different ordered numerical data

|year1|year2   |year3   |year|
|-----|--------|--------|----|
|751  |609     |590     |930 |
|417  |501     |441     |763 |
|691  |621     |941     |563 |
|179  |781     |335     |225 |
|961  |530     |433     |571 |

and

|year1|year2   |year3   |year|
|-----|--------|--------|----|
|961  |530     |433     |571 |
|751  |609     |590     |930 |
|691  |621     |941     |563 |
|179  |781     |335     |225 |
|417  |501     |441     |763 |

```console
comparesv file1.csv file2.csv --row-match 'deep'
```

|S.No|year1   |year2   |year3|year|
|----|--------|--------|-----|----|
|1   |True    |True    |True |True|
|2   |True    |True    |True |True|
|3   |True    |True    |True |True|
|4   |True    |True    |True |True|
|5   |True    |True    |True |True|

---
### Scenario n: Unlimited options. Please explore the options below
---
## Description

The first file is considered as the source file. It will be compared against the second file. Refer the below options to finetune the way it works.

### Row Match (-rm)

This will define the way how the rows between the files will be identified for comparison

`order` - This is the default option, This will compare the rows by their position between the files. This can be used if the records in both the files are in same order

`fuzzy` - This will use fuzzy logic to identify the matching row on second file. This can be used if the records are not in order and most of the data are **text**.

`deep` - This will use fuzzy logic to identify the matching row on second file. This can be used if the records are not in order and it has **numeric** data. This will look for each row in file1 against all the rows in file2 to find a potential match

### Column Match (-rm)

This will define the way how the columns between the files will be identified for comparison

`exact` - This is the default option, This will compare the columns between the files by their headers for an exact match and select it for comparison. eg. 'Age' and 'Age' columns across the files will be selected for comparison.

`fuzzy` - This will use fuzzy logic to identify the matching column on second file. This can be used if the column headers across the files are not exactly same by somehow closer. eg. 'age' and 'age of student' columns may be selected for comparison.

### String Match (-sm)

This will define the way how the textual data is compared.

`exact` - This is the default option, This will compare the exact text.

`fuzzy` - This will use fuzzy logic to find if the texts are closer to each other and identifies the match.

### Include Additional Rows (-ir)

If the second file contains more rows than the first file, this option will enable the comparison output to include the remaining rows (uncompared ones).

### Include Additional Columns (-ic)

If the second file contains more columns than the first file, this option will enable the comparison output to include the remaining columms.

### Ignore case (-i)

This option will ignore the case while comparing the strings.

### Include Stats (-is)

This option is enabled by default and it outputs the comparison stats (in percentage) on the console

### Save Output (-s)

This option will save the result & values comparison in the current directory. This is enabled by default.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "comparesv",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "Analysis, CSV, Compare, Comparison, Data",
    "author": null,
    "author_email": "Kishore Kumar <ukisho@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/b0/c3/e83154810da3003cb9d33a3fbbe9cc6bb1476a7bc4b9b9cdf48e52fbd075/comparesv-0.17.tar.gz",
    "platform": null,
    "description": "# comparesv\n### Python CSV Comparison on steriods \n\n## Installation\n\n```console\npip install comparesv\n```\n\n## Usage\n\n```console\ncomparesv [-h] [-v] [--enc1 ENCODING] [--enc2 ENCODING] [-i]\n              [-rm ROW_MATCH] [-cm COLUMN_MATCH] [-sm STRING_MATCH] [-ir]\n              [-ic] [-is] [-s]\n              [FILE1] [FILE2]\n\nCSV files comparison\n\npositional arguments:\n  FILE1                 the first CSV file\n  FILE2                 the second CSV file\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -v, --version         show program's version number and exit\n  --enc1 ENCODING       encoding of the first file (default is to autodetect)\n  --enc2 ENCODING       encoding of the second file (default is to autodetect)\n  -i, --ignore-case     ignore case (default is case-sensitive)\n  -rm ROW_MATCH, --row-match ROW_MATCH\n                        Logic to be used to identify the rows. Possible\n                        options 'order', 'fuzzy', 'deep' (default is order)\n  -cm COLUMN_MATCH, --column-match COLUMN_MATCH\n                        Logic to be used to identify the columns. Possible\n                        options 'exact','fuzzy' (default is exact)\n  -sm STRING_MATCH, --string-match STRING_MATCH\n                        Logic to be used to identify the columns. Possible\n                        options 'exact','fuzzy' (default is exact)\n  -ir, --include-addnl-rows\n                        Include added additional added rows from second file\n                        (default is false)\n  -ic, --include-addnl-columns\n                        Include added additional columns from second file\n                        (default is false)\n  -is, --include-stats  Include stats (default is false)\n  -s, --save-output     Save output to file\n```\n\n## Examples\n\n### Scenario 1: Simple direct comparison\n\n|id |first   |last    |age|\n|---|--------|--------|---|\n|432|Roy     |Aguilar |46 |\n|914|Janie   |Bowman  |24 |\n|021|Grace   |Copeland|53 |\n|708|Louise  |Franklin|25 |\n|850|Gertrude|Carr    |60 |\n\nvs\n\n|id |first   |last    |age|\n|---|--------|--------|---|\n|432|Roy     |Aguilar |46 |\n|914|Janie   |Bowman  |24 |\n|021|Grace   |Copeland|53 |\n|708|Louise  |Franklin|25 |\n|850|Gertrude|Carr    |60 |\n\n```console\ncomparesv file1 file2\n```\n\nWill provide:\n\n|S.No|id      |first   |last|age |\n|----|--------|--------|----|----|\n|1   |True    |True    |True|True|\n|2   |True    |True    |True|True|\n|3   |True    |True    |True|True|\n|4   |True    |True    |True|True|\n|5   |True    |True    |True|True|\n\nand\n\n|S.No|id      |first   |last|age |\n|----|--------|--------|----|----|\n|1   |[432]:[432]|[Roy]:[Roy]|[Aguilar]:[Aguilar]|[46]:[46]|\n|2   |[914]:[914]|[Janie]:[Janie]|[Bowman]:[Bowman]|[24]:[24]|\n|3   |[021]:[021]|[Grace]:[Grace]|[Copeland]:[Copeland]|[53]:[53]|\n|4   |[708]:[708]|[Louise]:[Louise]|[Franklin]:[Franklin]|[25]:[25]|\n|5   |[850]:[850]|[Gertrude]:[Gertrude]|[Carr]:[Carr]|[60]:[60]|\n\n---\n### Scenario 2: Fuzzy column names\n\n|id |first   |last    |age of student|\n|---|--------|--------|--------------|\n|432|Roy     |Aguilar |46            |\n|914|Janie   |Bowman  |24            |\n\nand \n\n|id |first   |last    |age|\n|---|--------|--------|---|\n|432|Roy     |Aguilar |46 |\n|914|Janie   |Bowman  |24 |\n\n```console\ncomparesv file1.csv file2.csv --column-match 'fuzzy'\n```\n\nwill provide\n|S.No|id      |first   |last|age |\n|----|--------|--------|----|----|\n|1   |True    |True    |True|True|\n|2   |True    |True    |True|True|\n---\n### Scenario 3: Fuzzy row order - Differnt ordered textual data\n\n|id |first   |last    |age|\n|---|--------|--------|---|\n|432|Roy     |Aguilar |46 |\n|914|Janie   |Bowman  |24 |\n|021|Grace   |Copeland|53 |\n\nand\n\n|id |first   |last    |age of student|\n|---|--------|--------|--------------|\n|021|Grace   |Copeland|53            |\n|432|Roy     |Aguilar |46            |\n|914|Janie   |Bowman  |24            |\n\n```console\ncomparesv file1.csv file2.csv --column-match 'fuzzy' --row-match 'fuzzy'\n```\nwill provide\n\n|S.No|id      |first   |last|age |\n|----|--------|--------|----|----|\n|1   |True    |True    |True|True|\n|2   |True    |True    |True|True|\n|3   |True    |True    |True|True|\n--- \n### Scenario 3: Deep row order - Different ordered numerical data\n\n|year1|year2   |year3   |year|\n|-----|--------|--------|----|\n|751  |609     |590     |930 |\n|417  |501     |441     |763 |\n|691  |621     |941     |563 |\n|179  |781     |335     |225 |\n|961  |530     |433     |571 |\n\nand\n\n|year1|year2   |year3   |year|\n|-----|--------|--------|----|\n|961  |530     |433     |571 |\n|751  |609     |590     |930 |\n|691  |621     |941     |563 |\n|179  |781     |335     |225 |\n|417  |501     |441     |763 |\n\n```console\ncomparesv file1.csv file2.csv --row-match 'deep'\n```\n\n|S.No|year1   |year2   |year3|year|\n|----|--------|--------|-----|----|\n|1   |True    |True    |True |True|\n|2   |True    |True    |True |True|\n|3   |True    |True    |True |True|\n|4   |True    |True    |True |True|\n|5   |True    |True    |True |True|\n\n---\n### Scenario n: Unlimited options. Please explore the options below\n---\n## Description\n\nThe first file is considered as the source file. It will be compared against the second file. Refer the below options to finetune the way it works.\n\n### Row Match (-rm)\n\nThis will define the way how the rows between the files will be identified for comparison\n\n`order` - This is the default option, This will compare the rows by their position between the files. This can be used if the records in both the files are in same order\n\n`fuzzy` - This will use fuzzy logic to identify the matching row on second file. This can be used if the records are not in order and most of the data are **text**.\n\n`deep` - This will use fuzzy logic to identify the matching row on second file. This can be used if the records are not in order and it has **numeric** data. This will look for each row in file1 against all the rows in file2 to find a potential match\n\n### Column Match (-rm)\n\nThis will define the way how the columns between the files will be identified for comparison\n\n`exact` - This is the default option, This will compare the columns between the files by their headers for an exact match and select it for comparison. eg. 'Age' and 'Age' columns across the files will be selected for comparison.\n\n`fuzzy` - This will use fuzzy logic to identify the matching column on second file. This can be used if the column headers across the files are not exactly same by somehow closer. eg. 'age' and 'age of student' columns may be selected for comparison.\n\n### String Match (-sm)\n\nThis will define the way how the textual data is compared.\n\n`exact` - This is the default option, This will compare the exact text.\n\n`fuzzy` - This will use fuzzy logic to find if the texts are closer to each other and identifies the match.\n\n### Include Additional Rows (-ir)\n\nIf the second file contains more rows than the first file, this option will enable the comparison output to include the remaining rows (uncompared ones).\n\n### Include Additional Columns (-ic)\n\nIf the second file contains more columns than the first file, this option will enable the comparison output to include the remaining columms.\n\n### Ignore case (-i)\n\nThis option will ignore the case while comparing the strings.\n\n### Include Stats (-is)\n\nThis option is enabled by default and it outputs the comparison stats (in percentage) on the console\n\n### Save Output (-s)\n\nThis option will save the result & values comparison in the current directory. This is enabled by default.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "CSV Comparison on steroids",
    "version": "0.17",
    "project_urls": {
        "Homepage": "https://github.com/kishorek/comparesv",
        "Issues": "https://github.com/kishorek/comparesv/issues",
        "Repository": "https://github.com/kishorek/comparesv"
    },
    "split_keywords": [
        "analysis",
        " csv",
        " compare",
        " comparison",
        " data"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5f6c09be40db958530d369764d0cd52f834138dd2ea07c52913d9cffe20d4fb9",
                "md5": "32c29aa92b70f4fcc60a10ddf63f64f4",
                "sha256": "30f74aacac0ce9130373c7fb82f3d9a1f587d16fa9ee604c11f18d4c38734a9b"
            },
            "downloads": -1,
            "filename": "comparesv-0.17-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "32c29aa92b70f4fcc60a10ddf63f64f4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 63435,
            "upload_time": "2025-08-27T15:42:40",
            "upload_time_iso_8601": "2025-08-27T15:42:40.011520Z",
            "url": "https://files.pythonhosted.org/packages/5f/6c/09be40db958530d369764d0cd52f834138dd2ea07c52913d9cffe20d4fb9/comparesv-0.17-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b0c3e83154810da3003cb9d33a3fbbe9cc6bb1476a7bc4b9b9cdf48e52fbd075",
                "md5": "79d97cf81665dad3332abebfd5c09fef",
                "sha256": "9916f9822f7c461997b1284292fd704ff0454999fdd0468c304ec08805638a52"
            },
            "downloads": -1,
            "filename": "comparesv-0.17.tar.gz",
            "has_sig": false,
            "md5_digest": "79d97cf81665dad3332abebfd5c09fef",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 56056,
            "upload_time": "2025-08-27T15:42:41",
            "upload_time_iso_8601": "2025-08-27T15:42:41.569794Z",
            "url": "https://files.pythonhosted.org/packages/b0/c3/e83154810da3003cb9d33a3fbbe9cc6bb1476a7bc4b9b9cdf48e52fbd075/comparesv-0.17.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-27 15:42:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kishorek",
    "github_project": "comparesv",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "comparesv"
}

None