# CASParser
[![code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![GitHub](https://img.shields.io/github/license/codereverser/casparser)](https://github.com/codereverser/casparser/blob/main/LICENSE)
![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/codereverser/casparser/run-pytest.yml?branch=main)
[![codecov](https://codecov.io/gh/codereverser/casparser/branch/main/graph/badge.svg?token=DYZ7TXWRGI)](https://codecov.io/gh/codereverser/casparser)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/casparser)
Parse Consolidated Account Statement (CAS) PDF files generated from CAMS/KFINTECH
`casparser` also includes a command line tool with the following analysis tools
- `summary`- print portfolio summary
- (**BETA**) `gains` - Print capital gains report (summary and detailed)
- with option to generate csv files for ITR in schedule 112A format
## Installation
```bash
pip install -U casparser
```
### with faster PyMuPDF parser
```bash
pip install -U 'casparser[fast]'
```
**Note:** Enabling this dependency could result in licensing changes. Check the
[License](#license) section for more details
## Usage
```python
import casparser
data = casparser.read_cas_pdf("/path/to/cas/file.pdf", "password")
# Get data in json format
json_str = casparser.read_cas_pdf("/path/to/cas/file.pdf", "password", output="json")
# Get transactions data in csv string format
csv_str = casparser.read_cas_pdf("/path/to/cas/file.pdf", "password", output="csv")
```
### Data structure
```json
{
"statement_period": {
"from": "YYYY-MMM-DD",
"to": "YYYY-MMM-DD"
},
"file_type": "CAMS/KARVY/UNKNOWN",
"cas_type": "DETAILED/SUMMARY",
"investor_info": {
"email": "string",
"name": "string",
"mobile": "string",
"address": "string"
},
"folios": [
{
"folio": "string",
"amc": "string",
"PAN": "string",
"KYC": "OK/NOT OK",
"PANKYC": "OK/NOT OK",
"schemes": [
{
"scheme": "string",
"isin": "string",
"amfi": "string",
"advisor": "string",
"rta_code": "string",
"rta": "string",
"open": "number",
"close": "number",
"close_calculated": "number",
"valuation": {
"date": "date",
"nav": "number",
"value": "number"
},
"transactions": [
{
"date": "YYYY-MM-DD",
"description": "string",
"amount": "number",
"units": "number",
"nav": "number",
"balance": "number",
"type": "string",
"dividend_rate": "number"
}
]
}
]
}
]
}
```
Notes:
- Transaction `type` can be any value from the following
- `PURCHASE`
- `PURCHASE_SIP`
- `REDEMPTION`
- `SWITCH_IN`
- `SWITCH_IN_MERGER`
- `SWITCH_OUT`
- `SWITCH_OUT_MERGER`
- `DIVIDEND_PAYOUT`
- `DIVIDEND_REINVESTMENT`
- `SEGREGATION`
- `STAMP_DUTY_TAX`
- `TDS_TAX`
- `STT_TAX`
- `MISC`
- `dividend_rate` is applicable only for `DIVIDEND_PAYOUT` and
`DIVIDEND_REINVESTMENT` transactions.
### CLI
casparser also comes with a command-line interface that prints summary of parsed
portfolio in a wide variety of formats.
```
Usage: casparser [-o output_file.json|output_file.csv] [-p password] [-s] [-a] CAS_PDF_FILE
-o, --output FILE Output file path. Saves the parsed data as json or csv
depending on the file extension. For other extensions, the
summary output is saved. [See note below]
-s, --summary Print Summary of transactions parsed.
-p PASSWORD CAS password
-a, --include-all Include schemes with zero valuation in the
summary output
-g, --gains Generate Capital Gains Report (BETA)
--gains-112a ask|FY2020-21 Generate Capital Gains Report - 112A format for
a given financial year - Use 'ask' for a prompt
from available options (BETA)
--force-pdfminer Force PDFMiner parser even if MuPDF is
detected
--version Show the version and exit.
-h, --help Show this message and exit.
```
#### CLI examples
```
# Print portfolio summary
casparser /path/to/cas.pdf -p password
# Print portfolio and capital gains summary
casparser /path/to/cas.pdf -p password -g
# Save parsed data as a json file
casparser /path/to/cas.pdf -p password -o pdf_parsed.json
# Save parsed data as a csv file
casparser /path/to/cas.pdf -p password -o pdf_parsed.csv
# Save capital gains transactions in csv files (pdf_parsed-gains-summary.csv and
# pdf_parsed-gains-detailed.csv)
casparser /path/to/cas.pdf -p password -g -o pdf_parsed.csv
```
**Note:** `casparser cli` supports two special output file formats [-o _file.json_ / _file.csv_]
1. `json` - complete parsed data is exported in json format (including investor info)
2. `csv` - Summary info is exported in csv format if the input file is a summary statement or if
a summary flag (`-s/--summary`) is passed as argument to the CLI. Otherwise, full
transaction history is included in the export.
If `-g` flag is present, two additional files '{basename}-gains-summary.csv',
'{basename}-gains-detailed.csv' are created with the capital-gains data.
3. any other extension - The summary table is saved in the file.
#### Demo
![demo](https://raw.githubusercontent.com/codereverser/casparser/main/assets/demo.jpg)
## ISIN & AMFI code support
Since v0.4.3, `casparser` includes support for identifying ISIN and AMFI code for the parsed schemes
via the helper module [casparser-isin](https://github.com/codereverser/casparser-isin/). If the parser
fails to assign ISIN or AMFI codes to a scheme, try updating the local ISIN database by
```shell
casparser-isin --update
```
If it still fails, please raise an issue at [casparser-isin](https://github.com/codereverser/casparser-isin/issues/new) with the
failing scheme name(s).
## License
CASParser is distributed under MIT license by default. However enabling the optional dependency
`mupdf/fast` would imply the use of [PyMuPDF](https://github.com/pymupdf/PyMuPDF) /
[MuPDF](https://mupdf.com/license.html) and hence the licenses GNU GPL v3 and GNU Affero GPL v3
would apply. Copies of all licenses have been included in this repository. - _IANAL_
## Resources
1. [CAS from CAMS](https://www.camsonline.com/Investors/Statements/Consolidated-Account-Statement)
2. [CAS from Karvy/Kfintech](https://mfs.kfintech.com/investor/General/ConsolidatedAccountStatement)
Raw data
{
"_id": null,
"home_page": "https://github.com/codereverser/casparser",
"name": "casparser",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<4.0",
"maintainer_email": "",
"keywords": "",
"author": "Sandeep Somasekharan",
"author_email": "codereverser@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/ad/84/2d93b9727a43a7dcfaafb84885102646793c3991581d0d76c1fba9a973de/casparser-0.7.4.tar.gz",
"platform": null,
"description": "# CASParser\n\n[![code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![GitHub](https://img.shields.io/github/license/codereverser/casparser)](https://github.com/codereverser/casparser/blob/main/LICENSE)\n![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/codereverser/casparser/run-pytest.yml?branch=main)\n[![codecov](https://codecov.io/gh/codereverser/casparser/branch/main/graph/badge.svg?token=DYZ7TXWRGI)](https://codecov.io/gh/codereverser/casparser)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/casparser)\n\nParse Consolidated Account Statement (CAS) PDF files generated from CAMS/KFINTECH\n\n`casparser` also includes a command line tool with the following analysis tools\n- `summary`- print portfolio summary\n- (**BETA**) `gains` - Print capital gains report (summary and detailed)\n - with option to generate csv files for ITR in schedule 112A format\n\n\n## Installation\n```bash\npip install -U casparser\n```\n\n### with faster PyMuPDF parser\n```bash\npip install -U 'casparser[fast]'\n```\n\n**Note:** Enabling this dependency could result in licensing changes. Check the\n[License](#license) section for more details\n\n\n## Usage\n\n```python\nimport casparser\ndata = casparser.read_cas_pdf(\"/path/to/cas/file.pdf\", \"password\")\n\n# Get data in json format\njson_str = casparser.read_cas_pdf(\"/path/to/cas/file.pdf\", \"password\", output=\"json\")\n\n# Get transactions data in csv string format\ncsv_str = casparser.read_cas_pdf(\"/path/to/cas/file.pdf\", \"password\", output=\"csv\")\n\n```\n\n### Data structure\n\n```json\n{\n \"statement_period\": {\n \"from\": \"YYYY-MMM-DD\",\n \"to\": \"YYYY-MMM-DD\"\n },\n \"file_type\": \"CAMS/KARVY/UNKNOWN\",\n \"cas_type\": \"DETAILED/SUMMARY\",\n \"investor_info\": {\n \"email\": \"string\",\n \"name\": \"string\",\n \"mobile\": \"string\",\n \"address\": \"string\"\n },\n \"folios\": [\n {\n \"folio\": \"string\",\n \"amc\": \"string\",\n \"PAN\": \"string\",\n \"KYC\": \"OK/NOT OK\",\n \"PANKYC\": \"OK/NOT OK\",\n \"schemes\": [\n {\n \"scheme\": \"string\",\n \"isin\": \"string\",\n \"amfi\": \"string\",\n \"advisor\": \"string\",\n \"rta_code\": \"string\",\n \"rta\": \"string\",\n \"open\": \"number\",\n \"close\": \"number\",\n \"close_calculated\": \"number\",\n \"valuation\": {\n \"date\": \"date\",\n \"nav\": \"number\",\n \"value\": \"number\"\n },\n \"transactions\": [\n {\n \"date\": \"YYYY-MM-DD\",\n \"description\": \"string\",\n \"amount\": \"number\",\n \"units\": \"number\",\n \"nav\": \"number\",\n \"balance\": \"number\",\n \"type\": \"string\",\n \"dividend_rate\": \"number\"\n }\n ]\n }\n ]\n }\n ]\n}\n```\nNotes:\n- Transaction `type` can be any value from the following\n - `PURCHASE`\n - `PURCHASE_SIP`\n - `REDEMPTION`\n - `SWITCH_IN`\n - `SWITCH_IN_MERGER`\n - `SWITCH_OUT`\n - `SWITCH_OUT_MERGER`\n - `DIVIDEND_PAYOUT`\n - `DIVIDEND_REINVESTMENT`\n - `SEGREGATION`\n - `STAMP_DUTY_TAX`\n - `TDS_TAX`\n - `STT_TAX`\n - `MISC`\n- `dividend_rate` is applicable only for `DIVIDEND_PAYOUT` and\n `DIVIDEND_REINVESTMENT` transactions.\n\n### CLI\n\ncasparser also comes with a command-line interface that prints summary of parsed\nportfolio in a wide variety of formats.\n\n```\nUsage: casparser [-o output_file.json|output_file.csv] [-p password] [-s] [-a] CAS_PDF_FILE\n\n -o, --output FILE Output file path. Saves the parsed data as json or csv\n depending on the file extension. For other extensions, the\n summary output is saved. [See note below]\n\n -s, --summary Print Summary of transactions parsed.\n -p PASSWORD CAS password\n -a, --include-all Include schemes with zero valuation in the\n summary output\n -g, --gains Generate Capital Gains Report (BETA)\n --gains-112a ask|FY2020-21 Generate Capital Gains Report - 112A format for\n a given financial year - Use 'ask' for a prompt\n from available options (BETA)\n --force-pdfminer Force PDFMiner parser even if MuPDF is\n detected\n\n --version Show the version and exit.\n -h, --help Show this message and exit.\n```\n\n#### CLI examples\n```\n# Print portfolio summary\ncasparser /path/to/cas.pdf -p password\n\n# Print portfolio and capital gains summary\ncasparser /path/to/cas.pdf -p password -g\n\n# Save parsed data as a json file\ncasparser /path/to/cas.pdf -p password -o pdf_parsed.json\n\n# Save parsed data as a csv file\ncasparser /path/to/cas.pdf -p password -o pdf_parsed.csv\n\n# Save capital gains transactions in csv files (pdf_parsed-gains-summary.csv and\n# pdf_parsed-gains-detailed.csv)\ncasparser /path/to/cas.pdf -p password -g -o pdf_parsed.csv\n\n```\n\n**Note:** `casparser cli` supports two special output file formats [-o _file.json_ / _file.csv_]\n1. `json` - complete parsed data is exported in json format (including investor info)\n2. `csv` - Summary info is exported in csv format if the input file is a summary statement or if\n a summary flag (`-s/--summary`) is passed as argument to the CLI. Otherwise, full\n transaction history is included in the export.\n If `-g` flag is present, two additional files '{basename}-gains-summary.csv',\n '{basename}-gains-detailed.csv' are created with the capital-gains data.\n3. any other extension - The summary table is saved in the file.\n\n\n#### Demo\n\n![demo](https://raw.githubusercontent.com/codereverser/casparser/main/assets/demo.jpg)\n\n## ISIN & AMFI code support\n\nSince v0.4.3, `casparser` includes support for identifying ISIN and AMFI code for the parsed schemes\nvia the helper module [casparser-isin](https://github.com/codereverser/casparser-isin/). If the parser\nfails to assign ISIN or AMFI codes to a scheme, try updating the local ISIN database by\n\n```shell\ncasparser-isin --update\n```\n\nIf it still fails, please raise an issue at [casparser-isin](https://github.com/codereverser/casparser-isin/issues/new) with the\nfailing scheme name(s).\n\n## License\n\nCASParser is distributed under MIT license by default. However enabling the optional dependency\n`mupdf/fast` would imply the use of [PyMuPDF](https://github.com/pymupdf/PyMuPDF) /\n[MuPDF](https://mupdf.com/license.html) and hence the licenses GNU GPL v3 and GNU Affero GPL v3\nwould apply. Copies of all licenses have been included in this repository. - _IANAL_\n\n## Resources\n1. [CAS from CAMS](https://www.camsonline.com/Investors/Statements/Consolidated-Account-Statement)\n2. [CAS from Karvy/Kfintech](https://mfs.kfintech.com/investor/General/ConsolidatedAccountStatement)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "(Karvy/Kfintech/CAMS) Consolidated Account Statement (CAS) PDF parser",
"version": "0.7.4",
"project_urls": {
"Homepage": "https://github.com/codereverser/casparser"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "10fc264c2d05efae648b3941f786c4934f7e0e9ec4262995108606995393a0d2",
"md5": "7d485a1773a7a80935c5d5f312c41291",
"sha256": "ac9802b2db04a0541969f1bf1a86e15a616dcd2f5bb879785e6dbb28bfec8f3d"
},
"downloads": -1,
"filename": "casparser-0.7.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7d485a1773a7a80935c5d5f312c41291",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<4.0",
"size": 32634,
"upload_time": "2023-10-07T15:46:44",
"upload_time_iso_8601": "2023-10-07T15:46:44.132030Z",
"url": "https://files.pythonhosted.org/packages/10/fc/264c2d05efae648b3941f786c4934f7e0e9ec4262995108606995393a0d2/casparser-0.7.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ad842d93b9727a43a7dcfaafb84885102646793c3991581d0d76c1fba9a973de",
"md5": "3501bf5f5539d928d97e3f73b254e2e5",
"sha256": "a25a863aa20dc1fda7292b23621dbc18858c15fbbaafbd3c5e8cd72aab0eaa80"
},
"downloads": -1,
"filename": "casparser-0.7.4.tar.gz",
"has_sig": false,
"md5_digest": "3501bf5f5539d928d97e3f73b254e2e5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<4.0",
"size": 27912,
"upload_time": "2023-10-07T15:46:45",
"upload_time_iso_8601": "2023-10-07T15:46:45.436788Z",
"url": "https://files.pythonhosted.org/packages/ad/84/2d93b9727a43a7dcfaafb84885102646793c3991581d0d76c1fba9a973de/casparser-0.7.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-07 15:46:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "codereverser",
"github_project": "casparser",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "casparser"
}