edgar-analyzer


Nameedgar-analyzer JSON
Version 0.0.1rc7 PyPI version JSON
download
home_pagehttps://github.com/mgao6767/edgar-analyzer
SummaryTextual analysis on SEC filings from EDGAR
upload_time2023-09-08 07:33:04
maintainer
docs_urlNone
authorMingze Gao
requires_python
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # `edgar-analyzer` - Textual Analysis with EDGAR filings

`edgar-analyzer` is a CLI tool to download SEC filings from EDGAR and perform textual analyses.

## Installation

```bash
pip install edgar-analyzer
```

## Workflow

### Setup

**Download index files**, which contain the firm CIK, name, filing date, type, and URL of the filing.

```bash
edgar-analyzer download_index --user_agent "MyCompany name@mycompany.com" --output "./index"
```

**Build a database** of the previously download index files for more efficient queries.

```bash
edgar-analyzer build_database --inputdir "./index" --database "edgar-idx.sqlite3"
```

**Download filings**, only filings in the database but not downloaded yet will be downloaded. Download speed will be auto throttled as per SEC's fair use policy.

```bash
edgar-analyzer download_filings --user_agent "MyCompany name@mycompany.com" --output "./output" --database "edgar-idx.sqlite3" --file_type "8-K" -t 4
```

### Run specific jobs

These tasks can be executed once the database of filings is built.

#### Find event date

```bash
❯ edgar-analyzer find_event_date -h
usage: edgar-analyzer [OPTION]... find_event_date [-h] -d data_directory --file_type file_type [-db databsae] [-t threads]

Find event date from filings from header data

options:
  -h, --help            show this help message and exit
  -t threads, --threads threads
                        number of processes to use

required named arguments:
  -d data_directory, --data_dir data_directory
                        directory of filings
  --file_type file_type
                        type of filing
  -db databsae, --database databsae
                        sqlite database to store results
```

#### Find reported items

```bash
❯ edgar-analyzer find_reported_items -h
usage: edgar-analyzer [OPTION]... find_reported_items [-h] -d data_directory --file_type file_type [-db databsae] [-t threads]

Find reported items from filings from header data

options:
  -h, --help            show this help message and exit
  -t threads, --threads threads
                        number of processes to use

required named arguments:
  -d data_directory, --data_dir data_directory
                        directory of filings
  --file_type file_type
                        type of filing
  -db databsae, --database databsae
                        sqlite database to store results
```

#### more to be integrated

## Example

Just a simple example of the job `find_event_date`. Based on the 1,491,368 8K filings (2004-2022), the table below shows the reporting lags (date of filing minus date of event). 

We can find that _most_ filings are filed on the same day as the event reported, and that over 99.99% of filings are filed within 4 calendar days (SEC requires 4 business days).

| Filing lag   (calendar days) | Frequency | Percentage | Cumulative |
| ---------------------------- | --------- | ---------- | ---------- |
| 0                            | 1470089   | 98.57%     | 98.57%     |
| 1                            | 20761     | 1.39%      | 99.97%     |
| 2                            | 285       | 0.02%      | 99.98%     |
| 3                            | 89        | 0.01%      | 99.99%     |
| 4                            | 47        | 0.00%      | 99.99%     |
| 5                            | 26        | 0.00%      | 100.00%    |
| 6                            | 14        | 0.00%      | 100.00%    |
| 7                            | 6         | 0.00%      | 100.00%    |
| 8                            | 4         | 0.00%      | 100.00%    |
| 9                            | 3         | 0.00%      | 100.00%    |
| 10 or more                   | 44        | 0.00%      | 100.00%    |

## Note

This tool is a work in progress and breaking changes may be expected.

## Contact

If you identify any issue, please feel free to contact me at [mingze.gao@sydney.edu.au](mailto:mingze.gao@sydney.edu.au).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/mgao6767/edgar-analyzer",
    "name": "edgar-analyzer",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Mingze Gao",
    "author_email": "mingze.gao@sydney.edu.au",
    "download_url": "https://files.pythonhosted.org/packages/41/a3/3efac8a0dca51b8bd45a775767c1404d6743beb9bcd0902d4b9b9ddbd9c7/edgar-analyzer-0.0.1rc7.tar.gz",
    "platform": null,
    "description": "# `edgar-analyzer` - Textual Analysis with EDGAR filings\n\n`edgar-analyzer` is a CLI tool to download SEC filings from EDGAR and perform textual analyses.\n\n## Installation\n\n```bash\npip install edgar-analyzer\n```\n\n## Workflow\n\n### Setup\n\n**Download index files**, which contain the firm CIK, name, filing date, type, and URL of the filing.\n\n```bash\nedgar-analyzer download_index --user_agent \"MyCompany name@mycompany.com\" --output \"./index\"\n```\n\n**Build a database** of the previously download index files for more efficient queries.\n\n```bash\nedgar-analyzer build_database --inputdir \"./index\" --database \"edgar-idx.sqlite3\"\n```\n\n**Download filings**, only filings in the database but not downloaded yet will be downloaded. Download speed will be auto throttled as per SEC's fair use policy.\n\n```bash\nedgar-analyzer download_filings --user_agent \"MyCompany name@mycompany.com\" --output \"./output\" --database \"edgar-idx.sqlite3\" --file_type \"8-K\" -t 4\n```\n\n### Run specific jobs\n\nThese tasks can be executed once the database of filings is built.\n\n#### Find event date\n\n```bash\n\u276f edgar-analyzer find_event_date -h\nusage: edgar-analyzer [OPTION]... find_event_date [-h] -d data_directory --file_type file_type [-db databsae] [-t threads]\n\nFind event date from filings from header data\n\noptions:\n  -h, --help            show this help message and exit\n  -t threads, --threads threads\n                        number of processes to use\n\nrequired named arguments:\n  -d data_directory, --data_dir data_directory\n                        directory of filings\n  --file_type file_type\n                        type of filing\n  -db databsae, --database databsae\n                        sqlite database to store results\n```\n\n#### Find reported items\n\n```bash\n\u276f edgar-analyzer find_reported_items -h\nusage: edgar-analyzer [OPTION]... find_reported_items [-h] -d data_directory --file_type file_type [-db databsae] [-t threads]\n\nFind reported items from filings from header data\n\noptions:\n  -h, --help            show this help message and exit\n  -t threads, --threads threads\n                        number of processes to use\n\nrequired named arguments:\n  -d data_directory, --data_dir data_directory\n                        directory of filings\n  --file_type file_type\n                        type of filing\n  -db databsae, --database databsae\n                        sqlite database to store results\n```\n\n#### more to be integrated\n\n## Example\n\nJust a simple example of the job `find_event_date`. Based on the 1,491,368 8K filings (2004-2022), the table below shows the reporting lags (date of filing minus date of event). \n\nWe can find that _most_ filings are filed on the same day as the event reported, and that over 99.99% of filings are filed within 4 calendar days (SEC requires 4 business days).\n\n| Filing lag   (calendar days) | Frequency | Percentage | Cumulative |\n| ---------------------------- | --------- | ---------- | ---------- |\n| 0                            | 1470089   | 98.57%     | 98.57%     |\n| 1                            | 20761     | 1.39%      | 99.97%     |\n| 2                            | 285       | 0.02%      | 99.98%     |\n| 3                            | 89        | 0.01%      | 99.99%     |\n| 4                            | 47        | 0.00%      | 99.99%     |\n| 5                            | 26        | 0.00%      | 100.00%    |\n| 6                            | 14        | 0.00%      | 100.00%    |\n| 7                            | 6         | 0.00%      | 100.00%    |\n| 8                            | 4         | 0.00%      | 100.00%    |\n| 9                            | 3         | 0.00%      | 100.00%    |\n| 10 or more                   | 44        | 0.00%      | 100.00%    |\n\n## Note\n\nThis tool is a work in progress and breaking changes may be expected.\n\n## Contact\n\nIf you identify any issue, please feel free to contact me at [mingze.gao@sydney.edu.au](mailto:mingze.gao@sydney.edu.au).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Textual analysis on SEC filings from EDGAR",
    "version": "0.0.1rc7",
    "project_urls": {
        "Homepage": "https://github.com/mgao6767/edgar-analyzer"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "41a33efac8a0dca51b8bd45a775767c1404d6743beb9bcd0902d4b9b9ddbd9c7",
                "md5": "8d2f4b231685cc844a44ae1fc05c6123",
                "sha256": "45ea120589c82965c574054a3809bbdeca012b6282db5f39aa1537f8760d497d"
            },
            "downloads": -1,
            "filename": "edgar-analyzer-0.0.1rc7.tar.gz",
            "has_sig": false,
            "md5_digest": "8d2f4b231685cc844a44ae1fc05c6123",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 11022,
            "upload_time": "2023-09-08T07:33:04",
            "upload_time_iso_8601": "2023-09-08T07:33:04.175089Z",
            "url": "https://files.pythonhosted.org/packages/41/a3/3efac8a0dca51b8bd45a775767c1404d6743beb9bcd0902d4b9b9ddbd9c7/edgar-analyzer-0.0.1rc7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-08 07:33:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mgao6767",
    "github_project": "edgar-analyzer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "edgar-analyzer"
}
        
Elapsed time: 0.79980s