sec-downloader


Namesec-downloader JSON
Version 0.11.1 PyPI version JSON
download
home_pagehttps://github.com/Elijas/sec-downloader
SummaryUseful extensions for sec-edgar-downloader.
upload_time2024-03-27 09:27:40
maintainerNone
docs_urlNone
authorElijas
requires_python>=3.7
licenseMIT License
keywords nbdev jupyter notebook python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # sec-downloader

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

<a href="https://github.com/elijas/sec-downloader/actions/workflows/test.yaml"><img alt="GitHub Workflow Status" src="https://img.shields.io/github/actions/workflow/status/elijas/sec-downloader/test.yaml?label=build"></a>
<a href="https://pypi.org/project/sec-downloader/"><img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/sec-downloader"></a>
<a href="https://badge.fury.io/py/sec-downloader"><img src="https://badge.fury.io/py/sec-downloader.svg" alt="PyPI version" /></a>
<a href="LICENSE"><img src="https://img.shields.io/github/license/elijas/sec-downloader.svg" alt="Licence"></a>

A better version of `sec-edgar-downloader`. Includes an alternative
implementation (a wrapper instead of a fork), to keep compatibility with
new `sec-edgar-downloader` releases. This library partially uses
[nbdev](https://nbdev.fast.ai/).

# Features

Advantages over `sec-edgar-downloader`:

**Flexibility in Download Process**

- Tailored for choosing *what*, *where*, and *how* to download.
- Files stored in memory for faster operations and no unnecessary disk
  clutter.

**Separate Metadata and File Downloads**

- Easily skip unneeded files.
- Download metadata first, then selectively download files.
- Option to save metadata for better organization.

**More Input Options**

- Ticker or CIK (e.g., `AAPL`, `0000320193`) for latest filings.
- Accession Number (e.g., `0000320193-23-000077`). Not supported in
  `sec-edgar-downloader`.
- SEC EDGAR URL (e.g.,
  `https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm`).
  Not supported in `sec-edgar-downloader`.

# Install

``` sh
pip install sec-downloader
```

# How to use

## Download the metadata

> **Note** The company name and email address are used to form a
> user-agent string that adheres to the SEC EDGAR’s fair access policy
> for programmatic downloading.
> [Source](https://www.sec.gov/os/webmaster-faq#code-support)

``` python
from sec_downloader import Downloader

dl = Downloader("MyCompanyName", "email@example.com")
```

Find a filing with an Accession Number

``` python
metadatas = dl.get_filing_metadatas("AAPL/0000320193-23-000077")
print(metadatas)
```

    [FilingMetadata(accession_number='0000320193-23-000077',
                    form_type='10-Q',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/320193/000032019323000077/aapl-20230701.htm',
                    items='',
                    primary_doc_description='10-Q',
                    filing_date='2023-08-04',
                    report_date='2023-07-01',
                    cik='0000320193',
                    company_name='Apple Inc.',
                    tickers=[Ticker(symbol='AAPL', exchange='Nasdaq')])]

Alternatively, you can also use any of these to get the same answer:

    metadatas = dl.get_filing_metadatas("aapl/000032019323000077")
    metadatas = dl.get_filing_metadatas("320193/000032019323000077")
    metadatas = dl.get_filing_metadatas("320193/0000320193-23-000077")
    metadatas = dl.get_filing_metadatas("0000320193/0000320193-23-000077")
    metadatas = dl.get_filing_metadatas(CompanyAndAccessionNumber(ticker_or_cik="320193", accession_number="0000320193-23-000077"))

Find the filing matching a SEC EDGAR Filing URL. Only CIK and Accession
Number are used from the URL:

``` python
metadatas = dl.get_filing_metadatas(
    "https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm"
)
print(metadatas)
```

    [FilingMetadata(accession_number='0001193125-23-272204',
                    form_type='8-K',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',
                    items='2.02,9.01',
                    primary_doc_description='8-K',
                    filing_date='2023-11-07',
                    report_date='2023-11-04',
                    cik='0001067983',
                    company_name='BERKSHIRE HATHAWAY INC',
                    tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),
                             Ticker(symbol='BRK-A', exchange='NYSE')])]

Alternatively, you can also URLs in other formats and get the same
answer:

    metadatas = dl.get_filing_metadatas("https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm")

Find latest filings by company ticker or CIK:

``` python
from sec_downloader.types import RequestedFilings

metadatas = dl.get_filing_metadatas(
    RequestedFilings(ticker_or_cik="MSFT", form_type="10-K", limit=2)
)
print(metadatas)
```

    [FilingMetadata(accession_number='0000950170-23-035122',
                    form_type='10-K',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000095017023035122/msft-20230630.htm',
                    items='',
                    primary_doc_description='10-K',
                    filing_date='2023-07-27',
                    report_date='2023-06-30',
                    cik='0000789019',
                    company_name='MICROSOFT CORP',
                    tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')]),
     FilingMetadata(accession_number='0001564590-22-026876',
                    form_type='10-K',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000156459022026876/msft-10k_20220630.htm',
                    items='',
                    primary_doc_description='10-K',
                    filing_date='2022-07-28',
                    report_date='2022-06-30',
                    cik='0000789019',
                    company_name='MICROSOFT CORP',
                    tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')])]

Alternatively, you can also use any of these to get the same answer:

    metadatas = dl.get_filing_metadatas("2/msft/10-K")
    metadatas = dl.get_filing_metadatas("2/789019/10-K")
    metadatas = dl.get_filing_metadatas("2/0000789019/10-K")

The parameters `limit` and `form_type` are optional. If omitted, `limit`
defaults to 1, and `form_type` defaults to ‘10-Q’.

``` python
metadatas = dl.get_filing_metadatas("NFLX")
print(metadatas)
```

    [FilingMetadata(accession_number='0001065280-23-000273',
                    form_type='10-Q',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/1065280/000106528023000273/nflx-20230930.htm',
                    items='',
                    primary_doc_description='10-Q',
                    filing_date='2023-10-20',
                    report_date='2023-09-30',
                    cik='0001065280',
                    company_name='NETFLIX INC',
                    tickers=[Ticker(symbol='NFLX', exchange='Nasdaq')])]

Alternatively, you can also use any of these to get the same answer:

    metadatas = dl.get_filing_metadatas("nflx")
    metadatas = dl.get_filing_metadatas("1/NFLX")
    metadatas = dl.get_filing_metadatas("NFLX/10-Q")
    metadatas = dl.get_filing_metadatas("1/NFLX/10-Q")
    metadatas = dl.get_filing_metadatas(RequestedFilings(ticker_or_cik="NFLX"))
    metadatas = dl.get_filing_metadatas(RequestedFilings(limit=1, ticker_or_cik="NFLX", form_type="10-Q"))

## Download the HTML files

After obtaining the Primary Document URL, for example from the metadata,
you can proceed to download the HTML using this URL.

``` python
for metadata in metadatas:
    html = dl.download_filing(url=metadata.primary_doc_url).decode()
    print(html[:50])
    break  # same for all filings, let's just print the first one
```

    '<?xml version="1.0" ?><!--XBRL Document Created wi'

# Alternative implementation: Wrapper

Files are downloaded to a temporary folder, immediately read into
memory, and then deleted. Let’s demonstrate how to download a single
file (latest 10-Q filing details in HTML format) to memory. The “glob”
pattern is used to select which files are read to memory.

``` python
from sec_edgar_downloader import Downloader as SecEdgarDownloader
from sec_downloader.download_storage import DownloadStorage

ONLY_HTML = "**/*.htm*"

storage = DownloadStorage(filter_pattern=ONLY_HTML)
with storage as path:
    dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
    dl.get("10-Q", "AAPL", limit=1, download_details=True)
# all files are now deleted and only stored in memory

content = storage.get_file_contents()[0].content
print(f"{content[:50]}...")
```

    "<?xml version='1.0' encoding='ASCII'?>\n<html xmlns..."

Downloading multiple documents:

``` python
storage = DownloadStorage()
with storage as path:
    dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
    dl.get("10-K", "GOOG", limit=2)
# all files are now deleted and only stored in memory

for path, content in storage.get_file_contents():
    print(f"Path: {path}\nContent [len={len(content)}]: {content[:30]}...\n")
```

    ('Path: sec-edgar-filings/GOOG/10-K/0001652044-24-000022/full-submission.txt\n'
     'Content [len=13927595]: <SEC-DOCUMENT>0001652044-24-00...\n')
    ('Path: sec-edgar-filings/GOOG/10-K/0001652044-23-000016/full-submission.txt\n'
     'Content [len=15264470]: <SEC-DOCUMENT>0001652044-23-00...\n')

# Contributing

Follow these steps to install the project locally for development:

1.  Install the project with the command `pip install -e ".[dev]"`.

> **Note** We highly recommend using virtual environments for Python
> development. If you’d like to use virtual environments, follow these
> steps instead:
>
> - Create a virtual environment `python3 -m venv .venv`
> - Activate the virtual environment `source .venv/bin/activate`
> - Install the project with the command `pip install -e ".[dev]"`

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Elijas/sec-downloader",
    "name": "sec-downloader",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "nbdev jupyter notebook python",
    "author": "Elijas",
    "author_email": "4084885+Elijas@users.noreply.github.com",
    "download_url": "https://files.pythonhosted.org/packages/f3/3e/3804ce8afeeb6155f415d0902d37154243a7955ebae1851a540162ea42b8/sec-downloader-0.11.1.tar.gz",
    "platform": null,
    "description": "# sec-downloader\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n<a href=\"https://github.com/elijas/sec-downloader/actions/workflows/test.yaml\"><img alt=\"GitHub Workflow Status\" src=\"https://img.shields.io/github/actions/workflow/status/elijas/sec-downloader/test.yaml?label=build\"></a>\n<a href=\"https://pypi.org/project/sec-downloader/\"><img alt=\"PyPI - Python Version\" src=\"https://img.shields.io/pypi/pyversions/sec-downloader\"></a>\n<a href=\"https://badge.fury.io/py/sec-downloader\"><img src=\"https://badge.fury.io/py/sec-downloader.svg\" alt=\"PyPI version\" /></a>\n<a href=\"LICENSE\"><img src=\"https://img.shields.io/github/license/elijas/sec-downloader.svg\" alt=\"Licence\"></a>\n\nA better version of `sec-edgar-downloader`. Includes an alternative\nimplementation (a wrapper instead of a fork), to keep compatibility with\nnew `sec-edgar-downloader` releases. This library partially uses\n[nbdev](https://nbdev.fast.ai/).\n\n# Features\n\nAdvantages over `sec-edgar-downloader`:\n\n**Flexibility in Download Process**\n\n- Tailored for choosing *what*, *where*, and *how* to download.\n- Files stored in memory for faster operations and no unnecessary disk\n  clutter.\n\n**Separate Metadata and File Downloads**\n\n- Easily skip unneeded files.\n- Download metadata first, then selectively download files.\n- Option to save metadata for better organization.\n\n**More Input Options**\n\n- Ticker or CIK (e.g., `AAPL`, `0000320193`) for latest filings.\n- Accession Number (e.g., `0000320193-23-000077`). Not supported in\n  `sec-edgar-downloader`.\n- SEC EDGAR URL (e.g.,\n  `https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm`).\n  Not supported in `sec-edgar-downloader`.\n\n# Install\n\n``` sh\npip install sec-downloader\n```\n\n# How to use\n\n## Download the metadata\n\n> **Note** The company name and email address are used to form a\n> user-agent string that adheres to the SEC EDGAR\u2019s fair access policy\n> for programmatic downloading.\n> [Source](https://www.sec.gov/os/webmaster-faq#code-support)\n\n``` python\nfrom sec_downloader import Downloader\n\ndl = Downloader(\"MyCompanyName\", \"email@example.com\")\n```\n\nFind a filing with an Accession Number\n\n``` python\nmetadatas = dl.get_filing_metadatas(\"AAPL/0000320193-23-000077\")\nprint(metadatas)\n```\n\n    [FilingMetadata(accession_number='0000320193-23-000077',\n                    form_type='10-Q',\n                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/320193/000032019323000077/aapl-20230701.htm',\n                    items='',\n                    primary_doc_description='10-Q',\n                    filing_date='2023-08-04',\n                    report_date='2023-07-01',\n                    cik='0000320193',\n                    company_name='Apple Inc.',\n                    tickers=[Ticker(symbol='AAPL', exchange='Nasdaq')])]\n\nAlternatively, you can also use any of these to get the same answer:\n\n    metadatas = dl.get_filing_metadatas(\"aapl/000032019323000077\")\n    metadatas = dl.get_filing_metadatas(\"320193/000032019323000077\")\n    metadatas = dl.get_filing_metadatas(\"320193/0000320193-23-000077\")\n    metadatas = dl.get_filing_metadatas(\"0000320193/0000320193-23-000077\")\n    metadatas = dl.get_filing_metadatas(CompanyAndAccessionNumber(ticker_or_cik=\"320193\", accession_number=\"0000320193-23-000077\"))\n\nFind the filing matching a SEC EDGAR Filing URL. Only CIK and Accession\nNumber are used from the URL:\n\n``` python\nmetadatas = dl.get_filing_metadatas(\n    \"https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm\"\n)\nprint(metadatas)\n```\n\n    [FilingMetadata(accession_number='0001193125-23-272204',\n                    form_type='8-K',\n                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',\n                    items='2.02,9.01',\n                    primary_doc_description='8-K',\n                    filing_date='2023-11-07',\n                    report_date='2023-11-04',\n                    cik='0001067983',\n                    company_name='BERKSHIRE HATHAWAY INC',\n                    tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),\n                             Ticker(symbol='BRK-A', exchange='NYSE')])]\n\nAlternatively, you can also URLs in other formats and get the same\nanswer:\n\n    metadatas = dl.get_filing_metadatas(\"https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm\")\n\nFind latest filings by company ticker or CIK:\n\n``` python\nfrom sec_downloader.types import RequestedFilings\n\nmetadatas = dl.get_filing_metadatas(\n    RequestedFilings(ticker_or_cik=\"MSFT\", form_type=\"10-K\", limit=2)\n)\nprint(metadatas)\n```\n\n    [FilingMetadata(accession_number='0000950170-23-035122',\n                    form_type='10-K',\n                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000095017023035122/msft-20230630.htm',\n                    items='',\n                    primary_doc_description='10-K',\n                    filing_date='2023-07-27',\n                    report_date='2023-06-30',\n                    cik='0000789019',\n                    company_name='MICROSOFT CORP',\n                    tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')]),\n     FilingMetadata(accession_number='0001564590-22-026876',\n                    form_type='10-K',\n                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000156459022026876/msft-10k_20220630.htm',\n                    items='',\n                    primary_doc_description='10-K',\n                    filing_date='2022-07-28',\n                    report_date='2022-06-30',\n                    cik='0000789019',\n                    company_name='MICROSOFT CORP',\n                    tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')])]\n\nAlternatively, you can also use any of these to get the same answer:\n\n    metadatas = dl.get_filing_metadatas(\"2/msft/10-K\")\n    metadatas = dl.get_filing_metadatas(\"2/789019/10-K\")\n    metadatas = dl.get_filing_metadatas(\"2/0000789019/10-K\")\n\nThe parameters `limit` and `form_type` are optional. If omitted, `limit`\ndefaults to 1, and `form_type` defaults to \u201810-Q\u2019.\n\n``` python\nmetadatas = dl.get_filing_metadatas(\"NFLX\")\nprint(metadatas)\n```\n\n    [FilingMetadata(accession_number='0001065280-23-000273',\n                    form_type='10-Q',\n                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/1065280/000106528023000273/nflx-20230930.htm',\n                    items='',\n                    primary_doc_description='10-Q',\n                    filing_date='2023-10-20',\n                    report_date='2023-09-30',\n                    cik='0001065280',\n                    company_name='NETFLIX INC',\n                    tickers=[Ticker(symbol='NFLX', exchange='Nasdaq')])]\n\nAlternatively, you can also use any of these to get the same answer:\n\n    metadatas = dl.get_filing_metadatas(\"nflx\")\n    metadatas = dl.get_filing_metadatas(\"1/NFLX\")\n    metadatas = dl.get_filing_metadatas(\"NFLX/10-Q\")\n    metadatas = dl.get_filing_metadatas(\"1/NFLX/10-Q\")\n    metadatas = dl.get_filing_metadatas(RequestedFilings(ticker_or_cik=\"NFLX\"))\n    metadatas = dl.get_filing_metadatas(RequestedFilings(limit=1, ticker_or_cik=\"NFLX\", form_type=\"10-Q\"))\n\n## Download the HTML files\n\nAfter obtaining the Primary Document URL, for example from the metadata,\nyou can proceed to download the HTML using this URL.\n\n``` python\nfor metadata in metadatas:\n    html = dl.download_filing(url=metadata.primary_doc_url).decode()\n    print(html[:50])\n    break  # same for all filings, let's just print the first one\n```\n\n    '<?xml version=\"1.0\" ?><!--XBRL Document Created wi'\n\n# Alternative implementation: Wrapper\n\nFiles are downloaded to a temporary folder, immediately read into\nmemory, and then deleted. Let\u2019s demonstrate how to download a single\nfile (latest 10-Q filing details in HTML format) to memory. The \u201cglob\u201d\npattern is used to select which files are read to memory.\n\n``` python\nfrom sec_edgar_downloader import Downloader as SecEdgarDownloader\nfrom sec_downloader.download_storage import DownloadStorage\n\nONLY_HTML = \"**/*.htm*\"\n\nstorage = DownloadStorage(filter_pattern=ONLY_HTML)\nwith storage as path:\n    dl = SecEdgarDownloader(\"MyCompanyName\", \"email@example.com\", path)\n    dl.get(\"10-Q\", \"AAPL\", limit=1, download_details=True)\n# all files are now deleted and only stored in memory\n\ncontent = storage.get_file_contents()[0].content\nprint(f\"{content[:50]}...\")\n```\n\n    \"<?xml version='1.0' encoding='ASCII'?>\\n<html xmlns...\"\n\nDownloading multiple documents:\n\n``` python\nstorage = DownloadStorage()\nwith storage as path:\n    dl = SecEdgarDownloader(\"MyCompanyName\", \"email@example.com\", path)\n    dl.get(\"10-K\", \"GOOG\", limit=2)\n# all files are now deleted and only stored in memory\n\nfor path, content in storage.get_file_contents():\n    print(f\"Path: {path}\\nContent [len={len(content)}]: {content[:30]}...\\n\")\n```\n\n    ('Path: sec-edgar-filings/GOOG/10-K/0001652044-24-000022/full-submission.txt\\n'\n     'Content [len=13927595]: <SEC-DOCUMENT>0001652044-24-00...\\n')\n    ('Path: sec-edgar-filings/GOOG/10-K/0001652044-23-000016/full-submission.txt\\n'\n     'Content [len=15264470]: <SEC-DOCUMENT>0001652044-23-00...\\n')\n\n# Contributing\n\nFollow these steps to install the project locally for development:\n\n1.  Install the project with the command `pip install -e \".[dev]\"`.\n\n> **Note** We highly recommend using virtual environments for Python\n> development. If you\u2019d like to use virtual environments, follow these\n> steps instead:\n>\n> - Create a virtual environment `python3 -m venv .venv`\n> - Activate the virtual environment `source .venv/bin/activate`\n> - Install the project with the command `pip install -e \".[dev]\"`\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Useful extensions for sec-edgar-downloader.",
    "version": "0.11.1",
    "project_urls": {
        "Homepage": "https://github.com/Elijas/sec-downloader"
    },
    "split_keywords": [
        "nbdev",
        "jupyter",
        "notebook",
        "python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a579ad90ebabbc3d4f8c2fed8b7900f661340089eaa6540265a56785970767cf",
                "md5": "f10112e00b3c653c262ba5e189868cdd",
                "sha256": "57b09dcc1286ef2e357da2f90b6baf2ecb959a64140fdb8e7fcfd3301be74bdb"
            },
            "downloads": -1,
            "filename": "sec_downloader-0.11.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f10112e00b3c653c262ba5e189868cdd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 11386,
            "upload_time": "2024-03-27T09:27:39",
            "upload_time_iso_8601": "2024-03-27T09:27:39.023004Z",
            "url": "https://files.pythonhosted.org/packages/a5/79/ad90ebabbc3d4f8c2fed8b7900f661340089eaa6540265a56785970767cf/sec_downloader-0.11.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f33e3804ce8afeeb6155f415d0902d37154243a7955ebae1851a540162ea42b8",
                "md5": "d0603360e14cb264499f86a74551ba07",
                "sha256": "48ff5199b91d0f5393650e028bfefbb9f2f8e33665014cd75e8ed688339519c2"
            },
            "downloads": -1,
            "filename": "sec-downloader-0.11.1.tar.gz",
            "has_sig": false,
            "md5_digest": "d0603360e14cb264499f86a74551ba07",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 13656,
            "upload_time": "2024-03-27T09:27:40",
            "upload_time_iso_8601": "2024-03-27T09:27:40.838141Z",
            "url": "https://files.pythonhosted.org/packages/f3/3e/3804ce8afeeb6155f415d0902d37154243a7955ebae1851a540162ea42b8/sec-downloader-0.11.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-27 09:27:40",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Elijas",
    "github_project": "sec-downloader",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sec-downloader"
}
        
Elapsed time: 0.23095s