# sec-downloader
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
<a href="https://github.com/elijas/sec-downloader/actions/workflows/test.yaml"><img alt="GitHub Workflow Status" src="https://img.shields.io/github/actions/workflow/status/elijas/sec-downloader/test.yaml?label=build"></a>
<a href="https://pypi.org/project/sec-downloader/"><img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/sec-downloader"></a>
<a href="https://badge.fury.io/py/sec-downloader"><img src="https://badge.fury.io/py/sec-downloader.svg" alt="PyPI version" /></a>
<a href="LICENSE"><img src="https://img.shields.io/github/license/elijas/sec-downloader.svg" alt="Licence"></a>
A better version of `sec-edgar-downloader`. Includes an alternative
implementation (a wrapper instead of a fork), to keep compatibility with
new `sec-edgar-downloader` releases. This library partially uses
[nbdev](https://nbdev.fast.ai/).
# Features
Advantages over `sec-edgar-downloader`:
**Flexibility in Download Process**
- Tailored for choosing *what*, *where*, and *how* to download.
- Files stored in memory for faster operations and no unnecessary disk
clutter.
**Separate Metadata and File Downloads**
- Easily skip unneeded files.
- Download metadata first, then selectively download files.
- Option to save metadata for better organization.
**More Input Options**
- Ticker or CIK (e.g., `AAPL`, `0000320193`) for latest filings.
- Accession Number (e.g., `0000320193-23-000077`). Not supported in
`sec-edgar-downloader`.
- SEC EDGAR URL (e.g.,
`https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm`).
Not supported in `sec-edgar-downloader`.
# Install
``` sh
pip install sec-downloader
```
# How to use
## Download the metadata
> **Note** The company name and email address are used to form a
> user-agent string that adheres to the SEC EDGAR’s fair access policy
> for programmatic downloading.
> [Source](https://www.sec.gov/os/webmaster-faq#code-support)
``` python
from sec_downloader import Downloader
dl = Downloader("MyCompanyName", "email@example.com")
```
Find a filing with an Accession Number
``` python
metadatas = dl.get_filing_metadatas("AAPL/0000320193-23-000077")
print(metadatas)
```
[FilingMetadata(accession_number='0000320193-23-000077',
form_type='10-Q',
primary_doc_url='https://www.sec.gov/Archives/edgar/data/320193/000032019323000077/aapl-20230701.htm',
items='',
primary_doc_description='10-Q',
filing_date='2023-08-04',
report_date='2023-07-01',
cik='0000320193',
company_name='Apple Inc.',
tickers=[Ticker(symbol='AAPL', exchange='Nasdaq')])]
Alternatively, you can also use any of these to get the same answer:
metadatas = dl.get_filing_metadatas("aapl/000032019323000077")
metadatas = dl.get_filing_metadatas("320193/000032019323000077")
metadatas = dl.get_filing_metadatas("320193/0000320193-23-000077")
metadatas = dl.get_filing_metadatas("0000320193/0000320193-23-000077")
metadatas = dl.get_filing_metadatas(CompanyAndAccessionNumber(ticker_or_cik="320193", accession_number="0000320193-23-000077"))
Find the filing matching a SEC EDGAR Filing URL. Only CIK and Accession
Number are used from the URL:
``` python
metadatas = dl.get_filing_metadatas(
"https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm"
)
print(metadatas)
```
[FilingMetadata(accession_number='0001193125-23-272204',
form_type='8-K',
primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',
items='2.02,9.01',
primary_doc_description='8-K',
filing_date='2023-11-07',
report_date='2023-11-04',
cik='0001067983',
company_name='BERKSHIRE HATHAWAY INC',
tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),
Ticker(symbol='BRK-A', exchange='NYSE')])]
Alternatively, you can also URLs in other formats and get the same
answer:
metadatas = dl.get_filing_metadatas("https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm")
Find latest filings by company ticker or CIK:
``` python
from sec_downloader.types import RequestedFilings
metadatas = dl.get_filing_metadatas(
RequestedFilings(ticker_or_cik="MSFT", form_type="10-K", limit=2)
)
print(metadatas)
```
[FilingMetadata(accession_number='0000950170-23-035122',
form_type='10-K',
primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000095017023035122/msft-20230630.htm',
items='',
primary_doc_description='10-K',
filing_date='2023-07-27',
report_date='2023-06-30',
cik='0000789019',
company_name='MICROSOFT CORP',
tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')]),
FilingMetadata(accession_number='0001564590-22-026876',
form_type='10-K',
primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000156459022026876/msft-10k_20220630.htm',
items='',
primary_doc_description='10-K',
filing_date='2022-07-28',
report_date='2022-06-30',
cik='0000789019',
company_name='MICROSOFT CORP',
tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')])]
Alternatively, you can also use any of these to get the same answer:
metadatas = dl.get_filing_metadatas("2/msft/10-K")
metadatas = dl.get_filing_metadatas("2/789019/10-K")
metadatas = dl.get_filing_metadatas("2/0000789019/10-K")
The parameters `limit` and `form_type` are optional. If omitted, `limit`
defaults to 1, and `form_type` defaults to ‘10-Q’.
``` python
metadatas = dl.get_filing_metadatas("NFLX")
print(metadatas)
```
[FilingMetadata(accession_number='0001065280-23-000273',
form_type='10-Q',
primary_doc_url='https://www.sec.gov/Archives/edgar/data/1065280/000106528023000273/nflx-20230930.htm',
items='',
primary_doc_description='10-Q',
filing_date='2023-10-20',
report_date='2023-09-30',
cik='0001065280',
company_name='NETFLIX INC',
tickers=[Ticker(symbol='NFLX', exchange='Nasdaq')])]
Alternatively, you can also use any of these to get the same answer:
metadatas = dl.get_filing_metadatas("nflx")
metadatas = dl.get_filing_metadatas("1/NFLX")
metadatas = dl.get_filing_metadatas("NFLX/10-Q")
metadatas = dl.get_filing_metadatas("1/NFLX/10-Q")
metadatas = dl.get_filing_metadatas(RequestedFilings(ticker_or_cik="NFLX"))
metadatas = dl.get_filing_metadatas(RequestedFilings(limit=1, ticker_or_cik="NFLX", form_type="10-Q"))
## Download the HTML files
After obtaining the Primary Document URL, for example from the metadata,
you can proceed to download the HTML using this URL.
``` python
for metadata in metadatas:
html = dl.download_filing(url=metadata.primary_doc_url).decode()
print(html[:50])
break # same for all filings, let's just print the first one
```
'<?xml version="1.0" ?><!--XBRL Document Created wi'
# Alternative implementation: Wrapper
Files are downloaded to a temporary folder, immediately read into
memory, and then deleted. Let’s demonstrate how to download a single
file (latest 10-Q filing details in HTML format) to memory. The “glob”
pattern is used to select which files are read to memory.
``` python
from sec_edgar_downloader import Downloader as SecEdgarDownloader
from sec_downloader.download_storage import DownloadStorage
ONLY_HTML = "**/*.htm*"
storage = DownloadStorage(filter_pattern=ONLY_HTML)
with storage as path:
dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
dl.get("10-Q", "AAPL", limit=1, download_details=True)
# all files are now deleted and only stored in memory
content = storage.get_file_contents()[0].content
print(f"{content[:50]}...")
```
"<?xml version='1.0' encoding='ASCII'?>\n<html xmlns..."
Downloading multiple documents:
``` python
storage = DownloadStorage()
with storage as path:
dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
dl.get("10-K", "GOOG", limit=2)
# all files are now deleted and only stored in memory
for path, content in storage.get_file_contents():
print(f"Path: {path}\nContent [len={len(content)}]: {content[:30]}...\n")
```
('Path: sec-edgar-filings/GOOG/10-K/0001652044-24-000022/full-submission.txt\n'
'Content [len=13927595]: <SEC-DOCUMENT>0001652044-24-00...\n')
('Path: sec-edgar-filings/GOOG/10-K/0001652044-23-000016/full-submission.txt\n'
'Content [len=15264470]: <SEC-DOCUMENT>0001652044-23-00...\n')
# Contributing
Follow these steps to install the project locally for development:
1. Install the project with the command `pip install -e ".[dev]"`.
> **Note** We highly recommend using virtual environments for Python
> development. If you’d like to use virtual environments, follow these
> steps instead:
>
> - Create a virtual environment `python3 -m venv .venv`
> - Activate the virtual environment `source .venv/bin/activate`
> - Install the project with the command `pip install -e ".[dev]"`
Raw data
{
"_id": null,
"home_page": "https://github.com/Elijas/sec-downloader",
"name": "sec-downloader",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "nbdev jupyter notebook python",
"author": "Elijas",
"author_email": "4084885+Elijas@users.noreply.github.com",
"download_url": "https://files.pythonhosted.org/packages/f3/3e/3804ce8afeeb6155f415d0902d37154243a7955ebae1851a540162ea42b8/sec-downloader-0.11.1.tar.gz",
"platform": null,
"description": "# sec-downloader\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n<a href=\"https://github.com/elijas/sec-downloader/actions/workflows/test.yaml\"><img alt=\"GitHub Workflow Status\" src=\"https://img.shields.io/github/actions/workflow/status/elijas/sec-downloader/test.yaml?label=build\"></a>\n<a href=\"https://pypi.org/project/sec-downloader/\"><img alt=\"PyPI - Python Version\" src=\"https://img.shields.io/pypi/pyversions/sec-downloader\"></a>\n<a href=\"https://badge.fury.io/py/sec-downloader\"><img src=\"https://badge.fury.io/py/sec-downloader.svg\" alt=\"PyPI version\" /></a>\n<a href=\"LICENSE\"><img src=\"https://img.shields.io/github/license/elijas/sec-downloader.svg\" alt=\"Licence\"></a>\n\nA better version of `sec-edgar-downloader`. Includes an alternative\nimplementation (a wrapper instead of a fork), to keep compatibility with\nnew `sec-edgar-downloader` releases. This library partially uses\n[nbdev](https://nbdev.fast.ai/).\n\n# Features\n\nAdvantages over `sec-edgar-downloader`:\n\n**Flexibility in Download Process**\n\n- Tailored for choosing *what*, *where*, and *how* to download.\n- Files stored in memory for faster operations and no unnecessary disk\n clutter.\n\n**Separate Metadata and File Downloads**\n\n- Easily skip unneeded files.\n- Download metadata first, then selectively download files.\n- Option to save metadata for better organization.\n\n**More Input Options**\n\n- Ticker or CIK (e.g., `AAPL`, `0000320193`) for latest filings.\n- Accession Number (e.g., `0000320193-23-000077`). Not supported in\n `sec-edgar-downloader`.\n- SEC EDGAR URL (e.g.,\n `https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm`).\n Not supported in `sec-edgar-downloader`.\n\n# Install\n\n``` sh\npip install sec-downloader\n```\n\n# How to use\n\n## Download the metadata\n\n> **Note** The company name and email address are used to form a\n> user-agent string that adheres to the SEC EDGAR\u2019s fair access policy\n> for programmatic downloading.\n> [Source](https://www.sec.gov/os/webmaster-faq#code-support)\n\n``` python\nfrom sec_downloader import Downloader\n\ndl = Downloader(\"MyCompanyName\", \"email@example.com\")\n```\n\nFind a filing with an Accession Number\n\n``` python\nmetadatas = dl.get_filing_metadatas(\"AAPL/0000320193-23-000077\")\nprint(metadatas)\n```\n\n [FilingMetadata(accession_number='0000320193-23-000077',\n form_type='10-Q',\n primary_doc_url='https://www.sec.gov/Archives/edgar/data/320193/000032019323000077/aapl-20230701.htm',\n items='',\n primary_doc_description='10-Q',\n filing_date='2023-08-04',\n report_date='2023-07-01',\n cik='0000320193',\n company_name='Apple Inc.',\n tickers=[Ticker(symbol='AAPL', exchange='Nasdaq')])]\n\nAlternatively, you can also use any of these to get the same answer:\n\n metadatas = dl.get_filing_metadatas(\"aapl/000032019323000077\")\n metadatas = dl.get_filing_metadatas(\"320193/000032019323000077\")\n metadatas = dl.get_filing_metadatas(\"320193/0000320193-23-000077\")\n metadatas = dl.get_filing_metadatas(\"0000320193/0000320193-23-000077\")\n metadatas = dl.get_filing_metadatas(CompanyAndAccessionNumber(ticker_or_cik=\"320193\", accession_number=\"0000320193-23-000077\"))\n\nFind the filing matching a SEC EDGAR Filing URL. Only CIK and Accession\nNumber are used from the URL:\n\n``` python\nmetadatas = dl.get_filing_metadatas(\n \"https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm\"\n)\nprint(metadatas)\n```\n\n [FilingMetadata(accession_number='0001193125-23-272204',\n form_type='8-K',\n primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',\n items='2.02,9.01',\n primary_doc_description='8-K',\n filing_date='2023-11-07',\n report_date='2023-11-04',\n cik='0001067983',\n company_name='BERKSHIRE HATHAWAY INC',\n tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),\n Ticker(symbol='BRK-A', exchange='NYSE')])]\n\nAlternatively, you can also URLs in other formats and get the same\nanswer:\n\n metadatas = dl.get_filing_metadatas(\"https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm\")\n\nFind latest filings by company ticker or CIK:\n\n``` python\nfrom sec_downloader.types import RequestedFilings\n\nmetadatas = dl.get_filing_metadatas(\n RequestedFilings(ticker_or_cik=\"MSFT\", form_type=\"10-K\", limit=2)\n)\nprint(metadatas)\n```\n\n [FilingMetadata(accession_number='0000950170-23-035122',\n form_type='10-K',\n primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000095017023035122/msft-20230630.htm',\n items='',\n primary_doc_description='10-K',\n filing_date='2023-07-27',\n report_date='2023-06-30',\n cik='0000789019',\n company_name='MICROSOFT CORP',\n tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')]),\n FilingMetadata(accession_number='0001564590-22-026876',\n form_type='10-K',\n primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000156459022026876/msft-10k_20220630.htm',\n items='',\n primary_doc_description='10-K',\n filing_date='2022-07-28',\n report_date='2022-06-30',\n cik='0000789019',\n company_name='MICROSOFT CORP',\n tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')])]\n\nAlternatively, you can also use any of these to get the same answer:\n\n metadatas = dl.get_filing_metadatas(\"2/msft/10-K\")\n metadatas = dl.get_filing_metadatas(\"2/789019/10-K\")\n metadatas = dl.get_filing_metadatas(\"2/0000789019/10-K\")\n\nThe parameters `limit` and `form_type` are optional. If omitted, `limit`\ndefaults to 1, and `form_type` defaults to \u201810-Q\u2019.\n\n``` python\nmetadatas = dl.get_filing_metadatas(\"NFLX\")\nprint(metadatas)\n```\n\n [FilingMetadata(accession_number='0001065280-23-000273',\n form_type='10-Q',\n primary_doc_url='https://www.sec.gov/Archives/edgar/data/1065280/000106528023000273/nflx-20230930.htm',\n items='',\n primary_doc_description='10-Q',\n filing_date='2023-10-20',\n report_date='2023-09-30',\n cik='0001065280',\n company_name='NETFLIX INC',\n tickers=[Ticker(symbol='NFLX', exchange='Nasdaq')])]\n\nAlternatively, you can also use any of these to get the same answer:\n\n metadatas = dl.get_filing_metadatas(\"nflx\")\n metadatas = dl.get_filing_metadatas(\"1/NFLX\")\n metadatas = dl.get_filing_metadatas(\"NFLX/10-Q\")\n metadatas = dl.get_filing_metadatas(\"1/NFLX/10-Q\")\n metadatas = dl.get_filing_metadatas(RequestedFilings(ticker_or_cik=\"NFLX\"))\n metadatas = dl.get_filing_metadatas(RequestedFilings(limit=1, ticker_or_cik=\"NFLX\", form_type=\"10-Q\"))\n\n## Download the HTML files\n\nAfter obtaining the Primary Document URL, for example from the metadata,\nyou can proceed to download the HTML using this URL.\n\n``` python\nfor metadata in metadatas:\n html = dl.download_filing(url=metadata.primary_doc_url).decode()\n print(html[:50])\n break # same for all filings, let's just print the first one\n```\n\n '<?xml version=\"1.0\" ?><!--XBRL Document Created wi'\n\n# Alternative implementation: Wrapper\n\nFiles are downloaded to a temporary folder, immediately read into\nmemory, and then deleted. Let\u2019s demonstrate how to download a single\nfile (latest 10-Q filing details in HTML format) to memory. The \u201cglob\u201d\npattern is used to select which files are read to memory.\n\n``` python\nfrom sec_edgar_downloader import Downloader as SecEdgarDownloader\nfrom sec_downloader.download_storage import DownloadStorage\n\nONLY_HTML = \"**/*.htm*\"\n\nstorage = DownloadStorage(filter_pattern=ONLY_HTML)\nwith storage as path:\n dl = SecEdgarDownloader(\"MyCompanyName\", \"email@example.com\", path)\n dl.get(\"10-Q\", \"AAPL\", limit=1, download_details=True)\n# all files are now deleted and only stored in memory\n\ncontent = storage.get_file_contents()[0].content\nprint(f\"{content[:50]}...\")\n```\n\n \"<?xml version='1.0' encoding='ASCII'?>\\n<html xmlns...\"\n\nDownloading multiple documents:\n\n``` python\nstorage = DownloadStorage()\nwith storage as path:\n dl = SecEdgarDownloader(\"MyCompanyName\", \"email@example.com\", path)\n dl.get(\"10-K\", \"GOOG\", limit=2)\n# all files are now deleted and only stored in memory\n\nfor path, content in storage.get_file_contents():\n print(f\"Path: {path}\\nContent [len={len(content)}]: {content[:30]}...\\n\")\n```\n\n ('Path: sec-edgar-filings/GOOG/10-K/0001652044-24-000022/full-submission.txt\\n'\n 'Content [len=13927595]: <SEC-DOCUMENT>0001652044-24-00...\\n')\n ('Path: sec-edgar-filings/GOOG/10-K/0001652044-23-000016/full-submission.txt\\n'\n 'Content [len=15264470]: <SEC-DOCUMENT>0001652044-23-00...\\n')\n\n# Contributing\n\nFollow these steps to install the project locally for development:\n\n1. Install the project with the command `pip install -e \".[dev]\"`.\n\n> **Note** We highly recommend using virtual environments for Python\n> development. If you\u2019d like to use virtual environments, follow these\n> steps instead:\n>\n> - Create a virtual environment `python3 -m venv .venv`\n> - Activate the virtual environment `source .venv/bin/activate`\n> - Install the project with the command `pip install -e \".[dev]\"`\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Useful extensions for sec-edgar-downloader.",
"version": "0.11.1",
"project_urls": {
"Homepage": "https://github.com/Elijas/sec-downloader"
},
"split_keywords": [
"nbdev",
"jupyter",
"notebook",
"python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "a579ad90ebabbc3d4f8c2fed8b7900f661340089eaa6540265a56785970767cf",
"md5": "f10112e00b3c653c262ba5e189868cdd",
"sha256": "57b09dcc1286ef2e357da2f90b6baf2ecb959a64140fdb8e7fcfd3301be74bdb"
},
"downloads": -1,
"filename": "sec_downloader-0.11.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f10112e00b3c653c262ba5e189868cdd",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 11386,
"upload_time": "2024-03-27T09:27:39",
"upload_time_iso_8601": "2024-03-27T09:27:39.023004Z",
"url": "https://files.pythonhosted.org/packages/a5/79/ad90ebabbc3d4f8c2fed8b7900f661340089eaa6540265a56785970767cf/sec_downloader-0.11.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f33e3804ce8afeeb6155f415d0902d37154243a7955ebae1851a540162ea42b8",
"md5": "d0603360e14cb264499f86a74551ba07",
"sha256": "48ff5199b91d0f5393650e028bfefbb9f2f8e33665014cd75e8ed688339519c2"
},
"downloads": -1,
"filename": "sec-downloader-0.11.1.tar.gz",
"has_sig": false,
"md5_digest": "d0603360e14cb264499f86a74551ba07",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 13656,
"upload_time": "2024-03-27T09:27:40",
"upload_time_iso_8601": "2024-03-27T09:27:40.838141Z",
"url": "https://files.pythonhosted.org/packages/f3/3e/3804ce8afeeb6155f415d0902d37154243a7955ebae1851a540162ea42b8/sec-downloader-0.11.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-27 09:27:40",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Elijas",
"github_project": "sec-downloader",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "sec-downloader"
}