llama-index-readers-sec-filings


Namellama-index-readers-sec-filings JSON
Version 0.1.5 PyPI version JSON
download
home_pageNone
Summaryllama-index readers sec_filings integration
upload_time2024-05-20 16:48:45
maintainerAthe-kunal
docs_urlNone
authorYour Name
requires_python<4.0,>=3.8.1
licenseMIT
keywords 10-k 10-q sec filings finance
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # SEC DATA DOWNLOADER

```bash
pip install llama-index-readers-sec-filings
```

Please checkout this repo that I am building on SEC Question Answering Agent [SEC-QA](https://github.com/Athe-kunal/SEC-QA-Agent)

This repository downloads all the texts from SEC documents (10-K and 10-Q). Currently, it is not supporting documents that are amended, but that will be added in the near futures.

Install the required dependencies

```
python install -r requirements.txt
```

The SEC Downloader expects 5 attributes

- tickers: It is a list of valid tickers
- amount: Number of documents that you want to download
- filing_type: 10-K or 10-Q filing type
- num_workers: It is for multithreading and multiprocessing. We have multi-threading at the ticker level and multi-processing at the year level for a given ticker
- include_amends: To include amendments or not.

## Usage

```python
from llama_index.readers.sec_filings import SECFilingsLoader

loader = SECFilingsLoader(tickers=["TSLA"], amount=3, filing_type="10-K")
loader.load_data()
```

It will download the data in the following directories and sub-directories

```yaml
- AAPL
  - 2018
  - 10-K.json
  - 2019
  - 10-K.json
  - 2020
  - 10-K.json
  - 2021
  - 10-K.json
  - 10-Q_12.json
  - 2022
  - 10-K.json
  - 10-Q_03.json
  - 10-Q_06.json
  - 10-Q_12.json
  - 2023
  - 10-Q_04.json
- GOOGL
  - 2018
  - 10-K.json
  - 2019
  - 10-K.json
  - 2020
  - 10-K.json
  - 2021
  - 10-K.json
  - 10-Q_09.json
  - 2022
  - 10-K.json
  - 10-Q_03.json
  - 10-Q_06.json
  - 10-Q_09.json
  - 2023
  - 10-Q_03.json
- TSLA
  - 2018
  - 10-K.json
  - 2019
  - 10-K.json
  - 2020
  - 10-K.json
  - 2021
  - 10-K.json
  - 10-KA.json
  - 10-Q_09.json
  - 2022
  - 10-K.json
  - 10-Q_03.json
  - 10-Q_06.json
  - 10-Q_09.json
  - 2023
  - 10-Q_03.json
```

Here for each ticker we have separate folders with 10-K data inside respective years and 10-Q data is saved in the respective year along with the month. `10-Q_03.json` means March data of 10-Q document. Also, the amended documents are stored in their respective year

## EXAMPLES

This loader is can be used with both Langchain and LlamaIndex.

### LlamaIndex

```python
from llama_index.core import VectorStoreIndex, download_loader
from llama_index.core import SimpleDirectoryReader

from llama_index.readers.sec_filings import SECFilingsLoader

loader = SECFilingsLoader(tickers=["TSLA"], amount=3, filing_type="10-K")
loader.load_data()

documents = SimpleDirectoryReader("data\TSLA\2022").load_data()
index = VectorStoreIndex.from_documents(documents)
index.query("What are the risk factors of Tesla for the year 2022?")
```

### Langchain

```python
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
from langchain.document_loaders import DirectoryLoader
from langchain.indexes import VectorstoreIndexCreator

from llama_index.readers.sec_filings import SECFilingsLoader

loader = SECFilingsLoader(tickers=["TSLA"], amount=3, filing_type="10-K")
loader.load_data()

dir_loader = DirectoryLoader("data\TSLA\2022")

index = VectorstoreIndexCreator().from_loaders([dir_loader])
retriever = index.vectorstore.as_retriever()
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(), chain_type="stuff", retriever=retriever
)

query = "What are the risk factors of Tesla for the year 2022?"
qa.run(query)
```

## REFERENCES

1. Unstructured SEC Filings API: [repo link](https://github.com/Unstructured-IO/pipeline-sec-filings/tree/main)
2. SEC Edgar Downloader: [repo link](https://github.com/jadchaar/sec-edgar-downloader)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-index-readers-sec-filings",
    "maintainer": "Athe-kunal",
    "docs_url": null,
    "requires_python": "<4.0,>=3.8.1",
    "maintainer_email": null,
    "keywords": "10-K, 10-Q, SEC Filings, finance",
    "author": "Your Name",
    "author_email": "you@example.com",
    "download_url": "https://files.pythonhosted.org/packages/63/aa/7f7f911131646cab1dc41dd0bde530b38f962c6898cc617b2090e58652ef/llama_index_readers_sec_filings-0.1.5.tar.gz",
    "platform": null,
    "description": "# SEC DATA DOWNLOADER\n\n```bash\npip install llama-index-readers-sec-filings\n```\n\nPlease checkout this repo that I am building on SEC Question Answering Agent [SEC-QA](https://github.com/Athe-kunal/SEC-QA-Agent)\n\nThis repository downloads all the texts from SEC documents (10-K and 10-Q). Currently, it is not supporting documents that are amended, but that will be added in the near futures.\n\nInstall the required dependencies\n\n```\npython install -r requirements.txt\n```\n\nThe SEC Downloader expects 5 attributes\n\n- tickers: It is a list of valid tickers\n- amount: Number of documents that you want to download\n- filing_type: 10-K or 10-Q filing type\n- num_workers: It is for multithreading and multiprocessing. We have multi-threading at the ticker level and multi-processing at the year level for a given ticker\n- include_amends: To include amendments or not.\n\n## Usage\n\n```python\nfrom llama_index.readers.sec_filings import SECFilingsLoader\n\nloader = SECFilingsLoader(tickers=[\"TSLA\"], amount=3, filing_type=\"10-K\")\nloader.load_data()\n```\n\nIt will download the data in the following directories and sub-directories\n\n```yaml\n- AAPL\n  - 2018\n  - 10-K.json\n  - 2019\n  - 10-K.json\n  - 2020\n  - 10-K.json\n  - 2021\n  - 10-K.json\n  - 10-Q_12.json\n  - 2022\n  - 10-K.json\n  - 10-Q_03.json\n  - 10-Q_06.json\n  - 10-Q_12.json\n  - 2023\n  - 10-Q_04.json\n- GOOGL\n  - 2018\n  - 10-K.json\n  - 2019\n  - 10-K.json\n  - 2020\n  - 10-K.json\n  - 2021\n  - 10-K.json\n  - 10-Q_09.json\n  - 2022\n  - 10-K.json\n  - 10-Q_03.json\n  - 10-Q_06.json\n  - 10-Q_09.json\n  - 2023\n  - 10-Q_03.json\n- TSLA\n  - 2018\n  - 10-K.json\n  - 2019\n  - 10-K.json\n  - 2020\n  - 10-K.json\n  - 2021\n  - 10-K.json\n  - 10-KA.json\n  - 10-Q_09.json\n  - 2022\n  - 10-K.json\n  - 10-Q_03.json\n  - 10-Q_06.json\n  - 10-Q_09.json\n  - 2023\n  - 10-Q_03.json\n```\n\nHere for each ticker we have separate folders with 10-K data inside respective years and 10-Q data is saved in the respective year along with the month. `10-Q_03.json` means March data of 10-Q document. Also, the amended documents are stored in their respective year\n\n## EXAMPLES\n\nThis loader is can be used with both Langchain and LlamaIndex.\n\n### LlamaIndex\n\n```python\nfrom llama_index.core import VectorStoreIndex, download_loader\nfrom llama_index.core import SimpleDirectoryReader\n\nfrom llama_index.readers.sec_filings import SECFilingsLoader\n\nloader = SECFilingsLoader(tickers=[\"TSLA\"], amount=3, filing_type=\"10-K\")\nloader.load_data()\n\ndocuments = SimpleDirectoryReader(\"data\\TSLA\\2022\").load_data()\nindex = VectorStoreIndex.from_documents(documents)\nindex.query(\"What are the risk factors of Tesla for the year 2022?\")\n```\n\n### Langchain\n\n```python\nfrom langchain.llms import OpenAI\nfrom langchain.chains import RetrievalQA\nfrom langchain.document_loaders import DirectoryLoader\nfrom langchain.indexes import VectorstoreIndexCreator\n\nfrom llama_index.readers.sec_filings import SECFilingsLoader\n\nloader = SECFilingsLoader(tickers=[\"TSLA\"], amount=3, filing_type=\"10-K\")\nloader.load_data()\n\ndir_loader = DirectoryLoader(\"data\\TSLA\\2022\")\n\nindex = VectorstoreIndexCreator().from_loaders([dir_loader])\nretriever = index.vectorstore.as_retriever()\nqa = RetrievalQA.from_chain_type(\n    llm=OpenAI(), chain_type=\"stuff\", retriever=retriever\n)\n\nquery = \"What are the risk factors of Tesla for the year 2022?\"\nqa.run(query)\n```\n\n## REFERENCES\n\n1. Unstructured SEC Filings API: [repo link](https://github.com/Unstructured-IO/pipeline-sec-filings/tree/main)\n2. SEC Edgar Downloader: [repo link](https://github.com/jadchaar/sec-edgar-downloader)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "llama-index readers sec_filings integration",
    "version": "0.1.5",
    "project_urls": null,
    "split_keywords": [
        "10-k",
        " 10-q",
        " sec filings",
        " finance"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f54ff9e3e5864f3e84998f3f32c0595bda29bfb7d1363cccb4f3ae68297f1eff",
                "md5": "eba232bffd87cc7dc63d5907ae854674",
                "sha256": "a3def3ab82bb5c931508f2c6c0b829bc5e79d27185e5ff230bce6bf56db39fda"
            },
            "downloads": -1,
            "filename": "llama_index_readers_sec_filings-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "eba232bffd87cc7dc63d5907ae854674",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8.1",
            "size": 25235,
            "upload_time": "2024-05-20T16:48:43",
            "upload_time_iso_8601": "2024-05-20T16:48:43.790447Z",
            "url": "https://files.pythonhosted.org/packages/f5/4f/f9e3e5864f3e84998f3f32c0595bda29bfb7d1363cccb4f3ae68297f1eff/llama_index_readers_sec_filings-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "63aa7f7f911131646cab1dc41dd0bde530b38f962c6898cc617b2090e58652ef",
                "md5": "dfd183bcd1a9f4779e2cafa2d1c89c60",
                "sha256": "93a3e20ba9345c31b47b970dee13d3f789167721cb6d9017edcb28e768857ce4"
            },
            "downloads": -1,
            "filename": "llama_index_readers_sec_filings-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "dfd183bcd1a9f4779e2cafa2d1c89c60",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8.1",
            "size": 22232,
            "upload_time": "2024-05-20T16:48:45",
            "upload_time_iso_8601": "2024-05-20T16:48:45.003022Z",
            "url": "https://files.pythonhosted.org/packages/63/aa/7f7f911131646cab1dc41dd0bde530b38f962c6898cc617b2090e58652ef/llama_index_readers_sec_filings-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-20 16:48:45",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-index-readers-sec-filings"
}
        
Elapsed time: 0.29025s