llama-index-readers-sec-filings


Namellama-index-readers-sec-filings JSON
Version 0.1.3 PyPI version JSON
download
home_page
Summaryllama-index readers sec_filings integration
upload_time2024-02-21 20:48:44
maintainerAthe-kunal
docs_urlNone
authorYour Name
requires_python>=3.8.1,<4.0
licenseMIT
keywords 10-k 10-q sec filings finance
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # SEC DATA DOWNLOADER

Please checkout this repo that I am building on SEC Question Answering Agent [SEC-QA](https://github.com/Athe-kunal/SEC-QA-Agent)

This repository downloads all the texts from SEC documents (10-K and 10-Q). Currently, it is not supporting documents that are amended, but that will be added in the near futures.

Install the required dependencies

```
python install -r requirements.txt
```

The SEC Downloader expects 5 attributes

- tickers: It is a list of valid tickers
- amount: Number of documents that you want to download
- filing_type: 10-K or 10-Q filing type
- num_workers: It is for multithreading and multiprocessing. We have multi-threading at the ticker level and multi-processing at the year level for a given ticker
- include_amends: To include amendments or not.

## Usage

```python
from llama_index import download_loader

SECFilingsLoader = download_loader("SECFilingsLoader")

loader = SECFilingsLoader(tickers=["TSLA"], amount=3, filing_type="10-K")
loader.load_data()
```

It will download the data in the following directories and sub-directories

```yaml
- AAPL
  - 2018
  - 10-K.json
  - 2019
  - 10-K.json
  - 2020
  - 10-K.json
  - 2021
  - 10-K.json
  - 10-Q_12.json
  - 2022
  - 10-K.json
  - 10-Q_03.json
  - 10-Q_06.json
  - 10-Q_12.json
  - 2023
  - 10-Q_04.json
- GOOGL
  - 2018
  - 10-K.json
  - 2019
  - 10-K.json
  - 2020
  - 10-K.json
  - 2021
  - 10-K.json
  - 10-Q_09.json
  - 2022
  - 10-K.json
  - 10-Q_03.json
  - 10-Q_06.json
  - 10-Q_09.json
  - 2023
  - 10-Q_03.json
- TSLA
  - 2018
  - 10-K.json
  - 2019
  - 10-K.json
  - 2020
  - 10-K.json
  - 2021
  - 10-K.json
  - 10-KA.json
  - 10-Q_09.json
  - 2022
  - 10-K.json
  - 10-Q_03.json
  - 10-Q_06.json
  - 10-Q_09.json
  - 2023
  - 10-Q_03.json
```

Here for each ticker we have separate folders with 10-K data inside respective years and 10-Q data is saved in the respective year along with the month. `10-Q_03.json` means March data of 10-Q document. Also, the amended documents are stored in their respective year

## EXAMPLES

This loader is can be used with both Langchain and LlamaIndex.

### LlamaIndex

```python
from llama_index import VectorStoreIndex, download_loader
from llama_index import SimpleDirectoryReader

SECFilingsLoader = download_loader("SECFilingsLoader")

loader = SECFilingsLoader(tickers=["TSLA"], amount=3, filing_type="10-K")
loader.load_data()

documents = SimpleDirectoryReader("data\TSLA\2022").load_data()
index = VectorStoreIndex.from_documents(documents)
index.query("What are the risk factors of Tesla for the year 2022?")
```

### Langchain

```python
from llama_index import download_loader
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
from langchain.document_loaders import DirectoryLoader
from langchain.indexes import VectorstoreIndexCreator

SECFilingsLoader = download_loader("SECFilingsLoader")

loader = SECFilingsLoader(tickers=["TSLA"], amount=3, filing_type="10-K")
loader.load_data()

dir_loader = DirectoryLoader("data\TSLA\2022")

index = VectorstoreIndexCreator().from_loaders([dir_loader])
retriever = index.vectorstore.as_retriever()
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(), chain_type="stuff", retriever=retriever
)

query = "What are the risk factors of Tesla for the year 2022?"
qa.run(query)
```

## REFERENCES

1. Unstructured SEC Filings API: [repo link](https://github.com/Unstructured-IO/pipeline-sec-filings/tree/main)
2. SEC Edgar Downloader: [repo link](https://github.com/jadchaar/sec-edgar-downloader)

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "llama-index-readers-sec-filings",
    "maintainer": "Athe-kunal",
    "docs_url": null,
    "requires_python": ">=3.8.1,<4.0",
    "maintainer_email": "",
    "keywords": "10-K,10-Q,SEC Filings,finance",
    "author": "Your Name",
    "author_email": "you@example.com",
    "download_url": "https://files.pythonhosted.org/packages/a3/e9/540d9ed058a97f9cc3f7bb3b11a050bf65c9c6bbc912541b84f54c6e1f6d/llama_index_readers_sec_filings-0.1.3.tar.gz",
    "platform": null,
    "description": "# SEC DATA DOWNLOADER\n\nPlease checkout this repo that I am building on SEC Question Answering Agent [SEC-QA](https://github.com/Athe-kunal/SEC-QA-Agent)\n\nThis repository downloads all the texts from SEC documents (10-K and 10-Q). Currently, it is not supporting documents that are amended, but that will be added in the near futures.\n\nInstall the required dependencies\n\n```\npython install -r requirements.txt\n```\n\nThe SEC Downloader expects 5 attributes\n\n- tickers: It is a list of valid tickers\n- amount: Number of documents that you want to download\n- filing_type: 10-K or 10-Q filing type\n- num_workers: It is for multithreading and multiprocessing. We have multi-threading at the ticker level and multi-processing at the year level for a given ticker\n- include_amends: To include amendments or not.\n\n## Usage\n\n```python\nfrom llama_index import download_loader\n\nSECFilingsLoader = download_loader(\"SECFilingsLoader\")\n\nloader = SECFilingsLoader(tickers=[\"TSLA\"], amount=3, filing_type=\"10-K\")\nloader.load_data()\n```\n\nIt will download the data in the following directories and sub-directories\n\n```yaml\n- AAPL\n  - 2018\n  - 10-K.json\n  - 2019\n  - 10-K.json\n  - 2020\n  - 10-K.json\n  - 2021\n  - 10-K.json\n  - 10-Q_12.json\n  - 2022\n  - 10-K.json\n  - 10-Q_03.json\n  - 10-Q_06.json\n  - 10-Q_12.json\n  - 2023\n  - 10-Q_04.json\n- GOOGL\n  - 2018\n  - 10-K.json\n  - 2019\n  - 10-K.json\n  - 2020\n  - 10-K.json\n  - 2021\n  - 10-K.json\n  - 10-Q_09.json\n  - 2022\n  - 10-K.json\n  - 10-Q_03.json\n  - 10-Q_06.json\n  - 10-Q_09.json\n  - 2023\n  - 10-Q_03.json\n- TSLA\n  - 2018\n  - 10-K.json\n  - 2019\n  - 10-K.json\n  - 2020\n  - 10-K.json\n  - 2021\n  - 10-K.json\n  - 10-KA.json\n  - 10-Q_09.json\n  - 2022\n  - 10-K.json\n  - 10-Q_03.json\n  - 10-Q_06.json\n  - 10-Q_09.json\n  - 2023\n  - 10-Q_03.json\n```\n\nHere for each ticker we have separate folders with 10-K data inside respective years and 10-Q data is saved in the respective year along with the month. `10-Q_03.json` means March data of 10-Q document. Also, the amended documents are stored in their respective year\n\n## EXAMPLES\n\nThis loader is can be used with both Langchain and LlamaIndex.\n\n### LlamaIndex\n\n```python\nfrom llama_index import VectorStoreIndex, download_loader\nfrom llama_index import SimpleDirectoryReader\n\nSECFilingsLoader = download_loader(\"SECFilingsLoader\")\n\nloader = SECFilingsLoader(tickers=[\"TSLA\"], amount=3, filing_type=\"10-K\")\nloader.load_data()\n\ndocuments = SimpleDirectoryReader(\"data\\TSLA\\2022\").load_data()\nindex = VectorStoreIndex.from_documents(documents)\nindex.query(\"What are the risk factors of Tesla for the year 2022?\")\n```\n\n### Langchain\n\n```python\nfrom llama_index import download_loader\nfrom langchain.llms import OpenAI\nfrom langchain.chains import RetrievalQA\nfrom langchain.document_loaders import DirectoryLoader\nfrom langchain.indexes import VectorstoreIndexCreator\n\nSECFilingsLoader = download_loader(\"SECFilingsLoader\")\n\nloader = SECFilingsLoader(tickers=[\"TSLA\"], amount=3, filing_type=\"10-K\")\nloader.load_data()\n\ndir_loader = DirectoryLoader(\"data\\TSLA\\2022\")\n\nindex = VectorstoreIndexCreator().from_loaders([dir_loader])\nretriever = index.vectorstore.as_retriever()\nqa = RetrievalQA.from_chain_type(\n    llm=OpenAI(), chain_type=\"stuff\", retriever=retriever\n)\n\nquery = \"What are the risk factors of Tesla for the year 2022?\"\nqa.run(query)\n```\n\n## REFERENCES\n\n1. Unstructured SEC Filings API: [repo link](https://github.com/Unstructured-IO/pipeline-sec-filings/tree/main)\n2. SEC Edgar Downloader: [repo link](https://github.com/jadchaar/sec-edgar-downloader)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "llama-index readers sec_filings integration",
    "version": "0.1.3",
    "project_urls": null,
    "split_keywords": [
        "10-k",
        "10-q",
        "sec filings",
        "finance"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dee214d99bb7b6fbf97c262415008a5e792dd169a8e7173778a3174a15d72ba8",
                "md5": "4bd7ab24d1e3011cb378f45c160ec4a7",
                "sha256": "e7b7b5ebbb652276e1167deadae2388fac8f21fd939bd5dc5112be1af299b22f"
            },
            "downloads": -1,
            "filename": "llama_index_readers_sec_filings-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4bd7ab24d1e3011cb378f45c160ec4a7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8.1,<4.0",
            "size": 25202,
            "upload_time": "2024-02-21T20:48:40",
            "upload_time_iso_8601": "2024-02-21T20:48:40.989351Z",
            "url": "https://files.pythonhosted.org/packages/de/e2/14d99bb7b6fbf97c262415008a5e792dd169a8e7173778a3174a15d72ba8/llama_index_readers_sec_filings-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a3e9540d9ed058a97f9cc3f7bb3b11a050bf65c9c6bbc912541b84f54c6e1f6d",
                "md5": "b5c2b9df6a3931d11506b951541697d7",
                "sha256": "f8eb56d30d35261f50859f571d9e99c9b50401e3a2940b79b5cae52d66e8696d"
            },
            "downloads": -1,
            "filename": "llama_index_readers_sec_filings-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "b5c2b9df6a3931d11506b951541697d7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8.1,<4.0",
            "size": 22346,
            "upload_time": "2024-02-21T20:48:44",
            "upload_time_iso_8601": "2024-02-21T20:48:44.261815Z",
            "url": "https://files.pythonhosted.org/packages/a3/e9/540d9ed058a97f9cc3f7bb3b11a050bf65c9c6bbc912541b84f54c6e1f6d/llama_index_readers_sec_filings-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-21 20:48:44",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-index-readers-sec-filings"
}
        
Elapsed time: 0.17030s