# github-docs-scraper
Simple CLI tool to scrape a GitHub repository (optionally a private one) and combine all the Markdown files it finds into a single file.
This file can then be easily uploaded to ChatGPT, Deepseek, Qwen, etc.
## Usage
Create a `.env.local` file with the following variables:
- `REPO_OWNER`: The owner of the GitHub repository.
- `REPO_NAME`: The name of the GitHub repository.
- `GITHUB_TOKEN`: The GitHub personal access token.
For instance:
```
REPO_OWNER=your_org_name
REPO_NAME=your_repo_name
GITHUB_TOKEN=your_github_token
```
## Installation
```bash
uv sync
uv run github-docs-scraper
```
Raw data
{
"_id": null,
"home_page": null,
"name": "github-docs-scraper",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.13",
"maintainer_email": null,
"keywords": "github, markdown, scraper",
"author": null,
"author_email": "Jacobus Geluk <jacobus.geluk@ekgf.org>",
"download_url": "https://files.pythonhosted.org/packages/ce/52/d72fd549d7af0c66e4f715be279a07616536d04a0a49ee4c0ce4e6f20acb/github_docs_scraper-0.1.0.tar.gz",
"platform": null,
"description": "# github-docs-scraper\n\nSimple CLI tool to scrape a GitHub repository (optionally a private one) and combine all the Markdown files it finds into a single file.\nThis file can then be easily uploaded to ChatGPT, Deepseek, Qwen, etc.\n\n## Usage\n\nCreate a `.env.local` file with the following variables:\n\n- `REPO_OWNER`: The owner of the GitHub repository.\n- `REPO_NAME`: The name of the GitHub repository.\n- `GITHUB_TOKEN`: The GitHub personal access token.\n\nFor instance:\n\n```\nREPO_OWNER=your_org_name\nREPO_NAME=your_repo_name\nGITHUB_TOKEN=your_github_token\n```\n\n## Installation\n\n```bash\nuv sync\nuv run github-docs-scraper\n```\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) 2025 Object Management Group Copyright (c) 2025 Jacobus Geluk Permission is hereby granted...",
"summary": "CLI tool to scrape GitHub repositories and combine Markdown files",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/your_org/github-docs-scraper"
},
"split_keywords": [
"github",
" markdown",
" scraper"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "6b677c9cfdd6aa640b826608c07686a1825247e0e4efb2e6b3f08e33ea66fd8e",
"md5": "4f0a2d67249aa83b528cd8c6d52616b0",
"sha256": "22471fa0cac829aed738e458f9b5084d207bd1000c298c638aefc4310cb5d4bf"
},
"downloads": -1,
"filename": "github_docs_scraper-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4f0a2d67249aa83b528cd8c6d52616b0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.13",
"size": 2128,
"upload_time": "2025-02-05T01:27:08",
"upload_time_iso_8601": "2025-02-05T01:27:08.655770Z",
"url": "https://files.pythonhosted.org/packages/6b/67/7c9cfdd6aa640b826608c07686a1825247e0e4efb2e6b3f08e33ea66fd8e/github_docs_scraper-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "ce52d72fd549d7af0c66e4f715be279a07616536d04a0a49ee4c0ce4e6f20acb",
"md5": "d16e40721d05ffa492e643dc1ebb9e06",
"sha256": "aca05762c998f689e183e069c00c1096384f2f2e51e4c3d0f14e17d331a8a1c1"
},
"downloads": -1,
"filename": "github_docs_scraper-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "d16e40721d05ffa492e643dc1ebb9e06",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.13",
"size": 3978,
"upload_time": "2025-02-05T01:27:10",
"upload_time_iso_8601": "2025-02-05T01:27:10.539011Z",
"url": "https://files.pythonhosted.org/packages/ce/52/d72fd549d7af0c66e4f715be279a07616536d04a0a49ee4c0ce4e6f20acb/github_docs_scraper-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-05 01:27:10",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "your_org",
"github_project": "github-docs-scraper",
"github_not_found": true,
"lcname": "github-docs-scraper"
}