# Reddit2Text
`reddit2text` is *the* Python library designed to effortlessly **transform any Reddit thread into clean, readable text data**.
Perfect for *feeding to an LLM, performing textual/data analysis, or simply archiving for offline use*, `reddit2text` offers a straightforward interface to access and convert content from Reddit.
## Table of Contents
- [Features](#features)
- [Installation](#installation)
- [Quickstart](#quickstart)
- [Example Code](#example)
- [Example Output](#output)
- [Configs](#configs)
- [Contributions](#contributions)
- [License](#license)
<a id="features"></a>
## Features
- Convert any Reddit thread (the post + all its comments) into structured text.
- Include all comments, with the ability to specify the maximum comment depth.
- Configure a custom comment delimiter, for visual separation of nested comments.
> **Have a Feature Idea?**
>
> Simply ***open an issue on github*** and tell us what should be added to the next release!
<a id="installation"></a>
## Installation
Easy install using pip
```sh
pip3 install reddit2text
```
<a id="quickstart"></a>
## Quickstart
**First**, you need to create a Reddit app to get your **client_id** and **client_secret**. Follow the instructions on [Reddit's API documentation](https://www.reddit.com/wiki/api) to set up your application.
**Then**, replace the `client_id`, `client_secret`, and `user_agent` with your credentials.
The user agent can be anything you like, but we recommend following this convention according to Reddit's guidelines: `'<app type>:<app name>:<version> (by <your username>)'`
<a id="example"></a>
*Here's an example:*
```python
from reddit2text import Reddit2Text
r2t = Reddit2Text(
# example credentials
client_id='123abc',
client_secret='123abc',
user_agent='script:my_app:v1.0 (by u/reddit2text)'
)
# The URL must have the post ID after the /comments/ to work, e.g. `1buyr0g`
URL = 'https://www.reddit.com/r/MadeMeSmile/comments/1buyr0g/ryan_reynolds_being_wholesome/'
output = r2t.textualize_post(URL)
print(output)
```
<a id="output"></a>
Here is an example (truncated) output from the above code!
https://pastebin.com/mmHFJtcc
<a id="configs"></a>
## Extra Configuration
- **max_comment_depth**: Maximum depth of comments to output. Includes the top-most comment. Defaults to `None` or `-1` to include all.
- **comment_delim**: String/character used to indent comments according to their nesting level. Defaults to `|` to mimic reddit.
```python
r2t = Reddit2Text(
# credentials ...
max_comment_depth=3, # all comment chains will be limited to a max of 3 replies
comment_delim='#' # each comment level will be preceded by multiples of this string
)
```
<a id="contributions"></a>
## Contributions
Contributions to reddit2text are welcome. Please submit pull requests or issues to our GitHub repository.
<a id="license"></a>
## License
reddit2text is released under the MIT License. See the LICENSE file for more details.
Raw data
{
"_id": null,
"home_page": null,
"name": "reddit2text",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "python, reddit, text conversion, reddit api, praw, reddit to text, reddit comments, social media analysis",
"author": "Nicholas Hansen-Feruch",
"author_email": "nicholas.feruch@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/35/a3/fa7ade4567b39b6f507f9575e5acef6bde9a91852b3452717aeaeade2640/reddit2text-0.0.9.tar.gz",
"platform": null,
"description": "\n# Reddit2Text\n\n`reddit2text` is *the* Python library designed to effortlessly **transform any Reddit thread into clean, readable text data**.\n\nPerfect for *feeding to an LLM, performing textual/data analysis, or simply archiving for offline use*, `reddit2text` offers a straightforward interface to access and convert content from Reddit.\n\n## Table of Contents\n- [Features](#features)\n- [Installation](#installation)\n- [Quickstart](#quickstart)\n - [Example Code](#example)\n - [Example Output](#output)\n- [Configs](#configs)\n- [Contributions](#contributions)\n- [License](#license)\n\n<a id=\"features\"></a>\n\n## Features\n- Convert any Reddit thread (the post + all its comments) into structured text.\n- Include all comments, with the ability to specify the maximum comment depth.\n- Configure a custom comment delimiter, for visual separation of nested comments.\n\n> **Have a Feature Idea?**\n>\n> Simply ***open an issue on github*** and tell us what should be added to the next release!\n\n<a id=\"installation\"></a>\n\n## Installation\nEasy install using pip\n```sh\npip3 install reddit2text\n```\n\n<a id=\"quickstart\"></a>\n\n## Quickstart\n**First**, you need to create a Reddit app to get your **client_id** and **client_secret**. Follow the instructions on [Reddit's API documentation](https://www.reddit.com/wiki/api) to set up your application.\n\n**Then**, replace the `client_id`, `client_secret`, and `user_agent` with your credentials.\n\nThe user agent can be anything you like, but we recommend following this convention according to Reddit's guidelines: `'<app type>:<app name>:<version> (by <your username>)'`\n\n<a id=\"example\"></a>\n\n*Here's an example:*\n```python\nfrom reddit2text import Reddit2Text\n\nr2t = Reddit2Text(\n # example credentials\n client_id='123abc',\n client_secret='123abc',\n user_agent='script:my_app:v1.0 (by u/reddit2text)'\n)\n\n# The URL must have the post ID after the /comments/ to work, e.g. `1buyr0g`\nURL = 'https://www.reddit.com/r/MadeMeSmile/comments/1buyr0g/ryan_reynolds_being_wholesome/'\n\noutput = r2t.textualize_post(URL)\nprint(output)\n```\n\n<a id=\"output\"></a>\n\nHere is an example (truncated) output from the above code!\nhttps://pastebin.com/mmHFJtcc\n\n<a id=\"configs\"></a>\n\n## Extra Configuration\n- **max_comment_depth**: Maximum depth of comments to output. Includes the top-most comment. Defaults to `None` or `-1` to include all.\n- **comment_delim**: String/character used to indent comments according to their nesting level. Defaults to `|` to mimic reddit.\n\n```python\nr2t = Reddit2Text(\n # credentials ...\n max_comment_depth=3, # all comment chains will be limited to a max of 3 replies\n comment_delim='#' # each comment level will be preceded by multiples of this string\n)\n```\n\n<a id=\"contributions\"></a>\n\n## Contributions\nContributions to reddit2text are welcome. Please submit pull requests or issues to our GitHub repository.\n\n<a id=\"license\"></a>\n\n## License\nreddit2text is released under the MIT License. See the LICENSE file for more details.\n",
"bugtrack_url": null,
"license": null,
"summary": "Convert Reddit posts to text",
"version": "0.0.9",
"project_urls": null,
"split_keywords": [
"python",
" reddit",
" text conversion",
" reddit api",
" praw",
" reddit to text",
" reddit comments",
" social media analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e4f5e9eb5d5b1f851aba31d91e0d07dd7e1e44001db69733d82fe419f2398d48",
"md5": "90bf93cee60fda12a0939fab3e94631d",
"sha256": "c950f7872a589f5382223b81813f361e7945a54d6c7d1d9cbf95491f67f4cb4a"
},
"downloads": -1,
"filename": "reddit2text-0.0.9-py3-none-any.whl",
"has_sig": false,
"md5_digest": "90bf93cee60fda12a0939fab3e94631d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 8551,
"upload_time": "2024-04-07T19:35:27",
"upload_time_iso_8601": "2024-04-07T19:35:27.336095Z",
"url": "https://files.pythonhosted.org/packages/e4/f5/e9eb5d5b1f851aba31d91e0d07dd7e1e44001db69733d82fe419f2398d48/reddit2text-0.0.9-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "35a3fa7ade4567b39b6f507f9575e5acef6bde9a91852b3452717aeaeade2640",
"md5": "ebfb342f5963fd1a2c14d2c80713e19a",
"sha256": "b2defa149e841a9a5142bc82b121d276b056eaa3f87e1e528a3337c9ded6b349"
},
"downloads": -1,
"filename": "reddit2text-0.0.9.tar.gz",
"has_sig": false,
"md5_digest": "ebfb342f5963fd1a2c14d2c80713e19a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 9126,
"upload_time": "2024-04-07T19:35:28",
"upload_time_iso_8601": "2024-04-07T19:35:28.375934Z",
"url": "https://files.pythonhosted.org/packages/35/a3/fa7ade4567b39b6f507f9575e5acef6bde9a91852b3452717aeaeade2640/reddit2text-0.0.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-07 19:35:28",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "reddit2text"
}