reddit2text


Namereddit2text JSON
Version 0.0.9 PyPI version JSON
download
home_pageNone
SummaryConvert Reddit posts to text
upload_time2024-04-07 19:35:28
maintainerNone
docs_urlNone
authorNicholas Hansen-Feruch
requires_python>=3.6
licenseNone
keywords python reddit text conversion reddit api praw reddit to text reddit comments social media analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# Reddit2Text

`reddit2text` is *the* Python library designed to effortlessly **transform any Reddit thread into clean, readable text data**.

Perfect for *feeding to an LLM, performing textual/data analysis, or simply archiving for offline use*, `reddit2text` offers a straightforward interface to access and convert content from Reddit.

## Table of Contents
- [Features](#features)
- [Installation](#installation)
- [Quickstart](#quickstart)
  - [Example Code](#example)
  - [Example Output](#output)
- [Configs](#configs)
- [Contributions](#contributions)
- [License](#license)

<a id="features"></a>

## Features
- Convert any Reddit thread (the post + all its comments) into structured text.
- Include all comments, with the ability to specify the maximum comment depth.
- Configure a custom comment delimiter, for visual separation of nested comments.

> **Have a Feature Idea?**
>
> Simply ***open an issue on github*** and tell us what should be added to the next release!

<a id="installation"></a>

## Installation
Easy install using pip
```sh
pip3 install reddit2text
```

<a id="quickstart"></a>

## Quickstart
**First**, you need to create a Reddit app to get your **client_id** and **client_secret**. Follow the instructions on [Reddit's API documentation](https://www.reddit.com/wiki/api) to set up your application.

**Then**, replace the `client_id`, `client_secret`, and `user_agent` with your credentials.

The user agent can be anything you like, but we recommend following this convention according to Reddit's guidelines: `'<app type>:<app name>:<version> (by <your username>)'`

<a id="example"></a>

*Here's an example:*
```python
from reddit2text import Reddit2Text

r2t = Reddit2Text(
    # example credentials
    client_id='123abc',
    client_secret='123abc',
    user_agent='script:my_app:v1.0 (by u/reddit2text)'
)

# The URL must have the post ID after the /comments/ to work, e.g. `1buyr0g`
URL = 'https://www.reddit.com/r/MadeMeSmile/comments/1buyr0g/ryan_reynolds_being_wholesome/'

output = r2t.textualize_post(URL)
print(output)
```

<a id="output"></a>

Here is an example (truncated) output from the above code!
https://pastebin.com/mmHFJtcc

<a id="configs"></a>

## Extra Configuration
- **max_comment_depth**: Maximum depth of comments to output. Includes the top-most comment. Defaults to `None` or `-1` to include all.
- **comment_delim**: String/character used to indent comments according to their nesting level. Defaults to `|` to mimic reddit.

```python
r2t = Reddit2Text(
    # credentials ...
    max_comment_depth=3,  # all comment chains will be limited to a max of 3 replies
    comment_delim='#'  # each comment level will be preceded by multiples of this string
)
```

<a id="contributions"></a>

## Contributions
Contributions to reddit2text are welcome. Please submit pull requests or issues to our GitHub repository.

<a id="license"></a>

## License
reddit2text is released under the MIT License. See the LICENSE file for more details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "reddit2text",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "python, reddit, text conversion, reddit api, praw, reddit to text, reddit comments, social media analysis",
    "author": "Nicholas Hansen-Feruch",
    "author_email": "nicholas.feruch@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/35/a3/fa7ade4567b39b6f507f9575e5acef6bde9a91852b3452717aeaeade2640/reddit2text-0.0.9.tar.gz",
    "platform": null,
    "description": "\n# Reddit2Text\n\n`reddit2text` is *the* Python library designed to effortlessly **transform any Reddit thread into clean, readable text data**.\n\nPerfect for *feeding to an LLM, performing textual/data analysis, or simply archiving for offline use*, `reddit2text` offers a straightforward interface to access and convert content from Reddit.\n\n## Table of Contents\n- [Features](#features)\n- [Installation](#installation)\n- [Quickstart](#quickstart)\n  - [Example Code](#example)\n  - [Example Output](#output)\n- [Configs](#configs)\n- [Contributions](#contributions)\n- [License](#license)\n\n<a id=\"features\"></a>\n\n## Features\n- Convert any Reddit thread (the post + all its comments) into structured text.\n- Include all comments, with the ability to specify the maximum comment depth.\n- Configure a custom comment delimiter, for visual separation of nested comments.\n\n> **Have a Feature Idea?**\n>\n> Simply ***open an issue on github*** and tell us what should be added to the next release!\n\n<a id=\"installation\"></a>\n\n## Installation\nEasy install using pip\n```sh\npip3 install reddit2text\n```\n\n<a id=\"quickstart\"></a>\n\n## Quickstart\n**First**, you need to create a Reddit app to get your **client_id** and **client_secret**. Follow the instructions on [Reddit's API documentation](https://www.reddit.com/wiki/api) to set up your application.\n\n**Then**, replace the `client_id`, `client_secret`, and `user_agent` with your credentials.\n\nThe user agent can be anything you like, but we recommend following this convention according to Reddit's guidelines: `'<app type>:<app name>:<version> (by <your username>)'`\n\n<a id=\"example\"></a>\n\n*Here's an example:*\n```python\nfrom reddit2text import Reddit2Text\n\nr2t = Reddit2Text(\n    # example credentials\n    client_id='123abc',\n    client_secret='123abc',\n    user_agent='script:my_app:v1.0 (by u/reddit2text)'\n)\n\n# The URL must have the post ID after the /comments/ to work, e.g. `1buyr0g`\nURL = 'https://www.reddit.com/r/MadeMeSmile/comments/1buyr0g/ryan_reynolds_being_wholesome/'\n\noutput = r2t.textualize_post(URL)\nprint(output)\n```\n\n<a id=\"output\"></a>\n\nHere is an example (truncated) output from the above code!\nhttps://pastebin.com/mmHFJtcc\n\n<a id=\"configs\"></a>\n\n## Extra Configuration\n- **max_comment_depth**: Maximum depth of comments to output. Includes the top-most comment. Defaults to `None` or `-1` to include all.\n- **comment_delim**: String/character used to indent comments according to their nesting level. Defaults to `|` to mimic reddit.\n\n```python\nr2t = Reddit2Text(\n    # credentials ...\n    max_comment_depth=3,  # all comment chains will be limited to a max of 3 replies\n    comment_delim='#'  # each comment level will be preceded by multiples of this string\n)\n```\n\n<a id=\"contributions\"></a>\n\n## Contributions\nContributions to reddit2text are welcome. Please submit pull requests or issues to our GitHub repository.\n\n<a id=\"license\"></a>\n\n## License\nreddit2text is released under the MIT License. See the LICENSE file for more details.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Convert Reddit posts to text",
    "version": "0.0.9",
    "project_urls": null,
    "split_keywords": [
        "python",
        " reddit",
        " text conversion",
        " reddit api",
        " praw",
        " reddit to text",
        " reddit comments",
        " social media analysis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e4f5e9eb5d5b1f851aba31d91e0d07dd7e1e44001db69733d82fe419f2398d48",
                "md5": "90bf93cee60fda12a0939fab3e94631d",
                "sha256": "c950f7872a589f5382223b81813f361e7945a54d6c7d1d9cbf95491f67f4cb4a"
            },
            "downloads": -1,
            "filename": "reddit2text-0.0.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "90bf93cee60fda12a0939fab3e94631d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 8551,
            "upload_time": "2024-04-07T19:35:27",
            "upload_time_iso_8601": "2024-04-07T19:35:27.336095Z",
            "url": "https://files.pythonhosted.org/packages/e4/f5/e9eb5d5b1f851aba31d91e0d07dd7e1e44001db69733d82fe419f2398d48/reddit2text-0.0.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "35a3fa7ade4567b39b6f507f9575e5acef6bde9a91852b3452717aeaeade2640",
                "md5": "ebfb342f5963fd1a2c14d2c80713e19a",
                "sha256": "b2defa149e841a9a5142bc82b121d276b056eaa3f87e1e528a3337c9ded6b349"
            },
            "downloads": -1,
            "filename": "reddit2text-0.0.9.tar.gz",
            "has_sig": false,
            "md5_digest": "ebfb342f5963fd1a2c14d2c80713e19a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 9126,
            "upload_time": "2024-04-07T19:35:28",
            "upload_time_iso_8601": "2024-04-07T19:35:28.375934Z",
            "url": "https://files.pythonhosted.org/packages/35/a3/fa7ade4567b39b6f507f9575e5acef6bde9a91852b3452717aeaeade2640/reddit2text-0.0.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-07 19:35:28",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "reddit2text"
}
        
Elapsed time: 3.78237s