overcast-to-sqlite


Nameovercast-to-sqlite JSON
Version 0.7.0 PyPI version JSON
download
home_pageNone
SummarySave listening history and feed/episode info from Overcast to a SQLite database.
upload_time2024-08-09 22:41:44
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseNone
keywords datasette overcast sqlite podcasts podcast transcripts
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # overcast-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/overcast-to-sqlite.svg)](https://pypi.org/project/overcast-to-sqlite/)
[![Lint](https://github.com/hbmartin/overcast-to-sqlite/actions/workflows/lint.yml/badge.svg)](https://github.com/hbmartin/overcast-to-sqlite/actions/workflows/lint.yml)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![Code style: black](https://img.shields.io/badge/🐧️-black-000000.svg)](https://github.com/psf/black)
[![Checked with pytype](https://img.shields.io/badge/🦆-pytype-437f30.svg)](https://google.github.io/pytype/)
[![Versions](https://img.shields.io/pypi/pyversions/overcast-to-sqlite.svg)](https://pypi.python.org/pypi/overcast-to-sqlite)
[![discord](https://img.shields.io/discord/823971286308356157?logo=discord&label=&color=323338)](https://discord.gg/EE7Hx4Kbny)
[![twitter](https://img.shields.io/badge/@hmartin-00aced.svg?logo=twitter&logoColor=black)](https://twitter.com/hmartin)

Save listening history and feed/episode info from Overcast to a SQLite database. Try exploring your podcast listening habits with [Datasette](https://datasette.io/)!

- [How to install](#how-to-install)
- [Authentication](#authentication)
- [Fetching and saving updates](#fetching-and-saving-updates)
- [Extending and saving full feeds](#extending-and-saving-full-feeds)
- [Downloading transcripts](#downloading-transcripts)

## How to install

    $ pip install overcast-to-sqlite

Or to upgrade:

    $ pip install --upgrade overcast-to-sqlite

## Authentication

Run this command to login to Overcast (note: neither your password nor email are saved, only the auth cookie):

    $ overcast-to-sqlite auth

This will create a file called `auth.json` in your current directory containing the required value. To save the file at a different path or filename, use the `--auth=myauth.json` option.

If you do not wish to save this information you can manually download the "All data" file [from the Overcast account page](https://overcast.fm/account) and pass it into the save command as described below.

## Fetching and saving updates

The `save` command retrieves all Overcast info and stores playlists, podcast feeds, and episodes in their respective tables with a primary key `overcastId`. 

    $ overcast-to-sqlite save

By default, this saves to `overcast.db` but this can be manually set.

    $ overcast-to-sqlite save someother.db

By default, it will attempt to use the info in `auth.json` file is present it will use the cookie from that file. You can point to a different location using `-a`:

    $ overcast-to-sqlite save -a /path/to/auth.json

Alternately, you can skip authentication by passing in an OPML file you downloaded from Overcast:

    $ overcast-to-sqlite save --load /path/to/overcast.opml

By default, the save command will save any OPML file it downloads adjacent to the database file in `archive/overcast/`. You can disable this behavior with `--no-archive` or `-na`.

For increased reporting verbosity, use the `-v` flag.

## Extending and saving full feeds

The `extend` command that will download the XML files for all feeds you are subscribed to and extract tags and attributes. These are stored in separate tables `feeds_extended` and `episodes_extended` with primary keys `xmlUrl` and  `enclosureUrl` respectively. (See points 4 and 5 below for more information.)

    $ overcast-to-sqlite extend

Like the save command, this will attempt to archive feeds to `archive/feeds/` by default. This can be disabled with `--no-archive` or `-na`.

It also supports the `-v` flag to print additional information.

There are a few caveats for this functionality:

1. The first time this is invoked will require downloading and parsing an XML file for each feed you are subscribed to. (Subsequent invocations only require  this for new episodes loaded by `save`) Because this command may take a long time to run if you have many feeds, it is recommended to use the `-v` flag to observe progress.
2. This will increase the size of your database by approximately 2 MB per feed, so may result in a large file if you subscribe to many feeds.
3. Certain feeds may not load due to e.g. authentication, rate limiting, or other issues. These will be logged to the console and the feed will be skipped. Likewise, an episode may appear in your episodes table but not in the extended information if it is no longer available.
4. The `_extended` tables use URLs as their primary key. This may potentially lead to unjoinable / orphaned episodes if the enclosure URL (i.e. URL of the audio file) has changed since Overcast stored it.
5. There is no guarantee of which columns will be present in these tables aside from URL, title, and description. This command attempts to capture and normalize all XML tags contained in the feed so it is likely that many columns will be created and only a few rows will have values for uncommon tags/attributes.

Any suggestions for improving on these caveats are welcome, please [open an issue](https://github.com/hbmartin/overcast-to-sqlite/issues)!

## Downloading transcripts

The `transcripts` command that will download the transcripts if available.

The `save` and `extend` commands MUST be run prior to this.

Episodes with a "podcast:transcript:url" value will be downloaded from that URL and the download's location will then be stored in "transcriptDownloadPath". 

    $ overcast-to-sqlite transcripts

Like previous commands, by default this will save transcripts to `archive/transcripts/<feed title>/<episode title>` by default.

A different path can be set with the `-p`/`--path` flag.

It also supports the `-v` flag to print additional information.

There is also a `-s` flag to only download transcripts for starred episodes.

## See also

- [Datasette](https://datasette.io/)
- [Podcast Transcript Convert](https://github.com/hbmartin/podcast-transcript-convert/)
- [Overcast Parser](https://github.com/hbmartin/overcast_parser)
- [Podcast Archiver](https://github.com/janw/podcast-archiver)

## Development

Pull requests are very welcome! For major changes, please open an issue first to discuss what you would like to change.

### Setup

```bash
git clone git@github.com:hbmartin/overcast-to-sqlite.git
cd overcast-to-sqlite
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python -m overcast_to_sqlite.cli all -v
```

### Code Formatting

This project is linted with [ruff](https://docs.astral.sh/ruff/) and uses [Black](https://github.com/ambv/black) code formatting.

## Authors

* [Harold Martin](https://www.linkedin.com/in/harold-martin-98526971/) - harold.martin at gmail

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "overcast-to-sqlite",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "datasette, overcast, sqlite, podcasts, podcast, transcripts",
    "author": null,
    "author_email": "Harold Martin <Harold.Martin@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/e3/e8/b33b16eaae17e745fe3c4e4075d90ef9fa1f5b0bedde86b702b690d15741/overcast_to_sqlite-0.7.0.tar.gz",
    "platform": null,
    "description": "# overcast-to-sqlite\n\n[![PyPI](https://img.shields.io/pypi/v/overcast-to-sqlite.svg)](https://pypi.org/project/overcast-to-sqlite/)\n[![Lint](https://github.com/hbmartin/overcast-to-sqlite/actions/workflows/lint.yml/badge.svg)](https://github.com/hbmartin/overcast-to-sqlite/actions/workflows/lint.yml)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![Code style: black](https://img.shields.io/badge/\ud83d\udc27\ufe0f-black-000000.svg)](https://github.com/psf/black)\n[![Checked with pytype](https://img.shields.io/badge/\ud83e\udd86-pytype-437f30.svg)](https://google.github.io/pytype/)\n[![Versions](https://img.shields.io/pypi/pyversions/overcast-to-sqlite.svg)](https://pypi.python.org/pypi/overcast-to-sqlite)\n[![discord](https://img.shields.io/discord/823971286308356157?logo=discord&label=&color=323338)](https://discord.gg/EE7Hx4Kbny)\n[![twitter](https://img.shields.io/badge/@hmartin-00aced.svg?logo=twitter&logoColor=black)](https://twitter.com/hmartin)\n\nSave listening history and feed/episode info from Overcast to a SQLite database. Try exploring your podcast listening habits with [Datasette](https://datasette.io/)!\n\n- [How to install](#how-to-install)\n- [Authentication](#authentication)\n- [Fetching and saving updates](#fetching-and-saving-updates)\n- [Extending and saving full feeds](#extending-and-saving-full-feeds)\n- [Downloading transcripts](#downloading-transcripts)\n\n## How to install\n\n    $ pip install overcast-to-sqlite\n\nOr to upgrade:\n\n    $ pip install --upgrade overcast-to-sqlite\n\n## Authentication\n\nRun this command to login to Overcast (note: neither your password nor email are saved, only the auth cookie):\n\n    $ overcast-to-sqlite auth\n\nThis will create a file called `auth.json` in your current directory containing the required value. To save the file at a different path or filename, use the `--auth=myauth.json` option.\n\nIf you do not wish to save this information you can manually download the \"All data\" file [from the Overcast account page](https://overcast.fm/account) and pass it into the save command as described below.\n\n## Fetching and saving updates\n\nThe `save` command retrieves all Overcast info and stores playlists, podcast feeds, and episodes in their respective tables with a primary key `overcastId`. \n\n    $ overcast-to-sqlite save\n\nBy default, this saves to `overcast.db` but this can be manually set.\n\n    $ overcast-to-sqlite save someother.db\n\nBy default, it will attempt to use the info in `auth.json` file is present it will use the cookie from that file. You can point to a different location using `-a`:\n\n    $ overcast-to-sqlite save -a /path/to/auth.json\n\nAlternately, you can skip authentication by passing in an OPML file you downloaded from Overcast:\n\n    $ overcast-to-sqlite save --load /path/to/overcast.opml\n\nBy default, the save command will save any OPML file it downloads adjacent to the database file in `archive/overcast/`. You can disable this behavior with `--no-archive` or `-na`.\n\nFor increased reporting verbosity, use the `-v` flag.\n\n## Extending and saving full feeds\n\nThe `extend` command that will download the XML files for all feeds you are subscribed to and extract tags and attributes. These are stored in separate tables `feeds_extended` and `episodes_extended` with primary keys `xmlUrl` and  `enclosureUrl` respectively. (See points 4 and 5 below for more information.)\n\n    $ overcast-to-sqlite extend\n\nLike the save command, this will attempt to archive feeds to `archive/feeds/` by default. This can be disabled with `--no-archive` or `-na`.\n\nIt also supports the `-v` flag to print additional information.\n\nThere are a few caveats for this functionality:\n\n1. The first time this is invoked will require downloading and parsing an XML file for each feed you are subscribed to. (Subsequent invocations only require  this for new episodes loaded by `save`) Because this command may take a long time to run if you have many feeds, it is recommended to use the `-v` flag to observe progress.\n2. This will increase the size of your database by approximately 2 MB per feed, so may result in a large file if you subscribe to many feeds.\n3. Certain feeds may not load due to e.g. authentication, rate limiting, or other issues. These will be logged to the console and the feed will be skipped. Likewise, an episode may appear in your episodes table but not in the extended information if it is no longer available.\n4. The `_extended` tables use URLs as their primary key. This may potentially lead to unjoinable / orphaned episodes if the enclosure URL (i.e. URL of the audio file) has changed since Overcast stored it.\n5. There is no guarantee of which columns will be present in these tables aside from URL, title, and description. This command attempts to capture and normalize all XML tags contained in the feed so it is likely that many columns will be created and only a few rows will have values for uncommon tags/attributes.\n\nAny suggestions for improving on these caveats are welcome, please [open an issue](https://github.com/hbmartin/overcast-to-sqlite/issues)!\n\n## Downloading transcripts\n\nThe `transcripts` command that will download the transcripts if available.\n\nThe `save` and `extend` commands MUST be run prior to this.\n\nEpisodes with a \"podcast:transcript:url\" value will be downloaded from that URL and the download's location will then be stored in \"transcriptDownloadPath\". \n\n    $ overcast-to-sqlite transcripts\n\nLike previous commands, by default this will save transcripts to `archive/transcripts/<feed title>/<episode title>` by default.\n\nA different path can be set with the `-p`/`--path` flag.\n\nIt also supports the `-v` flag to print additional information.\n\nThere is also a `-s` flag to only download transcripts for starred episodes.\n\n## See also\n\n- [Datasette](https://datasette.io/)\n- [Podcast Transcript Convert](https://github.com/hbmartin/podcast-transcript-convert/)\n- [Overcast Parser](https://github.com/hbmartin/overcast_parser)\n- [Podcast Archiver](https://github.com/janw/podcast-archiver)\n\n## Development\n\nPull requests are very welcome! For major changes, please open an issue first to discuss what you would like to change.\n\n### Setup\n\n```bash\ngit clone git@github.com:hbmartin/overcast-to-sqlite.git\ncd overcast-to-sqlite\npython3 -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\npython -m overcast_to_sqlite.cli all -v\n```\n\n### Code Formatting\n\nThis project is linted with [ruff](https://docs.astral.sh/ruff/) and uses [Black](https://github.com/ambv/black) code formatting.\n\n## Authors\n\n* [Harold Martin](https://www.linkedin.com/in/harold-martin-98526971/) - harold.martin at gmail\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Save listening history and feed/episode info from Overcast to a SQLite database.",
    "version": "0.7.0",
    "project_urls": {
        "Homepage": "https://github.com/hbmartin/overcast-to-sqlite"
    },
    "split_keywords": [
        "datasette",
        " overcast",
        " sqlite",
        " podcasts",
        " podcast",
        " transcripts"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "41f889dd498da20679f56595aa3e2bdb7d5e0a185d66bba3e7845c4a4054fd32",
                "md5": "139bb737c0c57a3063670a20fee3fe5e",
                "sha256": "dfecc17b106437d878ee5a57e94962c1925dc4284b08ae0edb5fa0010595fe4e"
            },
            "downloads": -1,
            "filename": "overcast_to_sqlite-0.7.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "139bb737c0c57a3063670a20fee3fe5e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 28471,
            "upload_time": "2024-08-09T22:41:43",
            "upload_time_iso_8601": "2024-08-09T22:41:43.041360Z",
            "url": "https://files.pythonhosted.org/packages/41/f8/89dd498da20679f56595aa3e2bdb7d5e0a185d66bba3e7845c4a4054fd32/overcast_to_sqlite-0.7.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e3e8b33b16eaae17e745fe3c4e4075d90ef9fa1f5b0bedde86b702b690d15741",
                "md5": "4632d0ebdebfde29a7733d2683faa362",
                "sha256": "f79628449ac59a0982cc723a789ec467c0146c2312b24dd59aa1f0f6f40302ec"
            },
            "downloads": -1,
            "filename": "overcast_to_sqlite-0.7.0.tar.gz",
            "has_sig": false,
            "md5_digest": "4632d0ebdebfde29a7733d2683faa362",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 27270,
            "upload_time": "2024-08-09T22:41:44",
            "upload_time_iso_8601": "2024-08-09T22:41:44.215271Z",
            "url": "https://files.pythonhosted.org/packages/e3/e8/b33b16eaae17e745fe3c4e4075d90ef9fa1f5b0bedde86b702b690d15741/overcast_to_sqlite-0.7.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-09 22:41:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hbmartin",
    "github_project": "overcast-to-sqlite",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "overcast-to-sqlite"
}
        
Elapsed time: 0.35016s