feed-to-sqlite


Namefeed-to-sqlite JSON
Version 0.6.1 PyPI version JSON
download
home_pageNone
SummarySave an RSS or ATOM feed to a SQLITE database
upload_time2025-08-29 01:40:29
maintainerNone
docs_urlNone
authorChris Amico
requires_python>=3.9
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Build Status](https://travis-ci.com/eyeseast/feed-to-sqlite.svg?branch=master)](https://travis-ci.com/eyeseast/feed-to-sqlite)
[![Tests](https://github.com/eyeseast/feed-to-sqlite/workflows/Test/badge.svg)](https://github.com/eyeseast/feed-to-sqlite/actions?query=workflow%3ATest)
[![PyPI](https://img.shields.io/pypi/v/feed-to-sqlite.svg)](https://pypi.org/project/feed-to-sqlite/)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/eyeseast/feed-to-sqlite/blob/master/LICENSE)

# feed-to-sqlite

Download an RSS or Atom feed and save it to a SQLite database. This is meant to work well with [datasette](https://github.com/simonw/datasette).

## Installation

```sh
pip install feed-to-sqlite
```

## CLI Usage

Let's grab the ATOM feeds for [items I've shared on NewsBlur](http://chrisamico.newsblur.com/social/rss/35501/chrisamico) and [my instapaper favorites](https://www.instapaper.com/starred/rss/13475/qUh7yaOUGOSQeANThMyxXdYnho) save each its own table.

```sh
feed-to-sqlite feeds.db http://chrisamico.newsblur.com/social/rss/35501/chrisamico https://www.instapaper.com/starred/rss/13475/qUh7yaOUGOSQeANThMyxXdYnho
```

This will use a SQLite database called `feeds.db`, creating it if necessary. By default, each feed gets its own table, named based on a slugified version of the feed's title.

To load all items from multiple feeds into a common (or pre-existing) table, pass a `--table` argument:

```sh
feed-to-sqlite feeds.db --table links <url> <url>
```

That will put all items in a table called `links`.

Each feed also creates an entry in a `feeds` table containing top-level metadata for each feed. Each item will have a foreign key to the originating feed. This is especially useful if combining feeds into a shared table.

## Python API

One function, `ingest_feed`, does most of the work here. The following will create a database called `feeds.db` and download my NewsBlur shared items into a new table called `links`.

```python
from feed_to_sqlite import ingest_feed

url = "http://chrisamico.newsblur.com/social/rss/35501/chrisamico"

ingest_feed("feeds.db", url=url, table_name="links")
```

### Transforming data on ingest

When working in Python directly, it's possible to pass in a function to transform rows before they're saved to the database.

The `normalize` argument to `ingest_feed` is a function that will be called on each feed item, useful for fixing links or doing additional work.

It's signature is `normalize(table, entry, feed_details, client)`:

- `table` is a SQLite table ([from sqlite-utils](https://sqlite-utils.datasette.io/en/stable/python-api.html#accessing-tables))
- `entry` is one feed item, as a dictionary
- `feed_details` is a dictionary of top-level feed information, as a dictionary
- `client` is an instance of `httpx.Client`, which can be used for outgoing HTTP requests during normalization

That function should return a dictionary representing the row to be saved. Returning a falsey value for a given row will cause that row to be skipped.

## Development

Tests use [pytest](https://docs.pytest.org/). Run `pytest tests/` to run the test suite.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "feed-to-sqlite",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Chris Amico",
    "author_email": "Chris Amico <eyeseast@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/ca/b4/c9ea1fd85868c6bf538106c63425bd1c0918b896bb4cf8470da7889220ec/feed_to_sqlite-0.6.1.tar.gz",
    "platform": null,
    "description": "[![Build Status](https://travis-ci.com/eyeseast/feed-to-sqlite.svg?branch=master)](https://travis-ci.com/eyeseast/feed-to-sqlite)\n[![Tests](https://github.com/eyeseast/feed-to-sqlite/workflows/Test/badge.svg)](https://github.com/eyeseast/feed-to-sqlite/actions?query=workflow%3ATest)\n[![PyPI](https://img.shields.io/pypi/v/feed-to-sqlite.svg)](https://pypi.org/project/feed-to-sqlite/)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/eyeseast/feed-to-sqlite/blob/master/LICENSE)\n\n# feed-to-sqlite\n\nDownload an RSS or Atom feed and save it to a SQLite database. This is meant to work well with [datasette](https://github.com/simonw/datasette).\n\n## Installation\n\n```sh\npip install feed-to-sqlite\n```\n\n## CLI Usage\n\nLet's grab the ATOM feeds for [items I've shared on NewsBlur](http://chrisamico.newsblur.com/social/rss/35501/chrisamico) and [my instapaper favorites](https://www.instapaper.com/starred/rss/13475/qUh7yaOUGOSQeANThMyxXdYnho) save each its own table.\n\n```sh\nfeed-to-sqlite feeds.db http://chrisamico.newsblur.com/social/rss/35501/chrisamico https://www.instapaper.com/starred/rss/13475/qUh7yaOUGOSQeANThMyxXdYnho\n```\n\nThis will use a SQLite database called `feeds.db`, creating it if necessary. By default, each feed gets its own table, named based on a slugified version of the feed's title.\n\nTo load all items from multiple feeds into a common (or pre-existing) table, pass a `--table` argument:\n\n```sh\nfeed-to-sqlite feeds.db --table links <url> <url>\n```\n\nThat will put all items in a table called `links`.\n\nEach feed also creates an entry in a `feeds` table containing top-level metadata for each feed. Each item will have a foreign key to the originating feed. This is especially useful if combining feeds into a shared table.\n\n## Python API\n\nOne function, `ingest_feed`, does most of the work here. The following will create a database called `feeds.db` and download my NewsBlur shared items into a new table called `links`.\n\n```python\nfrom feed_to_sqlite import ingest_feed\n\nurl = \"http://chrisamico.newsblur.com/social/rss/35501/chrisamico\"\n\ningest_feed(\"feeds.db\", url=url, table_name=\"links\")\n```\n\n### Transforming data on ingest\n\nWhen working in Python directly, it's possible to pass in a function to transform rows before they're saved to the database.\n\nThe `normalize` argument to `ingest_feed` is a function that will be called on each feed item, useful for fixing links or doing additional work.\n\nIt's signature is `normalize(table, entry, feed_details, client)`:\n\n- `table` is a SQLite table ([from sqlite-utils](https://sqlite-utils.datasette.io/en/stable/python-api.html#accessing-tables))\n- `entry` is one feed item, as a dictionary\n- `feed_details` is a dictionary of top-level feed information, as a dictionary\n- `client` is an instance of `httpx.Client`, which can be used for outgoing HTTP requests during normalization\n\nThat function should return a dictionary representing the row to be saved. Returning a falsey value for a given row will cause that row to be skipped.\n\n## Development\n\nTests use [pytest](https://docs.pytest.org/). Run `pytest tests/` to run the test suite.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Save an RSS or ATOM feed to a SQLITE database",
    "version": "0.6.1",
    "project_urls": {
        "homepage": "https://github.com/eyeseast/feed-to-sqlite",
        "source": "https://github.com/eyeseast/feed-to-sqlite"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "601f7450b7d160aa8668f15f01e29a4ed000005bc983789a1a3d38fc72653ef3",
                "md5": "ca50fdc25df4d0e636651158d7ea57de",
                "sha256": "baf4c44578472a85f59f681a4b5c562e66879454051d76608ed115e894df9642"
            },
            "downloads": -1,
            "filename": "feed_to_sqlite-0.6.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ca50fdc25df4d0e636651158d7ea57de",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 5175,
            "upload_time": "2025-08-29T01:40:28",
            "upload_time_iso_8601": "2025-08-29T01:40:28.337077Z",
            "url": "https://files.pythonhosted.org/packages/60/1f/7450b7d160aa8668f15f01e29a4ed000005bc983789a1a3d38fc72653ef3/feed_to_sqlite-0.6.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "cab4c9ea1fd85868c6bf538106c63425bd1c0918b896bb4cf8470da7889220ec",
                "md5": "6351f475cd7e9d1d89ac292e98c9e154",
                "sha256": "c04d724f9ad02358275e237a972928f2954d7930a9aadb04b511d2de9a0f7a49"
            },
            "downloads": -1,
            "filename": "feed_to_sqlite-0.6.1.tar.gz",
            "has_sig": false,
            "md5_digest": "6351f475cd7e9d1d89ac292e98c9e154",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 3807,
            "upload_time": "2025-08-29T01:40:29",
            "upload_time_iso_8601": "2025-08-29T01:40:29.473394Z",
            "url": "https://files.pythonhosted.org/packages/ca/b4/c9ea1fd85868c6bf538106c63425bd1c0918b896bb4cf8470da7889220ec/feed_to_sqlite-0.6.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-29 01:40:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "eyeseast",
    "github_project": "feed-to-sqlite",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "feed-to-sqlite"
}
        
Elapsed time: 1.05557s