datasette-atom


Namedatasette-atom JSON
Version 0.9 PyPI version JSON
download
home_pagehttps://github.com/simonw/datasette-atom
SummaryDatasette plugin that adds a .atom output format
upload_time2023-03-14 03:51:32
maintainer
docs_urlNone
authorSimon Willison
requires_python
licenseApache License, Version 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # datasette-atom

[![PyPI](https://img.shields.io/pypi/v/datasette-atom.svg)](https://pypi.org/project/datasette-atom/)
[![Changelog](https://img.shields.io/github/v/release/simonw/datasette-atom?include_prereleases&label=changelog)](https://github.com/simonw/datasette-atom/releases)
[![Tests](https://github.com/simonw/datasette-atom/workflows/Test/badge.svg)](https://github.com/simonw/datasette-atom/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/datasette-atom/blob/main/LICENSE)

Datasette plugin that adds support for generating [Atom feeds](https://validator.w3.org/feed/docs/atom.html) with the results of a SQL query.

## Installation

Install this plugin in the same environment as Datasette to enable the `.atom` output extension.

    $ pip install datasette-atom

## Usage

To create an Atom feed you need to define a custom SQL query that returns a required set of columns:

* `atom_id` - a unique ID for each row. [This article](https://web.archive.org/web/20080211143232/http://diveintomark.org/archives/2004/05/28/howto-atom-id) has suggestions about ways to create these IDs.
* `atom_title` - a title for that row.
* `atom_updated` - an [RFC 3339](http://www.faqs.org/rfcs/rfc3339.html) timestamp representing the last time the entry was modified in a significant way. This can usually be the time that the row was created.

The following columns are optional:

* `atom_content` - content that should be shown in the feed. This will be treated as a regular string, so any embedded HTML tags will be escaped when they are displayed.
* `atom_content_html` - content that should be shown in the feed. This will be treated as an HTML string, and will be sanitized using [Bleach](https://github.com/mozilla/bleach) to ensure it does not have any malicious code in it before being returned as part of a `<content type="html">` Atom element. If both are provided, this will be used in place of `atom_content`.
* `atom_link` - a URL that should be used as the link that the feed entry points to.
* `atom_author_name` - the name of the author of the entry. If you provide this you can also provide `atom_author_uri` and `atom_author_email` with a URL and e-mail address for that author.

A query that returns these columns can then be returned as an Atom feed by adding the `.atom` extension.

## Example

Here is an example SQL query which generates an Atom feed for new entries on [www.niche-museums.com](https://www.niche-museums.com/):

```sql
select
  'tag:niche-museums.com,' || substr(created, 0, 11) || ':' || id as atom_id,
  name as atom_title,
  created as atom_updated,
  'https://www.niche-museums.com/browse/museums/' || id as atom_link,
  coalesce(
    '<img src="' || photo_url || '?w=800&amp;h=400&amp;fit=crop&amp;auto=compress">',
    ''
  ) || '<p>' || description || '</p>' as atom_content_html
from
  museums
order by
  created desc
limit
  15
```

You can try this query by [pasting it in here](https://www.niche-museums.com/browse) - then click the `.atom` link to see it as an Atom feed.

## Using a canned query

Datasette's [canned query mechanism](https://docs.datasette.io/en/stable/sql_queries.html#canned-queries) is a useful way to configure feeds. If a canned query definition has a `title` that will be used as the title of the Atom feed.

Here's an example, defined using a `metadata.yaml` file:

```yaml
databases:
  browse:
    queries:
      feed:
        title: Niche Museums
        sql: |-
          select
            'tag:niche-museums.com,' || substr(created, 0, 11) || ':' || id as atom_id,
            name as atom_title,
            created as atom_updated,
            'https://www.niche-museums.com/browse/museums/' || id as atom_link,
            coalesce(
              '<img src="' || photo_url || '?w=800&amp;h=400&amp;fit=crop&amp;auto=compress">',
              ''
            ) || '<p>' || description || '</p>' as atom_content_html
          from
            museums
          order by
            created desc
          limit
            15
```
## Disabling HTML filtering

The HTML allow-list used by Bleach for the `atom_content_html` column can be found in the `clean(html)` function at the bottom of [datasette_atom/__init__.py](https://github.com/simonw/datasette-atom/blob/main/datasette_atom/__init__.py).

You can disable Bleach entirely for Atom feeds generated using a canned query. You should only do this if you are certain that no user-provided HTML could be included in that value.

Here's how to do that in `metadata.json`:

```json
{
  "plugins": {
    "datasette-atom": {
      "allow_unsafe_html_in_canned_queries": true
    }
  }
}
```
Setting this to `true` will disable Bleach filtering for all canned queries across all databases.

You can disable Bleach filtering just for a specific list of canned queries like so:

```json
{
  "plugins": {
    "datasette-atom": {
      "allow_unsafe_html_in_canned_queries": {
        "museums": ["latest", "moderation"]
      }
    }
  }
}
```
This will disable Bleach just for the canned queries called `latest` and `moderation` in the `museums.db` database.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/simonw/datasette-atom",
    "name": "datasette-atom",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Simon Willison",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/e4/91/6bdae028e45ded933ac0a440e84892e0778be72caa1e18e691120aa9aed9/datasette-atom-0.9.tar.gz",
    "platform": null,
    "description": "# datasette-atom\n\n[![PyPI](https://img.shields.io/pypi/v/datasette-atom.svg)](https://pypi.org/project/datasette-atom/)\n[![Changelog](https://img.shields.io/github/v/release/simonw/datasette-atom?include_prereleases&label=changelog)](https://github.com/simonw/datasette-atom/releases)\n[![Tests](https://github.com/simonw/datasette-atom/workflows/Test/badge.svg)](https://github.com/simonw/datasette-atom/actions?query=workflow%3ATest)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/datasette-atom/blob/main/LICENSE)\n\nDatasette plugin that adds support for generating [Atom feeds](https://validator.w3.org/feed/docs/atom.html) with the results of a SQL query.\n\n## Installation\n\nInstall this plugin in the same environment as Datasette to enable the `.atom` output extension.\n\n    $ pip install datasette-atom\n\n## Usage\n\nTo create an Atom feed you need to define a custom SQL query that returns a required set of columns:\n\n* `atom_id` - a unique ID for each row. [This article](https://web.archive.org/web/20080211143232/http://diveintomark.org/archives/2004/05/28/howto-atom-id) has suggestions about ways to create these IDs.\n* `atom_title` - a title for that row.\n* `atom_updated` - an [RFC 3339](http://www.faqs.org/rfcs/rfc3339.html) timestamp representing the last time the entry was modified in a significant way. This can usually be the time that the row was created.\n\nThe following columns are optional:\n\n* `atom_content` - content that should be shown in the feed. This will be treated as a regular string, so any embedded HTML tags will be escaped when they are displayed.\n* `atom_content_html` - content that should be shown in the feed. This will be treated as an HTML string, and will be sanitized using [Bleach](https://github.com/mozilla/bleach) to ensure it does not have any malicious code in it before being returned as part of a `<content type=\"html\">` Atom element. If both are provided, this will be used in place of `atom_content`.\n* `atom_link` - a URL that should be used as the link that the feed entry points to.\n* `atom_author_name` - the name of the author of the entry. If you provide this you can also provide `atom_author_uri` and `atom_author_email` with a URL and e-mail address for that author.\n\nA query that returns these columns can then be returned as an Atom feed by adding the `.atom` extension.\n\n## Example\n\nHere is an example SQL query which generates an Atom feed for new entries on [www.niche-museums.com](https://www.niche-museums.com/):\n\n```sql\nselect\n  'tag:niche-museums.com,' || substr(created, 0, 11) || ':' || id as atom_id,\n  name as atom_title,\n  created as atom_updated,\n  'https://www.niche-museums.com/browse/museums/' || id as atom_link,\n  coalesce(\n    '<img src=\"' || photo_url || '?w=800&amp;h=400&amp;fit=crop&amp;auto=compress\">',\n    ''\n  ) || '<p>' || description || '</p>' as atom_content_html\nfrom\n  museums\norder by\n  created desc\nlimit\n  15\n```\n\nYou can try this query by [pasting it in here](https://www.niche-museums.com/browse) - then click the `.atom` link to see it as an Atom feed.\n\n## Using a canned query\n\nDatasette's [canned query mechanism](https://docs.datasette.io/en/stable/sql_queries.html#canned-queries) is a useful way to configure feeds. If a canned query definition has a `title` that will be used as the title of the Atom feed.\n\nHere's an example, defined using a `metadata.yaml` file:\n\n```yaml\ndatabases:\n  browse:\n    queries:\n      feed:\n        title: Niche Museums\n        sql: |-\n          select\n            'tag:niche-museums.com,' || substr(created, 0, 11) || ':' || id as atom_id,\n            name as atom_title,\n            created as atom_updated,\n            'https://www.niche-museums.com/browse/museums/' || id as atom_link,\n            coalesce(\n              '<img src=\"' || photo_url || '?w=800&amp;h=400&amp;fit=crop&amp;auto=compress\">',\n              ''\n            ) || '<p>' || description || '</p>' as atom_content_html\n          from\n            museums\n          order by\n            created desc\n          limit\n            15\n```\n## Disabling HTML filtering\n\nThe HTML allow-list used by Bleach for the `atom_content_html` column can be found in the `clean(html)` function at the bottom of [datasette_atom/__init__.py](https://github.com/simonw/datasette-atom/blob/main/datasette_atom/__init__.py).\n\nYou can disable Bleach entirely for Atom feeds generated using a canned query. You should only do this if you are certain that no user-provided HTML could be included in that value.\n\nHere's how to do that in `metadata.json`:\n\n```json\n{\n  \"plugins\": {\n    \"datasette-atom\": {\n      \"allow_unsafe_html_in_canned_queries\": true\n    }\n  }\n}\n```\nSetting this to `true` will disable Bleach filtering for all canned queries across all databases.\n\nYou can disable Bleach filtering just for a specific list of canned queries like so:\n\n```json\n{\n  \"plugins\": {\n    \"datasette-atom\": {\n      \"allow_unsafe_html_in_canned_queries\": {\n        \"museums\": [\"latest\", \"moderation\"]\n      }\n    }\n  }\n}\n```\nThis will disable Bleach just for the canned queries called `latest` and `moderation` in the `museums.db` database.\n",
    "bugtrack_url": null,
    "license": "Apache License, Version 2.0",
    "summary": "Datasette plugin that adds a .atom output format",
    "version": "0.9",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3926c093bd5e010ece26c7d59ced4d50974d3bd05b8b2e16199ae0df5a9f6a57",
                "md5": "438657074e76dc01f0a0e550e9e96a2a",
                "sha256": "6dfb5d198ce9512854c808255fed13739881b43f927f073f34376d744264f42a"
            },
            "downloads": -1,
            "filename": "datasette_atom-0.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "438657074e76dc01f0a0e550e9e96a2a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 9068,
            "upload_time": "2023-03-14T03:51:31",
            "upload_time_iso_8601": "2023-03-14T03:51:31.165454Z",
            "url": "https://files.pythonhosted.org/packages/39/26/c093bd5e010ece26c7d59ced4d50974d3bd05b8b2e16199ae0df5a9f6a57/datasette_atom-0.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e4916bdae028e45ded933ac0a440e84892e0778be72caa1e18e691120aa9aed9",
                "md5": "bb78b8883439e69be0eb2b8d7527be2c",
                "sha256": "033025c0fb33bc181f3904c360e2fdfbeb0b1f99a3b3f85e093376525ea60b0d"
            },
            "downloads": -1,
            "filename": "datasette-atom-0.9.tar.gz",
            "has_sig": false,
            "md5_digest": "bb78b8883439e69be0eb2b8d7527be2c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10323,
            "upload_time": "2023-03-14T03:51:32",
            "upload_time_iso_8601": "2023-03-14T03:51:32.758099Z",
            "url": "https://files.pythonhosted.org/packages/e4/91/6bdae028e45ded933ac0a440e84892e0778be72caa1e18e691120aa9aed9/datasette-atom-0.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-14 03:51:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "simonw",
    "github_project": "datasette-atom",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "datasette-atom"
}
        
Elapsed time: 0.11809s