Name | Pushl JSON |
Version |
0.4.0
JSON |
| download |
home_page | https://plaidweb.site/ |
Summary | A conduit for pushing changes in a feed to the rest of the IndieWeb |
upload_time | 2025-01-01 18:33:28 |
maintainer | None |
docs_url | None |
author | fluffy |
requires_python | <4.0.0,>=3.10.0 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Pushl
A simple tool that parses content feeds and sends out appropriate push notifications (WebSub, webmention, etc.) when they change.
See http://publ.beesbuzz.biz/blog/113-Some-thoughts-on-WebMention for the motivation.
## Features
* Supports any feed supported by [feedparser](https://github.com/kurtmckee/feedparser)
and [mf2py](https://github.com/microformats/mf2py) (RSS, Atom, HTML pages containing
`h-entry`, etc.)
* Will send WebSub notifications for feeds which declare a WebSub hub
* Will send WebMention notifications for entries discovered on those feeds or specified directly
* Can perform autodiscovery of additional feeds on entry pages
* Can do a full backfill on Atom feeds configured with [RFC 5005](https://tools.ietf.org/html/rfc5005)
* When configured to use a cache directory, can detect entry deletions and updates to implement the webmention update and delete protocols (as well as saving some time and bandwidth)
## Site setup
If you want to support WebSub, have your feed implement [the WebSub protocol](https://indieweb.org/WebSub). The short version is that you should have a `<link rel="hub" href="http://path/to/hub" />` in your feed's top-level element.
There are a number of WebSub hubs available; I use [Superfeedr](http://pubsubhubbub.superfeedr.com).
For [WebMentions](https://indieweb.org/Webmention), configure your site templates with the various microformats; by default, Pushl will use the following tags as the top-level entry container, in descending order of priority:
* Anything with a `class` of `h-entry`
* An `<article>` tag
* Anything with a `class` of `entry`
For more information on how to configure your site templates, see the [microformats h-entry specification](http://microformats.org/wiki/h-entry).
### mf2 feed notes
If you're using an mf2 feed (i.e. an HTML-formatted page with `h-entry` declarations), only entries with a `u-url` property will be used for sending webmentions; further, Pushl will retrieve the page from that URL to ensure it has the full content. (This is to work around certain setups where the `h-feed` only shows summary text.)
Also, there is technically no requirement for an HTML page to declare an `h-feed`; all entities marked up with `h-entry` will be consumed.
## Installation
You can install it using `pip` with e.g.:
```bash
pip3 install pushl
```
However, I recommend installing it in a virtual environment with e.g.:
```bash
virtualenv3 $HOME/pushl
$HOME/pushl/bin/pip3 install pushl
```
and then putting a symlink to `$HOME/pushl/bin/pushl` to a directory in your $PATH, e.g.
```bash
ln -s $HOME/pushl/bin/pushl $HOME/bin/pushl
```
## Usage
### Basic
```bash
pushl -c $HOME/var/pushl-cache http://example.com/feed.xml
```
While you can run it without the `-c` argument, its use is highly recommended so that subsequent runs are both less spammy and so that it can detect changes and deletions.
### Sending pings from individual entries
If you just want to send webmentions from an entry page without processing an entire feed, the `-e/--entry` flag indicates that the following URLs are pages or entries, rather than feeds; e.g.
```bash
pushl -e http://example.com/some/page
```
will simply send the webmentions for that page.
### Additional feed discovery
The `-r/--recurse` flag will discover any additional feeds that are declared on entries and process them as well. This is useful if you have per-category feeds that you would also like to send WebSub notifications on. For example, [my site](http://beesbuzz.biz) has per-category feeds which are discoverable from individual entries, so `pushl -r http://beesbuzz.biz/feed` will send WebSub notifications for all of the categories which have recent changes.
Note that `-r` and `-e` in conjunction will also cause the feed declared on the entry page to be processed further. While it is tempting to use this in a feed autodiscovery context e.g.
```bash
pushl -re http://example.com/blog/
```
this will also send webmentions from the blog page itself which is probably *not* what you want to have happen.
### Backfilling old content
If your feed implements [RFC 5005](https://tools.ietf.org/html/rfc5005), the `-a` flag will scan past entries for WebMention as well. It is recommended to only use this flag when doing an initial backfill, as it can end up taking a long time on larger sites (and possibly make endpoint operators very grumpy at you). To send updates of much older entries it's better to just use `-e` to do it on a case-by-case basis.
### Dual-protocol/multi-domain websites
If you have a website which has multiple URLs that can access it (for example, http+https, or multiple domain names), you generally only want WebMentions to be sent from the canonical URL. The best solution is to use `<link rel="canonical">` to declare which one is the real one, and Pushl will use that in sending the mentions; so, for example:
```bash
pushl -r https://example.com/feed http://example.com/feed http://alt-domain.example.com/feed
```
As long as both `http://example.com` and `http://alt-domain.example.com` declare the `https://example.com` version as canonical, only the webmentions from `https://example.com` will be sent.
If, for some reason, you can't use `rel="canonical"` you can use the `-s/--websub-only` flag on Pushl to have it only send WebSub notifications for that feed; for example:
```bash
pushl -r https://example.com/feed -s https://other.example.com/feed
```
will send both Webmention and WebSub for `https://example.com` but only WebSub for `https://other.example.com`.
## Automated updates
`pushl` can be run from a cron job, although it's a good idea to use `flock -n` to prevent multiple instances from stomping on each other. An example cron job for updating a site might look like:
```crontab
*/5 * * * * flock -n $HOME/.pushl-lock pushl -rc $HOME/.pushl-cache http://example.com/feed
```
### My setup
In my setup, I have `pushl` installed in my website's pipenv:
```bash
cd $HOME/beesbuzz.biz
pipenv install pushl
```
and created this script as `$HOME/beesbuzz.biz/pushl.sh`:
```bash
#!/bin/bash
cd $(dirname "$0")
LOG=logs/pushl-$(date +%Y%m%d.log)
# redirect log output
if [ "$1" == "quiet" ] ; then
exec >> $LOG 2>&1
else
exec 2>&1 | tee -a $LOG
fi
# add timestamp
date
# run pushl
flock -n $HOME/var/pushl/run.lock $HOME/.local/bin/pipenv run pushl -rvvkc $HOME/var/pushl \
https://beesbuzz.biz/feed\?push=1 \
http://publ.beesbuzz.biz/feed\?push=1 \
https://tumblr.beesbuzz.biz/rss \
https://novembeat.com/feed\?push=1 \
http://beesbuzz.biz/feed\?push=1 \
-s http://beesbuzz.biz/feed-summary https://beesbuzz.biz/feed-summary
# while we're at it, clean out the log and pushl cache directory
find logs $HOME/var/pushl -type f -mtime +30 -print -delete
```
Then I have a cron job:
```crontab
*/15 * * * * $HOME/beesbuzz.biz/pushl.sh quiet
```
which runs it every 15 minutes.
I also have a [git deployment hook](http://publ.beesbuzz.biz/441) for my website, and its final step (after restarting `gunicorn`) is to run `pushl.sh`, in case a maximum latency of 15 minutes just isn't fast enough.
Raw data
{
"_id": null,
"home_page": "https://plaidweb.site/",
"name": "Pushl",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0.0,>=3.10.0",
"maintainer_email": null,
"keywords": null,
"author": "fluffy",
"author_email": "fluffy@beesbuzz.biz",
"download_url": "https://files.pythonhosted.org/packages/4b/32/f471a27c40a2e107c41b898ac6b0c59b1c844edcfe9f5883ad705c3661b3/pushl-0.4.0.tar.gz",
"platform": null,
"description": "# Pushl\n\nA simple tool that parses content feeds and sends out appropriate push notifications (WebSub, webmention, etc.) when they change.\n\nSee http://publ.beesbuzz.biz/blog/113-Some-thoughts-on-WebMention for the motivation.\n\n## Features\n\n* Supports any feed supported by [feedparser](https://github.com/kurtmckee/feedparser)\n and [mf2py](https://github.com/microformats/mf2py) (RSS, Atom, HTML pages containing\n `h-entry`, etc.)\n* Will send WebSub notifications for feeds which declare a WebSub hub\n* Will send WebMention notifications for entries discovered on those feeds or specified directly\n* Can perform autodiscovery of additional feeds on entry pages\n* Can do a full backfill on Atom feeds configured with [RFC 5005](https://tools.ietf.org/html/rfc5005)\n* When configured to use a cache directory, can detect entry deletions and updates to implement the webmention update and delete protocols (as well as saving some time and bandwidth)\n\n\n## Site setup\n\nIf you want to support WebSub, have your feed implement [the WebSub protocol](https://indieweb.org/WebSub). The short version is that you should have a `<link rel=\"hub\" href=\"http://path/to/hub\" />` in your feed's top-level element.\n\nThere are a number of WebSub hubs available; I use [Superfeedr](http://pubsubhubbub.superfeedr.com).\n\nFor [WebMentions](https://indieweb.org/Webmention), configure your site templates with the various microformats; by default, Pushl will use the following tags as the top-level entry container, in descending order of priority:\n\n* Anything with a `class` of `h-entry`\n* An `<article>` tag\n* Anything with a `class` of `entry`\n\nFor more information on how to configure your site templates, see the [microformats h-entry specification](http://microformats.org/wiki/h-entry).\n\n### mf2 feed notes\n\nIf you're using an mf2 feed (i.e. an HTML-formatted page with `h-entry` declarations), only entries with a `u-url` property will be used for sending webmentions; further, Pushl will retrieve the page from that URL to ensure it has the full content. (This is to work around certain setups where the `h-feed` only shows summary text.)\n\nAlso, there is technically no requirement for an HTML page to declare an `h-feed`; all entities marked up with `h-entry` will be consumed.\n\n## Installation\n\nYou can install it using `pip` with e.g.:\n\n```bash\npip3 install pushl\n```\n\nHowever, I recommend installing it in a virtual environment with e.g.:\n\n```bash\nvirtualenv3 $HOME/pushl\n$HOME/pushl/bin/pip3 install pushl\n```\n\nand then putting a symlink to `$HOME/pushl/bin/pushl` to a directory in your $PATH, e.g.\n\n```bash\nln -s $HOME/pushl/bin/pushl $HOME/bin/pushl\n```\n\n## Usage\n\n### Basic\n\n```bash\npushl -c $HOME/var/pushl-cache http://example.com/feed.xml\n```\n\nWhile you can run it without the `-c` argument, its use is highly recommended so that subsequent runs are both less spammy and so that it can detect changes and deletions.\n\n### Sending pings from individual entries\n\nIf you just want to send webmentions from an entry page without processing an entire feed, the `-e/--entry` flag indicates that the following URLs are pages or entries, rather than feeds; e.g.\n\n```bash\npushl -e http://example.com/some/page\n```\n\nwill simply send the webmentions for that page.\n\n### Additional feed discovery\n\nThe `-r/--recurse` flag will discover any additional feeds that are declared on entries and process them as well. This is useful if you have per-category feeds that you would also like to send WebSub notifications on. For example, [my site](http://beesbuzz.biz) has per-category feeds which are discoverable from individual entries, so `pushl -r http://beesbuzz.biz/feed` will send WebSub notifications for all of the categories which have recent changes.\n\nNote that `-r` and `-e` in conjunction will also cause the feed declared on the entry page to be processed further. While it is tempting to use this in a feed autodiscovery context e.g.\n\n```bash\npushl -re http://example.com/blog/\n```\n\nthis will also send webmentions from the blog page itself which is probably *not* what you want to have happen.\n\n### Backfilling old content\n\nIf your feed implements [RFC 5005](https://tools.ietf.org/html/rfc5005), the `-a` flag will scan past entries for WebMention as well. It is recommended to only use this flag when doing an initial backfill, as it can end up taking a long time on larger sites (and possibly make endpoint operators very grumpy at you). To send updates of much older entries it's better to just use `-e` to do it on a case-by-case basis.\n\n### Dual-protocol/multi-domain websites\n\nIf you have a website which has multiple URLs that can access it (for example, http+https, or multiple domain names), you generally only want WebMentions to be sent from the canonical URL. The best solution is to use `<link rel=\"canonical\">` to declare which one is the real one, and Pushl will use that in sending the mentions; so, for example:\n\n\n```bash\npushl -r https://example.com/feed http://example.com/feed http://alt-domain.example.com/feed\n```\n\nAs long as both `http://example.com` and `http://alt-domain.example.com` declare the `https://example.com` version as canonical, only the webmentions from `https://example.com` will be sent.\n\nIf, for some reason, you can't use `rel=\"canonical\"` you can use the `-s/--websub-only` flag on Pushl to have it only send WebSub notifications for that feed; for example:\n\n```bash\npushl -r https://example.com/feed -s https://other.example.com/feed\n```\n\nwill send both Webmention and WebSub for `https://example.com` but only WebSub for `https://other.example.com`.\n\n## Automated updates\n\n`pushl` can be run from a cron job, although it's a good idea to use `flock -n` to prevent multiple instances from stomping on each other. An example cron job for updating a site might look like:\n\n```crontab\n*/5 * * * * flock -n $HOME/.pushl-lock pushl -rc $HOME/.pushl-cache http://example.com/feed\n```\n\n### My setup\n\nIn my setup, I have `pushl` installed in my website's pipenv:\n\n```bash\ncd $HOME/beesbuzz.biz\npipenv install pushl\n```\n\nand created this script as `$HOME/beesbuzz.biz/pushl.sh`:\n\n```bash\n#!/bin/bash\n\ncd $(dirname \"$0\")\nLOG=logs/pushl-$(date +%Y%m%d.log)\n\n# redirect log output\nif [ \"$1\" == \"quiet\" ] ; then\n exec >> $LOG 2>&1\nelse\n exec 2>&1 | tee -a $LOG\nfi\n\n# add timestamp\ndate\n\n# run pushl\nflock -n $HOME/var/pushl/run.lock $HOME/.local/bin/pipenv run pushl -rvvkc $HOME/var/pushl \\\n https://beesbuzz.biz/feed\\?push=1 \\\n http://publ.beesbuzz.biz/feed\\?push=1 \\\n https://tumblr.beesbuzz.biz/rss \\\n https://novembeat.com/feed\\?push=1 \\\n http://beesbuzz.biz/feed\\?push=1 \\\n -s http://beesbuzz.biz/feed-summary https://beesbuzz.biz/feed-summary\n\n# while we're at it, clean out the log and pushl cache directory\nfind logs $HOME/var/pushl -type f -mtime +30 -print -delete\n```\n\nThen I have a cron job:\n\n```crontab\n*/15 * * * * $HOME/beesbuzz.biz/pushl.sh quiet\n```\n\nwhich runs it every 15 minutes.\n\nI also have a [git deployment hook](http://publ.beesbuzz.biz/441) for my website, and its final step (after restarting `gunicorn`) is to run `pushl.sh`, in case a maximum latency of 15 minutes just isn't fast enough.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A conduit for pushing changes in a feed to the rest of the IndieWeb",
"version": "0.4.0",
"project_urls": {
"Homepage": "https://plaidweb.site/",
"Repository": "https://github.com/PlaidWeb/Pushl"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0f2a3fc40b7ff0c8d7a95100c33faedef6b0bdf7385ea5a27403da3d3d0a93b9",
"md5": "71384a070f47aaa59a388c51b7b920da",
"sha256": "a1fcd7053c89ef7bcfd6a3796bedb797bb2b3283d5570bafb3c14d1657f511f4"
},
"downloads": -1,
"filename": "pushl-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "71384a070f47aaa59a388c51b7b920da",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0.0,>=3.10.0",
"size": 19618,
"upload_time": "2025-01-01T18:33:26",
"upload_time_iso_8601": "2025-01-01T18:33:26.341729Z",
"url": "https://files.pythonhosted.org/packages/0f/2a/3fc40b7ff0c8d7a95100c33faedef6b0bdf7385ea5a27403da3d3d0a93b9/pushl-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4b32f471a27c40a2e107c41b898ac6b0c59b1c844edcfe9f5883ad705c3661b3",
"md5": "5354111af1e41141593bc5c8082cc36f",
"sha256": "13fe73c502ed433e263d47cd4a72ba0c764c85de3e59288b4d435fe46a7b07ef"
},
"downloads": -1,
"filename": "pushl-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "5354111af1e41141593bc5c8082cc36f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0.0,>=3.10.0",
"size": 18260,
"upload_time": "2025-01-01T18:33:28",
"upload_time_iso_8601": "2025-01-01T18:33:28.598713Z",
"url": "https://files.pythonhosted.org/packages/4b/32/f471a27c40a2e107c41b898ac6b0c59b1c844edcfe9f5883ad705c3661b3/pushl-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-01 18:33:28",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "PlaidWeb",
"github_project": "Pushl",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pushl"
}