linkpreview


Namelinkpreview JSON
Version 0.11.0 PyPI version JSON
download
home_pageNone
SummaryGet link (URL) preview
upload_time2024-09-27 18:52:05
maintainerNone
docs_urlNone
authorMeyT
requires_pythonNone
licenseMIT
keywords link preview web htmlparse schema.org opengraph twittercard url
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # linkpreview

[![Build Status](https://github.com/meyt/linkpreview/actions/workflows/main.yaml/badge.svg)](https://github.com/meyt/linkpreview/actions)
[![Coverage Status](https://coveralls.io/repos/github/meyt/linkpreview/badge.svg?branch=master)](https://coveralls.io/github/meyt/linkpreview?branch=master)
[![pypi](https://img.shields.io/pypi/pyversions/linkpreview.svg)](https://pypi.python.org/pypi/linkpreview)

Get link preview in python

Gathering data from:

1. [OpenGraph](https://ogp.me/) meta tags
2. [TwitterCard](https://developer.twitter.com/en/docs/tweets/optimize-with-cards/overview/abouts-cards) meta tags
3. [Microdata](<https://en.wikipedia.org/wiki/Microdata_(HTML)>) meta tags
4. [JSON-LD](https://en.wikipedia.org/wiki/JSON-LD) meta tags
5. HTML Generic tags (`h1`, `p`, `img`)
6. URL readable parts

## Install

```
pip install linkpreview
```

## Usage

### Basic

```python
from linkpreview import link_preview

url = "http://localhost"
content = """
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width">
    <!-- ... --->
    <title>a title</title>
  </head>
  <body>
  <!-- ... --->
  </body>
</html>
"""
preview = link_preview(url, content)
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)
```

### Automatic fetch link content

```python
from linkpreview import link_preview

preview = link_preview("http://github.com/")
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)
```

### `lxml` as XML parser

Very recommended for better performance.

[Install](https://lxml.de/installation.html) the `lxml` and use it like this:

```python
from linkpreview import link_preview

preview = link_preview("http://github.com/", parser="lxml")
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)
```

### Advanced

```python
from linkpreview import Link, LinkPreview, LinkGrabber

url = "http://github.com"
grabber = LinkGrabber(
    initial_timeout=20,
    maxsize=1048576,
    receive_timeout=10,
    chunk_size=1024,
)
content, url = grabber.get_content(url)
link = Link(url, content)
preview = LinkPreview(link, parser="lxml")
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)
```

Extend default headers:

```python
content, url = grabber.get_content(url, headers={'user-agent': 'Twitterbot'})
```

Ignore default headers:

```python
content, url = grabber.get_content(
  url,
  headers={'user-agent': 'Twitterbot', 'accept': '*/*'},
  replace_headers=True,
)
```

Use preset headers:

```python
content, url = grabber.get_content( url, headers='googlebot')
```

Available presets:
`firefox`,
`chrome`,
`googlebot`,
`twitterbot`,
`telegrambot`,
`imessagebot`

If you already have parsed `BeautifulSoup` object:

```python
from bs4 import BeautifulSoup
from linkpreview import Link, LinkPreview

url = "http://example.com"
content = "<h1>Hello</h1>"
soup = BeautifulSoup(content, "html.parser")
link = Link(url, content)
preview = LinkPreview(link, soup=soup)
print("title:", preview.title)
```



            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "linkpreview",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "link preview web htmlparse schema.org opengraph twittercard url",
    "author": "MeyT",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/15/e0/7add03bd40f7f20dc5661e11e6e2137dc0a1062b01070699b420859de899/linkpreview-0.11.0.tar.gz",
    "platform": null,
    "description": "# linkpreview\n\n[![Build Status](https://github.com/meyt/linkpreview/actions/workflows/main.yaml/badge.svg)](https://github.com/meyt/linkpreview/actions)\n[![Coverage Status](https://coveralls.io/repos/github/meyt/linkpreview/badge.svg?branch=master)](https://coveralls.io/github/meyt/linkpreview?branch=master)\n[![pypi](https://img.shields.io/pypi/pyversions/linkpreview.svg)](https://pypi.python.org/pypi/linkpreview)\n\nGet link preview in python\n\nGathering data from:\n\n1. [OpenGraph](https://ogp.me/) meta tags\n2. [TwitterCard](https://developer.twitter.com/en/docs/tweets/optimize-with-cards/overview/abouts-cards) meta tags\n3. [Microdata](<https://en.wikipedia.org/wiki/Microdata_(HTML)>) meta tags\n4. [JSON-LD](https://en.wikipedia.org/wiki/JSON-LD) meta tags\n5. HTML Generic tags (`h1`, `p`, `img`)\n6. URL readable parts\n\n## Install\n\n```\npip install linkpreview\n```\n\n## Usage\n\n### Basic\n\n```python\nfrom linkpreview import link_preview\n\nurl = \"http://localhost\"\ncontent = \"\"\"\n<!DOCTYPE html>\n<html>\n  <head>\n    <meta charset=\"utf-8\">\n    <meta name=\"viewport\" content=\"width=device-width\">\n    <!-- ... --->\n    <title>a title</title>\n  </head>\n  <body>\n  <!-- ... --->\n  </body>\n</html>\n\"\"\"\npreview = link_preview(url, content)\nprint(\"title:\", preview.title)\nprint(\"description:\", preview.description)\nprint(\"image:\", preview.image)\nprint(\"force_title:\", preview.force_title)\nprint(\"absolute_image:\", preview.absolute_image)\nprint(\"site_name:\", preview.site_name)\nprint(\"favicon:\", preview.favicon)\nprint(\"absolute_favicon:\", preview.absolute_favicon)\n```\n\n### Automatic fetch link content\n\n```python\nfrom linkpreview import link_preview\n\npreview = link_preview(\"http://github.com/\")\nprint(\"title:\", preview.title)\nprint(\"description:\", preview.description)\nprint(\"image:\", preview.image)\nprint(\"force_title:\", preview.force_title)\nprint(\"absolute_image:\", preview.absolute_image)\nprint(\"site_name:\", preview.site_name)\nprint(\"favicon:\", preview.favicon)\nprint(\"absolute_favicon:\", preview.absolute_favicon)\n```\n\n### `lxml` as XML parser\n\nVery recommended for better performance.\n\n[Install](https://lxml.de/installation.html) the `lxml` and use it like this:\n\n```python\nfrom linkpreview import link_preview\n\npreview = link_preview(\"http://github.com/\", parser=\"lxml\")\nprint(\"title:\", preview.title)\nprint(\"description:\", preview.description)\nprint(\"image:\", preview.image)\nprint(\"force_title:\", preview.force_title)\nprint(\"absolute_image:\", preview.absolute_image)\nprint(\"site_name:\", preview.site_name)\nprint(\"favicon:\", preview.favicon)\nprint(\"absolute_favicon:\", preview.absolute_favicon)\n```\n\n### Advanced\n\n```python\nfrom linkpreview import Link, LinkPreview, LinkGrabber\n\nurl = \"http://github.com\"\ngrabber = LinkGrabber(\n    initial_timeout=20,\n    maxsize=1048576,\n    receive_timeout=10,\n    chunk_size=1024,\n)\ncontent, url = grabber.get_content(url)\nlink = Link(url, content)\npreview = LinkPreview(link, parser=\"lxml\")\nprint(\"title:\", preview.title)\nprint(\"description:\", preview.description)\nprint(\"image:\", preview.image)\nprint(\"force_title:\", preview.force_title)\nprint(\"absolute_image:\", preview.absolute_image)\nprint(\"site_name:\", preview.site_name)\nprint(\"favicon:\", preview.favicon)\nprint(\"absolute_favicon:\", preview.absolute_favicon)\n```\n\nExtend default headers:\n\n```python\ncontent, url = grabber.get_content(url, headers={'user-agent': 'Twitterbot'})\n```\n\nIgnore default headers:\n\n```python\ncontent, url = grabber.get_content(\n  url,\n  headers={'user-agent': 'Twitterbot', 'accept': '*/*'},\n  replace_headers=True,\n)\n```\n\nUse preset headers:\n\n```python\ncontent, url = grabber.get_content( url, headers='googlebot')\n```\n\nAvailable presets:\n`firefox`,\n`chrome`,\n`googlebot`,\n`twitterbot`,\n`telegrambot`,\n`imessagebot`\n\nIf you already have parsed `BeautifulSoup` object:\n\n```python\nfrom bs4 import BeautifulSoup\nfrom linkpreview import Link, LinkPreview\n\nurl = \"http://example.com\"\ncontent = \"<h1>Hello</h1>\"\nsoup = BeautifulSoup(content, \"html.parser\")\nlink = Link(url, content)\npreview = LinkPreview(link, soup=soup)\nprint(\"title:\", preview.title)\n```\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Get link (URL) preview",
    "version": "0.11.0",
    "project_urls": null,
    "split_keywords": [
        "link",
        "preview",
        "web",
        "htmlparse",
        "schema.org",
        "opengraph",
        "twittercard",
        "url"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a14b04c4740668ee84b37a2cb7d5e38111a399407a7ac81bc1c3e7efe2950b94",
                "md5": "bd3128d1ac9d37f50d52fba5c0621847",
                "sha256": "9f4dbd9abf0cdff6a5c8ca0e4133509c02ecf531ed6ea8c9e31da7e1cc510e8e"
            },
            "downloads": -1,
            "filename": "linkpreview-0.11.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "bd3128d1ac9d37f50d52fba5c0621847",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 21654,
            "upload_time": "2024-09-27T18:52:04",
            "upload_time_iso_8601": "2024-09-27T18:52:04.197896Z",
            "url": "https://files.pythonhosted.org/packages/a1/4b/04c4740668ee84b37a2cb7d5e38111a399407a7ac81bc1c3e7efe2950b94/linkpreview-0.11.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "15e07add03bd40f7f20dc5661e11e6e2137dc0a1062b01070699b420859de899",
                "md5": "19f8dbac1eabf0d14bed400a42ff08d9",
                "sha256": "af30d3d1d86358d8fce9fa7bf9976f0a7ef0b213645072f58e916a87782ccbb5"
            },
            "downloads": -1,
            "filename": "linkpreview-0.11.0.tar.gz",
            "has_sig": false,
            "md5_digest": "19f8dbac1eabf0d14bed400a42ff08d9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 15277,
            "upload_time": "2024-09-27T18:52:05",
            "upload_time_iso_8601": "2024-09-27T18:52:05.481578Z",
            "url": "https://files.pythonhosted.org/packages/15/e0/7add03bd40f7f20dc5661e11e6e2137dc0a1062b01070699b420859de899/linkpreview-0.11.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-27 18:52:05",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "linkpreview"
}
        
Elapsed time: 1.13812s