linkpreview


Namelinkpreview JSON
Version 0.9.0 PyPI version JSON
download
home_pageNone
SummaryGet link (URL) preview
upload_time2024-03-31 15:36:30
maintainerNone
docs_urlNone
authorMeyT
requires_pythonNone
licenseMIT
keywords link preview web htmlparse schema.org opengraph twittercard url
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # linkpreview

[![Build Status](https://github.com/meyt/linkpreview/actions/workflows/main.yaml/badge.svg)](https://github.com/meyt/linkpreview/actions)
[![Coverage Status](https://coveralls.io/repos/github/meyt/linkpreview/badge.svg?branch=master)](https://coveralls.io/github/meyt/linkpreview?branch=master)
[![pypi](https://img.shields.io/pypi/pyversions/linkpreview.svg)](https://pypi.python.org/pypi/linkpreview)

Get link preview in python

Gathering data from:

1. [OpenGraph](https://ogp.me/) meta tags
2. [TwitterCard](https://developer.twitter.com/en/docs/tweets/optimize-with-cards/overview/abouts-cards) meta tags
3. [Microdata](https://en.wikipedia.org/wiki/Microdata_(HTML)) meta tags
4. [JSON-LD](https://en.wikipedia.org/wiki/JSON-LD) meta tags
5. HTML Generic tags (`h1`, `p`, `img`)
6. URL readable parts

## Install

```
pip install linkpreview
```

## Usage

### Basic

```python
from linkpreview import link_preview

url = "http://localhost"
content = """
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width">
    <!-- ... --->
    <title>a title</title>
  </head>
  <body>
  <!-- ... --->
  </body>
</html>
"""
preview = link_preview(url, content)
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)
```

### Automatic fetch link content

```python
from linkpreview import link_preview

preview = link_preview("http://github.com/")
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)
```

### `lxml` as XML parser

Very recommended for better performance.

[Install](https://lxml.de/installation.html) the `lxml` and use it like this:

```python
from linkpreview import link_preview

preview = link_preview("http://github.com/", parser="lxml")
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)
```

### Advanced

```python
from linkpreview import Link, LinkPreview, LinkGrabber

url = "http://github.com"
grabber = LinkGrabber(
    initial_timeout=20,
    maxsize=1048576,
    receive_timeout=10,
    chunk_size=1024,
)
content, url = grabber.get_content(url)
link = Link(url, content)
preview = LinkPreview(link, parser="lxml")
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)
```


Extend default headers:
```python
content, url = grabber.get_content(url, headers={'user-agent': 'Twitterbot'})
```

Ignore default headers:
```python
content, url = grabber.get_content(
  url,
  headers={'user-agent': 'Twitterbot', 'accept': '*/*'},
  replace_headers=True,
)
```

Use preset headers:
```python
content, url = grabber.get_content( url, headers='googlebot')
```

Available presets:
`firefox`,
`chrome`,
`googlebot`,
`twitterbot`,
`telegrambot`,
`imessagebot`




            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "linkpreview",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "link preview web htmlparse schema.org opengraph twittercard url",
    "author": "MeyT",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/04/67/cfb73c2a320345859c8fd883a5bcab35ea22810fac4e9342c278463cae3a/linkpreview-0.9.0.tar.gz",
    "platform": null,
    "description": "# linkpreview\n\n[![Build Status](https://github.com/meyt/linkpreview/actions/workflows/main.yaml/badge.svg)](https://github.com/meyt/linkpreview/actions)\n[![Coverage Status](https://coveralls.io/repos/github/meyt/linkpreview/badge.svg?branch=master)](https://coveralls.io/github/meyt/linkpreview?branch=master)\n[![pypi](https://img.shields.io/pypi/pyversions/linkpreview.svg)](https://pypi.python.org/pypi/linkpreview)\n\nGet link preview in python\n\nGathering data from:\n\n1. [OpenGraph](https://ogp.me/) meta tags\n2. [TwitterCard](https://developer.twitter.com/en/docs/tweets/optimize-with-cards/overview/abouts-cards) meta tags\n3. [Microdata](https://en.wikipedia.org/wiki/Microdata_(HTML)) meta tags\n4. [JSON-LD](https://en.wikipedia.org/wiki/JSON-LD) meta tags\n5. HTML Generic tags (`h1`, `p`, `img`)\n6. URL readable parts\n\n## Install\n\n```\npip install linkpreview\n```\n\n## Usage\n\n### Basic\n\n```python\nfrom linkpreview import link_preview\n\nurl = \"http://localhost\"\ncontent = \"\"\"\n<!DOCTYPE html>\n<html>\n  <head>\n    <meta charset=\"utf-8\">\n    <meta name=\"viewport\" content=\"width=device-width\">\n    <!-- ... --->\n    <title>a title</title>\n  </head>\n  <body>\n  <!-- ... --->\n  </body>\n</html>\n\"\"\"\npreview = link_preview(url, content)\nprint(\"title:\", preview.title)\nprint(\"description:\", preview.description)\nprint(\"image:\", preview.image)\nprint(\"force_title:\", preview.force_title)\nprint(\"absolute_image:\", preview.absolute_image)\nprint(\"site_name:\", preview.site_name)\nprint(\"favicon:\", preview.favicon)\nprint(\"absolute_favicon:\", preview.absolute_favicon)\n```\n\n### Automatic fetch link content\n\n```python\nfrom linkpreview import link_preview\n\npreview = link_preview(\"http://github.com/\")\nprint(\"title:\", preview.title)\nprint(\"description:\", preview.description)\nprint(\"image:\", preview.image)\nprint(\"force_title:\", preview.force_title)\nprint(\"absolute_image:\", preview.absolute_image)\nprint(\"site_name:\", preview.site_name)\nprint(\"favicon:\", preview.favicon)\nprint(\"absolute_favicon:\", preview.absolute_favicon)\n```\n\n### `lxml` as XML parser\n\nVery recommended for better performance.\n\n[Install](https://lxml.de/installation.html) the `lxml` and use it like this:\n\n```python\nfrom linkpreview import link_preview\n\npreview = link_preview(\"http://github.com/\", parser=\"lxml\")\nprint(\"title:\", preview.title)\nprint(\"description:\", preview.description)\nprint(\"image:\", preview.image)\nprint(\"force_title:\", preview.force_title)\nprint(\"absolute_image:\", preview.absolute_image)\nprint(\"site_name:\", preview.site_name)\nprint(\"favicon:\", preview.favicon)\nprint(\"absolute_favicon:\", preview.absolute_favicon)\n```\n\n### Advanced\n\n```python\nfrom linkpreview import Link, LinkPreview, LinkGrabber\n\nurl = \"http://github.com\"\ngrabber = LinkGrabber(\n    initial_timeout=20,\n    maxsize=1048576,\n    receive_timeout=10,\n    chunk_size=1024,\n)\ncontent, url = grabber.get_content(url)\nlink = Link(url, content)\npreview = LinkPreview(link, parser=\"lxml\")\nprint(\"title:\", preview.title)\nprint(\"description:\", preview.description)\nprint(\"image:\", preview.image)\nprint(\"force_title:\", preview.force_title)\nprint(\"absolute_image:\", preview.absolute_image)\nprint(\"site_name:\", preview.site_name)\nprint(\"favicon:\", preview.favicon)\nprint(\"absolute_favicon:\", preview.absolute_favicon)\n```\n\n\nExtend default headers:\n```python\ncontent, url = grabber.get_content(url, headers={'user-agent': 'Twitterbot'})\n```\n\nIgnore default headers:\n```python\ncontent, url = grabber.get_content(\n  url,\n  headers={'user-agent': 'Twitterbot', 'accept': '*/*'},\n  replace_headers=True,\n)\n```\n\nUse preset headers:\n```python\ncontent, url = grabber.get_content( url, headers='googlebot')\n```\n\nAvailable presets:\n`firefox`,\n`chrome`,\n`googlebot`,\n`twitterbot`,\n`telegrambot`,\n`imessagebot`\n\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Get link (URL) preview",
    "version": "0.9.0",
    "project_urls": null,
    "split_keywords": [
        "link",
        "preview",
        "web",
        "htmlparse",
        "schema.org",
        "opengraph",
        "twittercard",
        "url"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "67cce4c848b75017e59a627a247006e8bec02f88a7222a2ae932e3d31cc93171",
                "md5": "7335b5dc093d08c41ddec140b7a665de",
                "sha256": "06b7da1c6ebc25f4962f6e65b0e29885eefdaaa568f0fffbda485b293a89e7a1"
            },
            "downloads": -1,
            "filename": "linkpreview-0.9.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7335b5dc093d08c41ddec140b7a665de",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 20654,
            "upload_time": "2024-03-31T15:36:29",
            "upload_time_iso_8601": "2024-03-31T15:36:29.079542Z",
            "url": "https://files.pythonhosted.org/packages/67/cc/e4c848b75017e59a627a247006e8bec02f88a7222a2ae932e3d31cc93171/linkpreview-0.9.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0467cfb73c2a320345859c8fd883a5bcab35ea22810fac4e9342c278463cae3a",
                "md5": "2a8e70dbba28258608266861b830a90a",
                "sha256": "f5a1f178953501f17f606015e979245bba80e0d63ee4caa79ea625cbf52415ef"
            },
            "downloads": -1,
            "filename": "linkpreview-0.9.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2a8e70dbba28258608266861b830a90a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 14610,
            "upload_time": "2024-03-31T15:36:30",
            "upload_time_iso_8601": "2024-03-31T15:36:30.508680Z",
            "url": "https://files.pythonhosted.org/packages/04/67/cfb73c2a320345859c8fd883a5bcab35ea22810fac4e9342c278463cae3a/linkpreview-0.9.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-31 15:36:30",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "linkpreview"
}
        
Elapsed time: 0.32018s