xml-sitemap-writer


Namexml-sitemap-writer JSON
Version 0.6.0 PyPI version JSON
download
home_pagehttps://github.com/pigs-will-fly/py-xml-sitemap-writer
SummaryPython3 package for writing large XML sitemaps with no external dependencies
upload_time2024-11-22 22:46:34
maintainerNone
docs_urlNone
authorMaciej Brencz
requires_python>=3.9
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # py-xml-sitemap-writer
[![PyPI](https://img.shields.io/pypi/v/xml-sitemap-writer.svg)](https://pypi.python.org/pypi/xml-sitemap-writer)
[![Downloads](https://pepy.tech/badge/xml-sitemap-writer)](https://pepy.tech/project/xml-sitemap-writer)
[![Coverage Status](https://coveralls.io/repos/github/pigs-will-fly/py-xml-sitemap-writer/badge.svg?branch=master)](https://coveralls.io/github/pigs-will-fly/py-xml-sitemap-writer?branch=master)

Python3 package for writing large [XML sitemaps](https://www.sitemaps.org/index.html) with no external dependencies.

```
pip install xml-sitemap-writer
```

## Usage

This package is meant to **generate sitemaps with hundred of thousands of URLs** in **memory-efficient way** by
making use of **iterators to populate sitemap** with URLs.

```python
from typing import Iterator
from xml_sitemap_writer import XMLSitemap

def get_products_for_sitemap() -> Iterator[str]:
    """
    Replace the logic below with a query from your database.
    """
    for idx in range(1, 1000001):
        yield f"/product/{idx}.html"  # URLs should be relative to what you provide as "root_url" below

with XMLSitemap(path='/your/web/root', root_url='https://your.site.io') as sitemap:
    sitemap.add_section('products')
    sitemap.add_urls(get_products_for_sitemap())
```

`sitemap.xml` and `sitemap-00N.xml.gz` files will be generated once this code runs:

```xml
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
	<!-- Powered by https://github.com/pigs-will-fly/py-xml-sitemap-writer -->
	<!-- 100000 urls -->
	<sitemap><loc>https://your.site.io/sitemap-products-001.xml.gz</loc></sitemap>
	<sitemap><loc>https://your.site.io/sitemap-products-002.xml.gz</loc></sitemap>
    ...
</sitemapindex>
```

And gzipped sub-sitemaps with up to 15.000 URLs each:

```xml
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
	<url><loc>https://your.site.io/product/1.html</loc></url>
	<url><loc>https://your.site.io/product/2.html</loc></url>
	<url><loc>https://your.site.io/product/3.html</loc></url>
    ...
</urlset>
<!-- 15000 urls in the sitemap -->
```

For easier discovery of your sitemap add its URL to `/robots.txt` file:

```
Sitemap: https://your.site.io/sitemap.xml
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/pigs-will-fly/py-xml-sitemap-writer",
    "name": "xml-sitemap-writer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Maciej Brencz",
    "author_email": "maciej.brencz@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/bc/34/f218ecba596a0adc4ff45e970480fb47469b3d4f7862892272df71fc7dc3/xml_sitemap_writer-0.6.0.tar.gz",
    "platform": null,
    "description": "# py-xml-sitemap-writer\n[![PyPI](https://img.shields.io/pypi/v/xml-sitemap-writer.svg)](https://pypi.python.org/pypi/xml-sitemap-writer)\n[![Downloads](https://pepy.tech/badge/xml-sitemap-writer)](https://pepy.tech/project/xml-sitemap-writer)\n[![Coverage Status](https://coveralls.io/repos/github/pigs-will-fly/py-xml-sitemap-writer/badge.svg?branch=master)](https://coveralls.io/github/pigs-will-fly/py-xml-sitemap-writer?branch=master)\n\nPython3 package for writing large [XML sitemaps](https://www.sitemaps.org/index.html) with no external dependencies.\n\n```\npip install xml-sitemap-writer\n```\n\n## Usage\n\nThis package is meant to **generate sitemaps with hundred of thousands of URLs** in **memory-efficient way** by\nmaking use of **iterators to populate sitemap** with URLs.\n\n```python\nfrom typing import Iterator\nfrom xml_sitemap_writer import XMLSitemap\n\ndef get_products_for_sitemap() -> Iterator[str]:\n    \"\"\"\n    Replace the logic below with a query from your database.\n    \"\"\"\n    for idx in range(1, 1000001):\n        yield f\"/product/{idx}.html\"  # URLs should be relative to what you provide as \"root_url\" below\n\nwith XMLSitemap(path='/your/web/root', root_url='https://your.site.io') as sitemap:\n    sitemap.add_section('products')\n    sitemap.add_urls(get_products_for_sitemap())\n```\n\n`sitemap.xml` and `sitemap-00N.xml.gz` files will be generated once this code runs:\n\n```xml\n<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<sitemapindex xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">\n\t<!-- Powered by https://github.com/pigs-will-fly/py-xml-sitemap-writer -->\n\t<!-- 100000 urls -->\n\t<sitemap><loc>https://your.site.io/sitemap-products-001.xml.gz</loc></sitemap>\n\t<sitemap><loc>https://your.site.io/sitemap-products-002.xml.gz</loc></sitemap>\n    ...\n</sitemapindex>\n```\n\nAnd gzipped sub-sitemaps with up to 15.000 URLs each:\n\n```xml\n<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">\n\t<url><loc>https://your.site.io/product/1.html</loc></url>\n\t<url><loc>https://your.site.io/product/2.html</loc></url>\n\t<url><loc>https://your.site.io/product/3.html</loc></url>\n    ...\n</urlset>\n<!-- 15000 urls in the sitemap -->\n```\n\nFor easier discovery of your sitemap add its URL to `/robots.txt` file:\n\n```\nSitemap: https://your.site.io/sitemap.xml\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python3 package for writing large XML sitemaps with no external dependencies",
    "version": "0.6.0",
    "project_urls": {
        "Homepage": "https://github.com/pigs-will-fly/py-xml-sitemap-writer"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b0664e81bd82bf3b6d626bdf5f0a189bb289bf505c50d5c00fd3f77ab7c6b4bc",
                "md5": "b90396a100a8d38bf40fd5b6009b43e9",
                "sha256": "89bf146d9e9f1473e4b8abdba8ffa5b57028435684e23b26cecf3865c90f2b12"
            },
            "downloads": -1,
            "filename": "xml_sitemap_writer-0.6.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b90396a100a8d38bf40fd5b6009b43e9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 5920,
            "upload_time": "2024-11-22T22:46:33",
            "upload_time_iso_8601": "2024-11-22T22:46:33.002546Z",
            "url": "https://files.pythonhosted.org/packages/b0/66/4e81bd82bf3b6d626bdf5f0a189bb289bf505c50d5c00fd3f77ab7c6b4bc/xml_sitemap_writer-0.6.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bc34f218ecba596a0adc4ff45e970480fb47469b3d4f7862892272df71fc7dc3",
                "md5": "d6168f97a06cfa7c9233722e2a426ff5",
                "sha256": "b196b7743996c83702d66f081edfa470cbe09079b46d623d6ecc9e2b3517a940"
            },
            "downloads": -1,
            "filename": "xml_sitemap_writer-0.6.0.tar.gz",
            "has_sig": false,
            "md5_digest": "d6168f97a06cfa7c9233722e2a426ff5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 5752,
            "upload_time": "2024-11-22T22:46:34",
            "upload_time_iso_8601": "2024-11-22T22:46:34.611754Z",
            "url": "https://files.pythonhosted.org/packages/bc/34/f218ecba596a0adc4ff45e970480fb47469b3d4f7862892272df71fc7dc3/xml_sitemap_writer-0.6.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-22 22:46:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pigs-will-fly",
    "github_project": "py-xml-sitemap-writer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "xml-sitemap-writer"
}
        
Elapsed time: 1.42668s