spip2md


Namespip2md JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://git.irsamc.ups-tlse.fr/LCPQ/spip2md
SummaryGenerate a static website with plain Markdown+YAML files from a SPIP CMS database
upload_time2023-06-23 14:49:41
maintainer
docs_urlNone
authorGuilhem Fauré
requires_python>=3.9,<4.0
licenseGPL-2.0
keywords markdown static website spip converter exporter
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ---
lang: en
---

# SPIP Database to Markdown

`spip2md` is a litle Python app that can export a SPIP database into a plain text,
Markdown + YAML repository, usable with static site generators.

## Features

`spip2md` is currently able to :

- Export every section (`spip_rubriques`), with every article (`spip_articles`) they
  contain
  - Replace authors (`spip_auteurs`) IDs with their name (in YAML block)
  - Generate different files for each language found in `<multi>` blocks
  - Copy over all the attached files (`spip_documents`), with proper links
  - Convert SPIP [Markup language](https://www.spip.net/fr_article1578.html)
  - Convert SPIP ID-based internal links (like `<art123>`) into path-based, normal links

## Dependencies

`spip2md` needs Python version 3.9 or supperior.

`spip2md` uses three Python libraries (as defined in pyproject.toml) :

- Peewee, with a database connection for your database :
  - pymysql (MySQL/MariaDB)
- PyYaml
- python-slugify (unidecode variant prefered)

## Installation

### Simple `pip` method

Install the package with `pip install spip2md` (or `python -m pip install spip2md`
if you don’t have pip installed).

Assuming your `$PATH` contains your `pip` install directory, you can now run
`spip2md` a normal command of the same name.

### Traditional method

Clone this git repo with command `git clone` and `cd` into the created directory.

Either make sure you have the dependencies installed system-wide, or create a
Python virtual-environment and install them inside.

You can then run `spip2md` as a Python module with command `python -m spip2md`.
Make sure to replace `spip2md` with a path to directory `spip2md` if you
didn’t `cd` into this repository’s directory.

## Configuration and Usage

Make sure you have access to the SPIP database you want to export on a
MySQL/MariaDB server. By default, `spip2md` expects a database named `spip` hosted on
`localhost`, with a user named `spip` of which password is `password`, but you can
totally configure this as well as other settings in the YAML config file.

If you want to copy over attached files like images, you also need access to
the data directory of your SPIP website, usually named `IMG`, and either rename it
`data` in your current working directory, or set `data_dir` setting to its path.

### YAML configuration file

To configure `spip2md` you can place a file named `spip2md.yml` in standard \*nix
configuration locations, set it with the command line argument, or run the
program with a `spip2md.yml` file in your working directory.

Here’s the *default configuration options* with comments explaining their meaning :

```yaml
# Data source settings
db: spip # Name of the database
db_host: localhost # Host of the database
db_user: spip # The database user
db_pass: password # The database password
data_dir: data # The directory in which SPIP images & files are stored

# Data destination settings
export_languages: ["en"] # Array of languages to export, two letter lang code
# If set, directories will be created only for this language, according to this
# language’s titles. Other languages will be written along with correct url: attribute
storage_language: null
output_dir: output/ # The directory in which files will be written

# Destination directories names settings
# Prepend ID to directory slug, preventing collisions
# If false, a counter will be appended in case of name collision
prepend_id: false
# Prepend lang of the object to directory slug, prenventing collision between langs
prepend_lang: false
title_max_length: 42 # Maximum length (chars) of a single filename

# Text body processing settings
remove_html: true # Should we clean remaining HTML blocks
metadata_markup: false # Should we keep markup (Markdown) in metadata fields, like title
unknown_char_replacement: ?? # String to replace broken encoding that cannot be repaired
prepend_h1: false # Add title of articles as Markdown h1, looks better on certain themes
# Array of objects with 2 or 3 values, allowing to move some fields into others.
# {src: moved_field_name, dest: destination_field_name, repr: "how to merge them"}
# repr is formatted with "{}" being the moved field, and "_" the destination one
# For example, to append a field "subtitle" to a field "title":
#   - src: subtitle
#     dest: title
#     repr: "{} _" # (this is the default repr)
move_fields: []
# Some taxonomies (Spip Mots types) to not export, typically specific to Spip functions
ignore_taxonomies: ["Gestion du site", "Gestion des articles", "Mise en page"]
rename_taxonomies: { equipes: "tag-equipes" } # Rename taxonomies (prenvent conflict)

# Ignored data settings
export_drafts: true # Should we export drafts
export_empty: true # Should we export empty articles
ignore_patterns: [] # List of regexes : Matching sections or articles will be ignored

# Settings you probably don’t want to modify
clear_log: true # Clear logfile between runs instead of appending to
clear_output: true # Clear output dir between runs instead of merging into

logfile: log-spip2md.log # Name of the logs file
loglevel: WARNING # Refer to Python’s loglevels

export_filetype: md # Filetype of exported text files
```

## External links

- SPIP [Database structure](https://www.spip.net/fr_article713.html)

## TODO

These tables seem to contain not-as-useful information,
but this needs to be investicated :

- `spip_evenements`
- `spip_meta`
- `spip_mots`
- `spip_syndic_articles`
- `spip_mots_liens`
- `spip_zones_liens`
- `spip_groupes_mots`
- `spip_meslettres`
- `spip_messages`
- `spip_syndic`
- `spip_zones`

These tables seem technical, SPIP specific :

- `spip_depots`
- `spip_depots_plugins`
- `spip_jobs`
- `spip_ortho_cache`
- `spip_paquets`
- `spip_plugins`
- `spip_referers`
- `spip_referers_articles`
- `spip_types_documents`
- `spip_versions`
- `spip_versions_fragments`
- `spip_visites`
- `spip_visites_articles`

These tables are empty :

- `spip_breves`
- `spip_evenements_participants`
- `spip_forum`
- `spip_jobs_liens`
- `spip_ortho_dico`
- `spip_petitions`
- `spip_resultats`
- `spip_signatures`
- `spip_test`
- `spip_urls`

The program spip2md is provided uder:

     SPDX-License-Identifier: GPL-2.0-only

Being under the terms of the GNU General Public License version 2 only,
according with:

     https://www.gnu.org/licenses/old-licenses/gpl-2.0.html

            

Raw data

            {
    "_id": null,
    "home_page": "https://git.irsamc.ups-tlse.fr/LCPQ/spip2md",
    "name": "spip2md",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9,<4.0",
    "maintainer_email": "",
    "keywords": "Markdown,Static website,SPIP,Converter,Exporter",
    "author": "Guilhem Faur\u00e9",
    "author_email": "guilhem.faure@gfaure.eu",
    "download_url": "https://files.pythonhosted.org/packages/68/bd/fcfd1a04c5a681db6007df26e099cc1c4bb03c5429560270b531a6b39f94/spip2md-0.1.1.tar.gz",
    "platform": null,
    "description": "---\nlang: en\n---\n\n# SPIP Database to Markdown\n\n`spip2md` is a litle Python app that can export a SPIP database into a plain text,\nMarkdown + YAML repository, usable with static site generators.\n\n## Features\n\n`spip2md` is currently able to\u202f:\n\n- Export every section (`spip_rubriques`), with every article (`spip_articles`) they\n  contain\n  - Replace authors (`spip_auteurs`) IDs with their name (in YAML block)\n  - Generate different files for each language found in `<multi>` blocks\n  - Copy over all the attached files (`spip_documents`), with proper links\n  - Convert SPIP [Markup language](https://www.spip.net/fr_article1578.html)\n  - Convert SPIP ID-based internal links (like `<art123>`) into path-based, normal links\n\n## Dependencies\n\n`spip2md` needs Python version 3.9 or supperior.\n\n`spip2md` uses three Python libraries\u202f(as defined in pyproject.toml)\u202f:\n\n- Peewee, with a database connection for your database\u202f:\n  - pymysql (MySQL/MariaDB)\n- PyYaml\n- python-slugify (unidecode variant prefered)\n\n## Installation\n\n### Simple `pip` method\n\nInstall the package with `pip install spip2md` (or `python -m pip install spip2md`\nif you don\u2019t have pip installed).\n\nAssuming your `$PATH` contains your `pip` install directory, you can now run\n`spip2md` a normal command of the same name.\n\n### Traditional method\n\nClone this git repo with command `git clone` and `cd` into the created directory.\n\nEither make sure you have the dependencies installed system-wide, or create a\nPython virtual-environment and install them inside.\n\nYou can then run `spip2md` as a Python module with command `python -m spip2md`.\nMake sure to replace `spip2md` with a path to directory `spip2md` if you\ndidn\u2019t `cd` into this repository\u2019s directory.\n\n## Configuration and Usage\n\nMake sure you have access to the SPIP database you want to export on a\nMySQL/MariaDB server. By default, `spip2md` expects a database named `spip` hosted on\n`localhost`, with a user named `spip` of which password is `password`, but you can\ntotally configure this as well as other settings in the YAML config file.\n\nIf you want to copy over attached files like images, you also need access to\nthe data directory of your SPIP website, usually named `IMG`, and either rename it\n`data` in your current working directory, or set `data_dir` setting to its path.\n\n### YAML configuration file\n\nTo configure `spip2md` you can place a file named `spip2md.yml` in standard \\*nix\nconfiguration locations, set it with the command line argument, or run the\nprogram with a `spip2md.yml` file in your working directory.\n\nHere\u2019s the *default configuration options*\u202fwith comments explaining their meaning\u202f:\n\n```yaml\n# Data source settings\ndb: spip # Name of the database\ndb_host: localhost # Host of the database\ndb_user: spip # The database user\ndb_pass: password # The database password\ndata_dir: data # The directory in which SPIP images & files are stored\n\n# Data destination settings\nexport_languages: [\"en\"] # Array of languages to export, two letter lang code\n# If set, directories will be created only for this language, according to this\n# language\u2019s titles. Other languages will be written along with correct url: attribute\nstorage_language: null\noutput_dir: output/ # The directory in which files will be written\n\n# Destination directories names settings\n# Prepend ID to directory slug, preventing collisions\n# If false, a counter will be appended in case of name collision\nprepend_id: false\n# Prepend lang of the object to directory slug, prenventing collision between langs\nprepend_lang: false\ntitle_max_length: 42 # Maximum length (chars) of a single filename\n\n# Text body processing settings\nremove_html: true # Should we clean remaining HTML blocks\nmetadata_markup: false # Should we keep markup (Markdown) in metadata fields, like title\nunknown_char_replacement: ?? # String to replace broken encoding that cannot be repaired\nprepend_h1: false # Add title of articles as Markdown h1, looks better on certain themes\n# Array of objects with 2 or 3 values, allowing to move some fields into others.\n# {src: moved_field_name, dest: destination_field_name, repr: \"how to merge them\"}\n# repr is formatted with \"{}\" being the moved field, and \"_\" the destination one\n# For example, to append a field \"subtitle\" to a field \"title\":\n#   - src: subtitle\n#     dest: title\n#     repr: \"{} _\" # (this is the default repr)\nmove_fields: []\n# Some taxonomies (Spip Mots types) to not export, typically specific to Spip functions\nignore_taxonomies: [\"Gestion du site\", \"Gestion des articles\", \"Mise en page\"]\nrename_taxonomies: { equipes: \"tag-equipes\" } # Rename taxonomies (prenvent conflict)\n\n# Ignored data settings\nexport_drafts: true # Should we export drafts\nexport_empty: true # Should we export empty articles\nignore_patterns: [] # List of regexes\u202f: Matching sections or articles will be ignored\n\n# Settings you probably don\u2019t want to modify\nclear_log: true # Clear logfile between runs instead of appending to\nclear_output: true # Clear output dir between runs instead of merging into\n\nlogfile: log-spip2md.log # Name of the logs file\nloglevel: WARNING # Refer to Python\u2019s loglevels\n\nexport_filetype: md # Filetype of exported text files\n```\n\n## External links\n\n- SPIP [Database structure](https://www.spip.net/fr_article713.html)\n\n## TODO\n\nThese tables seem to contain not-as-useful information,\nbut this needs to be investicated\u202f:\n\n- `spip_evenements`\n- `spip_meta`\n- `spip_mots`\n- `spip_syndic_articles`\n- `spip_mots_liens`\n- `spip_zones_liens`\n- `spip_groupes_mots`\n- `spip_meslettres`\n- `spip_messages`\n- `spip_syndic`\n- `spip_zones`\n\nThese tables seem technical, SPIP specific\u202f:\n\n- `spip_depots`\n- `spip_depots_plugins`\n- `spip_jobs`\n- `spip_ortho_cache`\n- `spip_paquets`\n- `spip_plugins`\n- `spip_referers`\n- `spip_referers_articles`\n- `spip_types_documents`\n- `spip_versions`\n- `spip_versions_fragments`\n- `spip_visites`\n- `spip_visites_articles`\n\nThese tables are empty\u202f:\n\n- `spip_breves`\n- `spip_evenements_participants`\n- `spip_forum`\n- `spip_jobs_liens`\n- `spip_ortho_dico`\n- `spip_petitions`\n- `spip_resultats`\n- `spip_signatures`\n- `spip_test`\n- `spip_urls`\n\nThe program spip2md is provided uder:\n\n     SPDX-License-Identifier: GPL-2.0-only\n\nBeing under the terms of the GNU General Public License version 2 only,\naccording with:\n\n     https://www.gnu.org/licenses/old-licenses/gpl-2.0.html\n",
    "bugtrack_url": null,
    "license": "GPL-2.0",
    "summary": "Generate a static website with plain Markdown+YAML files from a SPIP CMS database",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://git.irsamc.ups-tlse.fr/LCPQ/spip2md",
        "Repository": "https://git.irsamc.ups-tlse.fr/LCPQ/spip2md"
    },
    "split_keywords": [
        "markdown",
        "static website",
        "spip",
        "converter",
        "exporter"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "697045e194be27b8232a0aff450e2017d0390a79dab4309c7877fdca92be6254",
                "md5": "26a31eeaa49d8a19673cab7229d32f6d",
                "sha256": "083371fbbf44926c92a386bda8e276cdfbeb2821be277502fb0d30177ce07b3a"
            },
            "downloads": -1,
            "filename": "spip2md-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "26a31eeaa49d8a19673cab7229d32f6d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9,<4.0",
            "size": 25443,
            "upload_time": "2023-06-23T14:49:40",
            "upload_time_iso_8601": "2023-06-23T14:49:40.636746Z",
            "url": "https://files.pythonhosted.org/packages/69/70/45e194be27b8232a0aff450e2017d0390a79dab4309c7877fdca92be6254/spip2md-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "68bdfcfd1a04c5a681db6007df26e099cc1c4bb03c5429560270b531a6b39f94",
                "md5": "813a8bd816894d8b287ea241a5587183",
                "sha256": "102d99ce90cead2fe7f9abfa41bc9391355780321611320887dad6928972a332"
            },
            "downloads": -1,
            "filename": "spip2md-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "813a8bd816894d8b287ea241a5587183",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9,<4.0",
            "size": 23662,
            "upload_time": "2023-06-23T14:49:41",
            "upload_time_iso_8601": "2023-06-23T14:49:41.940394Z",
            "url": "https://files.pythonhosted.org/packages/68/bd/fcfd1a04c5a681db6007df26e099cc1c4bb03c5429560270b531a6b39f94/spip2md-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-23 14:49:41",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "spip2md"
}
        
Elapsed time: 0.10140s