jats


Namejats JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryJATS: Convert JATS XML articles to Markdown with peer review extraction
upload_time2025-10-19 17:04:56
maintainerNone
docs_urlNone
authorA. Sina Booeshaghi
requires_python>=3.10
licenseNone
keywords jats xml markdown converter publishing biorxiv elife peer-review
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # jats - JATS XML Parser

A Python CLI tool for converting JATS (Journal Article Tag Suite) XML files to Markdown format, with support for extracting peer review comments and author responses.

## Overview

jats parses JATS XML files from scientific publishers (bioRxiv, eLife, etc.) and converts them to clean, readable Markdown. It's particularly useful for working with preprint manuscripts and their associated peer review materials.

### Key Features

- Convert JATS XML articles to Markdown
- Extract peer review comments and author responses from multi-article XML files
- Support for bioRxiv manifest files (optional metadata)
- Organize reviews and responses by revision round
- Simple CLI interface with stdout or file output

## Installation

### Prerequisites

- Python >=3.10

### Install with uv (recommended)

```bash
cd jats
uv pip install -e .
```

### Install with pip

```bash
cd jats
pip install -e .
```

## Usage

### Basic Conversion

Convert a JATS XML file to Markdown:

```bash
# Output to stdout
jats convert article.xml

# Output to file
jats convert article.xml -o article.md

# With bioRxiv manifest file (optional)
jats convert article.xml -m manifest.xml -o article.md
```

### Extract Peer Reviews

Extract peer review comments and author responses from JATS XML files that include sub-articles (common in eLife and some bioRxiv articles):

```bash
# Extract reviews and responses to separate files
jats convert article.xml -o article.md -r output_base

# Creates:
# - output_base_reviews.md    (all review comments, organized by round)
# - output_base_responses.md  (all author responses, organized by round)
```

The `-r` flag extracts sub-articles with the following JATS article types:
- **Review comments**: decision-letter, referee-report, editor-report, reviewer-report
- **Author responses**: author-comment, reply

Reviews and responses are automatically organized by revision round using JATS4R `peer-review-revision-round` metadata (defaults to round 1 if not specified).

## Examples

### Convert bioRxiv Preprint

```bash
jats convert 2023.01.01.12345.xml -o paper.md
```

### Convert eLife Article with Peer Reviews

```bash
# Convert main article and extract reviews
jats convert elife-12345-v1.xml -o paper.md -r elife-12345-v1

# Output files:
# - paper.md                        (main article)
# - elife-12345-v1_reviews.md      (peer review comments)
# - elife-12345-v1_responses.md    (author responses)
```

### bioRxiv with Manifest

```bash
# manifest.xml provides additional metadata
jats convert article.xml -m manifest.xml -o article.md
```

## Input File Format

jats expects JATS XML files following the [JATS (Journal Article Tag Suite)](https://jats.nlm.nih.gov/) standard. This format is used by:

- **bioRxiv** and **medRxiv** preprint servers
- **eLife** journal
- **PubMed Central** (PMC)
- Many other scientific publishers

### JATS XML Structure

A typical JATS XML file contains:
- `<front>`: Article metadata (title, authors, abstract)
- `<body>`: Main article content organized in sections
- `<back>`: References, acknowledgments, etc.
- `<sub-article>`: Optional peer review materials (eLife, some bioRxiv)

### Manifest Files (bioRxiv)

bioRxiv articles may include an optional `manifest.xml` file that provides:
- Collection/category information
- Version history
- Links to published versions
- Peer review URLs

## Output Format

jats converts JATS XML to clean, readable Markdown with:

- Article title as H1 heading
- Authors with affiliations
- Abstract
- Body sections with appropriate heading levels
- Inline figures with captions
- References (when available)

### Peer Review Output

When using `-r`, peer review materials are extracted to separate Markdown files:

**Reviews file** (`*_reviews.md`):
```markdown
# Revision Round 1

## Reviewer 1

[Review content...]

---

## Reviewer 2

[Review content...]
```

**Responses file** (`*_responses.md`):
```markdown
# Revision Round 1

## Author Response

[Response content...]
```

## Development

### Running Tests

```bash
# Install development dependencies
uv pip install -e ".[dev]"

# Run tests
pytest
```

### Project Structure

```
jats/
├── jats/
│   ├── __init__.py
│   ├── main.py         # CLI entry point
│   ├── parser.py       # JATS XML parsing
│   ├── converter.py    # Markdown conversion
│   └── models.py       # Data models
├── tests/
│   ├── test_*.py       # Test files
│   └── *.xml           # Test fixtures
├── pyproject.toml      # Package configuration
└── README.md
```

See [DEVELOPMENT.md](DEVELOPMENT.md) for detailed development documentation and code style guide.

## JATS Resources

- [JATS Documentation](https://jats.nlm.nih.gov/)
- [JATS4R (JATS for Reuse)](https://jats4r.org/) - Recommendations for peer review tagging
- [bioRxiv JATS XML](https://www.biorxiv.org/about/FAQ#JATS)
- [eLife JATS XML](https://elifesciences.org/labs/c079f973/jats-xml-a-format-for-archiving-and-exchanging-scientific-content)

## License

MIT

## Support

For issues or questions, please open an issue on GitHub.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "jats",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "jats, xml, markdown, converter, publishing, biorxiv, elife, peer-review",
    "author": "A. Sina Booeshaghi",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/a9/0a/5f0eafa7483bc069cb1ff77e43c6c274f168d9f95a1792677dfec243012d/jats-0.1.0.tar.gz",
    "platform": null,
    "description": "# jats - JATS XML Parser\n\nA Python CLI tool for converting JATS (Journal Article Tag Suite) XML files to Markdown format, with support for extracting peer review comments and author responses.\n\n## Overview\n\njats parses JATS XML files from scientific publishers (bioRxiv, eLife, etc.) and converts them to clean, readable Markdown. It's particularly useful for working with preprint manuscripts and their associated peer review materials.\n\n### Key Features\n\n- Convert JATS XML articles to Markdown\n- Extract peer review comments and author responses from multi-article XML files\n- Support for bioRxiv manifest files (optional metadata)\n- Organize reviews and responses by revision round\n- Simple CLI interface with stdout or file output\n\n## Installation\n\n### Prerequisites\n\n- Python >=3.10\n\n### Install with uv (recommended)\n\n```bash\ncd jats\nuv pip install -e .\n```\n\n### Install with pip\n\n```bash\ncd jats\npip install -e .\n```\n\n## Usage\n\n### Basic Conversion\n\nConvert a JATS XML file to Markdown:\n\n```bash\n# Output to stdout\njats convert article.xml\n\n# Output to file\njats convert article.xml -o article.md\n\n# With bioRxiv manifest file (optional)\njats convert article.xml -m manifest.xml -o article.md\n```\n\n### Extract Peer Reviews\n\nExtract peer review comments and author responses from JATS XML files that include sub-articles (common in eLife and some bioRxiv articles):\n\n```bash\n# Extract reviews and responses to separate files\njats convert article.xml -o article.md -r output_base\n\n# Creates:\n# - output_base_reviews.md    (all review comments, organized by round)\n# - output_base_responses.md  (all author responses, organized by round)\n```\n\nThe `-r` flag extracts sub-articles with the following JATS article types:\n- **Review comments**: decision-letter, referee-report, editor-report, reviewer-report\n- **Author responses**: author-comment, reply\n\nReviews and responses are automatically organized by revision round using JATS4R `peer-review-revision-round` metadata (defaults to round 1 if not specified).\n\n## Examples\n\n### Convert bioRxiv Preprint\n\n```bash\njats convert 2023.01.01.12345.xml -o paper.md\n```\n\n### Convert eLife Article with Peer Reviews\n\n```bash\n# Convert main article and extract reviews\njats convert elife-12345-v1.xml -o paper.md -r elife-12345-v1\n\n# Output files:\n# - paper.md                        (main article)\n# - elife-12345-v1_reviews.md      (peer review comments)\n# - elife-12345-v1_responses.md    (author responses)\n```\n\n### bioRxiv with Manifest\n\n```bash\n# manifest.xml provides additional metadata\njats convert article.xml -m manifest.xml -o article.md\n```\n\n## Input File Format\n\njats expects JATS XML files following the [JATS (Journal Article Tag Suite)](https://jats.nlm.nih.gov/) standard. This format is used by:\n\n- **bioRxiv** and **medRxiv** preprint servers\n- **eLife** journal\n- **PubMed Central** (PMC)\n- Many other scientific publishers\n\n### JATS XML Structure\n\nA typical JATS XML file contains:\n- `<front>`: Article metadata (title, authors, abstract)\n- `<body>`: Main article content organized in sections\n- `<back>`: References, acknowledgments, etc.\n- `<sub-article>`: Optional peer review materials (eLife, some bioRxiv)\n\n### Manifest Files (bioRxiv)\n\nbioRxiv articles may include an optional `manifest.xml` file that provides:\n- Collection/category information\n- Version history\n- Links to published versions\n- Peer review URLs\n\n## Output Format\n\njats converts JATS XML to clean, readable Markdown with:\n\n- Article title as H1 heading\n- Authors with affiliations\n- Abstract\n- Body sections with appropriate heading levels\n- Inline figures with captions\n- References (when available)\n\n### Peer Review Output\n\nWhen using `-r`, peer review materials are extracted to separate Markdown files:\n\n**Reviews file** (`*_reviews.md`):\n```markdown\n# Revision Round 1\n\n## Reviewer 1\n\n[Review content...]\n\n---\n\n## Reviewer 2\n\n[Review content...]\n```\n\n**Responses file** (`*_responses.md`):\n```markdown\n# Revision Round 1\n\n## Author Response\n\n[Response content...]\n```\n\n## Development\n\n### Running Tests\n\n```bash\n# Install development dependencies\nuv pip install -e \".[dev]\"\n\n# Run tests\npytest\n```\n\n### Project Structure\n\n```\njats/\n\u251c\u2500\u2500 jats/\n\u2502   \u251c\u2500\u2500 __init__.py\n\u2502   \u251c\u2500\u2500 main.py         # CLI entry point\n\u2502   \u251c\u2500\u2500 parser.py       # JATS XML parsing\n\u2502   \u251c\u2500\u2500 converter.py    # Markdown conversion\n\u2502   \u2514\u2500\u2500 models.py       # Data models\n\u251c\u2500\u2500 tests/\n\u2502   \u251c\u2500\u2500 test_*.py       # Test files\n\u2502   \u2514\u2500\u2500 *.xml           # Test fixtures\n\u251c\u2500\u2500 pyproject.toml      # Package configuration\n\u2514\u2500\u2500 README.md\n```\n\nSee [DEVELOPMENT.md](DEVELOPMENT.md) for detailed development documentation and code style guide.\n\n## JATS Resources\n\n- [JATS Documentation](https://jats.nlm.nih.gov/)\n- [JATS4R (JATS for Reuse)](https://jats4r.org/) - Recommendations for peer review tagging\n- [bioRxiv JATS XML](https://www.biorxiv.org/about/FAQ#JATS)\n- [eLife JATS XML](https://elifesciences.org/labs/c079f973/jats-xml-a-format-for-archiving-and-exchanging-scientific-content)\n\n## License\n\nMIT\n\n## Support\n\nFor issues or questions, please open an issue on GitHub.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "JATS: Convert JATS XML articles to Markdown with peer review extraction",
    "version": "0.1.0",
    "project_urls": null,
    "split_keywords": [
        "jats",
        " xml",
        " markdown",
        " converter",
        " publishing",
        " biorxiv",
        " elife",
        " peer-review"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9e9d1f3225af1a39d0660c8b211744b1a1ff9d2a16b377ccb8d5c55ab185bf94",
                "md5": "311906fd8ad813ceff22c5a9891f7e6a",
                "sha256": "99f2d5b725f69d279c8ab3185c574f8d8e13e7dd5e39835d32f982833006e1a8"
            },
            "downloads": -1,
            "filename": "jats-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "311906fd8ad813ceff22c5a9891f7e6a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 14564,
            "upload_time": "2025-10-19T17:04:55",
            "upload_time_iso_8601": "2025-10-19T17:04:55.316467Z",
            "url": "https://files.pythonhosted.org/packages/9e/9d/1f3225af1a39d0660c8b211744b1a1ff9d2a16b377ccb8d5c55ab185bf94/jats-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a90a5f0eafa7483bc069cb1ff77e43c6c274f168d9f95a1792677dfec243012d",
                "md5": "2cc21af729c56ae80abc2bca5ba528cb",
                "sha256": "cea7a7d33152f0b8f5ecfb6cff2cf48b49850ea0735e7041552f4c8bab0b46b5"
            },
            "downloads": -1,
            "filename": "jats-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2cc21af729c56ae80abc2bca5ba528cb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 16898,
            "upload_time": "2025-10-19T17:04:56",
            "upload_time_iso_8601": "2025-10-19T17:04:56.301857Z",
            "url": "https://files.pythonhosted.org/packages/a9/0a/5f0eafa7483bc069cb1ff77e43c6c274f168d9f95a1792677dfec243012d/jats-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-19 17:04:56",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "jats"
}
        
Elapsed time: 1.14359s