# Tree-Sitter Wikitext Parser
This repository contains the implementation of a **Tree-Sitter** parser for **Wikitext**, a markup language used by MediaWiki.
Try the parse in the [playground](https://tree-sitter-wikitext.toolforge.org/)
## Overview
Tree-Sitter is a powerful parser generator tool and incremental parsing library. It is designed to build concrete syntax trees for source files and efficiently update them as the source changes. This project leverages Tree-Sitter to parse Wikitext, enabling structured analysis and manipulation of MediaWiki content.
## Features
- **Incremental Parsing**: Efficiently updates syntax trees as the source changes.
- **Language Agnostic**: Can be embedded in applications written in C, Python, Go, Rust, Node.js, and Swift.
- **Robust Parsing**: Handles syntax errors gracefully to provide useful results.
- **Custom Grammar**: Implements a grammar tailored for Wikitext.
## Repository Structure
- **`src/`**: Contains the core C implementation of the parser.
- **`bindings/`**: Language-specific bindings for Python, Go, Node.js, Rust, and Swift.
- **`grammar.js`**: Defines the grammar for Wikitext.
- **`queries/`**: Contains Tree-Sitter query files for extracting specific syntax patterns.
- **`tests/`**: Unit tests for validating the parser's functionality.
## Installation
### Prerequisites
- A C compiler (e.g., GCC or Clang)
- [Node.js](https://nodejs.org/) (for building the grammar)
- Python 3.6+ (optional, for Python bindings)
### Build Instructions
1. Clone the repository:
```bash
git clone https://github.com/santhoshtr/tree-sitter-wikitext.git
cd tree-sitter-wikitext
```
2. Build the parser:
```bash
npm install
```
3. (Optional) Build language-specific bindings:
- **Python**: Run `python setup.py build`.
- **Rust**: Use `cargo build`.
- **Go**: Use `go build`.
## Usage
### Embedding in Applications
The parser can be embedded in applications written in various languages. For example:
- **Python**: Use the `tree-sitter` Python module to load and use the parser.
- **Node.js**: Import the parser as a Node.js module.
- **Rust**: Use the `tree-sitter` crate to integrate the parser.
### Example: Parsing Wikitext in Rust
```rust
use tree_sitter::{Parser, Language};
fn main() {
// Create a new parser
let mut parser = tree_sitter::Parser::new();
parser.set_language(&tree_sitter_wikitext::LANGUAGE.into()).expect("Error loading wikitext grammar");
// Parse a Wikitext string
let source_code = "== Heading ==\nThis is a paragraph.\n";
let tree = parser.parse(source_code, None).unwrap();
// Print the syntax tree
println!("{}", tree.root_node().to_sexp());
}
```
### Using with Neovim
Checkout the repo, add the following configuration to `init.lua` of your nvim installation.
```lua
--- Refer https://github.com/nvim-treesitter/nvim-treesitter
local parser_config = require("nvim-treesitter.parsers").get_parser_configs()
parser_config.wikitext = {
install_info = {
url = "~/path/to/tree-sitter-wikitext", -- local path or git repo
files = { "src/parser.c" }, -- note that some parsers also require src/scanner.c or src/scanner.cc
-- optional entries:
branch = "main", -- default branch in case of git repo if different from master
generate_requires_npm = false, -- if stand-alone parser without npm dependencies
requires_generate_from_grammar = false, -- if folder contains pre-generated src/parser.c
},
filetype = "wikitext", -- if filetype does not match the parser name
}
vim.filetype.add({
pattern = {
[".*/*.wikitext"] = "wikitext",
},
})
```
Link the queries folder of `tree-sitter-wikitext` to `queries/wikitext` folder of nvim
```bash
cd ~/.config/nvim
mkdir -p queries
ln -s path/to/tree-sitter-wikitext/queries queries/wikitext
```
Re-open nvim. Open any file with `.wikitext` extension. You should see syntax highlighting. You can also inspect the tree-sitter tree using `:InspectTree` command
To run queries against a buffer, run `:EditQuery wikitext`. A scratch buffer will be opened. Write your Tree-Sitter query there, in normal node, move cursor over the capture names. You will see the corresponding text in the buffer get highlighted.
## Contributing
Contributions are welcome! Please follow these steps:
1. Fork the repository.
2. Create a new branch for your feature or bug fix.
3. Submit a pull request with a detailed description of your changes.
## License
This project is licensed under the MIT License. See the `LICENSE.md` file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "tree-sitter-wikitext",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "incremental, parsing, tree-sitter, wikitext",
"author": null,
"author_email": "Santhosh Thottingal <santhosh.thottingal@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/6a/2c/760f1c03a16b0d16a3b736a2c4d4e2a5ed96b4d69b31bf2eb7a86434d878/tree_sitter_wikitext-0.1.0.tar.gz",
"platform": null,
"description": "# Tree-Sitter Wikitext Parser\n\nThis repository contains the implementation of a **Tree-Sitter** parser for **Wikitext**, a markup language used by MediaWiki.\n\nTry the parse in the [playground](https://tree-sitter-wikitext.toolforge.org/)\n\n## Overview\n\nTree-Sitter is a powerful parser generator tool and incremental parsing library. It is designed to build concrete syntax trees for source files and efficiently update them as the source changes. This project leverages Tree-Sitter to parse Wikitext, enabling structured analysis and manipulation of MediaWiki content.\n\n## Features\n\n- **Incremental Parsing**: Efficiently updates syntax trees as the source changes.\n- **Language Agnostic**: Can be embedded in applications written in C, Python, Go, Rust, Node.js, and Swift.\n- **Robust Parsing**: Handles syntax errors gracefully to provide useful results.\n- **Custom Grammar**: Implements a grammar tailored for Wikitext.\n\n## Repository Structure\n\n- **`src/`**: Contains the core C implementation of the parser.\n- **`bindings/`**: Language-specific bindings for Python, Go, Node.js, Rust, and Swift.\n- **`grammar.js`**: Defines the grammar for Wikitext.\n- **`queries/`**: Contains Tree-Sitter query files for extracting specific syntax patterns.\n- **`tests/`**: Unit tests for validating the parser's functionality.\n\n## Installation\n\n### Prerequisites\n\n- A C compiler (e.g., GCC or Clang)\n- [Node.js](https://nodejs.org/) (for building the grammar)\n- Python 3.6+ (optional, for Python bindings)\n\n### Build Instructions\n\n1. Clone the repository:\n\n ```bash\n git clone https://github.com/santhoshtr/tree-sitter-wikitext.git\n cd tree-sitter-wikitext\n ```\n\n2. Build the parser:\n\n ```bash\n npm install\n ```\n\n3. (Optional) Build language-specific bindings:\n - **Python**: Run `python setup.py build`.\n - **Rust**: Use `cargo build`.\n - **Go**: Use `go build`.\n\n## Usage\n\n### Embedding in Applications\n\nThe parser can be embedded in applications written in various languages. For example:\n\n- **Python**: Use the `tree-sitter` Python module to load and use the parser.\n- **Node.js**: Import the parser as a Node.js module.\n- **Rust**: Use the `tree-sitter` crate to integrate the parser.\n\n### Example: Parsing Wikitext in Rust\n\n```rust\nuse tree_sitter::{Parser, Language};\n\nfn main() {\n // Create a new parser\n let mut parser = tree_sitter::Parser::new();\n parser.set_language(&tree_sitter_wikitext::LANGUAGE.into()).expect(\"Error loading wikitext grammar\");\n\n // Parse a Wikitext string\n let source_code = \"== Heading ==\\nThis is a paragraph.\\n\";\n let tree = parser.parse(source_code, None).unwrap();\n\n // Print the syntax tree\n println!(\"{}\", tree.root_node().to_sexp());\n}\n```\n\n### Using with Neovim\n\nCheckout the repo, add the following configuration to `init.lua` of your nvim installation.\n\n```lua\n--- Refer https://github.com/nvim-treesitter/nvim-treesitter\nlocal parser_config = require(\"nvim-treesitter.parsers\").get_parser_configs()\nparser_config.wikitext = {\n install_info = {\n url = \"~/path/to/tree-sitter-wikitext\", -- local path or git repo\n files = { \"src/parser.c\" }, -- note that some parsers also require src/scanner.c or src/scanner.cc\n -- optional entries:\n branch = \"main\", -- default branch in case of git repo if different from master\n generate_requires_npm = false, -- if stand-alone parser without npm dependencies\n requires_generate_from_grammar = false, -- if folder contains pre-generated src/parser.c\n },\n filetype = \"wikitext\", -- if filetype does not match the parser name\n}\n\nvim.filetype.add({\n pattern = {\n [\".*/*.wikitext\"] = \"wikitext\",\n },\n})\n```\n\nLink the queries folder of `tree-sitter-wikitext` to `queries/wikitext` folder of nvim\n\n```bash\ncd ~/.config/nvim\nmkdir -p queries\nln -s path/to/tree-sitter-wikitext/queries queries/wikitext\n```\n\nRe-open nvim. Open any file with `.wikitext` extension. You should see syntax highlighting. You can also inspect the tree-sitter tree using `:InspectTree` command\n\nTo run queries against a buffer, run `:EditQuery wikitext`. A scratch buffer will be opened. Write your Tree-Sitter query there, in normal node, move cursor over the capture names. You will see the corresponding text in the buffer get highlighted.\n\n## Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository.\n2. Create a new branch for your feature or bug fix.\n3. Submit a pull request with a detailed description of your changes.\n\n## License\n\nThis project is licensed under the MIT License. See the `LICENSE.md` file for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Wikitext grammar for tree-sitter",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/santhoshtr/tree-sitter-wikitext"
},
"split_keywords": [
"incremental",
" parsing",
" tree-sitter",
" wikitext"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "6a2c760f1c03a16b0d16a3b736a2c4d4e2a5ed96b4d69b31bf2eb7a86434d878",
"md5": "1eb4cfd7c6e8c89458ec4bcd493309a6",
"sha256": "ae2b4284fd8214abb7786ea121879c8984cb242308a52467ea651b9d298e6af0"
},
"downloads": -1,
"filename": "tree_sitter_wikitext-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "1eb4cfd7c6e8c89458ec4bcd493309a6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 69139,
"upload_time": "2025-08-21T05:29:01",
"upload_time_iso_8601": "2025-08-21T05:29:01.119740Z",
"url": "https://files.pythonhosted.org/packages/6a/2c/760f1c03a16b0d16a3b736a2c4d4e2a5ed96b4d69b31bf2eb7a86434d878/tree_sitter_wikitext-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-21 05:29:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "santhoshtr",
"github_project": "tree-sitter-wikitext",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "tree-sitter-wikitext"
}