tokenize-rt

Name	tokenize-rt JSON
Version	6.1.0 JSON
	download
home_page	https://github.com/asottile/tokenize-rt
Summary	A wrapper around the stdlib `tokenize` which roundtrips.
upload_time	2024-10-22 00:14:59
maintainer	None
docs_url	None
author	Anthony Sottile
requires_python	>=3.9
license	MIT
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![build status](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml/badge.svg)](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml)
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/asottile/tokenize-rt/main.svg)](https://results.pre-commit.ci/latest/github/asottile/tokenize-rt/main)

tokenize-rt
===========

The stdlib `tokenize` module does not properly roundtrip.  This wrapper
around the stdlib provides two additional tokens `ESCAPED_NL` and
`UNIMPORTANT_WS`, and a `Token` data type.  Use `src_to_tokens` and
`tokens_to_src` to roundtrip.

This library is useful if you're writing a refactoring tool based on the
python tokenization.

## Installation

```bash
pip install tokenize-rt
```

## Usage

### datastructures

#### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)`

A token offset, useful as a key when cross referencing the `ast` and the
tokenized source.

#### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)`

Construct a token

- `name`: one of the token names listed in `token.tok_name` or
  `ESCAPED_NL` or `UNIMPORTANT_WS`
- `src`: token's source as text
- `line`: the line number that this token appears on.
- `utf8_byte_offset`: the utf8 byte offset that this token appears on in the
  line.

#### `tokenize_rt.Token.offset`

Retrieves an `Offset` for this token.

### converting to and from `Token` representations

#### `tokenize_rt.src_to_tokens(text: str) -> List[Token]`

#### `tokenize_rt.tokens_to_src(Iterable[Token]) -> str`

### additional tokens added by `tokenize-rt`

#### `tokenize_rt.ESCAPED_NL`

#### `tokenize_rt.UNIMPORTANT_WS`

### helpers

#### `tokenize_rt.NON_CODING_TOKENS`

A `frozenset` containing tokens which may appear between others while not
affecting control flow or code:
- `COMMENT`
- `ESCAPED_NL`
- `NL`
- `UNIMPORTANT_WS`

#### `tokenize_rt.parse_string_literal(text: str) -> Tuple[str, str]`

parse a string literal into its prefix and string content

```pycon
>>> parse_string_literal('f"foo"')
('f', '"foo"')
```

#### `tokenize_rt.reversed_enumerate(Sequence[Token]) -> Iterator[Tuple[int, Token]]`

yields `(index, token)` pairs.  Useful for rewriting source.

#### `tokenize_rt.rfind_string_parts(Sequence[Token], i) -> Tuple[int, ...]`

find the indices of the string parts of a (joined) string literal

- `i` should start at the end of the string literal
- returns `()` (an empty tuple) for things which are not string literals

```pycon
>>> tokens = src_to_tokens('"foo" "bar".capitalize()')
>>> rfind_string_parts(tokens, 2)
(0, 2)
>>> tokens = src_to_tokens('("foo" "bar").capitalize()')
>>> rfind_string_parts(tokens, 4)
(1, 3)
```

## Differences from `tokenize`

- `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline "token"
- `tokenize-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`)
- `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for
  instance, this means you'll see `Token('STRING', "f'foo'", ...)` even in
  python 2.
- `tokenize-rt` normalizes python 2 long literals (`4l` / `4L`) and octal
  literals (`0755`) in python 3 (for easier rewriting of python 2 code while
  running python 3).

## Sample usage

- https://github.com/asottile/add-trailing-comma
- https://github.com/asottile/future-annotations
- https://github.com/asottile/future-fstrings
- https://github.com/asottile/pyupgrade
- https://github.com/asottile/yesqa

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/asottile/tokenize-rt",
    "name": "tokenize-rt",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Anthony Sottile",
    "author_email": "asottile@umich.edu",
    "download_url": "https://files.pythonhosted.org/packages/6b/0a/5854d8ced8c1e00193d1353d13db82d7f813f99bd5dcb776ce3e2a4c0d19/tokenize_rt-6.1.0.tar.gz",
    "platform": null,
    "description": "[![build status](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml/badge.svg)](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml)\n[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/asottile/tokenize-rt/main.svg)](https://results.pre-commit.ci/latest/github/asottile/tokenize-rt/main)\n\ntokenize-rt\n===========\n\nThe stdlib `tokenize` module does not properly roundtrip.  This wrapper\naround the stdlib provides two additional tokens `ESCAPED_NL` and\n`UNIMPORTANT_WS`, and a `Token` data type.  Use `src_to_tokens` and\n`tokens_to_src` to roundtrip.\n\nThis library is useful if you're writing a refactoring tool based on the\npython tokenization.\n\n## Installation\n\n```bash\npip install tokenize-rt\n```\n\n## Usage\n\n### datastructures\n\n#### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)`\n\nA token offset, useful as a key when cross referencing the `ast` and the\ntokenized source.\n\n#### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)`\n\nConstruct a token\n\n- `name`: one of the token names listed in `token.tok_name` or\n  `ESCAPED_NL` or `UNIMPORTANT_WS`\n- `src`: token's source as text\n- `line`: the line number that this token appears on.\n- `utf8_byte_offset`: the utf8 byte offset that this token appears on in the\n  line.\n\n#### `tokenize_rt.Token.offset`\n\nRetrieves an `Offset` for this token.\n\n### converting to and from `Token` representations\n\n#### `tokenize_rt.src_to_tokens(text: str) -> List[Token]`\n\n#### `tokenize_rt.tokens_to_src(Iterable[Token]) -> str`\n\n### additional tokens added by `tokenize-rt`\n\n#### `tokenize_rt.ESCAPED_NL`\n\n#### `tokenize_rt.UNIMPORTANT_WS`\n\n### helpers\n\n#### `tokenize_rt.NON_CODING_TOKENS`\n\nA `frozenset` containing tokens which may appear between others while not\naffecting control flow or code:\n- `COMMENT`\n- `ESCAPED_NL`\n- `NL`\n- `UNIMPORTANT_WS`\n\n#### `tokenize_rt.parse_string_literal(text: str) -> Tuple[str, str]`\n\nparse a string literal into its prefix and string content\n\n```pycon\n>>> parse_string_literal('f\"foo\"')\n('f', '\"foo\"')\n```\n\n#### `tokenize_rt.reversed_enumerate(Sequence[Token]) -> Iterator[Tuple[int, Token]]`\n\nyields `(index, token)` pairs.  Useful for rewriting source.\n\n#### `tokenize_rt.rfind_string_parts(Sequence[Token], i) -> Tuple[int, ...]`\n\nfind the indices of the string parts of a (joined) string literal\n\n- `i` should start at the end of the string literal\n- returns `()` (an empty tuple) for things which are not string literals\n\n```pycon\n>>> tokens = src_to_tokens('\"foo\" \"bar\".capitalize()')\n>>> rfind_string_parts(tokens, 2)\n(0, 2)\n>>> tokens = src_to_tokens('(\"foo\" \"bar\").capitalize()')\n>>> rfind_string_parts(tokens, 4)\n(1, 3)\n```\n\n## Differences from `tokenize`\n\n- `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline \"token\"\n- `tokenize-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`)\n- `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for\n  instance, this means you'll see `Token('STRING', \"f'foo'\", ...)` even in\n  python 2.\n- `tokenize-rt` normalizes python 2 long literals (`4l` / `4L`) and octal\n  literals (`0755`) in python 3 (for easier rewriting of python 2 code while\n  running python 3).\n\n## Sample usage\n\n- https://github.com/asottile/add-trailing-comma\n- https://github.com/asottile/future-annotations\n- https://github.com/asottile/future-fstrings\n- https://github.com/asottile/pyupgrade\n- https://github.com/asottile/yesqa\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A wrapper around the stdlib `tokenize` which roundtrips.",
    "version": "6.1.0",
    "project_urls": {
        "Homepage": "https://github.com/asottile/tokenize-rt"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "87ba576aac29b10dfa49a6ce650001d1bb31f81e734660555eaf144bfe5b8995",
                "md5": "b5aaf30ed9873884c66151995f3cd12c",
                "sha256": "d706141cdec4aa5f358945abe36b911b8cbdc844545da99e811250c0cee9b6fc"
            },
            "downloads": -1,
            "filename": "tokenize_rt-6.1.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b5aaf30ed9873884c66151995f3cd12c",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.9",
            "size": 6015,
            "upload_time": "2024-10-22T00:14:57",
            "upload_time_iso_8601": "2024-10-22T00:14:57.469645Z",
            "url": "https://files.pythonhosted.org/packages/87/ba/576aac29b10dfa49a6ce650001d1bb31f81e734660555eaf144bfe5b8995/tokenize_rt-6.1.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6b0a5854d8ced8c1e00193d1353d13db82d7f813f99bd5dcb776ce3e2a4c0d19",
                "md5": "48bdf2b8db11ee253ea3943a3e750a73",
                "sha256": "e8ee836616c0877ab7c7b54776d2fefcc3bde714449a206762425ae114b53c86"
            },
            "downloads": -1,
            "filename": "tokenize_rt-6.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "48bdf2b8db11ee253ea3943a3e750a73",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 5506,
            "upload_time": "2024-10-22T00:14:59",
            "upload_time_iso_8601": "2024-10-22T00:14:59.189300Z",
            "url": "https://files.pythonhosted.org/packages/6b/0a/5854d8ced8c1e00193d1353d13db82d7f813f99bd5dcb776ce3e2a4c0d19/tokenize_rt-6.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-22 00:14:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "asottile",
    "github_project": "tokenize-rt",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "tokenize-rt"
}

Anthony Sottile