tokenize-rt


Nametokenize-rt JSON
Version 5.2.0 PyPI version JSON
download
home_pagehttps://github.com/asottile/tokenize-rt
SummaryA wrapper around the stdlib `tokenize` which roundtrips.
upload_time2023-07-30 22:45:28
maintainer
docs_urlNone
authorAnthony Sottile
requires_python>=3.8
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![build status](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml/badge.svg)](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml)
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/asottile/tokenize-rt/main.svg)](https://results.pre-commit.ci/latest/github/asottile/tokenize-rt/main)

tokenize-rt
===========

The stdlib `tokenize` module does not properly roundtrip.  This wrapper
around the stdlib provides two additional tokens `ESCAPED_NL` and
`UNIMPORTANT_WS`, and a `Token` data type.  Use `src_to_tokens` and
`tokens_to_src` to roundtrip.

This library is useful if you're writing a refactoring tool based on the
python tokenization.

## Installation

```bash
pip install tokenize-rt
```

## Usage

### datastructures

#### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)`

A token offset, useful as a key when cross referencing the `ast` and the
tokenized source.

#### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)`

Construct a token

- `name`: one of the token names listed in `token.tok_name` or
  `ESCAPED_NL` or `UNIMPORTANT_WS`
- `src`: token's source as text
- `line`: the line number that this token appears on.
- `utf8_byte_offset`: the utf8 byte offset that this token appears on in the
  line.

#### `tokenize_rt.Token.offset`

Retrieves an `Offset` for this token.

### converting to and from `Token` representations

#### `tokenize_rt.src_to_tokens(text: str) -> List[Token]`

#### `tokenize_rt.tokens_to_src(Iterable[Token]) -> str`

### additional tokens added by `tokenize-rt`

#### `tokenize_rt.ESCAPED_NL`

#### `tokenize_rt.UNIMPORTANT_WS`

### helpers

#### `tokenize_rt.NON_CODING_TOKENS`

A `frozenset` containing tokens which may appear between others while not
affecting control flow or code:
- `COMMENT`
- `ESCAPED_NL`
- `NL`
- `UNIMPORTANT_WS`

#### `tokenize_rt.parse_string_literal(text: str) -> Tuple[str, str]`

parse a string literal into its prefix and string content

```pycon
>>> parse_string_literal('f"foo"')
('f', '"foo"')
```

#### `tokenize_rt.reversed_enumerate(Sequence[Token]) -> Iterator[Tuple[int, Token]]`

yields `(index, token)` pairs.  Useful for rewriting source.

#### `tokenize_rt.rfind_string_parts(Sequence[Token], i) -> Tuple[int, ...]`

find the indices of the string parts of a (joined) string literal

- `i` should start at the end of the string literal
- returns `()` (an empty tuple) for things which are not string literals

```pycon
>>> tokens = src_to_tokens('"foo" "bar".capitalize()')
>>> rfind_string_parts(tokens, 2)
(0, 2)
>>> tokens = src_to_tokens('("foo" "bar").capitalize()')
>>> rfind_string_parts(tokens, 4)
(1, 3)
```

## Differences from `tokenize`

- `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline "token"
- `tokenize-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`)
- `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for
  instance, this means you'll see `Token('STRING', "f'foo'", ...)` even in
  python 2.
- `tokenize-rt` normalizes python 2 long literals (`4l` / `4L`) and octal
  literals (`0755`) in python 3 (for easier rewriting of python 2 code while
  running python 3).

## Sample usage

- https://github.com/asottile/add-trailing-comma
- https://github.com/asottile/future-annotations
- https://github.com/asottile/future-fstrings
- https://github.com/asottile/pyupgrade
- https://github.com/asottile/yesqa

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/asottile/tokenize-rt",
    "name": "tokenize-rt",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "",
    "author": "Anthony Sottile",
    "author_email": "asottile@umich.edu",
    "download_url": "https://files.pythonhosted.org/packages/23/9c/1405c291295e26f94475bd76ba581e216b62389be28b031a98e370f981bb/tokenize_rt-5.2.0.tar.gz",
    "platform": null,
    "description": "[![build status](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml/badge.svg)](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml)\n[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/asottile/tokenize-rt/main.svg)](https://results.pre-commit.ci/latest/github/asottile/tokenize-rt/main)\n\ntokenize-rt\n===========\n\nThe stdlib `tokenize` module does not properly roundtrip.  This wrapper\naround the stdlib provides two additional tokens `ESCAPED_NL` and\n`UNIMPORTANT_WS`, and a `Token` data type.  Use `src_to_tokens` and\n`tokens_to_src` to roundtrip.\n\nThis library is useful if you're writing a refactoring tool based on the\npython tokenization.\n\n## Installation\n\n```bash\npip install tokenize-rt\n```\n\n## Usage\n\n### datastructures\n\n#### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)`\n\nA token offset, useful as a key when cross referencing the `ast` and the\ntokenized source.\n\n#### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)`\n\nConstruct a token\n\n- `name`: one of the token names listed in `token.tok_name` or\n  `ESCAPED_NL` or `UNIMPORTANT_WS`\n- `src`: token's source as text\n- `line`: the line number that this token appears on.\n- `utf8_byte_offset`: the utf8 byte offset that this token appears on in the\n  line.\n\n#### `tokenize_rt.Token.offset`\n\nRetrieves an `Offset` for this token.\n\n### converting to and from `Token` representations\n\n#### `tokenize_rt.src_to_tokens(text: str) -> List[Token]`\n\n#### `tokenize_rt.tokens_to_src(Iterable[Token]) -> str`\n\n### additional tokens added by `tokenize-rt`\n\n#### `tokenize_rt.ESCAPED_NL`\n\n#### `tokenize_rt.UNIMPORTANT_WS`\n\n### helpers\n\n#### `tokenize_rt.NON_CODING_TOKENS`\n\nA `frozenset` containing tokens which may appear between others while not\naffecting control flow or code:\n- `COMMENT`\n- `ESCAPED_NL`\n- `NL`\n- `UNIMPORTANT_WS`\n\n#### `tokenize_rt.parse_string_literal(text: str) -> Tuple[str, str]`\n\nparse a string literal into its prefix and string content\n\n```pycon\n>>> parse_string_literal('f\"foo\"')\n('f', '\"foo\"')\n```\n\n#### `tokenize_rt.reversed_enumerate(Sequence[Token]) -> Iterator[Tuple[int, Token]]`\n\nyields `(index, token)` pairs.  Useful for rewriting source.\n\n#### `tokenize_rt.rfind_string_parts(Sequence[Token], i) -> Tuple[int, ...]`\n\nfind the indices of the string parts of a (joined) string literal\n\n- `i` should start at the end of the string literal\n- returns `()` (an empty tuple) for things which are not string literals\n\n```pycon\n>>> tokens = src_to_tokens('\"foo\" \"bar\".capitalize()')\n>>> rfind_string_parts(tokens, 2)\n(0, 2)\n>>> tokens = src_to_tokens('(\"foo\" \"bar\").capitalize()')\n>>> rfind_string_parts(tokens, 4)\n(1, 3)\n```\n\n## Differences from `tokenize`\n\n- `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline \"token\"\n- `tokenize-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`)\n- `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for\n  instance, this means you'll see `Token('STRING', \"f'foo'\", ...)` even in\n  python 2.\n- `tokenize-rt` normalizes python 2 long literals (`4l` / `4L`) and octal\n  literals (`0755`) in python 3 (for easier rewriting of python 2 code while\n  running python 3).\n\n## Sample usage\n\n- https://github.com/asottile/add-trailing-comma\n- https://github.com/asottile/future-annotations\n- https://github.com/asottile/future-fstrings\n- https://github.com/asottile/pyupgrade\n- https://github.com/asottile/yesqa\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A wrapper around the stdlib `tokenize` which roundtrips.",
    "version": "5.2.0",
    "project_urls": {
        "Homepage": "https://github.com/asottile/tokenize-rt"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8d3578f03aa48cfebd13646707f64477bc7eacf1081edcdcd1b4d57cb1b5d0a8",
                "md5": "ed40d53e7b07e6df31aba47be177fdf0",
                "sha256": "b79d41a65cfec71285433511b50271b05da3584a1da144a0752e9c621a285289"
            },
            "downloads": -1,
            "filename": "tokenize_rt-5.2.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ed40d53e7b07e6df31aba47be177fdf0",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.8",
            "size": 5795,
            "upload_time": "2023-07-30T22:45:27",
            "upload_time_iso_8601": "2023-07-30T22:45:27.479858Z",
            "url": "https://files.pythonhosted.org/packages/8d/35/78f03aa48cfebd13646707f64477bc7eacf1081edcdcd1b4d57cb1b5d0a8/tokenize_rt-5.2.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "239c1405c291295e26f94475bd76ba581e216b62389be28b031a98e370f981bb",
                "md5": "10f4f13203de00bb66e0abdb373094ba",
                "sha256": "9fe80f8a5c1edad2d3ede0f37481cc0cc1538a2f442c9c2f9e4feacd2792d054"
            },
            "downloads": -1,
            "filename": "tokenize_rt-5.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "10f4f13203de00bb66e0abdb373094ba",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 5289,
            "upload_time": "2023-07-30T22:45:28",
            "upload_time_iso_8601": "2023-07-30T22:45:28.322004Z",
            "url": "https://files.pythonhosted.org/packages/23/9c/1405c291295e26f94475bd76ba581e216b62389be28b031a98e370f981bb/tokenize_rt-5.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-30 22:45:28",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "asottile",
    "github_project": "tokenize-rt",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "tokenize-rt"
}
        
Elapsed time: 0.19671s