code-diff


Namecode-diff JSON
Version 0.1.3 PyPI version JSON
download
home_pagehttps://github.com/cedricrupb/code_diff
SummaryFast AST based code differencing in Python
upload_time2025-01-14 09:31:16
maintainerNone
docs_urlNone
authorCedric Richter
requires_python>=3.8
licenseNone
keywords code differencing ast cst program language processing
VCS
bugtrack_url
requirements code-tokenize apted
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Code Diff
------------------------------------------------
> Fast AST based code differencing in Python

Software projects are constantly evolving to integrate new features or improve existing implementations. To keep track of this progress, it becomes important to track individual code changes. Code differencing provides a way
to identify the smallest code change between two
implementations. 

**code.diff** provides a fast alternative to standard code differencing techniques with a focus
on AST based code differencing. As part of this library, we include a fast reimplementation of the [**GumTree**](https://github.com/GumTreeDiff/gumtree) algorithm. However, by relying on
a best-effort AST parser, we are able to generate
AST code changes for individual code snippets. Many
programming languages including Python, Java and JavaScript are supported!


## Installation
The package is tested under Python 3. It can be installed via:
```
pip install code-diff
```

## Usage
code.diff can compute a code difference for nearly any program code in a few lines of code:
```python
import code_diff as cd

# Python
output = cd.difference(
    '''
        def my_func():
            print("Hello World")
    ''',
    '''
        def say_helloworld():
            print("Hello World")
    ''',
lang = "python")

# Output: my_func -> say_helloworld

output.edit_script()

# Output: 
# [
#  Update((identifier:my_func, line 1:12 - 1:19), say_helloworld)
#]


# Java
output = cd.difference(
    '''
        int x = x + 1;
    ''',
    '''
        int x = x / 2;
    ''',
lang = "java")

# Output: x + 1 -> x / 2

output.edit_script()

# Output: [
#  Insert(/:/, (binary_operator, line 0:4 - 0:9), 1),
#  Update((integer:1, line 0:8 - 0:9), 2),
#  Delete((+:+, line 0:6 - 0:7))
#]


```
## Language support
code.diff supports most programming languages
where an AST can be computed. To parse an AST,
the underlying parser employs
* [**code.tokenize:**](https://github.com/cedricrupb/code_tokenize) A frontend for 
tree-sitter to effectively parse and tokenize 
program code in Python.

* [**tree-sitter:**](https://tree-sitter.github.io/tree-sitter/) A best-effort AST parser supporting
many programming languages including Python, Java and JavaScript.

To decide whether your code can be handled by code.diff please review the libraries above.

**GumTree:** To compute an edit script between a source and target AST, we employ a Python reimplementation of the [GumTree](https://github.com/GumTreeDiff/gumtree) algorithm. Note however that the computed script are heavily dependent on the AST representation of the given code. Therefore, AST edit script computed with code.diff might significantly differ to the one computed by GumTree.


## Release history
* 0.1.2
    * Fix of the release information
    * Fix bug in 0.1.1 release
    * Package now useable by installing from PyPI
* 0.1.0
    * Initial functionality
    * Documentation
    * SStuB Testing

## Project Info
The goal of this project is to provide developer with easy access to AST-based code differencing. This is currently developed as a helper library for internal research projects. Therefore, it will only be updated as needed.

Feel free to open an issue if anything unexpected
happens. 

[Cedric Richter](https://uol.de/informatik/formale-methoden/team/cedric-richter) - [@cedricrichter](https://twitter.com/cedrichter) - cedric.richter@uni-oldenburg.de

Distributed under the MIT license. See ``LICENSE`` for more information.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/cedricrupb/code_diff",
    "name": "code-diff",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Cedric Richter <cedricr.upb@gmail.com>",
    "keywords": "code, differencing, AST, CST, program, language processing",
    "author": "Cedric Richter",
    "author_email": "Cedric Richter <cedricr.upb@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/55/88/8ce5dc2ea13d63f989dd365984a8b4ecaf28ab328f63fe860d0848d455ed/code_diff-0.1.3.tar.gz",
    "platform": null,
    "description": "# Code Diff\n------------------------------------------------\n> Fast AST based code differencing in Python\n\nSoftware projects are constantly evolving to integrate new features or improve existing implementations. To keep track of this progress, it becomes important to track individual code changes. Code differencing provides a way\nto identify the smallest code change between two\nimplementations. \n\n**code.diff** provides a fast alternative to standard code differencing techniques with a focus\non AST based code differencing. As part of this library, we include a fast reimplementation of the [**GumTree**](https://github.com/GumTreeDiff/gumtree) algorithm. However, by relying on\na best-effort AST parser, we are able to generate\nAST code changes for individual code snippets. Many\nprogramming languages including Python, Java and JavaScript are supported!\n\n\n## Installation\nThe package is tested under Python 3. It can be installed via:\n```\npip install code-diff\n```\n\n## Usage\ncode.diff can compute a code difference for nearly any program code in a few lines of code:\n```python\nimport code_diff as cd\n\n# Python\noutput = cd.difference(\n    '''\n        def my_func():\n            print(\"Hello World\")\n    ''',\n    '''\n        def say_helloworld():\n            print(\"Hello World\")\n    ''',\nlang = \"python\")\n\n# Output: my_func -> say_helloworld\n\noutput.edit_script()\n\n# Output: \n# [\n#  Update((identifier:my_func, line 1:12 - 1:19), say_helloworld)\n#]\n\n\n# Java\noutput = cd.difference(\n    '''\n        int x = x + 1;\n    ''',\n    '''\n        int x = x / 2;\n    ''',\nlang = \"java\")\n\n# Output: x + 1 -> x / 2\n\noutput.edit_script()\n\n# Output: [\n#  Insert(/:/, (binary_operator, line 0:4 - 0:9), 1),\n#  Update((integer:1, line 0:8 - 0:9), 2),\n#  Delete((+:+, line 0:6 - 0:7))\n#]\n\n\n```\n## Language support\ncode.diff supports most programming languages\nwhere an AST can be computed. To parse an AST,\nthe underlying parser employs\n* [**code.tokenize:**](https://github.com/cedricrupb/code_tokenize) A frontend for \ntree-sitter to effectively parse and tokenize \nprogram code in Python.\n\n* [**tree-sitter:**](https://tree-sitter.github.io/tree-sitter/) A best-effort AST parser supporting\nmany programming languages including Python, Java and JavaScript.\n\nTo decide whether your code can be handled by code.diff please review the libraries above.\n\n**GumTree:** To compute an edit script between a source and target AST, we employ a Python reimplementation of the [GumTree](https://github.com/GumTreeDiff/gumtree) algorithm. Note however that the computed script are heavily dependent on the AST representation of the given code. Therefore, AST edit script computed with code.diff might significantly differ to the one computed by GumTree.\n\n\n## Release history\n* 0.1.2\n    * Fix of the release information\n    * Fix bug in 0.1.1 release\n    * Package now useable by installing from PyPI\n* 0.1.0\n    * Initial functionality\n    * Documentation\n    * SStuB Testing\n\n## Project Info\nThe goal of this project is to provide developer with easy access to AST-based code differencing. This is currently developed as a helper library for internal research projects. Therefore, it will only be updated as needed.\n\nFeel free to open an issue if anything unexpected\nhappens. \n\n[Cedric Richter](https://uol.de/informatik/formale-methoden/team/cedric-richter) - [@cedricrichter](https://twitter.com/cedrichter) - cedric.richter@uni-oldenburg.de\n\nDistributed under the MIT license. See ``LICENSE`` for more information.\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Fast AST based code differencing in Python",
    "version": "0.1.3",
    "project_urls": {
        "Bug Reports": "https://github.com/cedricrupb/code_diff/issues",
        "Download": "https://github.com/cedricrupb/code_diff/archive/refs/tags/v0.1.3.tar.gz",
        "Homepage": "https://github.com/cedricrupb/code_diff",
        "Source": "https://github.com/cedricrupb/code_diff"
    },
    "split_keywords": [
        "code",
        " differencing",
        " ast",
        " cst",
        " program",
        " language processing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "99f63e4f598f6570f461e519b941333cce5db2db71635e00db1083e6aba32613",
                "md5": "eeb621fa1fda96e8bcf7a47734cd3276",
                "sha256": "25a470e3cc591ae92dcec5a084b93af4145f54fd6d7379d5c0bf5e8800b4334a"
            },
            "downloads": -1,
            "filename": "code_diff-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "eeb621fa1fda96e8bcf7a47734cd3276",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 23267,
            "upload_time": "2025-01-14T09:31:15",
            "upload_time_iso_8601": "2025-01-14T09:31:15.368938Z",
            "url": "https://files.pythonhosted.org/packages/99/f6/3e4f598f6570f461e519b941333cce5db2db71635e00db1083e6aba32613/code_diff-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "55888ce5dc2ea13d63f989dd365984a8b4ecaf28ab328f63fe860d0848d455ed",
                "md5": "7c85dac65b3c005c6cc225fce39e727c",
                "sha256": "02235e5419e1db87392bf742c33e1c3c65893d81cb61035cc9d7743e50809255"
            },
            "downloads": -1,
            "filename": "code_diff-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "7c85dac65b3c005c6cc225fce39e727c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 25164,
            "upload_time": "2025-01-14T09:31:16",
            "upload_time_iso_8601": "2025-01-14T09:31:16.762936Z",
            "url": "https://files.pythonhosted.org/packages/55/88/8ce5dc2ea13d63f989dd365984a8b4ecaf28ab328f63fe860d0848d455ed/code_diff-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-14 09:31:16",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "cedricrupb",
    "github_project": "code_diff",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "code-tokenize",
            "specs": [
                [
                    ">=",
                    "0.1.0"
                ]
            ]
        },
        {
            "name": "apted",
            "specs": [
                [
                    ">=",
                    "1.0.3"
                ]
            ]
        }
    ],
    "lcname": "code-diff"
}
        
Elapsed time: 0.46610s