Datalog Disassembly
===================
DDisasm is a *fast* disassembler which is *accurate* enough for the
resulting assembly code to be reassembled. DDisasm is implemented
using the datalog ([souffle](https://github.com/souffle-lang/souffle))
declarative logic programming language to compile disassembly rules
and heuristics. The disassembler first parses ELF/PE file information
and decodes a superset of possible instructions to create an initial
set of datalog facts. These facts are analyzed to identify *code
location*, *symbolization*, and *function boundaries*. The results of
this analysis, a refined set of datalog facts, are then translated to
the [GTIRB](https://github.com/grammatech/gtirb) intermediate
representation for binary analysis and reverse engineering. The
[GTIRB pretty printer](https://github.com/grammatech/gtirb-pprinter)
may then be used to pretty print the GTIRB to reassemblable assembly
code.
## Binary Support
Binary formats:
- ELF (Linux)
- PE (Windows)
Instruction Set Architectures (ISAs):
- x86_32
- x86_64
- ARM32
- ARM64
- MIPS32
## Getting Started
You can run a prebuilt version of Ddisasm using Docker:
```bash
docker pull grammatech/ddisasm:latest
```
Ddisasm can be used to disassemble a binary into the [GTIRB](https://github.com/grammatech/gtirb) representation.
We can try it with one of the examples included in the repository.
First, start the Ddisasm docker container:
```bash
docker run -v $PWD/examples:/examples -it grammatech/ddisasm:latest
```
Within the Docker container, let us build one of the examples:
```bash
apt update && apt install gcc -y
cd /examples/ex1
gcc ex.c -o ex
```
Now we can proceed to disassemble the binary:
```bash
ddisasm ex --ir ex.gtirb
```
Once you have the GTIRB representation, you can make programmatic changes to the
binary using [GTIRB](https://github.com/grammatech/gtirb) or [gtirb-rewriting](https://github.com/grammatech/gtirb-rewriting).
Then, you can use [gtirb-pprinter](https://github.com/grammatech/gtirb-pprinter) (included in the Docker image) to produce
a new version of the binary:
```
gtirb-pprinter ex.gtirb -b ex_rewritten
```
Internally, `gtirb-pprinter` will generate an assembly file and invoke the compiler/assembler (e.g. gcc)
to produce a new binary. `gtirb-pprinter` will take care or generating all the necessary command line
options to generate a new binary, including compilation options, library dependencies, or version linker scripts.
You can also use `gtirb-pprinter` to generate an assembly listing for manual modification:
```bash
gtirb-pprinter ex.gtirb --asm ex.s
```
This assembly listing can then be manually recompiled:
```bash
gcc -nostartfiles ex.s -o ex_rewritten
```
Please take a look at our [documentation](https://grammatech.github.io/ddisasm/) for additional information.
## [Documentation](https://grammatech.github.io/ddisasm/)
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md)
## External Contributors
* Programming Language Group, The University of Sydney: Initial support for ARM64.
* Github user gogo2464: Documentation refactoring.
## Cite
1. [Datalog Disassembly](https://www.usenix.org/conference/usenixsecurity20/presentation/flores-montoya)
```
@inproceedings {flores-montoya2020,
author = {Antonio Flores-Montoya and Eric Schulte},
title = {Datalog Disassembly},
booktitle = {29th USENIX Security Symposium (USENIX Security 20)},
year = {2020},
isbn = {978-1-939133-17-5},
pages = {1075--1092},
url = {https://www.usenix.org/conference/usenixsecurity20/presentation/flores-montoya},
publisher = {USENIX Association},
month = aug,
}
```
2. [GTIRB](https://arxiv.org/abs/1907.02859)
```
@misc{schulte2020gtirb,
title={GTIRB: Intermediate Representation for Binaries},
author={Eric Schulte and Jonathan Dorn and Antonio Flores-Montoya and Aaron Ballman and Tom Johnson},
year={2020},
eprint={1907.02859},
archivePrefix={arXiv},
primaryClass={cs.PL}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/grammatech/ddisasm",
"name": "ddisasm",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "reverse-engineering, disassembler, binary-analysis, intermediate-representation, binary-rewriting, gtirb",
"author": "GrammaTech, Inc.",
"author_email": "gtirb@grammatech.com",
"download_url": null,
"platform": null,
"description": "Datalog Disassembly\n===================\n\nDDisasm is a *fast* disassembler which is *accurate* enough for the\nresulting assembly code to be reassembled. DDisasm is implemented\nusing the datalog ([souffle](https://github.com/souffle-lang/souffle))\ndeclarative logic programming language to compile disassembly rules\nand heuristics. The disassembler first parses ELF/PE file information\nand decodes a superset of possible instructions to create an initial\nset of datalog facts. These facts are analyzed to identify *code\nlocation*, *symbolization*, and *function boundaries*. The results of\nthis analysis, a refined set of datalog facts, are then translated to\nthe [GTIRB](https://github.com/grammatech/gtirb) intermediate\nrepresentation for binary analysis and reverse engineering. The\n[GTIRB pretty printer](https://github.com/grammatech/gtirb-pprinter)\nmay then be used to pretty print the GTIRB to reassemblable assembly\ncode.\n\n## Binary Support\n\nBinary formats:\n\n - ELF (Linux)\n - PE (Windows)\n\nInstruction Set Architectures (ISAs):\n\n- x86_32\n- x86_64\n- ARM32\n- ARM64\n- MIPS32\n\n## Getting Started\n\nYou can run a prebuilt version of Ddisasm using Docker:\n\n```bash\ndocker pull grammatech/ddisasm:latest\n```\n\nDdisasm can be used to disassemble a binary into the [GTIRB](https://github.com/grammatech/gtirb) representation.\nWe can try it with one of the examples included in the repository.\n\nFirst, start the Ddisasm docker container:\n```bash\ndocker run -v $PWD/examples:/examples -it grammatech/ddisasm:latest\n```\n\nWithin the Docker container, let us build one of the examples:\n\n```bash\napt update && apt install gcc -y\ncd /examples/ex1\ngcc ex.c -o ex\n```\n\nNow we can proceed to disassemble the binary:\n\n```bash\nddisasm ex --ir ex.gtirb\n```\n\nOnce you have the GTIRB representation, you can make programmatic changes to the\nbinary using [GTIRB](https://github.com/grammatech/gtirb) or [gtirb-rewriting](https://github.com/grammatech/gtirb-rewriting).\n\nThen, you can use [gtirb-pprinter](https://github.com/grammatech/gtirb-pprinter) (included in the Docker image) to produce\na new version of the binary:\n\n```\ngtirb-pprinter ex.gtirb -b ex_rewritten\n```\n\nInternally, `gtirb-pprinter` will generate an assembly file and invoke the compiler/assembler (e.g. gcc)\nto produce a new binary. `gtirb-pprinter` will take care or generating all the necessary command line\noptions to generate a new binary, including compilation options, library dependencies, or version linker scripts.\n\nYou can also use `gtirb-pprinter` to generate an assembly listing for manual modification:\n```bash\ngtirb-pprinter ex.gtirb --asm ex.s\n```\n\nThis assembly listing can then be manually recompiled:\n```bash\ngcc -nostartfiles ex.s -o ex_rewritten\n```\n\nPlease take a look at our [documentation](https://grammatech.github.io/ddisasm/) for additional information.\n\n## [Documentation](https://grammatech.github.io/ddisasm/)\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md)\n\n## External Contributors\n\n * Programming Language Group, The University of Sydney: Initial support for ARM64.\n * Github user gogo2464: Documentation refactoring.\n\n## Cite\n\n1. [Datalog Disassembly](https://www.usenix.org/conference/usenixsecurity20/presentation/flores-montoya)\n\n```\n@inproceedings {flores-montoya2020,\n author = {Antonio Flores-Montoya and Eric Schulte},\n title = {Datalog Disassembly},\n booktitle = {29th USENIX Security Symposium (USENIX Security 20)},\n year = {2020},\n isbn = {978-1-939133-17-5},\n pages = {1075--1092},\n url = {https://www.usenix.org/conference/usenixsecurity20/presentation/flores-montoya},\n publisher = {USENIX Association},\n month = aug,\n}\n```\n\n2. [GTIRB](https://arxiv.org/abs/1907.02859)\n\n```\n@misc{schulte2020gtirb,\n title={GTIRB: Intermediate Representation for Binaries},\n author={Eric Schulte and Jonathan Dorn and Antonio Flores-Montoya and Aaron Ballman and Tom Johnson},\n year={2020},\n eprint={1907.02859},\n archivePrefix={arXiv},\n primaryClass={cs.PL}\n}\n```\n",
"bugtrack_url": null,
"license": "AGPL-3.0",
"summary": "A fast and accurate disassembler",
"version": "1.9.0",
"project_urls": {
"Homepage": "https://github.com/grammatech/ddisasm"
},
"split_keywords": [
"reverse-engineering",
" disassembler",
" binary-analysis",
" intermediate-representation",
" binary-rewriting",
" gtirb"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "96412612e021e4f5a8d4e7858fccf47c3f646641d812a45a79ec662255a3b237",
"md5": "6b58f923c9953302310baa0d792a575d",
"sha256": "4a1d25a7edc161217fb568d9a9eb8067a29f64ab9277797b183b5cc4136dd841"
},
"downloads": -1,
"filename": "ddisasm-1.9.0-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl",
"has_sig": false,
"md5_digest": "6b58f923c9953302310baa0d792a575d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 25629100,
"upload_time": "2024-06-13T22:25:04",
"upload_time_iso_8601": "2024-06-13T22:25:04.849695Z",
"url": "https://files.pythonhosted.org/packages/96/41/2612e021e4f5a8d4e7858fccf47c3f646641d812a45a79ec662255a3b237/ddisasm-1.9.0-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-13 22:25:04",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "grammatech",
"github_project": "ddisasm",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "ddisasm"
}