<p align=center>
<img src="https://avatars.githubusercontent.com/u/90545451?s=200&v=4"/>
</p>
# EVM Storage Layout Extractor
This project is a library that is able to ingest [EVM](https://ethereum.org/en/developers/docs/evm/)
[bytecode](https://ethereum.org/en/developers/docs/evm/opcodes/) and discover an approximate storage
layout for the contract described by that bytecode. It is _not_ intended to be a full decompiler,
but is instead a tool **highly** specialised for performing this discovery process.
See our [announcement](https://blog.smlxl.io/announcing-bytecode-generated-storage-layouts-on-evm-storage-96761758d397) for more details
or check the [deepdive post on our blog.](https://blog.smlxl.io/a-deep-dive-into-our-storage-layout-extractor-51554185d8af)
This discovery process is performed, in broad strokes, as follows:
1. Bytecode is ingested and disassembled into an instruction stream that is amenable to analysis.
This is a sequence of Opcodes that is equivalent to the bytecode.
2. The stream of instructions is executed symbolically on a specialised EVM implementation. This
execution is both **speculative** and **total**, exploring all possible code paths that can
influence the type attributed to a given storage location.
3. For each value seen in the program during execution, the VM builds a **symbolic value** (a little
tree structure) that represents the operations performed to that particular piece of "data".
4. These execution trees are passed to a type inference process. This process starts by _lifting_,
which turns low-level constructs into more-general high-level ones. The results of this are then
fed to _inference rules_ that output **type inference judgements** about the trees they analyse.
Finally, these inferences are combined with a _unifier_ to perform whole-program type inference.
5. The resolved types associated with each storage slot are then turned into a **storage layout**
that describes the type of each storage slot that was encountered.
For more information on the process with specific reference to concrete pieces of code, see the
documentation in [`lib.rs`](src/lib.rs). This also provides a basic usage example for the library,
though more complex ones can be found in the [tests](tests).
## Extending the Library
The primary means of extending this library to get better layouts is by extending the type inference
engine. This is done by either writing new **lifting passes** or **inference rules**, and you can
find more information on this process in the documentation
on [extending the library](./docs/Extending%20the%20Library.md).
## Contributing
If you want to contribute code or documentation (non-code contributions are _always_ welcome) to
this project, please take a look at our [contributing](./docs/CONTRIBUTING.md) documentation. It
provides an overview of how to get up and running, as well as what the contribution process looks
like for the library.
We are also available on our [Telegram group](https://t.me/+zw0fuNoYg39hZWRh) if you have any questions.
Raw data
{
"_id": null,
"home_page": "https://github.com/smlxlio/storage-layout-extractor",
"name": "storage-layout-extractor",
"maintainer": null,
"docs_url": null,
"requires_python": "",
"maintainer_email": null,
"keywords": "smlxl,evm,decompiler",
"author": "smlXL",
"author_email": null,
"download_url": null,
"platform": null,
"description": "<p align=center>\n <img src=\"https://avatars.githubusercontent.com/u/90545451?s=200&v=4\"/>\n</p>\n\n# EVM Storage Layout Extractor\n\nThis project is a library that is able to ingest [EVM](https://ethereum.org/en/developers/docs/evm/)\n[bytecode](https://ethereum.org/en/developers/docs/evm/opcodes/) and discover an approximate storage\nlayout for the contract described by that bytecode. It is _not_ intended to be a full decompiler,\nbut is instead a tool **highly** specialised for performing this discovery process. \n\nSee our [announcement](https://blog.smlxl.io/announcing-bytecode-generated-storage-layouts-on-evm-storage-96761758d397) for more details\nor check the [deepdive post on our blog.](https://blog.smlxl.io/a-deep-dive-into-our-storage-layout-extractor-51554185d8af)\n\nThis discovery process is performed, in broad strokes, as follows:\n\n1. Bytecode is ingested and disassembled into an instruction stream that is amenable to analysis.\n This is a sequence of Opcodes that is equivalent to the bytecode.\n2. The stream of instructions is executed symbolically on a specialised EVM implementation. This\n execution is both **speculative** and **total**, exploring all possible code paths that can\n influence the type attributed to a given storage location.\n3. For each value seen in the program during execution, the VM builds a **symbolic value** (a little\n tree structure) that represents the operations performed to that particular piece of \"data\".\n4. These execution trees are passed to a type inference process. This process starts by _lifting_,\n which turns low-level constructs into more-general high-level ones. The results of this are then\n fed to _inference rules_ that output **type inference judgements** about the trees they analyse.\n Finally, these inferences are combined with a _unifier_ to perform whole-program type inference.\n5. The resolved types associated with each storage slot are then turned into a **storage layout**\n that describes the type of each storage slot that was encountered.\n\nFor more information on the process with specific reference to concrete pieces of code, see the\ndocumentation in [`lib.rs`](src/lib.rs). This also provides a basic usage example for the library,\nthough more complex ones can be found in the [tests](tests).\n\n## Extending the Library\n\nThe primary means of extending this library to get better layouts is by extending the type inference\nengine. This is done by either writing new **lifting passes** or **inference rules**, and you can\nfind more information on this process in the documentation\non [extending the library](./docs/Extending%20the%20Library.md).\n\n## Contributing\n\nIf you want to contribute code or documentation (non-code contributions are _always_ welcome) to\nthis project, please take a look at our [contributing](./docs/CONTRIBUTING.md) documentation. It\nprovides an overview of how to get up and running, as well as what the contribution process looks\nlike for the library. \nWe are also available on our [Telegram group](https://t.me/+zw0fuNoYg39hZWRh) if you have any questions.\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A library for performing bytecode-based discovery of storage layouts for contracts that run on the EVM.",
"version": "0.5.4",
"project_urls": {
"Homepage": "https://github.com/smlxlio/storage-layout-extractor",
"Source Code": "https://github.com/smlxlio/storage-layout-extractor"
},
"split_keywords": [
"smlxl",
"evm",
"decompiler"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5b31c8897534c957553d9da38e7bb6e5a5995a44e4a8a1818931118ecacc5b9b",
"md5": "76eddccae363c4e8ae0c30e67bc4b3df",
"sha256": "bd0b3db98d156dee66b5be2ca29db4b3937e3e003a4cdcce4784126c94d3c268"
},
"downloads": -1,
"filename": "storage_layout_extractor-0.5.4-cp39-cp39-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "76eddccae363c4e8ae0c30e67bc4b3df",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": null,
"size": 456033,
"upload_time": "2023-11-14T09:34:18",
"upload_time_iso_8601": "2023-11-14T09:34:18.336648Z",
"url": "https://files.pythonhosted.org/packages/5b/31/c8897534c957553d9da38e7bb6e5a5995a44e4a8a1818931118ecacc5b9b/storage_layout_extractor-0.5.4-cp39-cp39-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-14 09:34:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "smlxlio",
"github_project": "storage-layout-extractor",
"github_not_found": true,
"lcname": "storage-layout-extractor"
}