Name | lightning-thunder JSON |
Version |
0.2.4
JSON |
| download |
home_page | None |
Summary | Lightning Thunder is a source-to-source compiler for PyTorch, enabling PyTorch programs to run on different hardware accelerators and graph compilers. |
upload_time | 2025-06-24 10:50:09 |
maintainer | None |
docs_url | None |
author | None |
requires_python | <3.14,>=3.10 |
license | None |
keywords |
deep learning
ai
compiler
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<div align='center'>
# Give your PyTorch models superpowers ⚡
</div>
<div align="center">
<img alt="Thunder" src="https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/LightningThunderLightModewByline.png#gh-light-mode-only" width="400px" style="max-width: 100%;">
<img alt="Thunder" src="https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/LightningThunderDarkModewByline.png#gh-dark-mode-only" width="400px" style="max-width: 100%;">
<br/>
<br/>
 
<strong>Source-to-source compiler for PyTorch.</strong>
Fast. Understandable. Extensible.
</div>
______________________________________________________________________
**Thunder** makes optimizing PyTorch models easy, augmenting them with custom kernels, fusions, quantization, distributed strategies, and more.
For **end users**, Thunder comes with plugins that provide model speed-ups out of the box, for optimal utilization of last generation hardware.
For **performance experts**, Thunder is the most ergonomic framework for understanding, modifying, and optimizing AI models through composable transformations.
<div align='center'>
<pre>
✅ Run PyTorch 40% faster ✅ Quantization ✅ Kernel fusion
✅ Training recipes ✅ FP4/FP6/FP8 precision ✅ Distributed TP/PP/DP
✅ Inference recipes ✅ Ready for NVIDIA Blackwell ✅ CUDA Graphs
✅ LLMs, non LLMs and more ✅ Custom Triton kernels ✅ Compose all the above
</pre>
</div>
<div align='center'>
[](https://github.com/Lightning-AI/lightning-thunder/blob/main/LICENSE)
[](https://github.com/Lightning-AI/lightning-thunder/actions/workflows/ci-testing.yml)
[](https://github.com/Lightning-AI/lightning-thunder/actions/workflows/ci-checks.yml)
[](https://lightning-thunder.readthedocs.io/en/latest/?badge=latest)
[](https://results.pre-commit.ci/latest/github/Lightning-AI/lightning-thunder/main)
</div>
<div align="center">
<div style="text-align: center;">
<a target="_blank" href="#quick-start" style="margin: 0 10px;">Quick start</a> •
<a target="_blank" href="#examples" style="margin: 0 10px;">Examples</a> •
<a target="_blank" href="#performance" style="margin: 0 10px;">Performance</a> •
<!-- <a target="_blank" href="#hosting-options" style="margin: 0 10px;">Hosting</a> • -->
<a target="_blank" href="https://lightning.ai/docs/thunder/latest/" style="margin: 0 10px;">Docs</a>
</div>
</div>
 
<!--
<div align="center">
<a target="_blank" href="https://lightning.ai/docs/thunder/home/get-started">
<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/get-started-badge.svg" height="36px" alt="Get started"/>
</a>
</div>
-->
 
<div align="center">
<img alt="Thunder" src="https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/pretrain_perf.png" width="800px" style="max-width: 100%;">
</div>
# Quick start
Install Thunder via pip ([more options](https://lightning.ai/docs/thunder/latest/fundamentals/installation.html)):
```bash
pip install torch==2.6.0 torchvision==0.21 nvfuser-cu124-torch26
pip install lightning-thunder
```
<details>
<summary>Advanced install options</summary>
### Blackwell support
For Blackwell you'll need CUDA 12.8
```bash
pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu128
pip install --pre nvfuser-cu128 --extra-index-url https://pypi.nvidia.com
pip install lightning-thunder
```
### Install additional executors
These are optional, feel free to mix and match
```bash
# cuDNN SDPA
pip install nvidia-cudnn-frontend
# Float8 support (this will compile from source, be patient)
pip install "transformer_engine[pytorch]"
```
### Install Thunder bleeding edge
```bash
pip install git+https://github.com/Lightning-AI/lightning-thunder.git@main
```
### Install Thunder for development
```bash
git clone https://github.com/Lightning-AI/lightning-thunder.git
cd lightning-thunder
pip install -e .
```
</details>
### Hello world
Define a function or a torch module:
```python
import torch.nn as nn
model = nn.Sequential(nn.Linear(2048, 4096), nn.ReLU(), nn.Linear(4096, 64))
```
Optimize it with Thunder:
```python
import thunder
import torch
thunder_model = thunder.compile(model)
x = torch.randn(64, 2048)
y = thunder_model(x)
assert torch.testing.assert_close(y, model(x))
```
## Examples
### Speed up LLM training
Install LitGPT (without updating other dependencies)
```
pip install --no-deps 'litgpt[all]'
```
and run
```python
import thunder
import torch
import litgpt
with torch.device("cuda"):
model = litgpt.GPT.from_name("Llama-3.2-1B").to(torch.bfloat16)
thunder_model = thunder.compile(model)
inp = torch.ones((1, 2048), device="cuda", dtype=torch.int64)
out = thunder_model(inp)
out.sum().backward()
```
### Speed up HuggingFace BERT inference
Install Hugging Face Transformers (recommended version is `4.50.2` and above)
```
pip install -U transformers
```
and run
```python
import thunder
import torch
import transformers
model_name = "bert-large-uncased"
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
with torch.device("cuda"):
model = transformers.AutoModelForCausalLM.from_pretrained(
model_name, torch_dtype=torch.bfloat16
)
model.requires_grad_(False)
model.eval()
inp = tokenizer(["Hello world!"], return_tensors="pt")
thunder_model = thunder.compile(model)
out = thunder_model(**inp)
print(out)
```
### Speed up HuggingFace DeepSeek R1 distill inference
Install Hugging Face Transformers (recommended version is `4.50.2` and above)
```
pip install -U transformers
```
and run
```python
import torch
import transformers
import thunder
model_name = "deepseek-ai/DeepSeek-R1-Distill-Llama-8B"
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
with torch.device("cuda"):
model = transformers.AutoModelForCausalLM.from_pretrained(
model_name, torch_dtype=torch.bfloat16
)
model.requires_grad_(False)
model.eval()
inp = tokenizer(["Hello world! Here's a long story"], return_tensors="pt")
thunder_model = thunder.compile(model)
out = thunder_model.generate(
**inp, do_sample=False, cache_implementation="static", max_new_tokens=100
)
print(out)
```
To get an idea of the speedups, just run
```bash
python examples/quickstart/hf_llm.py
```
Here what you get on a L4 machine from [Lightning Studio](https://lightning.ai):
```bash
Eager: 2273.22ms
Thunder: 1254.39ms
```
81% faster 🏎️! Quite the speedup ⚡️
### Speed up Vision Transformer inference
```python
import thunder
import torch
import torchvision as tv
with torch.device("cuda"):
model = tv.models.vit_b_16()
model.requires_grad_(False)
model.eval()
inp = torch.randn(128, 3, 224, 224)
out = model(inp)
thunder_model = thunder.compile(model)
out = thunder_model(inp)
```
### Benchmarking HF models
The script `examples/quickstart/hf_benchmarks.py` demonstrates how to benchmark a model for text generation, forward pass, forward pass with loss, and a full forward + backward computation.
On an H100 with torch=2.7.0 and nvfuser-cu126-torch27, running deepseek-ai/DeepSeek-R1-Distill-Llama-1.5B, the thunder executors (NVFuser and torch.compile) achieve the following speedups:
```
Text generation:
Thunder (nvfuser): 3.36× faster
Thunder (torch.compile): 3.42× faster
Forward pass:
Thunder (nvfuser): 1.51× faster
Thunder (torch.compile): 1.63× faster
Forward pass + loss:
Thunder (nvfuser): 1.55× faster
Thunder (torch.compile): 1.64× faster
Forward + backward:
Thunder (nvfuser): 1.51× faster
Thunder (torch.compile): 1.69× faster
```
## Plugins
Plugins are a way to apply optimizations to a model, such as parallelism and quantization.
Thunder comes with a few plugins included of the box, but it's easy to write new ones.
- scale up with distributed strategies with DDP, FSDP, TP ()
- optimize numerical precision with FP8, MXFP8
- save memory with quantization
- reduce latency with CUDAGraphs
- debugging and profiling
For example, in order to reduce CPU overheads via CUDAGraphs you can add "reduce-overhead"
to the `plugins=` argument of `thunder.compile`:
```python
thunder_model = thunder.compile(model, plugins="reduce-overhead")
```
This may or may not make a big difference. The point of Thunder is that you can easily
swap optimizations in and out and explore the best combination for your setup.
## How it works
Thunder works in three stages:
1. ⚡️ It acquires your model by interpreting Python bytecode and producing a straight-line Python program
1. ️⚡️ It transforms the computation trace to make it distributed, change precision
1. ⚡️ It routes parts of the trace for execution
- fusion (`NVFuser`, `torch.compile`)
- specialized libraries (e.g. `cuDNN SDPA`, `TransformerEngine`)
- custom Triton and CUDA kernels
- PyTorch eager operations
 
<div align="center">
<img alt="Thunder" src="https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/how_it_works.png" width="800px" style="max-width: 100%;">
</div>
 
This is how the trace looks like for a simple MLP:
```python
import thunder
import torch.nn as nn
model = nn.Sequential(nn.Linear(1024, 2048), nn.ReLU(), nn.Linear(2048, 256))
thunder_model = thunder.compile(model)
y = thunder_model(torch.randn(4, 1024))
print(thunder.last_traces(thunder_model)[-1])
```
This is the acquired trace, ready to be transformed and executed:
```python
def computation(input, t_0_bias, t_0_weight, t_2_bias, t_2_weight):
# input: "cuda:0 f32[4, 1024]"
# t_0_bias: "cuda:0 f32[2048]"
# t_0_weight: "cuda:0 f32[2048, 1024]"
# t_2_bias: "cuda:0 f32[256]"
# t_2_weight: "cuda:0 f32[256, 2048]"
t3 = ltorch.linear(input, t_0_weight, t_0_bias) # t3: "cuda:0 f32[4, 2048]"
t6 = ltorch.relu(t3, False) # t6: "cuda:0 f32[4, 2048]"
t10 = ltorch.linear(t6, t_2_weight, t_2_bias) # t10: "cuda:0 f32[4, 256]"
return (t10,)
```
Note how Thunder's intermediate representation is just (a subset of) Python!
## Performance
Thunder is fast. Here are the speed-ups obtained on a pre-training task using LitGPT on H100 and B200 hardware, relative to PyTorch eager.
<div align="center">
<img alt="Thunder" src="https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/pretrain_perf.png" width="800px" style="max-width: 100%;">
</div>
# Community
Thunder is an open source project, developed in collaboration with the community with significant contributions from NVIDIA.
💬 [Get help on Discord](https://discord.com/invite/XncpTy7DSt)
📋 [License: Apache 2.0](https://github.com/Lightning-AI/litserve/blob/main/LICENSE)
Raw data
{
"_id": null,
"home_page": null,
"name": "lightning-thunder",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.14,>=3.10",
"maintainer_email": null,
"keywords": "deep learning, AI, compiler",
"author": null,
"author_email": "Lightning AI <support@lightning.ai>",
"download_url": "https://files.pythonhosted.org/packages/71/b0/7702a5ad66be56794022542174c4cf11b7723c88d87c678a2cac8812f7a5/lightning_thunder-0.2.4.tar.gz",
"platform": null,
"description": "<div align='center'>\n\n# Give your PyTorch models superpowers \u26a1\n\n</div>\n\n<div align=\"center\">\n<img alt=\"Thunder\" src=\"https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/LightningThunderLightModewByline.png#gh-light-mode-only\" width=\"400px\" style=\"max-width: 100%;\">\n<img alt=\"Thunder\" src=\"https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/LightningThunderDarkModewByline.png#gh-dark-mode-only\" width=\"400px\" style=\"max-width: 100%;\">\n<br/>\n<br/>\n\n \n\n<strong>Source-to-source compiler for PyTorch.</strong>\nFast. Understandable. Extensible.\n\n</div>\n\n______________________________________________________________________\n\n**Thunder** makes optimizing PyTorch models easy, augmenting them with custom kernels, fusions, quantization, distributed strategies, and more.\n\nFor **end users**, Thunder comes with plugins that provide model speed-ups out of the box, for optimal utilization of last generation hardware.\n\nFor **performance experts**, Thunder is the most ergonomic framework for understanding, modifying, and optimizing AI models through composable transformations.\n\n<div align='center'>\n\n<pre>\n\u2705 Run PyTorch 40% faster \u2705 Quantization \u2705 Kernel fusion \n\u2705 Training recipes \u2705 FP4/FP6/FP8 precision \u2705 Distributed TP/PP/DP \n\u2705 Inference recipes \u2705 Ready for NVIDIA Blackwell \u2705 CUDA Graphs \n\u2705 LLMs, non LLMs and more \u2705 Custom Triton kernels \u2705 Compose all the above\n</pre>\n\n</div>\n\n<div align='center'>\n\n[](https://github.com/Lightning-AI/lightning-thunder/blob/main/LICENSE)\n[](https://github.com/Lightning-AI/lightning-thunder/actions/workflows/ci-testing.yml)\n[](https://github.com/Lightning-AI/lightning-thunder/actions/workflows/ci-checks.yml)\n[](https://lightning-thunder.readthedocs.io/en/latest/?badge=latest)\n[](https://results.pre-commit.ci/latest/github/Lightning-AI/lightning-thunder/main)\n\n</div>\n\n<div align=\"center\">\n <div style=\"text-align: center;\">\n <a target=\"_blank\" href=\"#quick-start\" style=\"margin: 0 10px;\">Quick start</a> \u2022\n <a target=\"_blank\" href=\"#examples\" style=\"margin: 0 10px;\">Examples</a> \u2022\n <a target=\"_blank\" href=\"#performance\" style=\"margin: 0 10px;\">Performance</a> \u2022\n <!-- <a target=\"_blank\" href=\"#hosting-options\" style=\"margin: 0 10px;\">Hosting</a> \u2022 -->\n <a target=\"_blank\" href=\"https://lightning.ai/docs/thunder/latest/\" style=\"margin: 0 10px;\">Docs</a>\n </div>\n</div>\n\n \n\n<!--\n<div align=\"center\">\n<a target=\"_blank\" href=\"https://lightning.ai/docs/thunder/home/get-started\">\n <img src=\"https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/get-started-badge.svg\" height=\"36px\" alt=\"Get started\"/>\n</a>\n</div>\n-->\n\n \n\n<div align=\"center\">\n<img alt=\"Thunder\" src=\"https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/pretrain_perf.png\" width=\"800px\" style=\"max-width: 100%;\">\n</div>\n\n# Quick start\n\nInstall Thunder via pip ([more options](https://lightning.ai/docs/thunder/latest/fundamentals/installation.html)):\n\n```bash\npip install torch==2.6.0 torchvision==0.21 nvfuser-cu124-torch26\n\npip install lightning-thunder\n```\n\n<details>\n <summary>Advanced install options</summary>\n\n### Blackwell support\n\nFor Blackwell you'll need CUDA 12.8\n\n```bash\npip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu128\npip install --pre nvfuser-cu128 --extra-index-url https://pypi.nvidia.com\n\npip install lightning-thunder\n```\n\n### Install additional executors\n\nThese are optional, feel free to mix and match\n\n```bash\n# cuDNN SDPA\npip install nvidia-cudnn-frontend\n\n# Float8 support (this will compile from source, be patient)\npip install \"transformer_engine[pytorch]\"\n```\n\n### Install Thunder bleeding edge\n\n```bash\npip install git+https://github.com/Lightning-AI/lightning-thunder.git@main\n```\n\n### Install Thunder for development\n\n```bash\ngit clone https://github.com/Lightning-AI/lightning-thunder.git\ncd lightning-thunder\npip install -e .\n```\n\n</details>\n\n### Hello world\n\nDefine a function or a torch module:\n\n```python\nimport torch.nn as nn\n\nmodel = nn.Sequential(nn.Linear(2048, 4096), nn.ReLU(), nn.Linear(4096, 64))\n```\n\nOptimize it with Thunder:\n\n```python\nimport thunder\nimport torch\n\nthunder_model = thunder.compile(model)\n\nx = torch.randn(64, 2048)\n\ny = thunder_model(x)\n\nassert torch.testing.assert_close(y, model(x))\n```\n\n## Examples\n\n### Speed up LLM training\n\nInstall LitGPT (without updating other dependencies)\n\n```\npip install --no-deps 'litgpt[all]'\n```\n\nand run\n\n```python\nimport thunder\nimport torch\nimport litgpt\n\nwith torch.device(\"cuda\"):\n model = litgpt.GPT.from_name(\"Llama-3.2-1B\").to(torch.bfloat16)\n\nthunder_model = thunder.compile(model)\n\ninp = torch.ones((1, 2048), device=\"cuda\", dtype=torch.int64)\n\nout = thunder_model(inp)\nout.sum().backward()\n```\n\n### Speed up HuggingFace BERT inference\n\nInstall Hugging Face Transformers (recommended version is `4.50.2` and above)\n\n```\npip install -U transformers\n```\n\nand run\n\n```python\nimport thunder\nimport torch\nimport transformers\n\nmodel_name = \"bert-large-uncased\"\n\ntokenizer = transformers.AutoTokenizer.from_pretrained(model_name)\n\nwith torch.device(\"cuda\"):\n model = transformers.AutoModelForCausalLM.from_pretrained(\n model_name, torch_dtype=torch.bfloat16\n )\n model.requires_grad_(False)\n model.eval()\n\n inp = tokenizer([\"Hello world!\"], return_tensors=\"pt\")\n\nthunder_model = thunder.compile(model)\n\nout = thunder_model(**inp)\nprint(out)\n```\n\n### Speed up HuggingFace DeepSeek R1 distill inference\n\nInstall Hugging Face Transformers (recommended version is `4.50.2` and above)\n\n```\npip install -U transformers\n```\n\nand run\n\n```python\nimport torch\nimport transformers\nimport thunder\n\nmodel_name = \"deepseek-ai/DeepSeek-R1-Distill-Llama-8B\"\n\ntokenizer = transformers.AutoTokenizer.from_pretrained(model_name)\n\nwith torch.device(\"cuda\"):\n model = transformers.AutoModelForCausalLM.from_pretrained(\n model_name, torch_dtype=torch.bfloat16\n )\n model.requires_grad_(False)\n model.eval()\n\n inp = tokenizer([\"Hello world! Here's a long story\"], return_tensors=\"pt\")\n\nthunder_model = thunder.compile(model)\n\nout = thunder_model.generate(\n **inp, do_sample=False, cache_implementation=\"static\", max_new_tokens=100\n)\nprint(out)\n```\n\nTo get an idea of the speedups, just run\n\n```bash\npython examples/quickstart/hf_llm.py\n```\n\nHere what you get on a L4 machine from [Lightning Studio](https://lightning.ai):\n\n```bash\nEager: 2273.22ms\nThunder: 1254.39ms\n```\n\n81% faster \ud83c\udfce\ufe0f! Quite the speedup \u26a1\ufe0f\n\n### Speed up Vision Transformer inference\n\n```python\nimport thunder\nimport torch\nimport torchvision as tv\n\nwith torch.device(\"cuda\"):\n model = tv.models.vit_b_16()\n model.requires_grad_(False)\n model.eval()\n\n inp = torch.randn(128, 3, 224, 224)\n\nout = model(inp)\n\nthunder_model = thunder.compile(model)\n\nout = thunder_model(inp)\n```\n\n### Benchmarking HF models\n\nThe script `examples/quickstart/hf_benchmarks.py` demonstrates how to benchmark a model for text generation, forward pass, forward pass with loss, and a full forward + backward computation.\n\nOn an H100 with torch=2.7.0 and nvfuser-cu126-torch27, running deepseek-ai/DeepSeek-R1-Distill-Llama-1.5B, the thunder executors (NVFuser and torch.compile) achieve the following speedups:\n\n```\nText generation:\nThunder (nvfuser): 3.36\u00d7 faster\nThunder (torch.compile): 3.42\u00d7 faster\n\nForward pass:\nThunder (nvfuser): 1.51\u00d7 faster\nThunder (torch.compile): 1.63\u00d7 faster\n\nForward pass + loss:\nThunder (nvfuser): 1.55\u00d7 faster\nThunder (torch.compile): 1.64\u00d7 faster\n\nForward + backward:\nThunder (nvfuser): 1.51\u00d7 faster\nThunder (torch.compile): 1.69\u00d7 faster\n```\n\n## Plugins\n\nPlugins are a way to apply optimizations to a model, such as parallelism and quantization.\n\nThunder comes with a few plugins included of the box, but it's easy to write new ones.\n\n- scale up with distributed strategies with DDP, FSDP, TP ()\n- optimize numerical precision with FP8, MXFP8\n- save memory with quantization\n- reduce latency with CUDAGraphs\n- debugging and profiling\n\nFor example, in order to reduce CPU overheads via CUDAGraphs you can add \"reduce-overhead\"\nto the `plugins=` argument of `thunder.compile`:\n\n```python\nthunder_model = thunder.compile(model, plugins=\"reduce-overhead\")\n```\n\nThis may or may not make a big difference. The point of Thunder is that you can easily\nswap optimizations in and out and explore the best combination for your setup.\n\n## How it works\n\nThunder works in three stages:\n\n1. \u26a1\ufe0f It acquires your model by interpreting Python bytecode and producing a straight-line Python program\n\n1. \ufe0f\u26a1\ufe0f It transforms the computation trace to make it distributed, change precision\n\n1. \u26a1\ufe0f It routes parts of the trace for execution\n\n - fusion (`NVFuser`, `torch.compile`)\n - specialized libraries (e.g. `cuDNN SDPA`, `TransformerEngine`)\n - custom Triton and CUDA kernels\n - PyTorch eager operations\n\n \n\n<div align=\"center\">\n<img alt=\"Thunder\" src=\"https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/how_it_works.png\" width=\"800px\" style=\"max-width: 100%;\">\n</div>\n\n \n\nThis is how the trace looks like for a simple MLP:\n\n```python\nimport thunder\nimport torch.nn as nn\n\nmodel = nn.Sequential(nn.Linear(1024, 2048), nn.ReLU(), nn.Linear(2048, 256))\n\nthunder_model = thunder.compile(model)\ny = thunder_model(torch.randn(4, 1024))\n\nprint(thunder.last_traces(thunder_model)[-1])\n```\n\nThis is the acquired trace, ready to be transformed and executed:\n\n```python\ndef computation(input, t_0_bias, t_0_weight, t_2_bias, t_2_weight):\n# input: \"cuda:0 f32[4, 1024]\"\n# t_0_bias: \"cuda:0 f32[2048]\"\n# t_0_weight: \"cuda:0 f32[2048, 1024]\"\n# t_2_bias: \"cuda:0 f32[256]\"\n# t_2_weight: \"cuda:0 f32[256, 2048]\"\nt3 = ltorch.linear(input, t_0_weight, t_0_bias) # t3: \"cuda:0 f32[4, 2048]\"\nt6 = ltorch.relu(t3, False) # t6: \"cuda:0 f32[4, 2048]\"\nt10 = ltorch.linear(t6, t_2_weight, t_2_bias) # t10: \"cuda:0 f32[4, 256]\"\nreturn (t10,)\n```\n\nNote how Thunder's intermediate representation is just (a subset of) Python!\n\n## Performance\n\nThunder is fast. Here are the speed-ups obtained on a pre-training task using LitGPT on H100 and B200 hardware, relative to PyTorch eager.\n\n<div align=\"center\">\n<img alt=\"Thunder\" src=\"https://github.com/Lightning-AI/lightning-thunder/raw/0.2.4/docs/source/_static/images/pretrain_perf.png\" width=\"800px\" style=\"max-width: 100%;\">\n</div>\n\n# Community\n\nThunder is an open source project, developed in collaboration with the community with significant contributions from NVIDIA.\n\n\ud83d\udcac [Get help on Discord](https://discord.com/invite/XncpTy7DSt)\n\ud83d\udccb [License: Apache 2.0](https://github.com/Lightning-AI/litserve/blob/main/LICENSE)\n",
"bugtrack_url": null,
"license": null,
"summary": "Lightning Thunder is a source-to-source compiler for PyTorch, enabling PyTorch programs to run on different hardware accelerators and graph compilers.",
"version": "0.2.4",
"project_urls": {
"Bug Tracker": "https://github.com/Lightning-AI/lightning-thunder/issues",
"Documentation": "https://lightning-thunder.rtfd.io/en/latest/",
"Homepage": "https://github.com/Lightning-AI/lightning-thunder",
"Source": "https://github.com/Lightning-AI/lightning-thunder"
},
"split_keywords": [
"deep learning",
" ai",
" compiler"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5111a057cf09ded9825857531353367cd1e3ccce4a9bea542dfdab568adc96d8",
"md5": "ba73d7c3d873bc8c9f77c612f584f230",
"sha256": "72b4662085b739dc2cab1bff0838fd4e7d8082c5f6735d3cd4eff4e27a07fa10"
},
"downloads": -1,
"filename": "lightning_thunder-0.2.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ba73d7c3d873bc8c9f77c612f584f230",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.14,>=3.10",
"size": 938996,
"upload_time": "2025-06-24T10:50:08",
"upload_time_iso_8601": "2025-06-24T10:50:08.743862Z",
"url": "https://files.pythonhosted.org/packages/51/11/a057cf09ded9825857531353367cd1e3ccce4a9bea542dfdab568adc96d8/lightning_thunder-0.2.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "71b07702a5ad66be56794022542174c4cf11b7723c88d87c678a2cac8812f7a5",
"md5": "9e4b180f5f3bda6ed269911ccb2d5f1e",
"sha256": "b715fa73766003d9b3a52adf01c2f561b32490dc79fc9c96c70189033ed07a66"
},
"downloads": -1,
"filename": "lightning_thunder-0.2.4.tar.gz",
"has_sig": false,
"md5_digest": "9e4b180f5f3bda6ed269911ccb2d5f1e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.14,>=3.10",
"size": 606792,
"upload_time": "2025-06-24T10:50:09",
"upload_time_iso_8601": "2025-06-24T10:50:09.968869Z",
"url": "https://files.pythonhosted.org/packages/71/b0/7702a5ad66be56794022542174c4cf11b7723c88d87c678a2cac8812f7a5/lightning_thunder-0.2.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-06-24 10:50:09",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Lightning-AI",
"github_project": "lightning-thunder",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "lightning-thunder"
}