Name | bodo JSON |
Version |
2025.4
JSON |
| download |
home_page | None |
Summary | High-Performance Python Compute Engine for Data and AI |
upload_time | 2025-04-07 21:42:43 |
maintainer | None |
docs_url | None |
author | Bodo.ai |
requires_python | <3.13,>=3.10 |
license | None |
keywords |
data
analytics
cluster
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<!--
NOTE: the example in this file is covered by tests in bodo/tests/test_quickstart_docs.py. Any changes to the examples in this file should also update the corresponding unit test.
-->

<h3 align="center">
<a href="https://docs.bodo.ai/latest/" target="_blank"><b>Docs</b></a>
·
<a href="https://bodocommunity.slack.com/join/shared_invite/zt-qwdc8fad-6rZ8a1RmkkJ6eOX1X__knA#/shared-invite/email" target="_blank"><b>Slack</b></a>
·
<a href="https://www.bodo.ai/benchmarks/" target="_blank"><b>Benchmarks</b></a>
</h3>
# Bodo: High-Performance Python Compute Engine for Data and AI
Bodo is a cutting edge compute engine for large scale Python data processing. Powered by an innovative auto-parallelizing just-in-time compiler, Bodo transforms Python programs into highly optimized, parallel binaries without requiring code rewrites, which makes Bodo [20x to 240x faster](https://github.com/bodo-ai/Bodo/tree/main/benchmarks/nyc_taxi) compared to alternatives!
<img src="benchmarks/img/nyc-taxi-benchmark.png" alt="NYC Taxi Benchmark" width="500"/>
Unlike traditional distributed computing frameworks, Bodo:
- Seamlessly supports native Python APIs like Pandas and NumPy.
- Eliminates runtime overheads common in driver-executor models by leveraging Message Passing Interface (MPI) tech for true distributed execution.
## Goals
Bodo makes Python run much (much!) faster than it normally does!
1. **Exceptional Performance:**
Deliver HPC-grade performance and scalability for Python data workloads as if the code was written in C++/MPI, whether running on a laptop or across large cloud clusters.
2. **Easy to Use:**
Easily integrate into Python workflows with a simple decorator, and support native Pandas and NumPy APIs.
3. **Interoperable:**
Compatible with regular Python ecosystem, and can selectively speed up only the functions that are Bodo supported.
4. **Integration with Modern Data Infrastructure:**
Provide robust support for industry-leading data platforms like Apache Iceberg and Snowflake, enabling smooth interoperability with existing ecosystems.
## Non-goals
1. *Full Python Language Support:*
We are currently focused on a targeted subset of Python used for data-intensive and computationally heavy workloads, rather than supporting the entire Python syntax and all library APIs.
2. *Non-Data Workloads:*
Prioritize applications in data engineering, data science, and AI/ML. Bodo is not designed for general-purpose use cases that are non-data-centric.
3. *Real-time Compilation:*
While compilation time is improving, Bodo is not yet optimized for scenarios requiring very short compilation times (e.g., workloads with execution times of only a few seconds).
## Key Features
- Automatic optimization & parallelization of Python programs using Pandas and NumPy.
- Linear scalability from laptops to large-scale clusters and supercomputers.
- Advanced scalable I/O support for Iceberg, Snowflake, Parquet, CSV, and JSON with automatic filter pushdown and column pruning for optimized data access.
- High performance SQL Engine that is natively integrated into Python.
See Bodo documentation to learn more: https://docs.bodo.ai/
## Installation
Note: Bodo requires Python 3.10, 3.11, or 3.12.
Bodo can be installed using Pip or Conda:
```bash
pip install -U bodo
```
or
```bash
conda create -n Bodo python=3.12 -c conda-forge
conda activate Bodo
conda install bodo -c bodo.ai -c conda-forge
```
Bodo works with Linux x86, both Mac x86 and Mac ARM, and Windows right now. We will have Linux ARM support (and more) coming soon!
## Example Code
Here is an example Pandas code that reads and processes a sample Parquet dataset with Bodo.
```python
import pandas as pd
import numpy as np
import bodo
import time
# Generate sample data
NUM_GROUPS = 30
NUM_ROWS = 20_000_000
df = pd.DataFrame({
"A": np.arange(NUM_ROWS) % NUM_GROUPS,
"B": np.arange(NUM_ROWS)
})
df.to_parquet("my_data.pq")
@bodo.jit(cache=True)
def computation():
t1 = time.time()
df = pd.read_parquet("my_data.pq")
df2 = pd.DataFrame({"A": df.apply(lambda r: 0 if r.A == 0 else (r.B // r.A), axis=1)})
df2.to_parquet("out.pq")
print("Execution time:", time.time() - t1)
computation()
```
## How to Contribute
Please read our latest [project contribution guide](CONTRIBUTING.md).
## Getting involved
You can join our community and collaborate with other contributors by joining our [Slack channel](https://bodocommunity.slack.com/join/shared_invite/zt-qwdc8fad-6rZ8a1RmkkJ6eOX1X__knA#/shared-invite/email) – we’re excited to hear your ideas and help you get started!
[](https://codecov.io/github/bodo-ai/Bodo)
Raw data
{
"_id": null,
"home_page": null,
"name": "bodo",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": null,
"keywords": "data, analytics, cluster",
"author": "Bodo.ai",
"author_email": null,
"download_url": null,
"platform": null,
"description": "<!--\nNOTE: the example in this file is covered by tests in bodo/tests/test_quickstart_docs.py. Any changes to the examples in this file should also update the corresponding unit test.\n -->\n\n\n\n<h3 align=\"center\">\n <a href=\"https://docs.bodo.ai/latest/\" target=\"_blank\"><b>Docs</b></a>\n · \n <a href=\"https://bodocommunity.slack.com/join/shared_invite/zt-qwdc8fad-6rZ8a1RmkkJ6eOX1X__knA#/shared-invite/email\" target=\"_blank\"><b>Slack</b></a>\n · \n <a href=\"https://www.bodo.ai/benchmarks/\" target=\"_blank\"><b>Benchmarks</b></a>\n</h3>\n\n# Bodo: High-Performance Python Compute Engine for Data and AI\n\nBodo is a cutting edge compute engine for large scale Python data processing. Powered by an innovative auto-parallelizing just-in-time compiler, Bodo transforms Python programs into highly optimized, parallel binaries without requiring code rewrites, which makes Bodo [20x to 240x faster](https://github.com/bodo-ai/Bodo/tree/main/benchmarks/nyc_taxi) compared to alternatives!\n\n<img src=\"benchmarks/img/nyc-taxi-benchmark.png\" alt=\"NYC Taxi Benchmark\" width=\"500\"/>\n\nUnlike traditional distributed computing frameworks, Bodo:\n- Seamlessly supports native Python APIs like Pandas and NumPy.\n- Eliminates runtime overheads common in driver-executor models by leveraging Message Passing Interface (MPI) tech for true distributed execution.\n\n## Goals\n\nBodo makes Python run much (much!) faster than it normally does!\n\n1. **Exceptional Performance:**\nDeliver HPC-grade performance and scalability for Python data workloads as if the code was written in C++/MPI, whether running on a laptop or across large cloud clusters.\n\n2. **Easy to Use:**\nEasily integrate into Python workflows with a simple decorator, and support native Pandas and NumPy APIs.\n\n3. **Interoperable:**\nCompatible with regular Python ecosystem, and can selectively speed up only the functions that are Bodo supported.\n\n4. **Integration with Modern Data Infrastructure:**\nProvide robust support for industry-leading data platforms like Apache Iceberg and Snowflake, enabling smooth interoperability with existing ecosystems.\n\n\n## Non-goals\n\n1. *Full Python Language Support:*\nWe are currently focused on a targeted subset of Python used for data-intensive and computationally heavy workloads, rather than supporting the entire Python syntax and all library APIs.\n\n2. *Non-Data Workloads:*\nPrioritize applications in data engineering, data science, and AI/ML. Bodo is not designed for general-purpose use cases that are non-data-centric.\n\n3. *Real-time Compilation:*\nWhile compilation time is improving, Bodo is not yet optimized for scenarios requiring very short compilation times (e.g., workloads with execution times of only a few seconds).\n\n\n## Key Features\n\n- Automatic optimization & parallelization of Python programs using Pandas and NumPy.\n- Linear scalability from laptops to large-scale clusters and supercomputers.\n- Advanced scalable I/O support for Iceberg, Snowflake, Parquet, CSV, and JSON with automatic filter pushdown and column pruning for optimized data access.\n- High performance SQL Engine that is natively integrated into Python.\n\nSee Bodo documentation to learn more: https://docs.bodo.ai/\n\n\n## Installation\n\nNote: Bodo requires Python 3.10, 3.11, or 3.12.\n\nBodo can be installed using Pip or Conda:\n\n```bash\npip install -U bodo\n```\n\nor\n\n```bash\nconda create -n Bodo python=3.12 -c conda-forge\nconda activate Bodo\nconda install bodo -c bodo.ai -c conda-forge\n```\n\nBodo works with Linux x86, both Mac x86 and Mac ARM, and Windows right now. We will have Linux ARM support (and more) coming soon!\n\n## Example Code\n\nHere is an example Pandas code that reads and processes a sample Parquet dataset with Bodo.\n\n\n```python\nimport pandas as pd\nimport numpy as np\nimport bodo\nimport time\n\n# Generate sample data\nNUM_GROUPS = 30\nNUM_ROWS = 20_000_000\n\ndf = pd.DataFrame({\n \"A\": np.arange(NUM_ROWS) % NUM_GROUPS,\n \"B\": np.arange(NUM_ROWS)\n})\ndf.to_parquet(\"my_data.pq\")\n\n@bodo.jit(cache=True)\ndef computation():\n t1 = time.time()\n df = pd.read_parquet(\"my_data.pq\")\n df2 = pd.DataFrame({\"A\": df.apply(lambda r: 0 if r.A == 0 else (r.B // r.A), axis=1)})\n df2.to_parquet(\"out.pq\")\n print(\"Execution time:\", time.time() - t1)\n\ncomputation()\n```\n\n## How to Contribute\n\nPlease read our latest [project contribution guide](CONTRIBUTING.md).\n\n## Getting involved\n\nYou can join our community and collaborate with other contributors by joining our [Slack channel](https://bodocommunity.slack.com/join/shared_invite/zt-qwdc8fad-6rZ8a1RmkkJ6eOX1X__knA#/shared-invite/email) \u2013 we\u2019re excited to hear your ideas and help you get started!\n\n[](https://codecov.io/github/bodo-ai/Bodo)",
"bugtrack_url": null,
"license": null,
"summary": "High-Performance Python Compute Engine for Data and AI",
"version": "2025.4",
"project_urls": {
"Documentation": "https://docs.bodo.ai",
"Homepage": "https://bodo.ai",
"Repository": "https://github.com/bodo-ai/Bodo"
},
"split_keywords": [
"data",
" analytics",
" cluster"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5da25c0487964f525b0bf818a6b41973b847c02afef1bc6e0c1ffa3d5266aa11",
"md5": "8cd084117031902d91a9521a83a47864",
"sha256": "c5cc6436e19b5cb1029394161224a23f41fbaf24b1fa4a6c4cd4c8c1be58792d"
},
"downloads": -1,
"filename": "bodo-2025.4-cp310-cp310-macosx_10_15_x86_64.whl",
"has_sig": false,
"md5_digest": "8cd084117031902d91a9521a83a47864",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": "<3.13,>=3.10",
"size": 47312970,
"upload_time": "2025-04-07T21:42:43",
"upload_time_iso_8601": "2025-04-07T21:42:43.613517Z",
"url": "https://files.pythonhosted.org/packages/5d/a2/5c0487964f525b0bf818a6b41973b847c02afef1bc6e0c1ffa3d5266aa11/bodo-2025.4-cp310-cp310-macosx_10_15_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "cf0cb15f0aa623f6b5f2c3986426c5a0afb5ed8633291829e19ef9928e1997c9",
"md5": "883854ac1bc03f2e112532f433cf5dfd",
"sha256": "6d0198372c3df2b5ddf89b7e65f3f62040f571f61848c13ec617050102d7ef7b"
},
"downloads": -1,
"filename": "bodo-2025.4-cp310-cp310-macosx_12_0_arm64.whl",
"has_sig": false,
"md5_digest": "883854ac1bc03f2e112532f433cf5dfd",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": "<3.13,>=3.10",
"size": 33448018,
"upload_time": "2025-04-07T21:42:47",
"upload_time_iso_8601": "2025-04-07T21:42:47.236142Z",
"url": "https://files.pythonhosted.org/packages/cf/0c/b15f0aa623f6b5f2c3986426c5a0afb5ed8633291829e19ef9928e1997c9/bodo-2025.4-cp310-cp310-macosx_12_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3b14bc9dfa9faeb0f8f4db947fdaf28865948afe0e8ce6e069f02f8eed8fb195",
"md5": "f46fe6b968b0e27b3b86582d342e9b07",
"sha256": "c5ecfbe7a09a65d66e50d5cc07ce52afe5255b5e2b91707e74bc07b7eef869c7"
},
"downloads": -1,
"filename": "bodo-2025.4-cp310-cp310-manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "f46fe6b968b0e27b3b86582d342e9b07",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": "<3.13,>=3.10",
"size": 49760828,
"upload_time": "2025-04-07T21:42:51",
"upload_time_iso_8601": "2025-04-07T21:42:51.030377Z",
"url": "https://files.pythonhosted.org/packages/3b/14/bc9dfa9faeb0f8f4db947fdaf28865948afe0e8ce6e069f02f8eed8fb195/bodo-2025.4-cp310-cp310-manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "0479cf2fbe0542f8f15ddbe9f04d1b17586698a43064c69a82f97f2b5b65b381",
"md5": "39a075e36bf85f1a8c128107704847c7",
"sha256": "503b4a3f66b7219f33fafb569eddaf5fd0989bcb22515db1924b2c039c973c3e"
},
"downloads": -1,
"filename": "bodo-2025.4-cp310-cp310-win_amd64.whl",
"has_sig": false,
"md5_digest": "39a075e36bf85f1a8c128107704847c7",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": "<3.13,>=3.10",
"size": 13050924,
"upload_time": "2025-04-07T21:42:53",
"upload_time_iso_8601": "2025-04-07T21:42:53.809219Z",
"url": "https://files.pythonhosted.org/packages/04/79/cf2fbe0542f8f15ddbe9f04d1b17586698a43064c69a82f97f2b5b65b381/bodo-2025.4-cp310-cp310-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "42e77d1f8a8954d7c3efaef6804bbac80f5bdd148f524d6a12426733f5dbd16d",
"md5": "7256eba7e5aa0619a7eef1cd25df4ace",
"sha256": "69203810149e90b2ad02c1623f37564171532a69153a612f0762b82c1a28149f"
},
"downloads": -1,
"filename": "bodo-2025.4-cp311-cp311-macosx_10_15_x86_64.whl",
"has_sig": false,
"md5_digest": "7256eba7e5aa0619a7eef1cd25df4ace",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": "<3.13,>=3.10",
"size": 47302586,
"upload_time": "2025-04-07T21:42:56",
"upload_time_iso_8601": "2025-04-07T21:42:56.437522Z",
"url": "https://files.pythonhosted.org/packages/42/e7/7d1f8a8954d7c3efaef6804bbac80f5bdd148f524d6a12426733f5dbd16d/bodo-2025.4-cp311-cp311-macosx_10_15_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "38b74c71132649396a199d07f6a707460ca0b4e845f608b4d54b459480bce8cd",
"md5": "47ec5855ed063b6ed05a2c7986abaf90",
"sha256": "a03421b8030075f9fd5d865b3d1c59e3333e830c13e0bc371d8188d302dca0f8"
},
"downloads": -1,
"filename": "bodo-2025.4-cp311-cp311-macosx_12_0_arm64.whl",
"has_sig": false,
"md5_digest": "47ec5855ed063b6ed05a2c7986abaf90",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": "<3.13,>=3.10",
"size": 33433557,
"upload_time": "2025-04-07T21:43:00",
"upload_time_iso_8601": "2025-04-07T21:43:00.366899Z",
"url": "https://files.pythonhosted.org/packages/38/b7/4c71132649396a199d07f6a707460ca0b4e845f608b4d54b459480bce8cd/bodo-2025.4-cp311-cp311-macosx_12_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "5852a25d3460f580b900106328565d98a381b7d24f0d176a8b713b8a4df1953d",
"md5": "68a3330761436c736aaf85855bd94e09",
"sha256": "14a3f080dc2107b62b8009ecd6fecdd0eb3d988b2147207dc517e8e10a7527c5"
},
"downloads": -1,
"filename": "bodo-2025.4-cp311-cp311-manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "68a3330761436c736aaf85855bd94e09",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": "<3.13,>=3.10",
"size": 49996128,
"upload_time": "2025-04-07T21:43:03",
"upload_time_iso_8601": "2025-04-07T21:43:03.684843Z",
"url": "https://files.pythonhosted.org/packages/58/52/a25d3460f580b900106328565d98a381b7d24f0d176a8b713b8a4df1953d/bodo-2025.4-cp311-cp311-manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "e779f737c086db7ce4bbbf467720f10de34e5d53af575a5eb61ad31be1c78b7f",
"md5": "89587c1a89dacce8f25477ba097ee203",
"sha256": "e3b9d231dfee06d0da6383fc5c085ccd93db8aa9a2fc8e4a2c8fa7a2c6723725"
},
"downloads": -1,
"filename": "bodo-2025.4-cp311-cp311-win_amd64.whl",
"has_sig": false,
"md5_digest": "89587c1a89dacce8f25477ba097ee203",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": "<3.13,>=3.10",
"size": 13063035,
"upload_time": "2025-04-07T21:43:07",
"upload_time_iso_8601": "2025-04-07T21:43:07.496959Z",
"url": "https://files.pythonhosted.org/packages/e7/79/f737c086db7ce4bbbf467720f10de34e5d53af575a5eb61ad31be1c78b7f/bodo-2025.4-cp311-cp311-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "77f026a5eb054d93c29a80451c0390672331450735dd7e66bd133d0d2527eab3",
"md5": "5ff913e027a191a3aedc6be3fee9f083",
"sha256": "49537077cd77f71d49c0be2d8eedc3c75a649c4a3bc8363495304864ca5f2421"
},
"downloads": -1,
"filename": "bodo-2025.4-cp312-cp312-macosx_10_15_x86_64.whl",
"has_sig": false,
"md5_digest": "5ff913e027a191a3aedc6be3fee9f083",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": "<3.13,>=3.10",
"size": 47286190,
"upload_time": "2025-04-07T21:43:10",
"upload_time_iso_8601": "2025-04-07T21:43:10.583269Z",
"url": "https://files.pythonhosted.org/packages/77/f0/26a5eb054d93c29a80451c0390672331450735dd7e66bd133d0d2527eab3/bodo-2025.4-cp312-cp312-macosx_10_15_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "48a667b0feb21a78bd8fc97ae5c718095039168047ced520ea65a6a172cb76a6",
"md5": "7164427f7abdfc78af3329931034a1e4",
"sha256": "01222976a3ed2bb1f85057cc9b5b70a9231126d062da19a9641aa12d09bc9ed9"
},
"downloads": -1,
"filename": "bodo-2025.4-cp312-cp312-macosx_12_0_arm64.whl",
"has_sig": false,
"md5_digest": "7164427f7abdfc78af3329931034a1e4",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": "<3.13,>=3.10",
"size": 33421172,
"upload_time": "2025-04-07T21:43:13",
"upload_time_iso_8601": "2025-04-07T21:43:13.626698Z",
"url": "https://files.pythonhosted.org/packages/48/a6/67b0feb21a78bd8fc97ae5c718095039168047ced520ea65a6a172cb76a6/bodo-2025.4-cp312-cp312-macosx_12_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "6b9e0b92b846ccee2cc142a7b4ffc88d4fd042e49d1f83763283a4eba2a311b8",
"md5": "08235e2d79ed26912677d86c3c336135",
"sha256": "039592872ee475e7cdfbecb7b0b4c2a54e550325a58838e3f8bf73574fe6bea6"
},
"downloads": -1,
"filename": "bodo-2025.4-cp312-cp312-manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "08235e2d79ed26912677d86c3c336135",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": "<3.13,>=3.10",
"size": 50179390,
"upload_time": "2025-04-07T21:43:17",
"upload_time_iso_8601": "2025-04-07T21:43:17.095413Z",
"url": "https://files.pythonhosted.org/packages/6b/9e/0b92b846ccee2cc142a7b4ffc88d4fd042e49d1f83763283a4eba2a311b8/bodo-2025.4-cp312-cp312-manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1f75924a93c051a94b53ac14e3a77d7a945ec69e163eb24f7f3437bf374f6ebb",
"md5": "7d0d2fc099effd3a30413475c3fad475",
"sha256": "6c53a5bdd6fc356dbf66ef9a49a0f19a1f1ae13e38cf7d0a207247460422c4b3"
},
"downloads": -1,
"filename": "bodo-2025.4-cp312-cp312-win_amd64.whl",
"has_sig": false,
"md5_digest": "7d0d2fc099effd3a30413475c3fad475",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": "<3.13,>=3.10",
"size": 13185109,
"upload_time": "2025-04-07T21:43:20",
"upload_time_iso_8601": "2025-04-07T21:43:20.465037Z",
"url": "https://files.pythonhosted.org/packages/1f/75/924a93c051a94b53ac14e3a77d7a945ec69e163eb24f7f3437bf374f6ebb/bodo-2025.4-cp312-cp312-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-04-07 21:42:43",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "bodo-ai",
"github_project": "Bodo",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "bodo"
}