polars


Namepolars JSON
Version 1.14.0 PyPI version JSON
download
home_pageNone
SummaryBlazingly fast DataFrame library
upload_time2024-11-17 18:50:20
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords dataframe arrow out-of-core
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <h1 align="center">
  <a href="https://pola.rs">
    <img src="https://raw.githubusercontent.com/pola-rs/polars-static/master/banner/polars_github_banner.svg" alt="Polars logo">
  </a>
</h1>

<div align="center">
  <a href="https://crates.io/crates/polars">
    <img src="https://img.shields.io/crates/v/polars.svg" alt="crates.io Latest Release"/>
  </a>
  <a href="https://pypi.org/project/polars/">
    <img src="https://img.shields.io/pypi/v/polars.svg" alt="PyPi Latest Release"/>
  </a>
  <a href="https://www.npmjs.com/package/nodejs-polars">
    <img src="https://img.shields.io/npm/v/nodejs-polars.svg" alt="NPM Latest Release"/>
  </a>
  <a href="https://rpolars.r-universe.dev">
    <img src="https://rpolars.r-universe.dev/badges/polars" alt="R-universe Latest Release"/>
  </a>
  <a href="https://doi.org/10.5281/zenodo.7697217">
    <img src="https://zenodo.org/badge/DOI/10.5281/zenodo.7697217.svg" alt="DOI Latest Release"/>
  </a>
</div>

<p align="center">
  <b>Documentation</b>:
  <a href="https://docs.pola.rs/api/python/stable/reference/index.html">Python</a>
  -
  <a href="https://docs.rs/polars/latest/polars/">Rust</a>
  -
  <a href="https://pola-rs.github.io/nodejs-polars/index.html">Node.js</a>
  -
  <a href="https://pola-rs.github.io/r-polars/index.html">R</a>
  |
  <b>StackOverflow</b>:
  <a href="https://stackoverflow.com/questions/tagged/python-polars">Python</a>
  -
  <a href="https://stackoverflow.com/questions/tagged/rust-polars">Rust</a>
  -
  <a href="https://stackoverflow.com/questions/tagged/nodejs-polars">Node.js</a>
  -
  <a href="https://stackoverflow.com/questions/tagged/r-polars">R</a>
  |
  <a href="https://docs.pola.rs/">User guide</a>
  |
  <a href="https://discord.gg/4UfP5cfBE7">Discord</a>
</p>

## Polars: Blazingly fast DataFrames in Rust, Python, Node.js, R, and SQL

Polars is a DataFrame interface on top of an OLAP Query Engine implemented in Rust using
[Apache Arrow Columnar Format](https://arrow.apache.org/docs/format/Columnar.html) as the memory
model.

- Lazy | eager execution
- Multi-threaded
- SIMD
- Query optimization
- Powerful expression API
- Hybrid Streaming (larger-than-RAM datasets)
- Rust | Python | NodeJS | R | ...

To learn more, read the [user guide](https://docs.pola.rs/).

## Python

```python
>>> import polars as pl
>>> df = pl.DataFrame(
...     {
...         "A": [1, 2, 3, 4, 5],
...         "fruits": ["banana", "banana", "apple", "apple", "banana"],
...         "B": [5, 4, 3, 2, 1],
...         "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
...     }
... )

# embarrassingly parallel execution & very expressive query language
>>> df.sort("fruits").select(
...     "fruits",
...     "cars",
...     pl.lit("fruits").alias("literal_string_fruits"),
...     pl.col("B").filter(pl.col("cars") == "beetle").sum(),
...     pl.col("A").filter(pl.col("B") > 2).sum().over("cars").alias("sum_A_by_cars"),
...     pl.col("A").sum().over("fruits").alias("sum_A_by_fruits"),
...     pl.col("A").reverse().over("fruits").alias("rev_A_by_fruits"),
...     pl.col("A").sort_by("B").over("fruits").alias("sort_A_by_B_by_fruits"),
... )
shape: (5, 8)
┌──────────┬──────────┬──────────────┬─────┬─────────────┬─────────────┬─────────────┬─────────────┐
│ fruits   ┆ cars     ┆ literal_stri ┆ B   ┆ sum_A_by_ca ┆ sum_A_by_fr ┆ rev_A_by_fr ┆ sort_A_by_B │
│ ---      ┆ ---      ┆ ng_fruits    ┆ --- ┆ rs          ┆ uits        ┆ uits        ┆ _by_fruits  │
│ str      ┆ str      ┆ ---          ┆ i64 ┆ ---         ┆ ---         ┆ ---         ┆ ---         │
│          ┆          ┆ str          ┆     ┆ i64         ┆ i64         ┆ i64         ┆ i64         │
╞══════════╪══════════╪══════════════╪═════╪═════════════╪═════════════╪═════════════╪═════════════╡
│ "apple"  ┆ "beetle" ┆ "fruits"     ┆ 11  ┆ 4           ┆ 7           ┆ 4           ┆ 4           │
│ "apple"  ┆ "beetle" ┆ "fruits"     ┆ 11  ┆ 4           ┆ 7           ┆ 3           ┆ 3           │
│ "banana" ┆ "beetle" ┆ "fruits"     ┆ 11  ┆ 4           ┆ 8           ┆ 5           ┆ 5           │
│ "banana" ┆ "audi"   ┆ "fruits"     ┆ 11  ┆ 2           ┆ 8           ┆ 2           ┆ 2           │
│ "banana" ┆ "beetle" ┆ "fruits"     ┆ 11  ┆ 4           ┆ 8           ┆ 1           ┆ 1           │
└──────────┴──────────┴──────────────┴─────┴─────────────┴─────────────┴─────────────┴─────────────┘
```

## SQL

```python
>>> df = pl.scan_csv("docs/assets/data/iris.csv")
>>> ## OPTION 1
>>> # run SQL queries on frame-level
>>> df.sql("""
...	SELECT species,
...	  AVG(sepal_length) AS avg_sepal_length
...	FROM self
...	GROUP BY species
...	""").collect()
shape: (3, 2)
┌────────────┬──────────────────┐
│ species    ┆ avg_sepal_length │
│ ---        ┆ ---              │
│ str        ┆ f64              │
╞════════════╪══════════════════╡
│ Virginica  ┆ 6.588            │
│ Versicolor ┆ 5.936            │
│ Setosa     ┆ 5.006            │
└────────────┴──────────────────┘
>>> ## OPTION 2
>>> # use pl.sql() to operate on the global context
>>> df2 = pl.LazyFrame({
...    "species": ["Setosa", "Versicolor", "Virginica"],
...    "blooming_season": ["Spring", "Summer", "Fall"]
...})
>>> pl.sql("""
... SELECT df.species,
...     AVG(df.sepal_length) AS avg_sepal_length,
...     df2.blooming_season
... FROM df
... LEFT JOIN df2 ON df.species = df2.species
... GROUP BY df.species, df2.blooming_season
... """).collect()
```

SQL commands can also be run directly from your terminal using the Polars CLI:

```bash
# run an inline SQL query
> polars -c "SELECT species, AVG(sepal_length) AS avg_sepal_length, AVG(sepal_width) AS avg_sepal_width FROM read_csv('docs/assets/data/iris.csv') GROUP BY species;"

# run interactively
> polars
Polars CLI v0.3.0
Type .help for help.

> SELECT species, AVG(sepal_length) AS avg_sepal_length, AVG(sepal_width) AS avg_sepal_width FROM read_csv('docs/assets/data/iris.csv') GROUP BY species;
```

Refer to the [Polars CLI repository](https://github.com/pola-rs/polars-cli) for more information.

## Performance 🚀🚀

### Blazingly fast

Polars is very fast. In fact, it is one of the best performing solutions available. See the
[PDS-H benchmarks](https://www.pola.rs/benchmarks.html) results.

### Lightweight

Polars is also very lightweight. It comes with zero required dependencies, and this shows in the
import times:

- polars: 70ms
- numpy: 104ms
- pandas: 520ms

### Handles larger-than-RAM data

If you have data that does not fit into memory, Polars' query engine is able to process your query
(or parts of your query) in a streaming fashion. This drastically reduces memory requirements, so
you might be able to process your 250GB dataset on your laptop. Collect with
`collect(streaming=True)` to run the query streaming. (This might be a little slower, but it is
still very fast!)

## Setup

### Python

Install the latest Polars version with:

```sh
pip install polars
```

We also have a conda package (`conda install -c conda-forge polars`), however pip is the preferred
way to install Polars.

Install Polars with all optional dependencies.

```sh
pip install 'polars[all]'
```

You can also install a subset of all optional dependencies.

```sh
pip install 'polars[numpy,pandas,pyarrow]'
```

See the [User Guide](https://docs.pola.rs/user-guide/installation/#feature-flags) for more details
on optional dependencies

To see the current Polars version and a full list of its optional dependencies, run:

```python
pl.show_versions()
```

Releases happen quite often (weekly / every few days) at the moment, so updating Polars regularly to
get the latest bugfixes / features might not be a bad idea.

### Rust

You can take latest release from `crates.io`, or if you want to use the latest features /
performance improvements point to the `main` branch of this repo.

```toml
polars = { git = "https://github.com/pola-rs/polars", rev = "<optional git tag>" }
```

Requires Rust version `>=1.80`.

## Contributing

Want to contribute? Read our [contributing guide](https://docs.pola.rs/development/contributing/).

## Python: compile Polars from source

If you want a bleeding edge release or maximal performance you should compile Polars from source.

This can be done by going through the following steps in sequence:

1. Install the latest [Rust compiler](https://www.rust-lang.org/tools/install)
2. Install [maturin](https://maturin.rs/): `pip install maturin`
3. `cd py-polars` and choose one of the following:
   - `make build`, slow binary with debug assertions and symbols, fast compile times
   - `make build-release`, fast binary without debug assertions, minimal debug symbols, long compile
     times
   - `make build-nodebug-release`, same as build-release but without any debug symbols, slightly
     faster to compile
   - `make build-debug-release`, same as build-release but with full debug symbols, slightly slower
     to compile
   - `make build-dist-release`, fastest binary, extreme compile times

By default the binary is compiled with optimizations turned on for a modern CPU. Specify `LTS_CPU=1`
with the command if your CPU is older and does not support e.g. AVX2.

Note that the Rust crate implementing the Python bindings is called `py-polars` to distinguish from
the wrapped Rust crate `polars` itself. However, both the Python package and the Python module are
named `polars`, so you can `pip install polars` and `import polars`.

## Using custom Rust functions in Python

Extending Polars with UDFs compiled in Rust is easy. We expose PyO3 extensions for `DataFrame` and
`Series` data structures. See more in https://github.com/pola-rs/pyo3-polars.

## Going big...

Do you expect more than 2^32 (~4.2 billion) rows? Compile Polars with the `bigidx` feature flag or,
for Python users, install `pip install polars-u64-idx`.

Don't use this unless you hit the row boundary as the default build of Polars is faster and consumes
less memory.

## Legacy

Do you want Polars to run on an old CPU (e.g. dating from before 2011), or on an `x86-64` build of
Python on Apple Silicon under Rosetta? Install `pip install polars-lts-cpu`. This version of Polars
is compiled without [AVX](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) target features.

## Sponsors

[<img src="https://www.jetbrains.com/company/brand/img/jetbrains_logo.png" height="50" alt="JetBrains logo" />](https://www.jetbrains.com)


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "polars",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "dataframe, arrow, out-of-core",
    "author": null,
    "author_email": "Ritchie Vink <ritchie46@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/10/5c/3376329b60b960cf0e615d6a2e13d57a59809665ee87960ad2edefd77a4e/polars-1.14.0.tar.gz",
    "platform": null,
    "description": "<h1 align=\"center\">\n  <a href=\"https://pola.rs\">\n    <img src=\"https://raw.githubusercontent.com/pola-rs/polars-static/master/banner/polars_github_banner.svg\" alt=\"Polars logo\">\n  </a>\n</h1>\n\n<div align=\"center\">\n  <a href=\"https://crates.io/crates/polars\">\n    <img src=\"https://img.shields.io/crates/v/polars.svg\" alt=\"crates.io Latest Release\"/>\n  </a>\n  <a href=\"https://pypi.org/project/polars/\">\n    <img src=\"https://img.shields.io/pypi/v/polars.svg\" alt=\"PyPi Latest Release\"/>\n  </a>\n  <a href=\"https://www.npmjs.com/package/nodejs-polars\">\n    <img src=\"https://img.shields.io/npm/v/nodejs-polars.svg\" alt=\"NPM Latest Release\"/>\n  </a>\n  <a href=\"https://rpolars.r-universe.dev\">\n    <img src=\"https://rpolars.r-universe.dev/badges/polars\" alt=\"R-universe Latest Release\"/>\n  </a>\n  <a href=\"https://doi.org/10.5281/zenodo.7697217\">\n    <img src=\"https://zenodo.org/badge/DOI/10.5281/zenodo.7697217.svg\" alt=\"DOI Latest Release\"/>\n  </a>\n</div>\n\n<p align=\"center\">\n  <b>Documentation</b>:\n  <a href=\"https://docs.pola.rs/api/python/stable/reference/index.html\">Python</a>\n  -\n  <a href=\"https://docs.rs/polars/latest/polars/\">Rust</a>\n  -\n  <a href=\"https://pola-rs.github.io/nodejs-polars/index.html\">Node.js</a>\n  -\n  <a href=\"https://pola-rs.github.io/r-polars/index.html\">R</a>\n  |\n  <b>StackOverflow</b>:\n  <a href=\"https://stackoverflow.com/questions/tagged/python-polars\">Python</a>\n  -\n  <a href=\"https://stackoverflow.com/questions/tagged/rust-polars\">Rust</a>\n  -\n  <a href=\"https://stackoverflow.com/questions/tagged/nodejs-polars\">Node.js</a>\n  -\n  <a href=\"https://stackoverflow.com/questions/tagged/r-polars\">R</a>\n  |\n  <a href=\"https://docs.pola.rs/\">User guide</a>\n  |\n  <a href=\"https://discord.gg/4UfP5cfBE7\">Discord</a>\n</p>\n\n## Polars: Blazingly fast DataFrames in Rust, Python, Node.js, R, and SQL\n\nPolars is a DataFrame interface on top of an OLAP Query Engine implemented in Rust using\n[Apache Arrow Columnar Format](https://arrow.apache.org/docs/format/Columnar.html) as the memory\nmodel.\n\n- Lazy | eager execution\n- Multi-threaded\n- SIMD\n- Query optimization\n- Powerful expression API\n- Hybrid Streaming (larger-than-RAM datasets)\n- Rust | Python | NodeJS | R | ...\n\nTo learn more, read the [user guide](https://docs.pola.rs/).\n\n## Python\n\n```python\n>>> import polars as pl\n>>> df = pl.DataFrame(\n...     {\n...         \"A\": [1, 2, 3, 4, 5],\n...         \"fruits\": [\"banana\", \"banana\", \"apple\", \"apple\", \"banana\"],\n...         \"B\": [5, 4, 3, 2, 1],\n...         \"cars\": [\"beetle\", \"audi\", \"beetle\", \"beetle\", \"beetle\"],\n...     }\n... )\n\n# embarrassingly parallel execution & very expressive query language\n>>> df.sort(\"fruits\").select(\n...     \"fruits\",\n...     \"cars\",\n...     pl.lit(\"fruits\").alias(\"literal_string_fruits\"),\n...     pl.col(\"B\").filter(pl.col(\"cars\") == \"beetle\").sum(),\n...     pl.col(\"A\").filter(pl.col(\"B\") > 2).sum().over(\"cars\").alias(\"sum_A_by_cars\"),\n...     pl.col(\"A\").sum().over(\"fruits\").alias(\"sum_A_by_fruits\"),\n...     pl.col(\"A\").reverse().over(\"fruits\").alias(\"rev_A_by_fruits\"),\n...     pl.col(\"A\").sort_by(\"B\").over(\"fruits\").alias(\"sort_A_by_B_by_fruits\"),\n... )\nshape: (5, 8)\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 fruits   \u2506 cars     \u2506 literal_stri \u2506 B   \u2506 sum_A_by_ca \u2506 sum_A_by_fr \u2506 rev_A_by_fr \u2506 sort_A_by_B \u2502\n\u2502 ---      \u2506 ---      \u2506 ng_fruits    \u2506 --- \u2506 rs          \u2506 uits        \u2506 uits        \u2506 _by_fruits  \u2502\n\u2502 str      \u2506 str      \u2506 ---          \u2506 i64 \u2506 ---         \u2506 ---         \u2506 ---         \u2506 ---         \u2502\n\u2502          \u2506          \u2506 str          \u2506     \u2506 i64         \u2506 i64         \u2506 i64         \u2506 i64         \u2502\n\u255e\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n\u2502 \"apple\"  \u2506 \"beetle\" \u2506 \"fruits\"     \u2506 11  \u2506 4           \u2506 7           \u2506 4           \u2506 4           \u2502\n\u2502 \"apple\"  \u2506 \"beetle\" \u2506 \"fruits\"     \u2506 11  \u2506 4           \u2506 7           \u2506 3           \u2506 3           \u2502\n\u2502 \"banana\" \u2506 \"beetle\" \u2506 \"fruits\"     \u2506 11  \u2506 4           \u2506 8           \u2506 5           \u2506 5           \u2502\n\u2502 \"banana\" \u2506 \"audi\"   \u2506 \"fruits\"     \u2506 11  \u2506 2           \u2506 8           \u2506 2           \u2506 2           \u2502\n\u2502 \"banana\" \u2506 \"beetle\" \u2506 \"fruits\"     \u2506 11  \u2506 4           \u2506 8           \u2506 1           \u2506 1           \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n## SQL\n\n```python\n>>> df = pl.scan_csv(\"docs/assets/data/iris.csv\")\n>>> ## OPTION 1\n>>> # run SQL queries on frame-level\n>>> df.sql(\"\"\"\n...\tSELECT species,\n...\t  AVG(sepal_length) AS avg_sepal_length\n...\tFROM self\n...\tGROUP BY species\n...\t\"\"\").collect()\nshape: (3, 2)\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 species    \u2506 avg_sepal_length \u2502\n\u2502 ---        \u2506 ---              \u2502\n\u2502 str        \u2506 f64              \u2502\n\u255e\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n\u2502 Virginica  \u2506 6.588            \u2502\n\u2502 Versicolor \u2506 5.936            \u2502\n\u2502 Setosa     \u2506 5.006            \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n>>> ## OPTION 2\n>>> # use pl.sql() to operate on the global context\n>>> df2 = pl.LazyFrame({\n...    \"species\": [\"Setosa\", \"Versicolor\", \"Virginica\"],\n...    \"blooming_season\": [\"Spring\", \"Summer\", \"Fall\"]\n...})\n>>> pl.sql(\"\"\"\n... SELECT df.species,\n...     AVG(df.sepal_length) AS avg_sepal_length,\n...     df2.blooming_season\n... FROM df\n... LEFT JOIN df2 ON df.species = df2.species\n... GROUP BY df.species, df2.blooming_season\n... \"\"\").collect()\n```\n\nSQL commands can also be run directly from your terminal using the Polars CLI:\n\n```bash\n# run an inline SQL query\n> polars -c \"SELECT species, AVG(sepal_length) AS avg_sepal_length, AVG(sepal_width) AS avg_sepal_width FROM read_csv('docs/assets/data/iris.csv') GROUP BY species;\"\n\n# run interactively\n> polars\nPolars CLI v0.3.0\nType .help for help.\n\n> SELECT species, AVG(sepal_length) AS avg_sepal_length, AVG(sepal_width) AS avg_sepal_width FROM read_csv('docs/assets/data/iris.csv') GROUP BY species;\n```\n\nRefer to the [Polars CLI repository](https://github.com/pola-rs/polars-cli) for more information.\n\n## Performance \ud83d\ude80\ud83d\ude80\n\n### Blazingly fast\n\nPolars is very fast. In fact, it is one of the best performing solutions available. See the\n[PDS-H benchmarks](https://www.pola.rs/benchmarks.html) results.\n\n### Lightweight\n\nPolars is also very lightweight. It comes with zero required dependencies, and this shows in the\nimport times:\n\n- polars: 70ms\n- numpy: 104ms\n- pandas: 520ms\n\n### Handles larger-than-RAM data\n\nIf you have data that does not fit into memory, Polars' query engine is able to process your query\n(or parts of your query) in a streaming fashion. This drastically reduces memory requirements, so\nyou might be able to process your 250GB dataset on your laptop. Collect with\n`collect(streaming=True)` to run the query streaming. (This might be a little slower, but it is\nstill very fast!)\n\n## Setup\n\n### Python\n\nInstall the latest Polars version with:\n\n```sh\npip install polars\n```\n\nWe also have a conda package (`conda install -c conda-forge polars`), however pip is the preferred\nway to install Polars.\n\nInstall Polars with all optional dependencies.\n\n```sh\npip install 'polars[all]'\n```\n\nYou can also install a subset of all optional dependencies.\n\n```sh\npip install 'polars[numpy,pandas,pyarrow]'\n```\n\nSee the [User Guide](https://docs.pola.rs/user-guide/installation/#feature-flags) for more details\non optional dependencies\n\nTo see the current Polars version and a full list of its optional dependencies, run:\n\n```python\npl.show_versions()\n```\n\nReleases happen quite often (weekly / every few days) at the moment, so updating Polars regularly to\nget the latest bugfixes / features might not be a bad idea.\n\n### Rust\n\nYou can take latest release from `crates.io`, or if you want to use the latest features /\nperformance improvements point to the `main` branch of this repo.\n\n```toml\npolars = { git = \"https://github.com/pola-rs/polars\", rev = \"<optional git tag>\" }\n```\n\nRequires Rust version `>=1.80`.\n\n## Contributing\n\nWant to contribute? Read our [contributing guide](https://docs.pola.rs/development/contributing/).\n\n## Python: compile Polars from source\n\nIf you want a bleeding edge release or maximal performance you should compile Polars from source.\n\nThis can be done by going through the following steps in sequence:\n\n1. Install the latest [Rust compiler](https://www.rust-lang.org/tools/install)\n2. Install [maturin](https://maturin.rs/): `pip install maturin`\n3. `cd py-polars` and choose one of the following:\n   - `make build`, slow binary with debug assertions and symbols, fast compile times\n   - `make build-release`, fast binary without debug assertions, minimal debug symbols, long compile\n     times\n   - `make build-nodebug-release`, same as build-release but without any debug symbols, slightly\n     faster to compile\n   - `make build-debug-release`, same as build-release but with full debug symbols, slightly slower\n     to compile\n   - `make build-dist-release`, fastest binary, extreme compile times\n\nBy default the binary is compiled with optimizations turned on for a modern CPU. Specify `LTS_CPU=1`\nwith the command if your CPU is older and does not support e.g. AVX2.\n\nNote that the Rust crate implementing the Python bindings is called `py-polars` to distinguish from\nthe wrapped Rust crate `polars` itself. However, both the Python package and the Python module are\nnamed `polars`, so you can `pip install polars` and `import polars`.\n\n## Using custom Rust functions in Python\n\nExtending Polars with UDFs compiled in Rust is easy. We expose PyO3 extensions for `DataFrame` and\n`Series` data structures. See more in https://github.com/pola-rs/pyo3-polars.\n\n## Going big...\n\nDo you expect more than 2^32 (~4.2 billion) rows? Compile Polars with the `bigidx` feature flag or,\nfor Python users, install `pip install polars-u64-idx`.\n\nDon't use this unless you hit the row boundary as the default build of Polars is faster and consumes\nless memory.\n\n## Legacy\n\nDo you want Polars to run on an old CPU (e.g. dating from before 2011), or on an `x86-64` build of\nPython on Apple Silicon under Rosetta? Install `pip install polars-lts-cpu`. This version of Polars\nis compiled without [AVX](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) target features.\n\n## Sponsors\n\n[<img src=\"https://www.jetbrains.com/company/brand/img/jetbrains_logo.png\" height=\"50\" alt=\"JetBrains logo\" />](https://www.jetbrains.com)\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Blazingly fast DataFrame library",
    "version": "1.14.0",
    "project_urls": {
        "Changelog": "https://github.com/pola-rs/polars/releases",
        "Documentation": "https://docs.pola.rs/api/python/stable/reference/index.html",
        "Homepage": "https://www.pola.rs/",
        "Repository": "https://github.com/pola-rs/polars"
    },
    "split_keywords": [
        "dataframe",
        " arrow",
        " out-of-core"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9177cee560ffa16842787f9524afeedc82c04dceb50ac42ab2ef6d3c2840a602",
                "md5": "63b1999e7175790c1899063f2180f211",
                "sha256": "f346177c6f3442e8e61eadc4830d588348bf3383b0100d1c942b5615813be16e"
            },
            "downloads": -1,
            "filename": "polars-1.14.0-cp39-abi3-macosx_10_12_x86_64.whl",
            "has_sig": false,
            "md5_digest": "63b1999e7175790c1899063f2180f211",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 34292998,
            "upload_time": "2024-11-17T18:49:23",
            "upload_time_iso_8601": "2024-11-17T18:49:23.287843Z",
            "url": "https://files.pythonhosted.org/packages/91/77/cee560ffa16842787f9524afeedc82c04dceb50ac42ab2ef6d3c2840a602/polars-1.14.0-cp39-abi3-macosx_10_12_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3caef084dbb5d80599d7bbabbc21aa42a129d7ac55afafa70190987a525c9694",
                "md5": "13a326ab48e7494f7dd26e7a240e911f",
                "sha256": "588b5622b3a73be874a8e432d45c8a122662c09ce5ba2d5e5966f6dacce2b914"
            },
            "downloads": -1,
            "filename": "polars-1.14.0-cp39-abi3-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "13a326ab48e7494f7dd26e7a240e911f",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 30178132,
            "upload_time": "2024-11-17T18:49:28",
            "upload_time_iso_8601": "2024-11-17T18:49:28.359254Z",
            "url": "https://files.pythonhosted.org/packages/3c/ae/f084dbb5d80599d7bbabbc21aa42a129d7ac55afafa70190987a525c9694/polars-1.14.0-cp39-abi3-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dd3af03ee80d8dba47b3fc10d02191ee1690b8d4791626da5ea0a29435bd9b24",
                "md5": "2778dfc610102bd72982c0e74185343c",
                "sha256": "0bc46ad6ceeec5d9d881f09c7c1811844e851980735f8455981cdea456e08f5c"
            },
            "downloads": -1,
            "filename": "polars-1.14.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "2778dfc610102bd72982c0e74185343c",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 35522548,
            "upload_time": "2024-11-17T18:49:31",
            "upload_time_iso_8601": "2024-11-17T18:49:31.435708Z",
            "url": "https://files.pythonhosted.org/packages/dd/3a/f03ee80d8dba47b3fc10d02191ee1690b8d4791626da5ea0a29435bd9b24/polars-1.14.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "47a071f9a9a207820fee6aa09fc21d88961eeb9772fb39b6b9922ad4cdb42df3",
                "md5": "fec2e77f87eebb8a8844f9cc8239541c",
                "sha256": "3fc0cf084f848799379e8eba14733ae0e9d66a0fa8ec41719df82ed29c827003"
            },
            "downloads": -1,
            "filename": "polars-1.14.0-cp39-abi3-manylinux_2_24_aarch64.whl",
            "has_sig": false,
            "md5_digest": "fec2e77f87eebb8a8844f9cc8239541c",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 31893694,
            "upload_time": "2024-11-17T18:49:35",
            "upload_time_iso_8601": "2024-11-17T18:49:35.048093Z",
            "url": "https://files.pythonhosted.org/packages/47/a0/71f9a9a207820fee6aa09fc21d88961eeb9772fb39b6b9922ad4cdb42df3/polars-1.14.0-cp39-abi3-manylinux_2_24_aarch64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9c844310aa0d4b526a60ee6e88e50514123682696bf37bc5f6c9298f6902e411",
                "md5": "26f1e0bb22e3c65d3a234a319f41e7ff",
                "sha256": "5ca507d162f88a44e1a945feecfa474fda0b66f378336d69b9ee23917da670c3"
            },
            "downloads": -1,
            "filename": "polars-1.14.0-cp39-abi3-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "26f1e0bb22e3c65d3a234a319f41e7ff",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 35280144,
            "upload_time": "2024-11-17T18:49:38",
            "upload_time_iso_8601": "2024-11-17T18:49:38.644400Z",
            "url": "https://files.pythonhosted.org/packages/9c/84/4310aa0d4b526a60ee6e88e50514123682696bf37bc5f6c9298f6902e411/polars-1.14.0-cp39-abi3-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "105c3376329b60b960cf0e615d6a2e13d57a59809665ee87960ad2edefd77a4e",
                "md5": "bcf7ddd557f3cd65f48cbeb197b09ad0",
                "sha256": "e34fbeca4664fba754a12d0a66b36569c4c9e5a0116108d9362067a0ca596b4d"
            },
            "downloads": -1,
            "filename": "polars-1.14.0.tar.gz",
            "has_sig": false,
            "md5_digest": "bcf7ddd557f3cd65f48cbeb197b09ad0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 4147056,
            "upload_time": "2024-11-17T18:50:20",
            "upload_time_iso_8601": "2024-11-17T18:50:20.198281Z",
            "url": "https://files.pythonhosted.org/packages/10/5c/3376329b60b960cf0e615d6a2e13d57a59809665ee87960ad2edefd77a4e/polars-1.14.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-17 18:50:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pola-rs",
    "github_project": "polars",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "polars"
}
        
Elapsed time: 0.48905s