Name | h5dataframe JSON |
Version |
0.2.3
JSON |
| download |
home_page | None |
Summary | Drop-in replacement for pandas DataFrames that allows to store data on an hdf5 file and manipulate data directly from that hdf5 file without loading it in memory. |
upload_time | 2025-09-10 22:19:21 |
maintainer | None |
docs_url | None |
author | Matteo Bouvier |
requires_python | >=3.10 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# h5dataframe
Drop-in replacement for pandas DataFrames that allows to store data on an hdf5 file and manipulate data directly from that hdf5 file without loading it in memory.
# Warning !
This is very much a **work in progress**, some features might not work yet or cause bugs.
**Save** your data elsewhere before converting it to an H5DataFrame.
If you miss a feature from pandas DataFrames, please fill an issue or feel free to contribute.
# Overview
This library provides the `H5DataFrame` object, replacing the regular `pandas.DataFrame`.
An `H5DataFrame` can be created from a `pandas.DataFrame` or from a dictionnary of (column_name -> column_values).
```python
>>> import pandas as pd
>>> from h5dataframe import H5DataFrame
>>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]},
index=['r1', 'r2', 'r3'])
>>> hdf = H5DataFrame(df)
>>> hdf
a b
r1 1 4
r2 2 5
r3 3 6
[RAM]
[3 rows x 2 columns]
```
At this point, all the data is still loaded in RAM, as indicated by the second-to-last line. To write the data to an hdf5 file, use the `H5DataFrame.write()` method.
```python
>>> hdf.write('path/to/file.h5')
>>> hdf
a b
r1 1 4
r2 2 5
r3 3 6
[FILE]
[3 rows x 2 columns]
```
The `H5DataFrame` is now backed on an hdf5 file, only loading data in RAM when requested.
Alternatively, an `H5DataFrame` can be read directly from an previously created hdf5 file with the `H5DataFrame.read()` method.
```python
>>> from h5dataframe import H5Mode
>>> H5DataFrame.read('path/to/file.h5', mode=H5Mode.READ)
a b
r1 1 4
r2 2 5
r3 3 6
[FILE]
[3 rows x 2 columns]
```
The default mode is `READ` (`'r'`) which creates a **read-only** `H5DataFrame`. To modify the data, use `mode=H5Mode.READ_WRITE` (`'r+'`).
# Installation
From pip:
```shell
pip install h5dataframe
```
From source:
```shell
git clone git@github.com:Vidium/h5dataframe.git
```
Raw data
{
"_id": null,
"home_page": null,
"name": "h5dataframe",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Matteo Bouvier",
"author_email": "Matteo Bouvier <matteo.bouvier@hotmail.fr>",
"download_url": "https://files.pythonhosted.org/packages/51/b7/4c0d744fb34104565e6b1251555bebaa384a5560e3a9cf49e54ff2d03331/h5dataframe-0.2.3.tar.gz",
"platform": null,
"description": "# h5dataframe\n\nDrop-in replacement for pandas DataFrames that allows to store data on an hdf5 file and manipulate data directly from that hdf5 file without loading it in memory.\n\n# Warning !\n\nThis is very much a **work in progress**, some features might not work yet or cause bugs.\n**Save** your data elsewhere before converting it to an H5DataFrame.\n\nIf you miss a feature from pandas DataFrames, please fill an issue or feel free to contribute.\n\n# Overview\n\nThis library provides the `H5DataFrame` object, replacing the regular `pandas.DataFrame`.\n\nAn `H5DataFrame` can be created from a `pandas.DataFrame` or from a dictionnary of (column_name -> column_values).\n\n```python\n>>> import pandas as pd\n>>> from h5dataframe import H5DataFrame\n>>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}, \n index=['r1', 'r2', 'r3'])\n>>> hdf = H5DataFrame(df)\n>>> hdf\n a b\nr1 1 4\nr2 2 5\nr3 3 6\n[RAM]\n[3 rows x 2 columns]\n```\n\nAt this point, all the data is still loaded in RAM, as indicated by the second-to-last line. To write the data to an hdf5 file, use the `H5DataFrame.write()` method.\n\n```python\n>>> hdf.write('path/to/file.h5')\n>>> hdf\n a b\nr1 1 4\nr2 2 5\nr3 3 6\n[FILE]\n[3 rows x 2 columns]\n```\n\nThe `H5DataFrame` is now backed on an hdf5 file, only loading data in RAM when requested.\n\nAlternatively, an `H5DataFrame` can be read directly from an previously created hdf5 file with the `H5DataFrame.read()` method.\n\n```python\n>>> from h5dataframe import H5Mode\n>>> H5DataFrame.read('path/to/file.h5', mode=H5Mode.READ)\n a b\nr1 1 4\nr2 2 5\nr3 3 6\n[FILE]\n[3 rows x 2 columns]\n```\n\nThe default mode is `READ` (`'r'`) which creates a **read-only** `H5DataFrame`. To modify the data, use `mode=H5Mode.READ_WRITE` (`'r+'`).\n\n# Installation\n\nFrom pip:\n```shell\npip install h5dataframe\n```\n\nFrom source:\n```shell\ngit clone git@github.com:Vidium/h5dataframe.git\n```",
"bugtrack_url": null,
"license": null,
"summary": "Drop-in replacement for pandas DataFrames that allows to store data on an hdf5 file and manipulate data directly from that hdf5 file without loading it in memory.",
"version": "0.2.3",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "fce9f54181b30f06432df1ee9e1bb18c43128e35c3d572167462a7c68fbc941d",
"md5": "f2eeeb9f630973ada5af5637453b45f3",
"sha256": "189ba906fc8ef6fc46f65a7af26cc8d55fb4a6847201161c63c66ffaac72f352"
},
"downloads": -1,
"filename": "h5dataframe-0.2.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f2eeeb9f630973ada5af5637453b45f3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 14609,
"upload_time": "2025-09-10T22:19:20",
"upload_time_iso_8601": "2025-09-10T22:19:20.108367Z",
"url": "https://files.pythonhosted.org/packages/fc/e9/f54181b30f06432df1ee9e1bb18c43128e35c3d572167462a7c68fbc941d/h5dataframe-0.2.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "51b74c0d744fb34104565e6b1251555bebaa384a5560e3a9cf49e54ff2d03331",
"md5": "88781210a28e918580cf26f67448fb58",
"sha256": "54db846e3b897f01c34850c72bf96ad97a40d7b2760e1c41337bab956c516456"
},
"downloads": -1,
"filename": "h5dataframe-0.2.3.tar.gz",
"has_sig": false,
"md5_digest": "88781210a28e918580cf26f67448fb58",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 11511,
"upload_time": "2025-09-10T22:19:21",
"upload_time_iso_8601": "2025-09-10T22:19:21.439111Z",
"url": "https://files.pythonhosted.org/packages/51/b7/4c0d744fb34104565e6b1251555bebaa384a5560e3a9cf49e54ff2d03331/h5dataframe-0.2.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-10 22:19:21",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "h5dataframe"
}