# lakeFS High-Level Python SDK
lakeFS High Level SDK for Python, provides developers with the following features:
1. Simpler programming interface with less configuration
2. Inferring identity from environment
3. Better abstractions for common, more complex operations (I/O, transactions, imports)
## Requirements
Python 3.9+
## Installation & Usage
### pip install
```sh
pip install lakefs
```
### Import the package
```python
import lakefs
```
## Getting Started
Please follow the [installation procedure](#installation--usage) and afterward refer to the following example snippet for a quick start:
```python
import lakefs
from lakefs.client import Client
# Using default client will attempt to authenticate with lakeFS server using configured credentials
# If environment variables or .lakectl.yaml file exist
repo = lakefs.repository(repository_id="my-repo")
# Or explicitly initialize and provide a Client object
clt = Client(username="<lakefs_access_key_id>", password="<lakefs_secret_access_key>", host="<lakefs_endpoint>")
repo = lakefs.Repository(repository_id="my-repo", client=clt)
# From this point, proceed using the package according to documentation
main_branch = repo.create(storage_namespace="<storage_namespace>").branch(branch_id="main")
...
```
## Examples
### Print sizes of all objects in lakefs://repo/main~2
```py
ref = lakefs.Repository("repo").ref("main~2")
for obj in ref.objects():
print(f"{o.path}: {o.size_bytes}")
```
### Difference between two branches
```py
for i in lakefs.Repository("repo").ref("main").diff("twig"):
print(i)
```
You can also use the [ref expression][lakefs-spec-ref]s here, for instance
`.diff("main~2")` also works. Ref expressions are the lakeFS analogues of
[how Git specifies revisions][git-spec-rev].
### Search a stored object for a string
```py
with lakefs.Repository("repo").ref("main").object("path/to/data").reader(mode="r") as f:
for l in f:
if "quick" in l:
print(l)
```
### Upload and commit some data
```py
with lakefs.Repository("golden").branch("main").object("path/to/new").writer(mode="wb") as f:
f.write(b"my data")
# Returns a Reference
lakefs.Repository("golden").branch("main").commit("added my data using lakeFS high-level SDK")
# Prints "my data"
with lakefs.Repository("golden").branch("main").object("path/to/new").reader(mode="r") as f:
for l in f:
print(l)
```
Unlike references, branches are readable. This example couldn't work if we used a ref.
## Tests
To run the tests using `pytest`, first clone the lakeFS git repository
```sh
git clone https://github.com/treeverse/lakeFS.git
cd lakefs/clients/python-wrapper
```
### Unit Tests
Inside the `tests` folder, execute `pytest utests` to run the unit tests.
### Integration Tests
See [testing documentation](https://github.com/treeverse/lakeFS/blob/master/clients/python-wrapper/tests/integration/README.md) for more information
## Documentation
[lakeFS Python SDK](https://pydocs-lakefs.lakefs.io/)
## Author
services@treeverse.io
[git-spec-rev]: https://git-scm.com/docs/git-rev-parse#_specifying_revisions
[lakefs-spec-ref]: https://docs.lakefs.io/understand/model.html#ref-expressions
Raw data
{
"_id": null,
"home_page": "https://github.com/treeverse/lakeFS/tree/master/clients/python-wrapper",
"name": "lakefs",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "OpenAPI, OpenAPI-Generator, lakeFS API, Python Wrapper",
"author": "Treeverse",
"author_email": "services@treeverse.io",
"download_url": "https://files.pythonhosted.org/packages/e6/1f/c92142e7174ea0dc3df70dea2abb1fb46704cb65a1498f0af537b2f536ba/lakefs-0.8.0.tar.gz",
"platform": null,
"description": "# lakeFS High-Level Python SDK\n\nlakeFS High Level SDK for Python, provides developers with the following features:\n1. Simpler programming interface with less configuration\n2. Inferring identity from environment \n3. Better abstractions for common, more complex operations (I/O, transactions, imports)\n\n## Requirements\n\nPython 3.9+\n\n## Installation & Usage\n\n### pip install\n\n```sh\npip install lakefs\n```\n\n### Import the package\n\n```python\nimport lakefs\n```\n\n## Getting Started\n\nPlease follow the [installation procedure](#installation--usage) and afterward refer to the following example snippet for a quick start:\n\n```python\n\nimport lakefs\nfrom lakefs.client import Client\n\n# Using default client will attempt to authenticate with lakeFS server using configured credentials\n# If environment variables or .lakectl.yaml file exist \nrepo = lakefs.repository(repository_id=\"my-repo\")\n\n# Or explicitly initialize and provide a Client object \nclt = Client(username=\"<lakefs_access_key_id>\", password=\"<lakefs_secret_access_key>\", host=\"<lakefs_endpoint>\")\nrepo = lakefs.Repository(repository_id=\"my-repo\", client=clt)\n\n# From this point, proceed using the package according to documentation\nmain_branch = repo.create(storage_namespace=\"<storage_namespace>\").branch(branch_id=\"main\")\n...\n```\n\n## Examples\n\n### Print sizes of all objects in lakefs://repo/main~2\n\n```py\nref = lakefs.Repository(\"repo\").ref(\"main~2\")\nfor obj in ref.objects():\n print(f\"{o.path}: {o.size_bytes}\")\n```\n\n### Difference between two branches\n\n```py\nfor i in lakefs.Repository(\"repo\").ref(\"main\").diff(\"twig\"):\n print(i)\n```\n\nYou can also use the [ref expression][lakefs-spec-ref]s here, for instance\n`.diff(\"main~2\")` also works. Ref expressions are the lakeFS analogues of\n[how Git specifies revisions][git-spec-rev].\n\n### Search a stored object for a string\n\n```py\nwith lakefs.Repository(\"repo\").ref(\"main\").object(\"path/to/data\").reader(mode=\"r\") as f:\n for l in f:\n if \"quick\" in l:\n\t print(l)\n```\n\n### Upload and commit some data\n\n```py\nwith lakefs.Repository(\"golden\").branch(\"main\").object(\"path/to/new\").writer(mode=\"wb\") as f:\n f.write(b\"my data\")\n\n# Returns a Reference\nlakefs.Repository(\"golden\").branch(\"main\").commit(\"added my data using lakeFS high-level SDK\")\n\n# Prints \"my data\"\nwith lakefs.Repository(\"golden\").branch(\"main\").object(\"path/to/new\").reader(mode=\"r\") as f:\n for l in f:\n print(l)\n```\n\nUnlike references, branches are readable. This example couldn't work if we used a ref.\n\n## Tests\n\nTo run the tests using `pytest`, first clone the lakeFS git repository\n\n```sh\ngit clone https://github.com/treeverse/lakeFS.git\ncd lakefs/clients/python-wrapper\n```\n\n### Unit Tests\n\nInside the `tests` folder, execute `pytest utests` to run the unit tests.\n\n### Integration Tests\n\nSee [testing documentation](https://github.com/treeverse/lakeFS/blob/master/clients/python-wrapper/tests/integration/README.md) for more information\n\n## Documentation\n\n[lakeFS Python SDK](https://pydocs-lakefs.lakefs.io/) \n\n## Author\n\nservices@treeverse.io\n\n[git-spec-rev]: https://git-scm.com/docs/git-rev-parse#_specifying_revisions\n[lakefs-spec-ref]: https://docs.lakefs.io/understand/model.html#ref-expressions\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "lakeFS Python SDK Wrapper",
"version": "0.8.0",
"project_urls": {
"Homepage": "https://github.com/treeverse/lakeFS/tree/master/clients/python-wrapper"
},
"split_keywords": [
"openapi",
" openapi-generator",
" lakefs api",
" python wrapper"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c65ea62181941014fcb017bc56466ec6cfbe4a91ef81d1bd65d25a4c1f388e6b",
"md5": "88d75c721c644ff85ab761230c650005",
"sha256": "2c548f79bcb36487a14d842edf9610fb7ee0a809d2ac6a44d391a8cb5dbe1a85"
},
"downloads": -1,
"filename": "lakefs-0.8.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "88d75c721c644ff85ab761230c650005",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 50982,
"upload_time": "2025-01-03T20:06:21",
"upload_time_iso_8601": "2025-01-03T20:06:21.529263Z",
"url": "https://files.pythonhosted.org/packages/c6/5e/a62181941014fcb017bc56466ec6cfbe4a91ef81d1bd65d25a4c1f388e6b/lakefs-0.8.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e61fc92142e7174ea0dc3df70dea2abb1fb46704cb65a1498f0af537b2f536ba",
"md5": "5fc51de6f7703d662986b0e6b4110ad0",
"sha256": "56962b94a6cf251a5c9d283c3dca839df58cde97670a91cac9244be589528d75"
},
"downloads": -1,
"filename": "lakefs-0.8.0.tar.gz",
"has_sig": false,
"md5_digest": "5fc51de6f7703d662986b0e6b4110ad0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 41096,
"upload_time": "2025-01-03T20:06:24",
"upload_time_iso_8601": "2025-01-03T20:06:24.540474Z",
"url": "https://files.pythonhosted.org/packages/e6/1f/c92142e7174ea0dc3df70dea2abb1fb46704cb65a1498f0af537b2f536ba/lakefs-0.8.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-03 20:06:24",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "treeverse",
"github_project": "lakeFS",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "lakefs"
}