Name | daplapath JSON |
Version |
1.0.7
JSON |
| download |
home_page | None |
Summary | A pathlib.Path class for dapla |
upload_time | 2024-10-04 18:47:18 |
maintainer | None |
docs_url | None |
author | ort |
requires_python | <4,>=3.10 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# dapla-path
pathlib.Path for dapla
Opprettet av:
ort <ort@ssb.no>
---
# Path (dapla)
```python
import dapla as dp
import pandas as pd
from daplapath.path import Path
```
```python
folder = Path('ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024')
folder
```
'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024'
## Fungerer som tekst
```python
folder.startswith("ssb")
```
True
```python
dp.FileClient.get_gcs_file_system().exists(folder)
```
True
## Med metoder og attributter ala pathlib.Path
```python
folder.exists()
```
True
```python
folder.is_dir()
```
True
```python
file = folder / "ABAS_kommune_utenhav_p2024_v1.parquet"
file
```
'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024/ABAS_kommune_utenhav_p2024_v1.parquet'
```python
file.parent
```
'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024'
## Og noen pandas attributter
Uten å lese filen
```python
file.columns
```
Index(['OBJTYPE', 'NAVN', 'KOMMUNENR', 'FYLKE', 'AREAL_GDB', 'SHAPE_Length',
'SHAPE_Area', 'geometry'],
dtype='object')
```python
file.dtypes
```
OBJTYPE string
NAVN string
KOMMUNENR string
FYLKE string
AREAL_GDB double
SHAPE_Length double
SHAPE_Area double
geometry binary
dtype: object
```python
file.shape
```
(481, 8)
## Versjonering
```python
file.version_number
```
1
```python
print(file.versions())
```
timestamp mb (int)
2024-05-19 12:31:02 941 .../ABAS_kommune_utenhav_p2024.parquet
2024-08-16 16:15:10 941 .../ABAS_kommune_utenhav_p2024_v1.parquet
Name: path, dtype: object
```python
file.latest_version()
```
'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024/ABAS_kommune_utenhav_p2024_v1.parquet'
```python
file.highest_numbered_version()
```
'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024/ABAS_kommune_utenhav_p2024_v1.parquet'
```python
# highest_numbered_version + 1
file.new_version()
```
'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024/ABAS_kommune_utenhav_p2024_v2.parquet'
```python
# alltid False
file.new_version().exists()
```
False
```python
# finner/fjerner versjonsnummer med regex-søk
file._version_pattern
```
'_v(\\d+)'
## Branch tree
Filtre med hyperlenke. Gjør at man kopierer stien når man klikker på den.
```python
print(
Path("ssb-kart-data-delt-prod/analyse_data/klargjorte-data").tree()
)
```
ssb-kart-data-delt-prod/analyse_data/klargjorte-data /
└──2000 /
└──SSB_tettsted_flate_p2000.parquet
└──SSB_tettsted_flate_p2000_v1.parquet
└──2002 /
└──SSB_tettsted_flate_p2002.parquet
└──SSB_tettsted_flate_p2002_v1.parquet
└──2003 /
└──SSB_tettsted_flate_p2003.parquet
└──SSB_tettsted_flate_p2003_v1.parquet
└──2004 /
└──SSB_tettsted_flate_p2004.parquet
└──SSB_tettsted_flate_p2004_v1.parquet
└──2005 /
└──SSB_tettsted_flate_p2005.parquet
└──SSB_tettsted_flate_p2005_v1.parquet
└──2006 /
└──SSB_tettsted_flate_p2006.parquet
└──SSB_tettsted_flate_p2006_v1.parquet
└──2007 /
└──SSB_tettsted_flate_p2007.parquet
└──SSB_tettsted_flate_p2007_v1.parquet
└──2008 /
└──SSB_tettsted_flate_p2008.parquet
└──SSB_tettsted_flate_p2008_v1.parquet
└──SSB_tettsted_ringbuffer_p2008.parquet
└──(...)
└──2009 /
└──SSB_tettsted_flate_p2009.parquet
└──SSB_tettsted_flate_p2009_v1.parquet
└──2010 /
└──SOL_arealressurs_flate_p2010.parquet
└──SOL_arealressurs_flate_p2010_v1.parquet
└──2011 /
└──SOL_Arstat_flate_p2011.parquet
└──SOL_Arstat_flate_p2011_v1.parquet
└──SSB_tettsted_flate_p2011.parquet
└──(...)
└──2012 /
└──ABAS_fylke_flate_p2012_v1.parquet
└──ABAS_fylke_linje_p2012_v1.parquet
└──ABAS_grunnkrets_flate_p2012_v1.parquet
└──(...)
└──2013 /
└──ABAS_fylke_flate_p2013_v1.parquet
└──ABAS_kommune_flate_p2013_v1.parquet
└──DEK_eiendom_flate_p2013_v1.parquet
└──(...)
└──2014 /
└──DEK_eiendom_flate_p2014_v1.parquet
└──FKB_anlegg_flate_p2014_v1.parquet
└──FKB_anlegg_linje_p2014_v1.parquet
└──(...)
└──2015 /
└──ABAS_grunnkrets_flate_p2015_v1.parquet
└──ABAS_grunnkrets_utenhav_p2015_v1.parquet
└──ABAS_kommune_flate_p2015_v1.parquet
└──(...)
└──2016 /
└──ABAS_fylke_flate_p2016_v1.parquet
└──ABAS_grunnkrets_flate_p2016_v1.parquet
└──ABAS_grunnkrets_utenhav_p2016_v1.parquet
└──(...)
└──2017 /
└──ABAS_fylke_flate_p2017_v1.parquet
└──ABAS_grunnkrets_flate_p2017_v1.parquet
└──ABAS_grunnkrets_utenhav_p2017_v1.parquet
└──(...)
└──2018 /
└──ABAS_fylke_flate_p2018_v1.parquet
└──ABAS_grunnkrets_flate_p2018_v1.parquet
└──ABAS_grunnkrets_utenhav_p2018_v1.parquet
└──(...)
└──2019 /
└──ABAS_fylke_flate_p2019_v1.parquet
└──ABAS_grunnkrets_flate_p2019_v1.parquet
└──ABAS_grunnkrets_utenhav_p2019_v1.parquet
└──(...)
└──2020 /
└──ABAS_fylke_flate_p2020_v1.parquet
└──ABAS_grunnkrets_flate_p2020_v1.parquet
└──ABAS_grunnkrets_utenhav_p2020_v1.parquet
└──(...)
└──2021 /
└──ABAS_fylke_flate_p2021_v1.parquet
└──ABAS_grunnkrets_flate_p2021_v1.parquet
└──ABAS_grunnkrets_utenhav_p2021_v1.parquet
└──(...)
└──2022 /
└──ABAS_fylke_flate_p2022_v1.parquet
└──ABAS_grunnkrets_flate_p2022_v1.parquet
└──ABAS_grunnkrets_utenhav_p2022_v1.parquet
└──(...)
└──2023 /
└──ABAS_KnrGamle_p2023_v1.parquet
└──ABAS_fylke_flate_p2023_v1.parquet
└──ABAS_grunnkrets_flate_p2023_v1.parquet
└──(...)
└──2024 /
└──ABAS_fylke_flate_p2024_v1.parquet
└──ABAS_grunnkrets_flate_p2024_v1.parquet
└──ABAS_grunnkrets_utenhav_p2024_v1.parquet
└──(...)
## ls - få filstier, timestamp og størrelse
Med stier som kopieres (som ctrl + c) når man klipper på stien.
```python
files_in_dir = file.parent.ls()
print(files_in_dir)
```
timestamp mb (int)
2024-04-19 11:44:12 11 .../ABAS_kommune_flate_p2024_v1.parquet
2024-04-19 11:45:47 0 .../N50_JernbaneStasjon_punkt_p2024.parquet
0 .../N50_JernbaneStasjon_punkt_p2024_v1.parquet
0 .../N50_lufthavn_punkt_p2024.parquet
0 .../N50_lufthavn_punkt_p2024_v1.parquet
...
2024-08-21 14:47:12 861 .../SSB_hav_flate_p2024.parquet
2024-08-23 14:59:30 152 .../SSB_tettsted_flate_p2024_v1.parquet
2024-08-23 14:59:36 152 .../SSB_tettsted_kommune_flate_p2024_v1.parquet
2024-08-23 15:34:21 1122 .../SSB_tettsted_kommune_ringbuffer_p2024_v1.parquet
2024-08-23 17:11:32 740 .../NVDB_veg_linje_p2024_v1.parquet
Name: path, Length: 127, dtype: object
```python
# subclass av pandas.Series
type(files_in_dir)
```
daplapath.path.PathSeries
```python
print(files_in_dir.loc[lambda x: x.gb > 10].keep_latest_versions())
```
timestamp mb (int)
2024-07-18 00:13:09 17646 .../FKB_arealressurs_flate_p2024_v1.parquet
2024-08-20 14:03:16 19717 .../FKB_gronnstruktur_flate_p2024_v1.parquet
Name: path, dtype: object
```python
# stiene er fortsatt Path
type(files_in_dir.iloc[0])
```
daplapath.path.Path
```python
# velg ut filene
print(folder.ls().files)
```
timestamp mb (int)
2024-04-19 11:44:12 11 .../ABAS_kommune_flate_p2024_v1.parquet
2024-04-19 11:45:47 0 .../N50_JernbaneStasjon_punkt_p2024.parquet
0 .../N50_JernbaneStasjon_punkt_p2024_v1.parquet
0 .../N50_lufthavn_punkt_p2024.parquet
0 .../N50_lufthavn_punkt_p2024_v1.parquet
...
2024-08-21 14:47:12 861 .../SSB_hav_flate_p2024.parquet
2024-08-23 14:59:30 152 .../SSB_tettsted_flate_p2024_v1.parquet
2024-08-23 14:59:36 152 .../SSB_tettsted_kommune_flate_p2024_v1.parquet
2024-08-23 15:34:21 1122 .../SSB_tettsted_kommune_ringbuffer_p2024_v1.parquet
2024-08-23 17:11:32 740 .../NVDB_veg_linje_p2024_v1.parquet
Name: path, Length: 127, dtype: object
```python
print(folder.ls().dirs)
```
Series([], Name: path, dtype: object)
```python
# samme som .loc med x.str.contains
print(folder.ls().containing("kommune"))
```
timestamp mb (int)
2024-04-19 11:44:12 11 .../ABAS_kommune_flate_p2024_v1.parquet
2024-05-19 12:31:02 941 .../ABAS_kommune_utenhav_p2024.parquet
2024-06-24 14:25:14 11 .../ABAS_kommune_flate_p2024.parquet
2024-08-16 16:15:10 941 .../ABAS_kommune_utenhav_p2024_v1.parquet
2024-08-23 14:59:36 152 .../SSB_tettsted_kommune_flate_p2024_v1.parquet
2024-08-23 15:34:21 1122 .../SSB_tettsted_kommune_ringbuffer_p2024_v1.parquet
Name: path, dtype: object
```python
print(file.parent.parent.ls(recursive=True).files)
```
timestamp mb (int)
2024-04-19 11:43:21 0 .../2022/N50_JernbaneStasjon_punkt_p2022_v1.parquet
2024-04-19 11:43:22 0 .../2022/N50_lufthavn_punkt_p2022_v1.parquet
2024-04-19 11:43:23 0 .../2022/NVE_Vindturbin_punkt_p2022_v1.parquet
0 .../2022/NVE_Trafostasjon_punkt_p2022_v1.parquet
2024-04-19 11:43:24 0 .../2022/S100_TekniskSit_flate_p2022_v1.parquet
...
2024-08-21 14:47:12 861 .../2024/SSB_hav_flate_p2024.parquet
2024-08-23 14:59:30 152 .../2024/SSB_tettsted_flate_p2024_v1.parquet
2024-08-23 14:59:36 152 .../2024/SSB_tettsted_kommune_flate_p2024_v1.parquet
2024-08-23 15:34:21 1122 .../2024/SSB_tettsted_kommune_ringbuffer_p2024_v1.parquet
2024-08-23 17:11:32 740 .../2024/NVDB_veg_linje_p2024_v1.parquet
Length: 1323, dtype: object
## Write to testpath
```python
testpath = Path('ssb-areal-data-produkt-prod/arealstat/temp/test_df_p2023_v1.parquet')
# delete files first
for version in testpath.versions():
version.rm_file()
testpath.exists()
```
False
```python
df = pd.DataFrame({"x": [1,2,3], "y": [*"abc"]})
dp.write_pandas(df, testpath)
testpath.exists()
```
True
```python
testpath.latest_version()
```
'ssb-areal-data-produkt-prod/arealstat/temp/test_df_p2023_v1.parquet'
```python
# highest_numbered_version + 1
testpath.new_version()
```
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[31], line 2
1 # highest_numbered_version + 1
----> 2 testpath.new_version()
File ~/daplapath/daplapath/path.py:805, in Path.new_version(self, timeout)
803 time_should_be_at_least = pd.Timestamp.now() - pd.Timedelta(minutes=timeout)
804 if timestamp[0] > time_should_be_at_least:
--> 805 raise ValueError(
806 f"Latest version of the file was updated {timestamp[0]}, which "
807 f"is less than the timeout period of {timeout} minutes. "
808 "Change the timeout argument, but be sure to not save new "
809 "versions in a loop."
810 )
812 return highest_numbered.add_to_version_number(1)
ValueError: Latest version of the file was updated 2024-08-28 15:09:47, which is less than the timeout period of 30 minutes. Change the timeout argument, but be sure to not save new versions in a loop.
```python
dp.write_pandas(df, testpath.new_version(timeout=0.01))
```
```python
print(testpath.versions())
```
timestamp mb (int)
2024-08-28 15:09:47 0 ssb-areal-data-produkt-prod/arealstat/temp/test_df_p2023_v1.parquet
2024-08-28 15:09:52 0 ssb-areal-data-produkt-prod/arealstat/temp/test_df_p2023_v2.parquet
dtype: object
```python
```
Raw data
{
"_id": null,
"home_page": null,
"name": "daplapath",
"maintainer": null,
"docs_url": null,
"requires_python": "<4,>=3.10",
"maintainer_email": null,
"keywords": null,
"author": "ort",
"author_email": "ort@ssb.no",
"download_url": "https://files.pythonhosted.org/packages/57/f0/e5316ef819751dc3e70ca5a354d2e2bd88bf24477470b0c126a8b98cffa3/daplapath-1.0.7.tar.gz",
"platform": null,
"description": "# dapla-path\n\npathlib.Path for dapla\n\nOpprettet av:\nort <ort@ssb.no>\n\n---\n\n# Path (dapla)\n\n\n```python\nimport dapla as dp\nimport pandas as pd\n\nfrom daplapath.path import Path\n```\n\n\n```python\nfolder = Path('ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024')\nfolder\n```\n\n\n\n\n 'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024'\n\n\n\n## Fungerer som tekst\n\n\n```python\nfolder.startswith(\"ssb\")\n```\n\n\n\n\n True\n\n\n\n\n```python\ndp.FileClient.get_gcs_file_system().exists(folder)\n```\n\n\n\n\n True\n\n\n\n## Med metoder og attributter ala pathlib.Path\n\n\n```python\nfolder.exists()\n```\n\n\n\n\n True\n\n\n\n\n```python\nfolder.is_dir()\n```\n\n\n\n\n True\n\n\n\n\n```python\nfile = folder / \"ABAS_kommune_utenhav_p2024_v1.parquet\"\nfile\n```\n\n\n\n\n 'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024/ABAS_kommune_utenhav_p2024_v1.parquet'\n\n\n\n\n```python\nfile.parent\n```\n\n\n\n\n 'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024'\n\n\n\n## Og noen pandas attributter\n\nUten \u00e5 lese filen\n\n\n```python\nfile.columns\n```\n\n\n\n\n Index(['OBJTYPE', 'NAVN', 'KOMMUNENR', 'FYLKE', 'AREAL_GDB', 'SHAPE_Length',\n 'SHAPE_Area', 'geometry'],\n dtype='object')\n\n\n\n\n```python\nfile.dtypes\n```\n\n\n\n\n OBJTYPE string\n NAVN string\n KOMMUNENR string\n FYLKE string\n AREAL_GDB double\n SHAPE_Length double\n SHAPE_Area double\n geometry binary\n dtype: object\n\n\n\n\n```python\nfile.shape\n```\n\n\n\n\n (481, 8)\n\n\n\n## Versjonering\n\n\n```python\nfile.version_number\n```\n\n\n\n\n 1\n\n\n\n\n```python\nprint(file.versions())\n```\n\n timestamp mb (int)\n 2024-05-19 12:31:02 941 .../ABAS_kommune_utenhav_p2024.parquet\n 2024-08-16 16:15:10 941 .../ABAS_kommune_utenhav_p2024_v1.parquet\n Name: path, dtype: object\n\n\n\n```python\nfile.latest_version()\n```\n\n\n\n\n 'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024/ABAS_kommune_utenhav_p2024_v1.parquet'\n\n\n\n\n```python\nfile.highest_numbered_version()\n```\n\n\n\n\n 'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024/ABAS_kommune_utenhav_p2024_v1.parquet'\n\n\n\n\n```python\n# highest_numbered_version + 1\nfile.new_version()\n```\n\n\n\n\n 'ssb-kart-data-delt-prod/analyse_data/klargjorte-data/2024/ABAS_kommune_utenhav_p2024_v2.parquet'\n\n\n\n\n```python\n# alltid False\nfile.new_version().exists()\n```\n\n\n\n\n False\n\n\n\n\n```python\n# finner/fjerner versjonsnummer med regex-s\u00f8k\nfile._version_pattern\n```\n\n\n\n\n '_v(\\\\d+)'\n\n\n\n## Branch tree\n\nFiltre med hyperlenke. Gj\u00f8r at man kopierer stien n\u00e5r man klikker p\u00e5 den.\n\n\n```python\nprint(\n Path(\"ssb-kart-data-delt-prod/analyse_data/klargjorte-data\").tree()\n)\n```\n\n ssb-kart-data-delt-prod/analyse_data/klargjorte-data /\n \u2514\u2500\u25002000 /\n \u2514\u2500\u2500SSB_tettsted_flate_p2000.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2000_v1.parquet\n \u2514\u2500\u25002002 /\n \u2514\u2500\u2500SSB_tettsted_flate_p2002.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2002_v1.parquet\n \u2514\u2500\u25002003 /\n \u2514\u2500\u2500SSB_tettsted_flate_p2003.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2003_v1.parquet\n \u2514\u2500\u25002004 /\n \u2514\u2500\u2500SSB_tettsted_flate_p2004.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2004_v1.parquet\n \u2514\u2500\u25002005 /\n \u2514\u2500\u2500SSB_tettsted_flate_p2005.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2005_v1.parquet\n \u2514\u2500\u25002006 /\n \u2514\u2500\u2500SSB_tettsted_flate_p2006.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2006_v1.parquet\n \u2514\u2500\u25002007 /\n \u2514\u2500\u2500SSB_tettsted_flate_p2007.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2007_v1.parquet\n \u2514\u2500\u25002008 /\n \u2514\u2500\u2500SSB_tettsted_flate_p2008.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2008_v1.parquet\n \u2514\u2500\u2500SSB_tettsted_ringbuffer_p2008.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002009 /\n \u2514\u2500\u2500SSB_tettsted_flate_p2009.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2009_v1.parquet\n \u2514\u2500\u25002010 /\n \u2514\u2500\u2500SOL_arealressurs_flate_p2010.parquet\n \u2514\u2500\u2500SOL_arealressurs_flate_p2010_v1.parquet\n \u2514\u2500\u25002011 /\n \u2514\u2500\u2500SOL_Arstat_flate_p2011.parquet\n \u2514\u2500\u2500SOL_Arstat_flate_p2011_v1.parquet\n \u2514\u2500\u2500SSB_tettsted_flate_p2011.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002012 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2012_v1.parquet\n \u2514\u2500\u2500ABAS_fylke_linje_p2012_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2012_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002013 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2013_v1.parquet\n \u2514\u2500\u2500ABAS_kommune_flate_p2013_v1.parquet\n \u2514\u2500\u2500DEK_eiendom_flate_p2013_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002014 /\n \u2514\u2500\u2500DEK_eiendom_flate_p2014_v1.parquet\n \u2514\u2500\u2500FKB_anlegg_flate_p2014_v1.parquet\n \u2514\u2500\u2500FKB_anlegg_linje_p2014_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002015 /\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2015_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_utenhav_p2015_v1.parquet\n \u2514\u2500\u2500ABAS_kommune_flate_p2015_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002016 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2016_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2016_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_utenhav_p2016_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002017 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2017_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2017_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_utenhav_p2017_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002018 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2018_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2018_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_utenhav_p2018_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002019 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2019_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2019_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_utenhav_p2019_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002020 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2020_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2020_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_utenhav_p2020_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002021 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2021_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2021_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_utenhav_p2021_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002022 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2022_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2022_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_utenhav_p2022_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002023 /\n \u2514\u2500\u2500ABAS_KnrGamle_p2023_v1.parquet\n \u2514\u2500\u2500ABAS_fylke_flate_p2023_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2023_v1.parquet\n \u2514\u2500\u2500(...)\n \u2514\u2500\u25002024 /\n \u2514\u2500\u2500ABAS_fylke_flate_p2024_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_flate_p2024_v1.parquet\n \u2514\u2500\u2500ABAS_grunnkrets_utenhav_p2024_v1.parquet\n \u2514\u2500\u2500(...)\n\n\n## ls - f\u00e5 filstier, timestamp og st\u00f8rrelse\n\nMed stier som kopieres (som ctrl + c) n\u00e5r man klipper p\u00e5 stien.\n\n\n```python\nfiles_in_dir = file.parent.ls()\nprint(files_in_dir)\n```\n\n timestamp mb (int)\n 2024-04-19 11:44:12 11 .../ABAS_kommune_flate_p2024_v1.parquet\n 2024-04-19 11:45:47 0 .../N50_JernbaneStasjon_punkt_p2024.parquet\n 0 .../N50_JernbaneStasjon_punkt_p2024_v1.parquet\n 0 .../N50_lufthavn_punkt_p2024.parquet\n 0 .../N50_lufthavn_punkt_p2024_v1.parquet\n ... \n 2024-08-21 14:47:12 861 .../SSB_hav_flate_p2024.parquet\n 2024-08-23 14:59:30 152 .../SSB_tettsted_flate_p2024_v1.parquet\n 2024-08-23 14:59:36 152 .../SSB_tettsted_kommune_flate_p2024_v1.parquet\n 2024-08-23 15:34:21 1122 .../SSB_tettsted_kommune_ringbuffer_p2024_v1.parquet\n 2024-08-23 17:11:32 740 .../NVDB_veg_linje_p2024_v1.parquet\n Name: path, Length: 127, dtype: object\n\n\n\n```python\n# subclass av pandas.Series\ntype(files_in_dir)\n```\n\n\n\n\n daplapath.path.PathSeries\n\n\n\n\n```python\nprint(files_in_dir.loc[lambda x: x.gb > 10].keep_latest_versions())\n```\n\n timestamp mb (int)\n 2024-07-18 00:13:09 17646 .../FKB_arealressurs_flate_p2024_v1.parquet\n 2024-08-20 14:03:16 19717 .../FKB_gronnstruktur_flate_p2024_v1.parquet\n Name: path, dtype: object\n\n\n\n```python\n# stiene er fortsatt Path\ntype(files_in_dir.iloc[0])\n```\n\n\n\n\n daplapath.path.Path\n\n\n\n\n```python\n# velg ut filene\nprint(folder.ls().files)\n```\n\n timestamp mb (int)\n 2024-04-19 11:44:12 11 .../ABAS_kommune_flate_p2024_v1.parquet\n 2024-04-19 11:45:47 0 .../N50_JernbaneStasjon_punkt_p2024.parquet\n 0 .../N50_JernbaneStasjon_punkt_p2024_v1.parquet\n 0 .../N50_lufthavn_punkt_p2024.parquet\n 0 .../N50_lufthavn_punkt_p2024_v1.parquet\n ... \n 2024-08-21 14:47:12 861 .../SSB_hav_flate_p2024.parquet\n 2024-08-23 14:59:30 152 .../SSB_tettsted_flate_p2024_v1.parquet\n 2024-08-23 14:59:36 152 .../SSB_tettsted_kommune_flate_p2024_v1.parquet\n 2024-08-23 15:34:21 1122 .../SSB_tettsted_kommune_ringbuffer_p2024_v1.parquet\n 2024-08-23 17:11:32 740 .../NVDB_veg_linje_p2024_v1.parquet\n Name: path, Length: 127, dtype: object\n\n\n\n```python\nprint(folder.ls().dirs)\n```\n\n Series([], Name: path, dtype: object)\n\n\n\n```python\n# samme som .loc med x.str.contains\nprint(folder.ls().containing(\"kommune\"))\n```\n\n timestamp mb (int)\n 2024-04-19 11:44:12 11 .../ABAS_kommune_flate_p2024_v1.parquet\n 2024-05-19 12:31:02 941 .../ABAS_kommune_utenhav_p2024.parquet\n 2024-06-24 14:25:14 11 .../ABAS_kommune_flate_p2024.parquet\n 2024-08-16 16:15:10 941 .../ABAS_kommune_utenhav_p2024_v1.parquet\n 2024-08-23 14:59:36 152 .../SSB_tettsted_kommune_flate_p2024_v1.parquet\n 2024-08-23 15:34:21 1122 .../SSB_tettsted_kommune_ringbuffer_p2024_v1.parquet\n Name: path, dtype: object\n\n\n\n```python\nprint(file.parent.parent.ls(recursive=True).files)\n```\n\n timestamp mb (int)\n 2024-04-19 11:43:21 0 .../2022/N50_JernbaneStasjon_punkt_p2022_v1.parquet\n 2024-04-19 11:43:22 0 .../2022/N50_lufthavn_punkt_p2022_v1.parquet\n 2024-04-19 11:43:23 0 .../2022/NVE_Vindturbin_punkt_p2022_v1.parquet\n 0 .../2022/NVE_Trafostasjon_punkt_p2022_v1.parquet\n 2024-04-19 11:43:24 0 .../2022/S100_TekniskSit_flate_p2022_v1.parquet\n ... \n 2024-08-21 14:47:12 861 .../2024/SSB_hav_flate_p2024.parquet\n 2024-08-23 14:59:30 152 .../2024/SSB_tettsted_flate_p2024_v1.parquet\n 2024-08-23 14:59:36 152 .../2024/SSB_tettsted_kommune_flate_p2024_v1.parquet\n 2024-08-23 15:34:21 1122 .../2024/SSB_tettsted_kommune_ringbuffer_p2024_v1.parquet\n 2024-08-23 17:11:32 740 .../2024/NVDB_veg_linje_p2024_v1.parquet\n Length: 1323, dtype: object\n\n\n## Write to testpath\n\n\n```python\ntestpath = Path('ssb-areal-data-produkt-prod/arealstat/temp/test_df_p2023_v1.parquet')\n\n# delete files first\nfor version in testpath.versions():\n version.rm_file()\n\ntestpath.exists()\n```\n\n\n\n\n False\n\n\n\n\n```python\ndf = pd.DataFrame({\"x\": [1,2,3], \"y\": [*\"abc\"]})\n\ndp.write_pandas(df, testpath)\n\ntestpath.exists()\n```\n\n\n\n\n True\n\n\n\n\n```python\ntestpath.latest_version()\n```\n\n\n\n\n 'ssb-areal-data-produkt-prod/arealstat/temp/test_df_p2023_v1.parquet'\n\n\n\n\n```python\n# highest_numbered_version + 1\ntestpath.new_version()\n```\n\n\n ---------------------------------------------------------------------------\n\n ValueError Traceback (most recent call last)\n\n Cell In[31], line 2\n 1 # highest_numbered_version + 1\n ----> 2 testpath.new_version()\n\n\n File ~/daplapath/daplapath/path.py:805, in Path.new_version(self, timeout)\n 803 time_should_be_at_least = pd.Timestamp.now() - pd.Timedelta(minutes=timeout)\n 804 if timestamp[0] > time_should_be_at_least:\n --> 805 raise ValueError(\n 806 f\"Latest version of the file was updated {timestamp[0]}, which \"\n 807 f\"is less than the timeout period of {timeout} minutes. \"\n 808 \"Change the timeout argument, but be sure to not save new \"\n 809 \"versions in a loop.\"\n 810 )\n 812 return highest_numbered.add_to_version_number(1)\n\n\n ValueError: Latest version of the file was updated 2024-08-28 15:09:47, which is less than the timeout period of 30 minutes. Change the timeout argument, but be sure to not save new versions in a loop.\n\n\n\n```python\ndp.write_pandas(df, testpath.new_version(timeout=0.01))\n```\n\n\n```python\nprint(testpath.versions())\n```\n\n timestamp mb (int)\n 2024-08-28 15:09:47 0 ssb-areal-data-produkt-prod/arealstat/temp/test_df_p2023_v1.parquet\n 2024-08-28 15:09:52 0 ssb-areal-data-produkt-prod/arealstat/temp/test_df_p2023_v2.parquet\n dtype: object\n\n\n\n```python\n\n```\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A pathlib.Path class for dapla",
"version": "1.0.7",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b5e69a2cf496b83924f608bf6c2aadc2e68b26621e29f0894286fe92eb7b02a5",
"md5": "64b3cb52e8e9fcbb84a22a1ac0f7b897",
"sha256": "d99f21174fbf248949dbbfdd913bd052c84a2f9f782bac3d8034c09dd787b1bb"
},
"downloads": -1,
"filename": "daplapath-1.0.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "64b3cb52e8e9fcbb84a22a1ac0f7b897",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4,>=3.10",
"size": 15572,
"upload_time": "2024-10-04T18:47:17",
"upload_time_iso_8601": "2024-10-04T18:47:17.776253Z",
"url": "https://files.pythonhosted.org/packages/b5/e6/9a2cf496b83924f608bf6c2aadc2e68b26621e29f0894286fe92eb7b02a5/daplapath-1.0.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "57f0e5316ef819751dc3e70ca5a354d2e2bd88bf24477470b0c126a8b98cffa3",
"md5": "05863e2104d3bfec0e7c9822f8d39872",
"sha256": "792cbf3723e34a5a4abdaf7dfebfc5d753d221166efac66dac856593d8cdc589"
},
"downloads": -1,
"filename": "daplapath-1.0.7.tar.gz",
"has_sig": false,
"md5_digest": "05863e2104d3bfec0e7c9822f8d39872",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4,>=3.10",
"size": 17029,
"upload_time": "2024-10-04T18:47:18",
"upload_time_iso_8601": "2024-10-04T18:47:18.862619Z",
"url": "https://files.pythonhosted.org/packages/57/f0/e5316ef819751dc3e70ca5a354d2e2bd88bf24477470b0c126a8b98cffa3/daplapath-1.0.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-04 18:47:18",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "daplapath"
}