[token]: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token
[hdsr-mid]: [https://github.com/hdr-mid]
[pypi]: https://pypi.org/project/hdsr_pygithub
[mit]: https://github.com/hdsr-mid/hdsr_pygithub/blob/main/LICENSE.txt
### Context
* Created: November 2021
* Author: Renier Kramer, renier.kramer@hdsr.nl
* Maintainer: Roger de Crook, roger.de.crook@hdsr.nl
* Python version: 3.7 <= x <= 3.12
### Description
A python project that enables interaction with the GitHub API v3 to e.g. read/download dirs/files from the
github organisation [hdsr-mid]. Often downloading is not required as files can be loaded in
memory, see 'Usage 1' below. To interact with private repos you need to authenticate via a personal github
access token (see 'Token' below).
### Token
A token (a long hash) has to be created once (and updated when it expires). You can have maximum 1 token. This token is
related to your github user account, so you don't need a token per repo/organisation/etc.
You can [create a token yourself][token]. In short:
1. Login github.com with your account (user + password)
2. Ensure you have at least read-permission for the hdsr-mid repo(s) you want to interact with. To verify, browse to
the specific repo. If you can open it, then you have at least read-permission. If not, please contact
roger.de.crook@hdsr.nl to get access.
3. Create a token:
1. On github.com, go to your profile settings (click your icon right upper corner and 'settings' in the dropdown).
2. Click 'developer settings' (left lower corner).
3. Click 'Personal access tokens' and then 'Tokens (classic)'.
4. Click 'Generate new token' and then 'Generate new token (classic)'.
5. For scopes select only 'repo' (Full control of private repositories). This selects automatically the related
sub-selections (e.g. 'repo:status' (Access commit status).
4. We recommend setting an expiry date of max 1 year (for safety reasons).
5. You can use this token in two ways:
1. recommended: Create a .env file for example on your personal HDSR drive, e.g. 'G:/secrets.env', and add one
line: GITHUB_PERSONAL_ACCESS_TOKEN=<your_token>. Please do not share this file with others!
2. Use it hardcode in you code, see 'Usage 1: simple'. In this case, be careful sharing your code.
#### Installation
```
pip install hdsr-pygithub
# or
conda install hdsr-pygithub --channel hdsr-mid
```
### Usage example 1: simple (little arguments and hard-coded personal_access_token)
```
# Ensure you followed steps 1 till 4 of topic 'Token' above
import hdsr_pygithub
from pathlib import Path
github_downloader = hdsr_pygithub.GithubFileDownloader(
repo_name="startenddate", # ensure your github account has read-permission for this repo
target_file=Path("data/output/results/mwm_peilschalen_short.csv"), # this file must exist in the master branch
personal_access_token=<your_personal_access_token> # see topic 'Token' 5.2 above
)
# download files to disk
download_directory = github_downloader.download_files(download_directory=<a_dir>)
downloaded_filepath = download_directory / "data/output/results/mwm_peilschalen_short.csv"
assert downloaded_filepath.exists()
# or read file in memory using e.g. pandas
import pandas as pd
url = github_downloader.get_download_url()
# in case filetype is a .csv (other other filetypes: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html):
dataframe_file = pd.read_csv(filepath_or_buffer=url)
```
### Usage example 2: sophisticated (more arguments and personal_access_token in .env file)
```
# Ensure you followed steps 1 till 4 from topic 'Token' above
import hdsr_pygithub
from datetime import datetime
from pathlib import Path
github_downloader = hdsr_pygithub.GithubDirDownloader(
repo_name="startenddate", # ensure your github account has repo read-permission
branch_name="main", # defaults to 'main' if not specified
target_dir=Path("data/output/results/"), # this dir must exist in the branch specified above
allowed_period_no_updates=datetime.timedelta(weeks=10), # defaults to 1 year if not specified
repo_organisation='hdsr-mid', # defaults to 'hdsr-mid'
secrets_env_path=<your .env file path> # defaults to Path("G:/") / "secrets.env"
)
# download complete github directory (recursive) to disk
download_directory = github_downloader.download_files(download_directory=<a_dir>)
assert download_directory.is_dir()
# or download complete github directory (recursive) to disk to your Temp directory (C:/Users/<user>/AppData/Local/Temp/..)
download_directory = github_downloader.download_files(use_tmp_dir=True)
assert download_directory.is_dir()
```
### License
[MIT][mit]
### Releases
[PyPi][pypi]
### Contributions
All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.
Issues are posted on: https://github.com/hdsr-mid/hdsr_pygithub/issues
### Test coverage
```
2024-05-03 (release 1.18)
---------- coverage: platform win32, python 3.12.0-final-0 -----------
Name Stmts Miss Cover
------------------------------------------------------
hdsr_pygithub\constants.py 7 0 100%
hdsr_pygithub\downloader\base.py 260 56 78%
hdsr_pygithub\downloader\dir.py 88 0 100%
hdsr_pygithub\downloader\file.py 40 1 98%
hdsr_pygithub\exceptions.py 21 2 90%
setup.py 10 10 0%
------------------------------------------------------
TOTAL 426 69 84%
```
### Conda general tips
#### Build conda environment (on Windows) from any directory using environment.yml:
Note1: prefix is not set in the environment.yml as then conda does not handle it very well
Note2: env_directory can be anywhere, it does not have to be in your code project
```
> conda env create --prefix <env_directory><env_name> --file <path_to_project>/environment.yml
# example: conda env create --prefix C:/Users/xxx/.conda/envs/project_xx --file C:/Users/code_projects/xx/environment.yml
> conda info --envs # verify that <env_name> (project_xx) is in this list
```
#### Start the application from any directory:
```
> conda activate <env_name>
At any location:
> (<env_name>) python <path_to_project>/main.py
```
#### Test the application:
```
> conda activate <env_name>
> cd <path_to_project>
> pytest # make sure pytest is installed (conda install pytest)
```
#### List all conda environments on your machine:
```
At any location:
> conda info --envs
```
#### Delete a conda environment:
```
Get directory where environment is located
> conda info --envs
Remove the enviroment
> conda env remove --name <env_name>
Finally, remove the left-over directory by hand
```
#### Write dependencies to environment.yml:
The goal is to keep the .yml as short as possible (not include sub-dependencies), yet make the environment
reproducible. Why? If you do 'conda install matplotlib' you also install sub-dependencies like pyqt, qt
icu, and sip. You should not include these sub-dependencies in your .yml as:
- including sub-dependencies result in an unnecessary strict environment (difficult to solve when conflicting)
- sub-dependencies will be installed when dependencies are being installed
```
> conda activate <conda_env_name>
Recommended:
> conda env export --from-history --no-builds | findstr -v "prefix" > --file <path_to_project>/environment_new.yml
Alternative:
> conda env export --no-builds | findstr -v "prefix" > --file <path_to_project>/environment_new.yml
--from-history:
Only include packages that you have explicitly asked for, as opposed to including every package in the
environment. This flag works regardless how you created the environment (through CMD or Anaconda Navigator).
--no-builds:
By default, the YAML includes platform-specific build constraints. If you transfer across platforms (e.g.
win32 to 64) omit the build info with '--no-builds'.
```
#### Pip and Conda:
If a package is not available on all conda channels, but available as pip package, one can install pip as a dependency.
Note that mixing packages from conda and pip is always a potential problem: conda calls pip, but pip does not know
how to satisfy missing dependencies with packages from Anaconda repositories.
```
> conda activate <env_name>
> conda install pip
> pip install <pip_package>
```
The environment.yml might look like:
```
channels:
- defaults
dependencies:
- <a conda package>=<version>
- pip
- pip:
- <a pip package>==<version>
```
You can also write a requirements.txt file:
```
> pip list --format=freeze > <path_to_project>/requirements.txt
```
Raw data
{
"_id": null,
"home_page": "https://github.com/hdsr-mid/hdsr_pygithub",
"name": "hdsr-pygithub",
"maintainer": "Roger de Crook",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "roger.de.crook@hdsr.nl",
"keywords": "interface, interaction, github, files, hdsr",
"author": "Renier Kramer",
"author_email": "renier.kramer@hdsr.nl",
"download_url": "https://files.pythonhosted.org/packages/ce/aa/3efd80b7e70d4cd1574926ea0cc96c7358faddbf8572b5be6e3a75d1d0dc/hdsr_pygithub-1.19.tar.gz",
"platform": null,
"description": "[token]: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token\r\n[hdsr-mid]: [https://github.com/hdr-mid]\r\n[pypi]: https://pypi.org/project/hdsr_pygithub\r\n[mit]: https://github.com/hdsr-mid/hdsr_pygithub/blob/main/LICENSE.txt\r\n\r\n### Context\r\n* Created: November 2021\r\n* Author: Renier Kramer, renier.kramer@hdsr.nl\r\n* Maintainer: Roger de Crook, roger.de.crook@hdsr.nl\r\n* Python version: 3.7 <= x <= 3.12\r\n\r\n### Description\r\nA python project that enables interaction with the GitHub API v3 to e.g. read/download dirs/files from the \r\ngithub organisation [hdsr-mid]. Often downloading is not required as files can be loaded in \r\nmemory, see 'Usage 1' below. To interact with private repos you need to authenticate via a personal github \r\naccess token (see 'Token' below).\r\n\r\n### Token\r\nA token (a long hash) has to be created once (and updated when it expires). You can have maximum 1 token. This token is\r\nrelated to your github user account, so you don't need a token per repo/organisation/etc. \r\nYou can [create a token yourself][token]. In short:\r\n1. Login github.com with your account (user + password)\r\n2. Ensure you have at least read-permission for the hdsr-mid repo(s) you want to interact with. To verify, browse to \r\n the specific repo. If you can open it, then you have at least read-permission. If not, please contact \r\n roger.de.crook@hdsr.nl to get access.\r\n3. Create a token:\r\n 1. On github.com, go to your profile settings (click your icon right upper corner and 'settings' in the dropdown).\r\n 2. Click 'developer settings' (left lower corner).\r\n 3. Click 'Personal access tokens' and then 'Tokens (classic)'.\r\n 4. Click 'Generate new token' and then 'Generate new token (classic)'.\r\n 5. For scopes select only 'repo' (Full control of private repositories). This selects automatically the related \r\n sub-selections (e.g. 'repo:status' (Access commit status).\r\n4. We recommend setting an expiry date of max 1 year (for safety reasons).\r\n5. You can use this token in two ways:\r\n 1. recommended: Create a .env file for example on your personal HDSR drive, e.g. 'G:/secrets.env', and add one \r\n line: GITHUB_PERSONAL_ACCESS_TOKEN=<your_token>. Please do not share this file with others!\r\n 2. Use it hardcode in you code, see 'Usage 1: simple'. In this case, be careful sharing your code.\r\n \r\n#### Installation\r\n```\r\npip install hdsr-pygithub \r\n# or \r\nconda install hdsr-pygithub --channel hdsr-mid\r\n```\r\n\r\n### Usage example 1: simple (little arguments and hard-coded personal_access_token)\r\n```\r\n# Ensure you followed steps 1 till 4 of topic 'Token' above\r\nimport hdsr_pygithub\r\nfrom pathlib import Path\r\n\r\ngithub_downloader = hdsr_pygithub.GithubFileDownloader(\r\n repo_name=\"startenddate\", # ensure your github account has read-permission for this repo\r\n target_file=Path(\"data/output/results/mwm_peilschalen_short.csv\"), # this file must exist in the master branch\r\n personal_access_token=<your_personal_access_token> # see topic 'Token' 5.2 above\r\n)\r\n\r\n# download files to disk\r\ndownload_directory = github_downloader.download_files(download_directory=<a_dir>)\r\ndownloaded_filepath = download_directory / \"data/output/results/mwm_peilschalen_short.csv\"\r\nassert downloaded_filepath.exists()\r\n\r\n# or read file in memory using e.g. pandas\r\nimport pandas as pd\r\nurl = github_downloader.get_download_url()\r\n# in case filetype is a .csv (other other filetypes: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html):\r\ndataframe_file = pd.read_csv(filepath_or_buffer=url)\r\n```\r\n\r\n### Usage example 2: sophisticated (more arguments and personal_access_token in .env file)\r\n```\r\n# Ensure you followed steps 1 till 4 from topic 'Token' above\r\nimport hdsr_pygithub\r\nfrom datetime import datetime\r\nfrom pathlib import Path\r\n\r\ngithub_downloader = hdsr_pygithub.GithubDirDownloader(\r\n repo_name=\"startenddate\", # ensure your github account has repo read-permission \r\n branch_name=\"main\", # defaults to 'main' if not specified \r\n target_dir=Path(\"data/output/results/\"), # this dir must exist in the branch specified above\r\n allowed_period_no_updates=datetime.timedelta(weeks=10), # defaults to 1 year if not specified \r\n repo_organisation='hdsr-mid', # defaults to 'hdsr-mid'\r\n secrets_env_path=<your .env file path> # defaults to Path(\"G:/\") / \"secrets.env\"\r\n)\r\n\r\n# download complete github directory (recursive) to disk\r\ndownload_directory = github_downloader.download_files(download_directory=<a_dir>)\r\nassert download_directory.is_dir()\r\n\r\n# or download complete github directory (recursive) to disk to your Temp directory (C:/Users/<user>/AppData/Local/Temp/..)\r\ndownload_directory = github_downloader.download_files(use_tmp_dir=True)\r\nassert download_directory.is_dir()\r\n```\r\n\r\n### License \r\n[MIT][mit]\r\n\r\n### Releases\r\n[PyPi][pypi]\r\n\r\n### Contributions\r\nAll contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.\r\nIssues are posted on: https://github.com/hdsr-mid/hdsr_pygithub/issues\r\n\r\n### Test coverage \r\n```\r\n2024-05-03 (release 1.18)\r\n---------- coverage: platform win32, python 3.12.0-final-0 -----------\r\nName Stmts Miss Cover\r\n------------------------------------------------------\r\nhdsr_pygithub\\constants.py 7 0 100%\r\nhdsr_pygithub\\downloader\\base.py 260 56 78%\r\nhdsr_pygithub\\downloader\\dir.py 88 0 100%\r\nhdsr_pygithub\\downloader\\file.py 40 1 98%\r\nhdsr_pygithub\\exceptions.py 21 2 90%\r\nsetup.py 10 10 0%\r\n------------------------------------------------------\r\nTOTAL 426 69 84%\r\n```\r\n\r\n### Conda general tips\r\n#### Build conda environment (on Windows) from any directory using environment.yml:\r\nNote1: prefix is not set in the environment.yml as then conda does not handle it very well\r\nNote2: env_directory can be anywhere, it does not have to be in your code project\r\n```\r\n> conda env create --prefix <env_directory><env_name> --file <path_to_project>/environment.yml\r\n# example: conda env create --prefix C:/Users/xxx/.conda/envs/project_xx --file C:/Users/code_projects/xx/environment.yml\r\n> conda info --envs # verify that <env_name> (project_xx) is in this list \r\n```\r\n#### Start the application from any directory:\r\n```\r\n> conda activate <env_name>\r\nAt any location:\r\n> (<env_name>) python <path_to_project>/main.py\r\n```\r\n#### Test the application:\r\n```\r\n> conda activate <env_name>\r\n> cd <path_to_project>\r\n> pytest # make sure pytest is installed (conda install pytest)\r\n```\r\n#### List all conda environments on your machine:\r\n```\r\nAt any location:\r\n> conda info --envs\r\n```\r\n#### Delete a conda environment:\r\n```\r\nGet directory where environment is located \r\n> conda info --envs\r\nRemove the enviroment\r\n> conda env remove --name <env_name>\r\nFinally, remove the left-over directory by hand\r\n```\r\n#### Write dependencies to environment.yml:\r\nThe goal is to keep the .yml as short as possible (not include sub-dependencies), yet make the environment \r\nreproducible. Why? If you do 'conda install matplotlib' you also install sub-dependencies like pyqt, qt \r\nicu, and sip. You should not include these sub-dependencies in your .yml as:\r\n- including sub-dependencies result in an unnecessary strict environment (difficult to solve when conflicting)\r\n- sub-dependencies will be installed when dependencies are being installed\r\n```\r\n> conda activate <conda_env_name>\r\n\r\nRecommended:\r\n> conda env export --from-history --no-builds | findstr -v \"prefix\" > --file <path_to_project>/environment_new.yml \r\n\r\nAlternative:\r\n> conda env export --no-builds | findstr -v \"prefix\" > --file <path_to_project>/environment_new.yml \r\n\r\n--from-history: \r\n Only include packages that you have explicitly asked for, as opposed to including every package in the \r\n environment. This flag works regardless how you created the environment (through CMD or Anaconda Navigator).\r\n--no-builds:\r\n By default, the YAML includes platform-specific build constraints. If you transfer across platforms (e.g. \r\n win32 to 64) omit the build info with '--no-builds'.\r\n```\r\n#### Pip and Conda:\r\nIf a package is not available on all conda channels, but available as pip package, one can install pip as a dependency.\r\nNote that mixing packages from conda and pip is always a potential problem: conda calls pip, but pip does not know \r\nhow to satisfy missing dependencies with packages from Anaconda repositories. \r\n```\r\n> conda activate <env_name>\r\n> conda install pip\r\n> pip install <pip_package>\r\n```\r\nThe environment.yml might look like:\r\n```\r\nchannels:\r\n - defaults\r\ndependencies:\r\n - <a conda package>=<version>\r\n - pip\r\n - pip:\r\n - <a pip package>==<version>\r\n```\r\nYou can also write a requirements.txt file:\r\n```\r\n> pip list --format=freeze > <path_to_project>/requirements.txt\r\n```\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "An interface for interacting with hdsr github repos",
"version": "1.19",
"project_urls": {
"Download": "https://github.com/hdsr-mid/hdsr_pygithub/archive/v1.19.tar.gz",
"Homepage": "https://github.com/hdsr-mid/hdsr_pygithub"
},
"split_keywords": [
"interface",
" interaction",
" github",
" files",
" hdsr"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ceaa3efd80b7e70d4cd1574926ea0cc96c7358faddbf8572b5be6e3a75d1d0dc",
"md5": "329745b3b8fb7e3414a3f53b5ec21233",
"sha256": "e968e4fb2340dc4d44faa8780e73554687c415c37c2dec2995068aeaa50c3ba8"
},
"downloads": -1,
"filename": "hdsr_pygithub-1.19.tar.gz",
"has_sig": false,
"md5_digest": "329745b3b8fb7e3414a3f53b5ec21233",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 19174,
"upload_time": "2024-05-05T13:49:10",
"upload_time_iso_8601": "2024-05-05T13:49:10.022386Z",
"url": "https://files.pythonhosted.org/packages/ce/aa/3efd80b7e70d4cd1574926ea0cc96c7358faddbf8572b5be6e3a75d1d0dc/hdsr_pygithub-1.19.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-05 13:49:10",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hdsr-mid",
"github_project": "hdsr_pygithub",
"github_not_found": true,
"lcname": "hdsr-pygithub"
}