Name | eBird2ABAP JSON |
Version |
0.2.1
JSON |
| download |
home_page | None |
Summary | Create ABAP card from eBird data |
upload_time | 2024-09-11 19:11:56 |
maintainer | None |
docs_url | None |
author | Raphaël Nussbaumer |
requires_python | None |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
[](https://pypi.org/project/eBird2ABAP/) [](https://github.com/Rafnuss/eBird2ABAP)
# eBird to ABAP
[View on GitHub](https://github.com/RaphaelNussbaumer/eBird2ABAP)
The aims of this python package is to produce a dataset of ABAP full protocol card equivalent from the eBird EBD dataset.
The overview process is to find and combine eBird checklists that satisfy the full protocol requirements: same observers, within a pentad, at least 2hr of reporting all species spread over 5 days.
This process does not address "ad hoc" data (e.g., incomplete data in eBird).
## Getting started
Install the package from Github
```bash
pip install eBird2ABAP
```
Download EBD:
```bash
wget https://ebird.org/data/download?p=prepackaged/ebd_AFR_relJul-2024.tar
```
Run the function
```python
from eBird2ABAP import *
ebird2abap("data/eBird/ebd_AFR_relJul-2024/ebd_AFR_relJul-2024.txt.gz")
```
## Process
1. Construct the list of valid cards
1. Group raw EBD data to checklist level information (merge shared checklist)
2. Filter checklists which could make a valid card
1. Keep only complete checklists
2. Keep only checklists with `Historical`, `Stationary`, `Traveling`, `Incidental` protocol
3. Keep only checklists within the pentad, that is,
1. Exclude checklists with `historical` protocol that don't have a distance.
2. Exclude checklists with distance greater than the distance from center of checklist to closest pentad limit (accept some overlap with a correction factor).
4. Keep only checklists with duration greated than 0.
3. Group checklists by date (named `checkday` later on)
4. Group checklists into pentad_observer group so that we only have to loop through the date to find valid card
5. Preliminary filter to eliminate all pentad_observer for which the sum over the entire period does not lead to 2h
6. For each remaining pentad_observer, apply the function `checkday_pentad_observer()`, which,
1. Compute the temporal distance between all checkday and check if they are within 5 days.
2. Loop through all checkday,
1. Compute the total duration of all checkdays within temporal distance
1. If valid, create the card_id and apply it to all checkday. Iterate to the first next checkday that was not within temporal distance
2. If invalid, iterate to the next checkday
2. Create the card data
1. For each valid card, aggregate all checklists which are (1) within pentad, (2) same observer and (3) day within the 5 day period. This include more checklists than used to construct the list of valid cards
3. Add species level information to cards
1. Add sequence information based on first occurance on checklist.
4. Export in JSON
## Matching entry
### Card level
| | **Example** | **eBird EBD** | **Comments** |
| ----------------- | --------------------------- | ------------------------------ | -------------------------------------------------------------- |
| **Protocol** | "F" | "F" | Only full card are considered in this conversion process |
| **ObserverEmail** | "ipanshak@gmail.com" | "kenyabirdmap@naturekenya.org" | Using KBM email adress for now |
| **CardNo** | "0910c0725_050642_20230815" | _Pentad_ObserverNo_StartDate_ | build from Prend, ObserverNo and start date. |
| **StartDate** | "2023-08-14" | min(OBSERVATIONDATE) | |
| **EndDate** | "2023-08-17" | max(OBSERVATIONDATE) | |
| **StartTime** | "06:43" | min(OBSERVATIONDATETIME) | |
| **Pentad** | "0910c0725" | card_pentad | |
| **ObserverNo** | "050642" | "22829" | eBird account number is 22829 on ABAP. |
| **TotalHours** | "1:57" | sum(DURATIONMINUTES) | sum of the durations of all checklists |
| **Hour1** | 14 | "" | We don't record sequence in eBird |
| **Hour2** | 23 | "" | |
| **Hour3** | 0 | "" | |
| **Hour4** | 0 | "" | |
| **Hour5** | 0 | "" | |
| **Hour6** | 0 | "" | |
| **Hour7** | 0 | "" | |
| **Hour8** | 0 | "" | |
| **Hour9** | 0 | "" | |
| **Hour10** | 0 | "" | |
| **TotalSpp** | 23 | length(sp_list) | |
| **InclNight** | "0" | "0" | This conversion tool does not adress the issue of nocturnal |
| **AllHabitats** | "0" | "0" | This conversion tool does not quantify the use of all habitat. |
| | | Checklists | List of checklists used |
| | | TotalDistance | sum of distances of all checklists |
| | | ObserverNoEbird | observer ID from eBird |
### Record level
| | **Example** | **eBird EBD** | **Comments** |
| ---------------- | --------------------------- | ----------------------------- | --------------------------------------------------------------------------- |
| **Sequence** | 1 | _i_ | Taxonomic order |
| **Latitude** | 9.0910404 | checklist_latitude | Not recorded |
| **Longitude** | 7.4309485 | checklist_longitude | Not recorded |
| **Altitude** | 469.8 | "" | Not recorded |
| **CardNo** | "0910c0725_050642_20230815" | _Pentad_ObserverNo_StartDate_ | Same as the card to which the record belong to |
| **Spp** | 314 | ADU | ADU number match based on <https://github.com/A-Rocha-Kenya/Birds-of-Kenya> |
| **Accuracy** | 35.340999603271 | "" | Not recorded |
| **SightingTime** | "2023-08-15T05:36:33.834Z" | checklist_start | Use the checklist of the first occurance |
## Sample ouput
```{js}
[
{
"Protocol":"F",
"ObserverEmail":"kenyabirdmap@naturekenya.org",
"CardNo":"0500b0220_r1034990_20180213",
"StartDate":"2018-02-13",
"EndDate":"2018-02-15",
"StartTime":"15:08",
"Pentad":"0500b0220",
"ObserverNo":"22829",
"TotalHours":2.0,
"Hour1":"",
"Hour2":"",
"Hour3":"",
"Hour4":"",
"Hour5":"",
"Hour6":"",
"Hour7":"",
"Hour8":"",
"Hour9":"",
"Hour10":"",
"TotalSpp":38,
"InclNight":"0",
"AllHabitats":"0",
"Checklists":[
"S43361123",
"S43361118"
],
"TotalDistance":0.0,
"ObserverNoEbird":"obsr1034990",
"records":[
{
"Sequence":1,
"Latitude":4.9640506,
"Longitude":-2.40952,
"Altitude":"",
"CardNo":"0500b0220_r1034990_20180213",
"Spp":1338,
"SourceSpp":,
"Accuracy":"",
"SightingTime":"2018-02-13T15:08:00Z"
},
...
]
},
...
]
```
## Discussion
- Support Ad-hoc?
What eBird/eBird user can do to improve/maximize the number of card?
- Both protocol need user to do complete list (and not add-hoc or incomplete).
- Try to encourage birder to avoid using Historical protocol
- Use the track to determine pentad overlap
- Let the user confirm that a checklit belong to a single pentad rather than a check based on distance
- User input for `AllHabitats`.
Raw data
{
"_id": null,
"home_page": null,
"name": "eBird2ABAP",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Rapha\u00ebl Nussbaumer",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/4a/12/6ad28650476276662bcce0bb039c119cda9ab8232d737ffe62e42d8ae9e0/ebird2abap-0.2.1.tar.gz",
"platform": null,
"description": "[](https://pypi.org/project/eBird2ABAP/) [](https://github.com/Rafnuss/eBird2ABAP)\n\n# eBird to ABAP\n\n[View on GitHub](https://github.com/RaphaelNussbaumer/eBird2ABAP)\n\nThe aims of this python package is to produce a dataset of ABAP full protocol card equivalent from the eBird EBD dataset.\n\nThe overview process is to find and combine eBird checklists that satisfy the full protocol requirements: same observers, within a pentad, at least 2hr of reporting all species spread over 5 days.\n\nThis process does not address \"ad hoc\" data (e.g., incomplete data in eBird).\n\n## Getting started\n\nInstall the package from Github\n\n```bash\npip install eBird2ABAP\n```\n\nDownload EBD:\n\n```bash\nwget https://ebird.org/data/download?p=prepackaged/ebd_AFR_relJul-2024.tar\n```\n\nRun the function\n\n```python\nfrom eBird2ABAP import *\nebird2abap(\"data/eBird/ebd_AFR_relJul-2024/ebd_AFR_relJul-2024.txt.gz\")\n```\n\n## Process\n\n1. Construct the list of valid cards\n 1. Group raw EBD data to checklist level information (merge shared checklist)\n 2. Filter checklists which could make a valid card\n 1. Keep only complete checklists\n 2. Keep only checklists with `Historical`, `Stationary`, `Traveling`, `Incidental` protocol\n 3. Keep only checklists within the pentad, that is,\n 1. Exclude checklists with `historical` protocol that don't have a distance.\n 2. Exclude checklists with distance greater than the distance from center of checklist to closest pentad limit (accept some overlap with a correction factor).\n 4. Keep only checklists with duration greated than 0.\n 3. Group checklists by date (named `checkday` later on)\n 4. Group checklists into pentad_observer group so that we only have to loop through the date to find valid card\n 5. Preliminary filter to eliminate all pentad_observer for which the sum over the entire period does not lead to 2h\n 6. For each remaining pentad_observer, apply the function `checkday_pentad_observer()`, which,\n 1. Compute the temporal distance between all checkday and check if they are within 5 days.\n 2. Loop through all checkday,\n 1. Compute the total duration of all checkdays within temporal distance\n 1. If valid, create the card_id and apply it to all checkday. Iterate to the first next checkday that was not within temporal distance\n 2. If invalid, iterate to the next checkday\n2. Create the card data\n 1. For each valid card, aggregate all checklists which are (1) within pentad, (2) same observer and (3) day within the 5 day period. This include more checklists than used to construct the list of valid cards\n3. Add species level information to cards\n 1. Add sequence information based on first occurance on checklist.\n4. Export in JSON\n\n## Matching entry\n\n### Card level\n\n| | **Example** | **eBird EBD** | **Comments** |\n| ----------------- | --------------------------- | ------------------------------ | -------------------------------------------------------------- |\n| **Protocol** | \"F\" | \"F\" | Only full card are considered in this conversion process |\n| **ObserverEmail** | \"ipanshak@gmail.com\" | \"kenyabirdmap@naturekenya.org\" | Using KBM email adress for now |\n| **CardNo** | \"0910c0725_050642_20230815\" | _Pentad_ObserverNo_StartDate_ | build from Prend, ObserverNo and start date. |\n| **StartDate** | \"2023-08-14\" | min(OBSERVATIONDATE) | |\n| **EndDate** | \"2023-08-17\" | max(OBSERVATIONDATE) | |\n| **StartTime** | \"06:43\" | min(OBSERVATIONDATETIME) | |\n| **Pentad** | \"0910c0725\" | card_pentad | |\n| **ObserverNo** | \"050642\" | \"22829\" | eBird account number is 22829 on ABAP. |\n| **TotalHours** | \"1:57\" | sum(DURATIONMINUTES) | sum of the durations of all checklists |\n| **Hour1** | 14 | \"\" | We don't record sequence in eBird |\n| **Hour2** | 23 | \"\" | |\n| **Hour3** | 0 | \"\" | |\n| **Hour4** | 0 | \"\" | |\n| **Hour5** | 0 | \"\" | |\n| **Hour6** | 0 | \"\" | |\n| **Hour7** | 0 | \"\" | |\n| **Hour8** | 0 | \"\" | |\n| **Hour9** | 0 | \"\" | |\n| **Hour10** | 0 | \"\" | |\n| **TotalSpp** | 23 | length(sp_list) | |\n| **InclNight** | \"0\" | \"0\" | This conversion tool does not adress the issue of nocturnal |\n| **AllHabitats** | \"0\" | \"0\" | This conversion tool does not quantify the use of all habitat. |\n| | | Checklists | List of checklists used |\n| | | TotalDistance | sum of distances of all checklists |\n| | | ObserverNoEbird | observer ID from eBird |\n\n### Record level\n\n| | **Example** | **eBird EBD** | **Comments** |\n| ---------------- | --------------------------- | ----------------------------- | --------------------------------------------------------------------------- |\n| **Sequence** | 1 | _i_ | Taxonomic order |\n| **Latitude** | 9.0910404 | checklist_latitude | Not recorded |\n| **Longitude** | 7.4309485 | checklist_longitude | Not recorded |\n| **Altitude** | 469.8 | \"\" | Not recorded |\n| **CardNo** | \"0910c0725_050642_20230815\" | _Pentad_ObserverNo_StartDate_ | Same as the card to which the record belong to |\n| **Spp** | 314 | ADU | ADU number match based on <https://github.com/A-Rocha-Kenya/Birds-of-Kenya> |\n| **Accuracy** | 35.340999603271 | \"\" | Not recorded |\n| **SightingTime** | \"2023-08-15T05:36:33.834Z\" | checklist_start | Use the checklist of the first occurance |\n\n## Sample ouput\n\n```{js}\n[\n {\n \"Protocol\":\"F\",\n \"ObserverEmail\":\"kenyabirdmap@naturekenya.org\",\n \"CardNo\":\"0500b0220_r1034990_20180213\",\n \"StartDate\":\"2018-02-13\",\n \"EndDate\":\"2018-02-15\",\n \"StartTime\":\"15:08\",\n \"Pentad\":\"0500b0220\",\n \"ObserverNo\":\"22829\",\n \"TotalHours\":2.0,\n \"Hour1\":\"\",\n \"Hour2\":\"\",\n \"Hour3\":\"\",\n \"Hour4\":\"\",\n \"Hour5\":\"\",\n \"Hour6\":\"\",\n \"Hour7\":\"\",\n \"Hour8\":\"\",\n \"Hour9\":\"\",\n \"Hour10\":\"\",\n \"TotalSpp\":38,\n \"InclNight\":\"0\",\n \"AllHabitats\":\"0\",\n \"Checklists\":[\n \"S43361123\",\n \"S43361118\"\n ],\n \"TotalDistance\":0.0,\n \"ObserverNoEbird\":\"obsr1034990\",\n \"records\":[\n {\n \"Sequence\":1,\n \"Latitude\":4.9640506,\n \"Longitude\":-2.40952,\n \"Altitude\":\"\",\n \"CardNo\":\"0500b0220_r1034990_20180213\",\n \"Spp\":1338,\n \"SourceSpp\":,\n \"Accuracy\":\"\",\n \"SightingTime\":\"2018-02-13T15:08:00Z\"\n },\n ...\n ]\n },\n ...\n]\n```\n\n## Discussion\n\n- Support Ad-hoc?\n\nWhat eBird/eBird user can do to improve/maximize the number of card?\n\n- Both protocol need user to do complete list (and not add-hoc or incomplete).\n- Try to encourage birder to avoid using Historical protocol\n- Use the track to determine pentad overlap\n- Let the user confirm that a checklit belong to a single pentad rather than a check based on distance\n- User input for `AllHabitats`.\n",
"bugtrack_url": null,
"license": null,
"summary": "Create ABAP card from eBird data",
"version": "0.2.1",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ecccf2dcf9b03ebf7beebf77cd13b30f94a3b424bef06d1666a53ee620cb37c5",
"md5": "afc8dafb29fceb0bbfbfbf01b31f5066",
"sha256": "619a4ad9547041a77e85d792511aa37930803d0369bbbea9196d6459045f954e"
},
"downloads": -1,
"filename": "eBird2ABAP-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "afc8dafb29fceb0bbfbfbf01b31f5066",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 64324,
"upload_time": "2024-09-11T19:11:55",
"upload_time_iso_8601": "2024-09-11T19:11:55.470568Z",
"url": "https://files.pythonhosted.org/packages/ec/cc/f2dcf9b03ebf7beebf77cd13b30f94a3b424bef06d1666a53ee620cb37c5/eBird2ABAP-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4a126ad28650476276662bcce0bb039c119cda9ab8232d737ffe62e42d8ae9e0",
"md5": "1deb28c32f5c03ee38e240983b0575d1",
"sha256": "ad90d987c0b7fdc8ca7b51771fa2d785a0727a8ddcb8cc60cb6d2391ebb289dc"
},
"downloads": -1,
"filename": "ebird2abap-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "1deb28c32f5c03ee38e240983b0575d1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 67173,
"upload_time": "2024-09-11T19:11:56",
"upload_time_iso_8601": "2024-09-11T19:11:56.769274Z",
"url": "https://files.pythonhosted.org/packages/4a/12/6ad28650476276662bcce0bb039c119cda9ab8232d737ffe62e42d8ae9e0/ebird2abap-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-11 19:11:56",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "ebird2abap"
}