# brioche
[![PyPI version](https://badge.fury.io/py/brioche.svg)](https://badge.fury.io/py/brioche)
[![DOI](https://zenodo.org/badge/305771587.svg)](https://doi.org/10.5281/zenodo.14207034)
Brioche is a Python library for performing a biomization analysis of pollen samples.
It implements the protocol defined by Prentice et al. (1996), using Pandas for efficient analysis of a large number of pollen samples across time or sites. It was implemented to support a study of eastern African mountain evolution by Githumbi et al. (submitted).
Brioche is best used in a Jupyter notebook, but the library also includes a command line tool.
The following sections describe the biomisation protocol as implemented in Brioche at an overview level. The [example Jupyter notebook](examples/biomization.ipynb) shows how to use the library to process pollen samples from Excel files, and can be used both as documentation and as a starting point for your own analysis.
# Mapping biomes
To use Brioche you need to define a set of plant functional types (PFTs). As defined by Prentice et al., PFTs are "broad classes of plant defined by stature (e.g. tree/shrub), leaf form (e.g. broad-leaved/needle-leaved), phenology (e.g. evergreen/deciduous), and climatic adaptations". In brioche each PFT is identified by a number. These are the PFTs defined for the analysis in Githumbi et al. (submitted):
| PFT | Description |
|-----|-------------|
| 1 | Wet temperate evergreen tree (<15c, >1200mmyr-1)
| 2 | Dry temperate evergreen tree (<15c, <1200mmyr-1)
| 3 | Wet tropical evergreen tree(>15c, >1200mmyr-1)
| 4 | Dry tropical evergreen tree(>15c, <1200mmyr-1)
| 5 | Wet tropical raingreen tree (>1200mmyr-1)
| 6 | Dry tropical raingreen tree
| 7 | Tropical woody shrub
| 8 | Temperate woody shrub
| 9 | Frost tolerant woody shrub/tree
| 10 | Tropical herb/forb
| 11 | Temperate herb/forb
| 12 | Frost tolerant herb/forb
| 13 | Temperate sclerophyllous
| 14 | Grass
| 15 | Wetland taxa (sedges/herbs)
| 16 | Fire tolerant temperate tree/shrub
| 17 | Fire tolerant tropical tree/shrub
| 18 | Succulent woody shrub/tree
| 19 | Liana
| 20 | Fern
| 21 | Palm
Next you need to define a **biome x PFT matrix** that maps your set of biomes to the PFTs that are dominant in each one. Working with a full matrix can be unwieldy, so brioche supports reading a simple table with one row for each biome, followed by a list of the PFTs that map to it:
| Biome | PFTs |
|-------|------|
| Grassland/afroalpine | 12 |
| Ericaceous scrub | 17 9 14 12 |
| Moorland | 13 14 15 |
Finally the taxa in the samples must be mapped to one or more PFTs in a **taxon x PFT matrix**. (Prentice uses a *PFT x taxon matrix* instead, but the transposed structure is easier to work with since there's typically many more taxa than PFTs.) This is also most conveniently represented as a table with one row for each taxon, followed by a list of the PFTs that it maps to:
| Taxa | PFTs |
|------|------|
| Abrus | 7 10 |
| Abutilon | 1 7 |
| Acacia | 2 4 6 7 |
| Acalypha | 10 11 |
Brioche combines the two mappings to create a **taxon x biome matrix** (again, Prentice uses a *biome x taxon matrix*), where each cell holds a 1 or a 0 to indicate if the taxon maps to that biome. This matrix is only used during the calculation, but can be output to verify the mapping.
The matrixes are represented as unpivoted lists in Brioche using the classes `BiomePftList` and `TaxaPftList`. They are best constructed with the class methods `read_csv()`, `read_excel_sheet()` and `read_google_sheet()`.
Alternatively, if you prefer to work with a matrix where you map taxa and biomes to PFTs by entering 1 and 0 in each cell you can use the classes `BiomePftMatrix` and `TaxaPftMatrix`. Parts of the biome x PFT mapping above would then look like this:
| Biome | 12 | 13 | 14 | 15 |
|-----------------------|----|----|----|----|
| Grassland/afroalpine | 1 | 0 | 0 | 0 |
| Ericaceous scrub | 1 | 0 | 1 | 0 |
| Moorland | 0 | 1 | 1 | 1 |
# Reading pollen samples
The classes `PollenCounts` and `PollenPercentages` (which both implement the abstract base class `PollenSamples`) hold pollen samples for the analysis as absolute number or percentages, respectively. They can be loaded using the class methods `read_csv()` or `read_google_sheet()`, while Excel files are easiest to load the Pandas utility function `pd.read_excel()` and instantiating the classes from the resulting data frames directly (see the example notebook).
The source must have at least one index column that typically represent sample depth or age. It is also ossible with compound keys where the sample data contains both depth and age. There should be one column per taxon. Partial example:
| Depth | Age | Cyperaceae | Poaceae | Helichrysum
|-------|-----|------------|---------|-------------
| 703 | 3329 | 102 | 103 | 11
| 753 | 3445 | 106 | 88 | 11
| 803 | 3562 | 99 | 70 | 8
# Biome affinity analysis
The analysis is performed by instantiating the class `Biomization` providing the `TaxaPftList` and `BiomePftList` objects from above. To make it easier to build up the mapping the method `get_unmapped_taxas()` provide a list of taxa that are not mapped to any biome, either because they are not mapped to any PFT or because a PFT is not mapped to any biome.
The analysis is performed on stabilized sample values, which are calculated as the square root of the percentages to reduce the impact of dominant taxa. A threshold can be subtracted from the percentages before the square root is calculated, setting any negative numbers to 0, to ignore taxa with very low numbers from the analysis.
`PollenCounts` and `PollenPercentages` provide the `get_stabilized()` method which returns a `StabilizedPollenSamples` object. This is passed to `Biomization.get_biome_affinity()` method which returns a `BiomeAffinity` result object. The results are provided by two properties:
* `scores`: A `pd.DataFrame` with the calculated affinity scores for each biome for each sample. The score for one biome is calculated by adding the stabilized values for each taxa that are mapped to that biome via the PFTs.
* `biomes`: a `pd.Series` containing the name of the biome with the highest affinity score for each sample. In case of ties, the biome with the fewest mapped taxa will be chosen. The logic from Prentice is that biomes with more mapped taxa are identified by the presence of those additional taxa, and in their absence they should not be selected over a biome that does not include them.
For a complete example, see the example notebook. This has the additional step of explicitly calculating pollen percentages to a specific number of decimals, rather than going directly from pollen counts to stabilized values.
# Extending the Brioche analysis
The classes `PollenCounts`, `PollenPercentages`, `StabilizedPollenSamples` and `BiomeAffinity` all have an `apply()` method that allows the underlying dataframe to be processed by any Pandas code, and returns a new object containing the results. The example notebook demonstrates how to use this to bin affinity scores on age.
# Processing the result
The `BiomeAffinity` properties can be written to Excel files for further processing, be further processed in Python using Pandas or plotted. Again, see the example notebook.
# References
Githumbi et al., submitted. Understanding Eastern African Montane Forest Evolution since the end of the Last Glacial Maxima.
Prentice, C., Guiot, J., Huntley, B., Jolly, D., Cheddadi, R., 1996. Reconstructing biomes from palaeoecological data: a general method and its application to European pollen data at 0 and 6 ka. Clim. Dyn. 12, 185–194.
Raw data
{
"_id": null,
"home_page": "https://github.com/petli/brioche",
"name": "brioche",
"maintainer": null,
"docs_url": null,
"requires_python": "<4,>=3.10",
"maintainer_email": null,
"keywords": "biome, biomization, plant function type, pollen",
"author": "Peter Liljenberg",
"author_email": "peter.liljenberg@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/52/f8/fad2cff3b36bb6c7b98218f54705270e2fe6555b7f7f7a09c87a8aebf7b0/brioche-1.0.0.tar.gz",
"platform": null,
"description": "# brioche\n\n[![PyPI version](https://badge.fury.io/py/brioche.svg)](https://badge.fury.io/py/brioche)\n[![DOI](https://zenodo.org/badge/305771587.svg)](https://doi.org/10.5281/zenodo.14207034)\n\nBrioche is a Python library for performing a biomization analysis of pollen samples.\n\nIt implements the protocol defined by Prentice et al. (1996), using Pandas for efficient analysis of a large number of pollen samples across time or sites. It was implemented to support a study of eastern African mountain evolution by Githumbi et al. (submitted).\n\nBrioche is best used in a Jupyter notebook, but the library also includes a command line tool.\n\nThe following sections describe the biomisation protocol as implemented in Brioche at an overview level. The [example Jupyter notebook](examples/biomization.ipynb) shows how to use the library to process pollen samples from Excel files, and can be used both as documentation and as a starting point for your own analysis.\n\n# Mapping biomes\n\nTo use Brioche you need to define a set of plant functional types (PFTs). As defined by Prentice et al., PFTs are \"broad classes of plant defined by stature (e.g. tree/shrub), leaf form (e.g. broad-leaved/needle-leaved), phenology (e.g. evergreen/deciduous), and climatic adaptations\". In brioche each PFT is identified by a number. These are the PFTs defined for the analysis in Githumbi et al. (submitted):\n\n| PFT |\tDescription |\n|-----|-------------|\n| 1 |\tWet temperate evergreen tree (<15c, >1200mmyr-1)\n| 2\t| Dry temperate evergreen tree (<15c, <1200mmyr-1)\n| 3\t| Wet tropical evergreen tree(>15c, >1200mmyr-1)\n| 4\t| Dry tropical evergreen tree(>15c, <1200mmyr-1)\n| 5\t| Wet tropical raingreen tree (>1200mmyr-1)\n| 6\t| Dry tropical raingreen tree\n| 7\t| Tropical woody shrub\n| 8\t| Temperate woody shrub\n| 9\t| Frost tolerant woody shrub/tree\n| 10 | Tropical herb/forb\n| 11 | Temperate herb/forb\n| 12 | Frost tolerant herb/forb\n| 13 | Temperate sclerophyllous\n| 14 | Grass\n| 15 | Wetland taxa (sedges/herbs)\n| 16 | Fire tolerant temperate tree/shrub\n| 17 | Fire tolerant tropical tree/shrub\n| 18 | Succulent woody shrub/tree\n| 19 | Liana\n| 20 | Fern\n| 21 | Palm\n\nNext you need to define a **biome x PFT matrix** that maps your set of biomes to the PFTs that are dominant in each one. Working with a full matrix can be unwieldy, so brioche supports reading a simple table with one row for each biome, followed by a list of the PFTs that map to it:\n\n| Biome | PFTs |\n|-------|------|\n| Grassland/afroalpine | 12 |\n| Ericaceous scrub | 17 9 14 12 |\n| Moorland | 13 14 15 |\t\n\nFinally the taxa in the samples must be mapped to one or more PFTs in a **taxon x PFT matrix**. (Prentice uses a *PFT x taxon matrix* instead, but the transposed structure is easier to work with since there's typically many more taxa than PFTs.) This is also most conveniently represented as a table with one row for each taxon, followed by a list of the PFTs that it maps to:\n\n| Taxa | PFTs |\n|------|------|\n| Abrus | 7 10 |\n| Abutilon | 1 7 |\n| Acacia | 2 4 6 7 |\n| Acalypha | 10 11 |\t\n\nBrioche combines the two mappings to create a **taxon x biome matrix** (again, Prentice uses a *biome x taxon matrix*), where each cell holds a 1 or a 0 to indicate if the taxon maps to that biome. This matrix is only used during the calculation, but can be output to verify the mapping.\n\nThe matrixes are represented as unpivoted lists in Brioche using the classes `BiomePftList` and `TaxaPftList`. They are best constructed with the class methods `read_csv()`, `read_excel_sheet()` and `read_google_sheet()`.\n\nAlternatively, if you prefer to work with a matrix where you map taxa and biomes to PFTs by entering 1 and 0 in each cell you can use the classes `BiomePftMatrix` and `TaxaPftMatrix`. Parts of the biome x PFT mapping above would then look like this:\n\n| Biome | 12 | 13 | 14 | 15 |\n|-----------------------|----|----|----|----|\n| Grassland/afroalpine | 1 | 0 | 0 | 0 |\n| Ericaceous scrub | 1 | 0 | 1 | 0 |\n| Moorland | 0 | 1 | 1 | 1 |\n\n\n# Reading pollen samples\n\nThe classes `PollenCounts` and `PollenPercentages` (which both implement the abstract base class `PollenSamples`) hold pollen samples for the analysis as absolute number or percentages, respectively. They can be loaded using the class methods `read_csv()` or `read_google_sheet()`, while Excel files are easiest to load the Pandas utility function `pd.read_excel()` and instantiating the classes from the resulting data frames directly (see the example notebook).\n\nThe source must have at least one index column that typically represent sample depth or age. It is also ossible with compound keys where the sample data contains both depth and age. There should be one column per taxon. Partial example:\n\n| Depth\t| Age | Cyperaceae | Poaceae | Helichrysum\n|-------|-----|------------|---------|-------------\n| 703 | 3329 | 102 | 103 | 11\n| 753 | 3445 | 106 | 88 | 11\n| 803 | 3562 | 99 | 70 | 8\n\n\n# Biome affinity analysis\n\nThe analysis is performed by instantiating the class `Biomization` providing the `TaxaPftList` and `BiomePftList` objects from above. To make it easier to build up the mapping the method `get_unmapped_taxas()` provide a list of taxa that are not mapped to any biome, either because they are not mapped to any PFT or because a PFT is not mapped to any biome.\n\nThe analysis is performed on stabilized sample values, which are calculated as the square root of the percentages to reduce the impact of dominant taxa. A threshold can be subtracted from the percentages before the square root is calculated, setting any negative numbers to 0, to ignore taxa with very low numbers from the analysis.\n\n`PollenCounts` and `PollenPercentages` provide the `get_stabilized()` method which returns a `StabilizedPollenSamples` object. This is passed to `Biomization.get_biome_affinity()` method which returns a `BiomeAffinity` result object. The results are provided by two properties:\n\n* `scores`: A `pd.DataFrame` with the calculated affinity scores for each biome for each sample. The score for one biome is calculated by adding the stabilized values for each taxa that are mapped to that biome via the PFTs.\n* `biomes`: a `pd.Series` containing the name of the biome with the highest affinity score for each sample. In case of ties, the biome with the fewest mapped taxa will be chosen. The logic from Prentice is that biomes with more mapped taxa are identified by the presence of those additional taxa, and in their absence they should not be selected over a biome that does not include them.\n\nFor a complete example, see the example notebook. This has the additional step of explicitly calculating pollen percentages to a specific number of decimals, rather than going directly from pollen counts to stabilized values. \n\n\n# Extending the Brioche analysis\n\nThe classes `PollenCounts`, `PollenPercentages`, `StabilizedPollenSamples` and `BiomeAffinity` all have an `apply()` method that allows the underlying dataframe to be processed by any Pandas code, and returns a new object containing the results. The example notebook demonstrates how to use this to bin affinity scores on age.\n\n\n# Processing the result\n\nThe `BiomeAffinity` properties can be written to Excel files for further processing, be further processed in Python using Pandas or plotted. Again, see the example notebook.\n\n\n# References\n\nGithumbi et al., submitted. Understanding Eastern African Montane Forest Evolution since the end of the Last Glacial Maxima.\n\nPrentice, C., Guiot, J., Huntley, B., Jolly, D., Cheddadi, R., 1996. Reconstructing biomes from palaeoecological data: a general method and its application to European pollen data at 0 and 6 ka. Clim. Dyn. 12, 185\u2013194.\n",
"bugtrack_url": null,
"license": null,
"summary": "Python/Pandas library to do biomization analyses of pollen samples",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/petli/brioche"
},
"split_keywords": [
"biome",
" biomization",
" plant function type",
" pollen"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "cebcbbd3a46ba1a9786a51a7626554897cbbb5f0126e9d2e3c6767b8aaf35c59",
"md5": "7c5686610179edabc3e4a6c3efba25f1",
"sha256": "9af381cc750b078504dc8623d108ca85f47752f34c48471f3b979ef742d72611"
},
"downloads": -1,
"filename": "brioche-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7c5686610179edabc3e4a6c3efba25f1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4,>=3.10",
"size": 14226,
"upload_time": "2025-01-13T18:45:18",
"upload_time_iso_8601": "2025-01-13T18:45:18.404261Z",
"url": "https://files.pythonhosted.org/packages/ce/bc/bbd3a46ba1a9786a51a7626554897cbbb5f0126e9d2e3c6767b8aaf35c59/brioche-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "52f8fad2cff3b36bb6c7b98218f54705270e2fe6555b7f7f7a09c87a8aebf7b0",
"md5": "57ecdfe587f287b3ce3d5a4a4594839b",
"sha256": "6672477ecac8cc273b9fab1abc5615852decdc6d463a2c4dfb17c5d90cf44698"
},
"downloads": -1,
"filename": "brioche-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "57ecdfe587f287b3ce3d5a4a4594839b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4,>=3.10",
"size": 795626,
"upload_time": "2025-01-13T18:45:19",
"upload_time_iso_8601": "2025-01-13T18:45:19.559516Z",
"url": "https://files.pythonhosted.org/packages/52/f8/fad2cff3b36bb6c7b98218f54705270e2fe6555b7f7f7a09c87a8aebf7b0/brioche-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-13 18:45:19",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "petli",
"github_project": "brioche",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "brioche"
}