# Chem Analysis
Your one-stop shop for analyzing chemistry data.
This package was developed for general chemistry data analysis, but special attention was paid to analyze large data sets
(example: analyzing 1000 NMR at once).
Design Philosophy:
* Handle large data sets
* Support analyzing 1000s of NMR at once. (like those generated from kinetic analysis)
* Modular
* Be able to turn on or off or switch out processing or analysis methods with minimal code change
* Explict
* Alot of analytic software automatically perform data transforms and hide this from the user making it hard to
truly know what's going on with your data. Here everything needs to be called explicitly, but typical processing
steps are suggested in several of the examples.
**Support data types**:
* IR
* NMR (Bruker, Spinsolve) - 1D only
* SEC (GPC)
* GC and LC/HPLC/UPLC
* Mass Spec. and GC-MS
* UV-Vis (coming soon)
## Installation
[pypi page](https://pypi.org/project/chem-analysis/)
`pip install chem_analysis`
### Dependencies
#### Required
* [numpy](https://github.com/numpy/numpy) (for math)
* [scipy](https://github.com/scipy/scipy) (for math)
* [tabulate](https://github.com/astanin/python-tabulate) (for pretty table outputs)
* [bigsmiles](https://github.com/dylanwal/BigSMILES) (for chemistry (molar weights, SMILES))
#### Optional
* [matplotlib](https://github.com/matplotlib/matplotlib) (for plotting)
* [plotly](https://github.com/plotly/plotly.py) (for plotting)
* [pyqtgraph](https://github.com/pyqtgraph/pyqtgraph) (for plotting)
## Capabilities
### Processing Methods:
* Baseline correction
* Peak Picking
* Resampling
* Translations
* Smoothing
* Phase correction (NMR)
* Referencing (NMR)
* Chromatogram Library Matching (GC, LC/HPLC/UPLC)
* Mass Spec Library Matching (MS, GC-MS)
* And more ...
### Analysis Methods:
* Peak picking
* Integration
* Peak fitting
* Deconvolution (peaks and MS spectra)
* Multi-component analysis (MCA)
* And more ...
## Plotting / GUI
* Matplotlib
* Popular Python Plotting Library
* Plotly
* Provides interactive plots (html)
* Zoom in/out is a game changer
* PyQt
* Uses GPU for rendering.
* Good for lots of data!!
Plotting is hard and everyone has their preferences. The goal of plotting in this package if to
provide quick and useful views of the data with useful defaults.
It does not strive for perfect looks or full customization.
## Quick Examples
See [Examples folder] for a large list.
```python
import pathlib
import plotly.graph_objs as go
import chem_analysis as ca
def main():
cal_RI = ca.sec.ConventionalCalibration(lambda time: 10 ** (-0.6 * time + 10.644),
mw_bounds=(160, 1_090_000), name="RI calibration")
# loading data
file_path = pathlib.Path(r"data//SEC.csv")
sec_signal = ca.sec.SECSignal.from_csv(file_path)
sec_signal.calibration = cal_RI
# processing
sec_signal.processor.add(
ca.processing.baseline.ImprovedAsymmetricLeastSquared(lambda_=1e6, p=0.15)
)
# analysis
peak = ca.analysis.peak_picking.find_peak_largest(sec_signal,
mask=ca.processing.weigths.Spans((10, 12.2), invert=True)
)
result = ca.analysis.integration.rolling_ball(peak, n=45, min_height=0.05, n_points_with_pos_slope=1)
# plotting
fig = go.Figure(layout=ca.plotting.PlotlyConfig.plotly_layout())
ca.plotting.signal(sec_signal, fig=fig)
ca.plotting.calibration(cal_RI, fig=fig)
ca.plotting.peaks(result, fig=fig)
fig.data[0].line.color = "blue" # customize colors
fig.data[4].fillcolor = "gray" # customize colors
# fig.show()
fig.write_html('figs/sec_data_analysis.png')
# print results
print(result.stats_table().to_str())
if __name__ == '__main__':
main()
```
| peak | area | ...** | mw_d | mw_n |
|--------|------|-------|--------|--------|
| 0 | 2.7 | ... | 1.218 | 7465 |
**Only showing 4 of 28 stats calculated for SEC peak
![sec_data_analysis.png](https://github.com/dylanwal/chem_analysis/tree/develop/dev/sec_data_analysis.png)
## Contributing
Let's be honest, that are bugs in the code! Pointing them out are much appreciated, and solutions are double appreciated!
Got a new data type, new algorithm, analysis, etc.? New contributions are welcomed!
Best practice is to open an issue with your idea, and I will let you know if it
is a good fit for the project. If you are interested in helping code the addition please mention that as well.
Raw data
{
"_id": null,
"home_page": "https://github.com/dylanwal/chem_analysis",
"name": "chem-analysis",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "materials characterization, materials science, baseline, background, baseline correction, baseline subtraction, chemistry, spectroscopy, NMR, IR, UV-VIS, GC, LC, UPLC, HPLC, GC-MS, MS, GPC, SEC, gas chromatography, mass spectrometry, nuclear magnetic resonance, size exclusion chromatography, gel permeation chromatography",
"author": "Dylan Walsh",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/f6/c3/06cc303990da7925bcb958d25f93a81018a3221f3c121b93879fd4475970/chem_analysis-0.0.7.tar.gz",
"platform": "any",
"description": "# Chem Analysis\r\n\r\nYour one-stop shop for analyzing chemistry data.\r\n\r\nThis package was developed for general chemistry data analysis, but special attention was paid to analyze large data sets \r\n(example: analyzing 1000 NMR at once). \r\n\r\nDesign Philosophy:\r\n* Handle large data sets \r\n * Support analyzing 1000s of NMR at once. (like those generated from kinetic analysis)\r\n* Modular \r\n * Be able to turn on or off or switch out processing or analysis methods with minimal code change\r\n* Explict\r\n * Alot of analytic software automatically perform data transforms and hide this from the user making it hard to \r\n truly know what's going on with your data. Here everything needs to be called explicitly, but typical processing\r\n steps are suggested in several of the examples.\r\n\r\n\r\n**Support data types**:\r\n* IR\r\n* NMR (Bruker, Spinsolve) - 1D only\r\n* SEC (GPC)\r\n* GC and LC/HPLC/UPLC\r\n* Mass Spec. and GC-MS\r\n* UV-Vis (coming soon)\r\n\r\n\r\n## Installation\r\n[pypi page](https://pypi.org/project/chem-analysis/)\r\n\r\n`pip install chem_analysis`\r\n\r\n### Dependencies\r\n#### Required\r\n* [numpy](https://github.com/numpy/numpy) (for math)\r\n* [scipy](https://github.com/scipy/scipy) (for math)\r\n* [tabulate](https://github.com/astanin/python-tabulate) (for pretty table outputs)\r\n* [bigsmiles](https://github.com/dylanwal/BigSMILES) (for chemistry (molar weights, SMILES))\r\n#### Optional\r\n* [matplotlib](https://github.com/matplotlib/matplotlib) (for plotting)\r\n* [plotly](https://github.com/plotly/plotly.py) (for plotting)\r\n* [pyqtgraph](https://github.com/pyqtgraph/pyqtgraph) (for plotting)\r\n\r\n## Capabilities\r\n### Processing Methods:\r\n* Baseline correction\r\n* Peak Picking\r\n* Resampling\r\n* Translations\r\n* Smoothing\r\n* Phase correction (NMR)\r\n* Referencing (NMR)\r\n* Chromatogram Library Matching (GC, LC/HPLC/UPLC)\r\n* Mass Spec Library Matching (MS, GC-MS)\r\n* And more ...\r\n\r\n### Analysis Methods:\r\n* Peak picking\r\n* Integration\r\n* Peak fitting\r\n* Deconvolution (peaks and MS spectra)\r\n* Multi-component analysis (MCA)\r\n* And more ...\r\n\r\n## Plotting / GUI\r\n* Matplotlib\r\n * Popular Python Plotting Library\r\n* Plotly\r\n * Provides interactive plots (html)\r\n * Zoom in/out is a game changer\r\n* PyQt\r\n * Uses GPU for rendering.\r\n * Good for lots of data!!\r\n\r\nPlotting is hard and everyone has their preferences. The goal of plotting in this package if to \r\nprovide quick and useful views of the data with useful defaults. \r\nIt does not strive for perfect looks or full customization. \r\n\r\n\r\n## Quick Examples\r\n\r\nSee [Examples folder] for a large list.\r\n\r\n```python\r\nimport pathlib\r\nimport plotly.graph_objs as go\r\nimport chem_analysis as ca\r\n\r\n\r\ndef main():\r\n cal_RI = ca.sec.ConventionalCalibration(lambda time: 10 ** (-0.6 * time + 10.644),\r\n mw_bounds=(160, 1_090_000), name=\"RI calibration\")\r\n\r\n # loading data\r\n file_path = pathlib.Path(r\"data//SEC.csv\")\r\n sec_signal = ca.sec.SECSignal.from_csv(file_path)\r\n sec_signal.calibration = cal_RI\r\n\r\n # processing\r\n sec_signal.processor.add(\r\n ca.processing.baseline.ImprovedAsymmetricLeastSquared(lambda_=1e6, p=0.15)\r\n )\r\n\r\n # analysis\r\n peak = ca.analysis.peak_picking.find_peak_largest(sec_signal,\r\n mask=ca.processing.weigths.Spans((10, 12.2), invert=True)\r\n )\r\n result = ca.analysis.integration.rolling_ball(peak, n=45, min_height=0.05, n_points_with_pos_slope=1)\r\n\r\n # plotting\r\n fig = go.Figure(layout=ca.plotting.PlotlyConfig.plotly_layout())\r\n ca.plotting.signal(sec_signal, fig=fig)\r\n ca.plotting.calibration(cal_RI, fig=fig)\r\n ca.plotting.peaks(result, fig=fig)\r\n fig.data[0].line.color = \"blue\" # customize colors\r\n fig.data[4].fillcolor = \"gray\" # customize colors\r\n # fig.show()\r\n fig.write_html('figs/sec_data_analysis.png')\r\n\r\n # print results\r\n print(result.stats_table().to_str())\r\n\r\n\r\nif __name__ == '__main__':\r\n main()\r\n\r\n\r\n```\r\n\r\n| peak | area | ...** | mw_d | mw_n |\r\n|--------|------|-------|--------|--------|\r\n| 0 | 2.7 | ... | 1.218 | 7465 |\r\n\r\n**Only showing 4 of 28 stats calculated for SEC peak\r\n\r\n![sec_data_analysis.png](https://github.com/dylanwal/chem_analysis/tree/develop/dev/sec_data_analysis.png)\r\n\r\n## Contributing\r\n\r\nLet's be honest, that are bugs in the code! Pointing them out are much appreciated, and solutions are double appreciated! \r\n\r\nGot a new data type, new algorithm, analysis, etc.? New contributions are welcomed! \r\nBest practice is to open an issue with your idea, and I will let you know if it\r\nis a good fit for the project. If you are interested in helping code the addition please mention that as well. \r\n\r\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "Analyzing experimental chemistry data",
"version": "0.0.7",
"project_urls": {
"Homepage": "https://github.com/dylanwal/chem_analysis"
},
"split_keywords": [
"materials characterization",
" materials science",
" baseline",
" background",
" baseline correction",
" baseline subtraction",
" chemistry",
" spectroscopy",
" nmr",
" ir",
" uv-vis",
" gc",
" lc",
" uplc",
" hplc",
" gc-ms",
" ms",
" gpc",
" sec",
" gas chromatography",
" mass spectrometry",
" nuclear magnetic resonance",
" size exclusion chromatography",
" gel permeation chromatography"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "aa04c6e318263d8efe96d147ce8bf4a85aeb23402d7170f4477016647ace2ec2",
"md5": "ae494f2286e5c188649d8e3576d8b257",
"sha256": "171e284c97440e1c8d8f15d606d6ca00b5a2990e913fe5c09074a0c8cbb7dcd0"
},
"downloads": -1,
"filename": "chem_analysis-0.0.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ae494f2286e5c188649d8e3576d8b257",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 240078,
"upload_time": "2024-09-30T19:27:35",
"upload_time_iso_8601": "2024-09-30T19:27:35.007949Z",
"url": "https://files.pythonhosted.org/packages/aa/04/c6e318263d8efe96d147ce8bf4a85aeb23402d7170f4477016647ace2ec2/chem_analysis-0.0.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f6c306cc303990da7925bcb958d25f93a81018a3221f3c121b93879fd4475970",
"md5": "f9415f16403a7adefb668afb14d97d80",
"sha256": "12a12ab5d6d94e7cb6c7309eec36a6e8397819b959d1b437269683b3f4839de3"
},
"downloads": -1,
"filename": "chem_analysis-0.0.7.tar.gz",
"has_sig": false,
"md5_digest": "f9415f16403a7adefb668afb14d97d80",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 153496,
"upload_time": "2024-09-30T19:27:36",
"upload_time_iso_8601": "2024-09-30T19:27:36.497640Z",
"url": "https://files.pythonhosted.org/packages/f6/c3/06cc303990da7925bcb958d25f93a81018a3221f3c121b93879fd4475970/chem_analysis-0.0.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-30 19:27:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dylanwal",
"github_project": "chem_analysis",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "chem-analysis"
}