
# NAIS Processor
Code package to process [NAIS](https://www.airel.ee/products/nais/) (Neutral cluster and Air Ion Spectrometer, Airel Ltd.) data files.
## Installation
```shell
pip install nais-processor
```
You can find the package on [PyPI](https://pypi.org/project/nais-processor/).
## Documentation
See [here](https://jlpl.github.io/nais-processor/)
## Modules
### Processor
The `nais.processor` module can be used to process the data to netcdf files and allows options for the following operations:
* Inlet loss correction (Gromley and Kennedy, 1948)
* Ion mode correction (Wagner et al. 2016)
* Conversion to standard conditions (293.15 K, 101325 Pa)
* Remove charger ion band from total particle data
* Use fill values in case of missing environmental sensor data
### Utils
The `nais.utils` module contains functions that allow one to do operations on the NAIS data files.
### Checker
The `nais.checker` module contains a GUI application with which one can visually inspect the nais ion/aerosol size distributions along with the flags and identify bad data by drawing a bounding box around it and saving the coordinates for later use.
(Tested with Qt vers. 5.15.2)
## Example usage
Use the `make_config_template()` method to create a configuration file template and fill it with necessary information. The configuration file is used at processing the data files.
```python
from nais.processor import make_config_template
make_config_template("/home/user/viikki.yml")
```
Running this will create a configuration file template called `/home/user/viikki.yml`. After filling in the information in the confguration file for our example measurement the file may look like this:
```yaml
measurement_location: Viikki, Helsinki, Finland
description: Agricultural site
longitude: 25.02
latitude: 60.23
data_folder:
- /home/user/data/2021
- /home/user/data/2022
processed_folder: /home/user/viikki
database_file: /home/user/viikki.json
start_date: 2022-09-28
end_date: 2022-09-30
inlet_length: 1.0
do_inlet_loss_correction: true
convert_to_standard_conditions: true
do_wagner_ion_mode_correction: true
remove_corona_ions: true
allow_reprocess: false
redo_database: false
fill_temperature: 273.15
fill_pressure: 101325.0
fill_flowrate: 54.0
dilution_on: false
file_format: block
resolution: 5min
```
Then process the data files by running `nais_processor()` method with the config file as the input argument.
In our example case:
```python
from nais.processor import nais_processor
nais_processor("/home/user/viikki.yml")
```
```
Building database...
Processing 20220928 (Viikki, Helsinki, Finland)
Processing 20220929 (Viikki, Helsinki, Finland)
Processing 20220930 (Viikki, Helsinki, Finland)
Done!
```
The code produces daily processed data files `NAIS_yyyymmdd.nc` (netCDF format). These files are saved in the destination given in the configuration file.
The locations of raw and processed files for each day are written in the JSON formatted `database_file`. This prevents reprocessing when `allow_reprocess: false`.
The netcdf files have the following structure:
| Fields | Dimensions | Data type | Units | Comments |
|--------------------|---------------|----------------|-------|------------------- |
| **Coordinates** | | | | |
| time | time | datetime64[ns] | | timezone: utc |
| diameter | diameter | float | m | particle diameter |
| flag | flag | string | | |
| **Data variables** | | | | |
| neg_ions | time,diameter | float | cm-3 | dN/dlogDp |
| pos_ions | time,diameter | float | cm-3 | dN/dlogDp |
| neg_particles | time,diameter | float | cm-3 | dN/dlogDp |
| pos_particles | time,diameter | float | cm-3 | dN/dlogDp |
| neg_ion_flags | time,flag | int | | flag=1, no flag=0 |
| pos_ion_flags | time,flag | int | | flag=1, no flag=0 |
| neg_particle_flags | time,flag | int | | flag=1, no flag=0 |
| pos_particle_flags | time,flag | int | | flag=1, no flag=0 |
| temperature | time | float | K | |
| pressure | time | float | Pa | |
| sample_flow | time | float | lpm | |
| **Attributes** | | | | |
| Measurement info | | dictionary | | |
Below are some examples of how to access the different variables in the netcdf file.
```python
import xarray as xr
import pandas as pd
# load the dataset
ds = xr.open_dataset("/home/user/viikki/NAIS_20220928.nc")
# Get negative ion number size distribution
df_neg_ions = ds.neg_ions.to_pandas()
# Get temperature
df_temperature = ds.temperature.to_pandas()
# Close the file
ds.close()
```
Continuing on with the data analysis, next we combine the previously created files into a single continuous dataset with 1 hour time resolution and only raise a flag if at least 50% of the data points inside the two hour window contain the flag. We save it as a netcdf file.
```python
from nais.utils import combine_data
import pandas as pd
import xarray as xr
data_source = "/home/user/viikki"
date_range = pd.date_range("2022-09-28","2022-09-30")
ds = combine_data(data_source, date_range, "1H",
flag_sensitivity=0.5)
ds.to_netcdf("combined_nais_dataset.nc")
```
Then we launch the data checker with the combined data in order to identify bad data. Bounding boxes can be drawn around bad data in the size distributions (initiate an adjustable box with double left click and remove from the menu opened by right clicking the box). By clicking the save boundaries button the box coordinates are saved to a netcdf file (filename given in the second argument). If the bounding boxes are saved, they will be reloaded when the checker is reopened with same arguments, so save your work regularly in case the program crashes.
```python
from nais.checker import startNaisChecker
startNaisChecker("combined_nais_dataset.nc", "bad_data_bounds.nc")
```
We can set the bad data to `NaN` in our combined file and use the resulting dataset as the starting point for further analysis.
```python
from nais.utils import remove_bad_data
ds = xr.open_dataset("combined_nais_dataset.nc")
bad_data = xr.open_dataset("bad_data_bounds.nc")
ds = remove_bad_data(ds, bad_data)
```
## License
This project is licensed under the terms of the GNU GPLv3.
## References
Gormley P. G. and Kennedy M., Diffusion from a Stream Flowing through a Cylindrical Tube, Proceedings of the Royal Irish Academy. Section A: Mathematical and Physical Sciences, 52, (1948-1950), pp. 163-169.
Wagner R., Manninen H.E., Franchin A., Lehtipalo K., Mirme S., Steiner G., Petäjä T. and Kulmala M., On the accuracy of ion measurements using a Neutral cluster and Air Ion Spectrometer, Boreal Environment Research, 21, (2016), pp. 230–241.
Raw data
{
"_id": null,
"home_page": "https://github.com/jlpl/nais-processor",
"name": "nais-processor",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Janne Lampilahti",
"author_email": "janne.lampilahti@helsinki.fi",
"download_url": "https://files.pythonhosted.org/packages/a2/e7/ef1f85cfe794060fea043788bd388dc507706da63a04d02cad5e3ba124cb/nais-processor-0.0.35.tar.gz",
"platform": null,
"description": "\n\n# NAIS Processor\nCode package to process [NAIS](https://www.airel.ee/products/nais/) (Neutral cluster and Air Ion Spectrometer, Airel Ltd.) data files.\n\n## Installation\n```shell\npip install nais-processor\n```\nYou can find the package on [PyPI](https://pypi.org/project/nais-processor/).\n\n## Documentation\nSee [here](https://jlpl.github.io/nais-processor/)\n\n## Modules\n\n### Processor\nThe `nais.processor` module can be used to process the data to netcdf files and allows options for the following operations:\n\n* Inlet loss correction (Gromley and Kennedy, 1948)\n* Ion mode correction (Wagner et al. 2016)\n* Conversion to standard conditions (293.15 K, 101325 Pa)\n* Remove charger ion band from total particle data\n* Use fill values in case of missing environmental sensor data\n\n### Utils\nThe `nais.utils` module contains functions that allow one to do operations on the NAIS data files.\n\n### Checker\nThe `nais.checker` module contains a GUI application with which one can visually inspect the nais ion/aerosol size distributions along with the flags and identify bad data by drawing a bounding box around it and saving the coordinates for later use.\n\n(Tested with Qt vers. 5.15.2)\n\n## Example usage\nUse the `make_config_template()` method to create a configuration file template and fill it with necessary information. The configuration file is used at processing the data files.\n```python\nfrom nais.processor import make_config_template\nmake_config_template(\"/home/user/viikki.yml\")\n```\nRunning this will create a configuration file template called `/home/user/viikki.yml`. After filling in the information in the confguration file for our example measurement the file may look like this:\n```yaml\nmeasurement_location: Viikki, Helsinki, Finland\ndescription: Agricultural site\nlongitude: 25.02\nlatitude: 60.23\ndata_folder:\n- /home/user/data/2021\n- /home/user/data/2022\nprocessed_folder: /home/user/viikki\ndatabase_file: /home/user/viikki.json\nstart_date: 2022-09-28\nend_date: 2022-09-30\ninlet_length: 1.0\ndo_inlet_loss_correction: true\nconvert_to_standard_conditions: true\ndo_wagner_ion_mode_correction: true\nremove_corona_ions: true\nallow_reprocess: false\nredo_database: false\nfill_temperature: 273.15\nfill_pressure: 101325.0\nfill_flowrate: 54.0\ndilution_on: false\nfile_format: block\nresolution: 5min \n```\nThen process the data files by running `nais_processor()` method with the config file as the input argument.\n\nIn our example case:\n```python\nfrom nais.processor import nais_processor\nnais_processor(\"/home/user/viikki.yml\")\n```\n```\nBuilding database...\nProcessing 20220928 (Viikki, Helsinki, Finland)\nProcessing 20220929 (Viikki, Helsinki, Finland)\nProcessing 20220930 (Viikki, Helsinki, Finland)\nDone!\n```\nThe code produces daily processed data files `NAIS_yyyymmdd.nc` (netCDF format). These files are saved in the destination given in the configuration file.\n\nThe locations of raw and processed files for each day are written in the JSON formatted `database_file`. This prevents reprocessing when `allow_reprocess: false`.\n\nThe netcdf files have the following structure:\n| Fields | Dimensions | Data type | Units | Comments |\n|--------------------|---------------|----------------|-------|------------------- |\n| **Coordinates** | | | | |\n| time | time | datetime64[ns] | | timezone: utc |\n| diameter | diameter | float | m | particle diameter |\n| flag | flag | string | | |\n| **Data variables** | | | | |\n| neg_ions | time,diameter | float | cm-3 | dN/dlogDp |\n| pos_ions | time,diameter | float | cm-3 | dN/dlogDp |\n| neg_particles | time,diameter | float | cm-3 | dN/dlogDp |\n| pos_particles | time,diameter | float | cm-3 | dN/dlogDp |\n| neg_ion_flags | time,flag | int | | flag=1, no flag=0 |\n| pos_ion_flags | time,flag | int | | flag=1, no flag=0 |\n| neg_particle_flags | time,flag | int | | flag=1, no flag=0 |\n| pos_particle_flags | time,flag | int | | flag=1, no flag=0 |\n| temperature | time | float | K | |\n| pressure | time | float | Pa | |\n| sample_flow | time | float | lpm | |\n| **Attributes** | | | | |\n| Measurement info | | dictionary | | |\n\nBelow are some examples of how to access the different variables in the netcdf file.\n```python\nimport xarray as xr\nimport pandas as pd\n\n# load the dataset\nds = xr.open_dataset(\"/home/user/viikki/NAIS_20220928.nc\")\n\n# Get negative ion number size distribution\ndf_neg_ions = ds.neg_ions.to_pandas()\n\n# Get temperature\ndf_temperature = ds.temperature.to_pandas()\n\n# Close the file\nds.close()\n```\n\nContinuing on with the data analysis, next we combine the previously created files into a single continuous dataset with 1 hour time resolution and only raise a flag if at least 50% of the data points inside the two hour window contain the flag. We save it as a netcdf file.\n```python\nfrom nais.utils import combine_data\nimport pandas as pd\nimport xarray as xr\n\ndata_source = \"/home/user/viikki\"\ndate_range = pd.date_range(\"2022-09-28\",\"2022-09-30\")\n\nds = combine_data(data_source, date_range, \"1H\",\n flag_sensitivity=0.5)\n\nds.to_netcdf(\"combined_nais_dataset.nc\")\n```\nThen we launch the data checker with the combined data in order to identify bad data. Bounding boxes can be drawn around bad data in the size distributions (initiate an adjustable box with double left click and remove from the menu opened by right clicking the box). By clicking the save boundaries button the box coordinates are saved to a netcdf file (filename given in the second argument). If the bounding boxes are saved, they will be reloaded when the checker is reopened with same arguments, so save your work regularly in case the program crashes.\n```python\nfrom nais.checker import startNaisChecker\nstartNaisChecker(\"combined_nais_dataset.nc\", \"bad_data_bounds.nc\")\n```\nWe can set the bad data to `NaN` in our combined file and use the resulting dataset as the starting point for further analysis.\n```python\nfrom nais.utils import remove_bad_data\n\nds = xr.open_dataset(\"combined_nais_dataset.nc\")\nbad_data = xr.open_dataset(\"bad_data_bounds.nc\")\nds = remove_bad_data(ds, bad_data)\n```\n\n## License\nThis project is licensed under the terms of the GNU GPLv3.\n\n## References\nGormley P. G. and Kennedy M., Diffusion from a Stream Flowing through a Cylindrical Tube, Proceedings of the Royal Irish Academy. Section A: Mathematical and Physical Sciences, 52, (1948-1950), pp. 163-169.\n\nWagner R., Manninen H.E., Franchin A., Lehtipalo K., Mirme S., Steiner G., Pet\u00e4j\u00e4 T. and Kulmala M., On the accuracy of ion measurements using a Neutral cluster and Air Ion Spectrometer, Boreal Environment Research, 21, (2016), pp. 230\u2013241.\n",
"bugtrack_url": null,
"license": null,
"summary": "Code to process ion spectrometer data files",
"version": "0.0.35",
"project_urls": {
"Homepage": "https://github.com/jlpl/nais-processor"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b081521c18049fb003c1967134acb4c36293a3de7863506d39d20338fda36e65",
"md5": "f8cc201a1cd5fe7db9058e0ab8bad6f6",
"sha256": "fb5242b86147a9af558126059a21bc8de301e753cf253897dc30ffacb280463c"
},
"downloads": -1,
"filename": "nais_processor-0.0.35-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f8cc201a1cd5fe7db9058e0ab8bad6f6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 32137,
"upload_time": "2024-06-20T18:53:03",
"upload_time_iso_8601": "2024-06-20T18:53:03.965721Z",
"url": "https://files.pythonhosted.org/packages/b0/81/521c18049fb003c1967134acb4c36293a3de7863506d39d20338fda36e65/nais_processor-0.0.35-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a2e7ef1f85cfe794060fea043788bd388dc507706da63a04d02cad5e3ba124cb",
"md5": "a70af423018568d42853ec7830482c59",
"sha256": "1ca4da0fb05a4bb2a220db0ca8c2ef565c615a0d1f6df81b983a61792a726a62"
},
"downloads": -1,
"filename": "nais-processor-0.0.35.tar.gz",
"has_sig": false,
"md5_digest": "a70af423018568d42853ec7830482c59",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 142927,
"upload_time": "2024-06-20T18:53:06",
"upload_time_iso_8601": "2024-06-20T18:53:06.417781Z",
"url": "https://files.pythonhosted.org/packages/a2/e7/ef1f85cfe794060fea043788bd388dc507706da63a04d02cad5e3ba124cb/nais-processor-0.0.35.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-20 18:53:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "jlpl",
"github_project": "nais-processor",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "nais-processor"
}