Name | tesci JSON |
Version |
1.0.0
JSON |
| download |
home_page | |
Summary | A toolkit to aid in scientific mapping |
upload_time | 2024-03-14 19:47:35 |
maintainer | |
docs_url | None |
author | |
requires_python | >=3.11 |
license | |
keywords |
scientific mapping
merging data sources
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# tesci
## An interactive toolkit for merging data from multiple citation databases
[![PyPI](https://img.shields.io/pypi/v/tesci.svg)](https://img.shields.io/pypi/v/tesci.svg)
[![License-1]( https://img.shields.io/badge/License-Apache-blue.svg)](https://img.shields.io/badge/License-Apache-blue.svg)
[![License-2]( https://img.shields.io/badge/License-MIT-green.svg)](https://img.shields.io/badge/License-MIT-green.svg)
[![Python-Versions](https://img.shields.io/pypi/pyversions/tesci.svg)](https://img.shields.io/pypi/pyversions/tesci.svg)
## Overview
`TeslaSCIToolkit` (abbrev. `tesci`) is a scientific mapping tool that comes with the following features:
1. Merging data from multiple citation databases
2. Restricting access to sensitive columns in data sources with aggregations
3. Exporting transformed data into other repositories
4. CI/CD integration, currently GitHub Actions
For examples and use-cases, see [examples](examples/) directory.
## Quickstart
### Aggregating data from a single database source
To create an aggregation of [`simple.csv`](examples/simple/simple.csv) based on average `salary` and `age`.
#### 1. Interactive approach
```properties
tesci start -d simple.csv -o exported.csv
tesci aggregate avg -c salary -a avg_salary
tesci aggregate avg -c age -a avg_age
tesci apply
```
#### 2. Configuration approach
```yml
aggregate:
- alias: avg_salary
column: salary
function: avg
- alias: avg_age
column: age
function: avg
data:
dest: exported.csv
src: simple.csv
```
The result is a transformation from `simple.csv` to `exported.csv`:
| | | |
|--|--|--|
|<table> <tr><th>id</th><th>name</th><th>email</th><th>phone-number</th><th>age</th><th>salary</th></tr><tr><td>1</td><td>John Doe</td><td>john@mail.com</td><td>1234567890</td><td>33</td><td>100000</td></tr><tr><td>2</td><td>Jane Doe</td><td>jane@mail.com</td><td>0987654321</td><td>44</td><td>200000</td></tr><tr><td>3</td><td>John Smith</td><td>smith@mail.com</td><td>1234509876</td><td>55</td><td>300000</td></tr><tr><td>4</td><td>Jane Williams</td><td>jwilliams@mail.com</td><td>1234509876</td><td>31</td><td>98000</td></tr><tr><td>5</td><td>Jack Miller</td><td></td><td>1234509876</td><td>33</td><td>79000</td></tr> </table>|→|<table> <tr><th>avg_salary</th><th>avg_age</th></tr><tr><td>155400.0</td><td>39.2</td></tr> </table>|
### Merging data from multiple citation databases
After retrieving data sources from citation databases of your choice, place the databases in a directory of your choice.
Then, specify the configuration used for merging. An example of a configuration is [here](examples/bibliometric-study/config.yml).
After specifying your configuration choices, merge can then by run with:
`tesci similarity merge --first-src PATH --second-src PATH --dest DIR`
where PATH and DIR refer to relative filesystem paths and directories.
## License
Licensed under either of Apache License, Version 2.0 or MIT license.
Raw data
{
"_id": null,
"home_page": "",
"name": "tesci",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "",
"keywords": "scientific mapping,merging data sources",
"author": "",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/d4/1d/eb57e2fe7f81ea5d79f8c3bfe9243f7e4032165546f79627d146df163ca5/tesci-1.0.0.tar.gz",
"platform": null,
"description": "# tesci\r\n\r\n## An interactive toolkit for merging data from multiple citation databases\r\n\r\n[![PyPI](https://img.shields.io/pypi/v/tesci.svg)](https://img.shields.io/pypi/v/tesci.svg)\r\n[![License-1]( https://img.shields.io/badge/License-Apache-blue.svg)](https://img.shields.io/badge/License-Apache-blue.svg)\r\n[![License-2]( https://img.shields.io/badge/License-MIT-green.svg)](https://img.shields.io/badge/License-MIT-green.svg)\r\n[![Python-Versions](https://img.shields.io/pypi/pyversions/tesci.svg)](https://img.shields.io/pypi/pyversions/tesci.svg)\r\n\r\n## Overview\r\n\r\n`TeslaSCIToolkit` (abbrev. `tesci`) is a scientific mapping tool that comes with the following features:\r\n\r\n 1. Merging data from multiple citation databases\r\n 2. Restricting access to sensitive columns in data sources with aggregations\r\n 3. Exporting transformed data into other repositories\r\n 4. CI/CD integration, currently GitHub Actions\r\n\r\nFor examples and use-cases, see [examples](examples/) directory.\r\n\r\n## Quickstart\r\n\r\n### Aggregating data from a single database source\r\n\r\nTo create an aggregation of [`simple.csv`](examples/simple/simple.csv) based on average `salary` and `age`.\r\n\r\n#### 1. Interactive approach\r\n\r\n```properties\r\ntesci start -d simple.csv -o exported.csv\u200b\r\ntesci aggregate avg -c salary -a avg_salary\u200b\r\ntesci aggregate avg -c age -a avg_age\u200b\r\ntesci apply\u200b\r\n```\r\n\r\n#### 2. Configuration approach\r\n\r\n```yml\r\naggregate:\u200b\r\n - alias: avg_salary\u200b\r\n column: salary\u200b\r\n function: avg\u200b\r\n - alias: avg_age\u200b\r\n column: age\u200b\r\n function: avg\u200b\r\ndata:\u200b\r\n dest: exported.csv\u200b\r\n src: simple.csv\r\n```\r\n\r\nThe result is a transformation from `simple.csv` to `exported.csv`:\r\n\r\n| | | |\r\n|--|--|--|\r\n|<table> <tr><th>id</th><th>name</th><th>email</th><th>phone-number</th><th>age</th><th>salary</th></tr><tr><td>1</td><td>John Doe</td><td>john@mail.com</td><td>1234567890</td><td>33</td><td>100000</td></tr><tr><td>2</td><td>Jane Doe</td><td>jane@mail.com</td><td>0987654321</td><td>44</td><td>200000</td></tr><tr><td>3</td><td>John Smith</td><td>smith@mail.com</td><td>1234509876</td><td>55</td><td>300000</td></tr><tr><td>4</td><td>Jane Williams</td><td>jwilliams@mail.com</td><td>1234509876</td><td>31</td><td>98000</td></tr><tr><td>5</td><td>Jack Miller</td><td></td><td>1234509876</td><td>33</td><td>79000</td></tr> </table>|→|<table> <tr><th>avg_salary</th><th>avg_age</th></tr><tr><td>155400.0</td><td>39.2</td></tr> </table>|\r\n\r\n### Merging data from multiple citation databases\r\n\r\nAfter retrieving data sources from citation databases of your choice, place the databases in a directory of your choice.\r\nThen, specify the configuration used for merging. An example of a configuration is [here](examples/bibliometric-study/config.yml).\r\n\r\nAfter specifying your configuration choices, merge can then by run with:\r\n\r\n`tesci similarity merge --first-src PATH --second-src PATH --dest DIR`\r\n\r\nwhere PATH and DIR refer to relative filesystem paths and directories.\r\n\r\n## License\r\n\r\nLicensed under either of Apache License, Version 2.0 or MIT license.\r\n",
"bugtrack_url": null,
"license": "",
"summary": "A toolkit to aid in scientific mapping",
"version": "1.0.0",
"project_urls": null,
"split_keywords": [
"scientific mapping",
"merging data sources"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f66309b2c3bc9d24dc253d5ae61fdf95712428461f5ee5c181a3928fedea8f22",
"md5": "dc506ceb56fed0a9185a52cce5a4ef93",
"sha256": "4808f5dbb711cb83839123f85c23ea737754add73b3ab1f3892f6a0ed6ae6ddc"
},
"downloads": -1,
"filename": "tesci-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "dc506ceb56fed0a9185a52cce5a4ef93",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 20692,
"upload_time": "2024-03-14T19:47:33",
"upload_time_iso_8601": "2024-03-14T19:47:33.858801Z",
"url": "https://files.pythonhosted.org/packages/f6/63/09b2c3bc9d24dc253d5ae61fdf95712428461f5ee5c181a3928fedea8f22/tesci-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d41deb57e2fe7f81ea5d79f8c3bfe9243f7e4032165546f79627d146df163ca5",
"md5": "cfaddbdc198fb35fb3d5406386964615",
"sha256": "3b55e6f9a6f355d05bf3e3de6cb5382333b5db1652ffac4fc0b0fa1bbf5592a5"
},
"downloads": -1,
"filename": "tesci-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "cfaddbdc198fb35fb3d5406386964615",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 18158,
"upload_time": "2024-03-14T19:47:35",
"upload_time_iso_8601": "2024-03-14T19:47:35.347556Z",
"url": "https://files.pythonhosted.org/packages/d4/1d/eb57e2fe7f81ea5d79f8c3bfe9243f7e4032165546f79627d146df163ca5/tesci-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-14 19:47:35",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "tesci"
}