tesci


Nametesci JSON
Version 1.0.0 PyPI version JSON
download
home_page
SummaryA toolkit to aid in scientific mapping
upload_time2024-03-14 19:47:35
maintainer
docs_urlNone
author
requires_python>=3.11
license
keywords scientific mapping merging data sources
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # tesci

## An interactive toolkit for merging data from multiple citation databases

[![PyPI](https://img.shields.io/pypi/v/tesci.svg)](https://img.shields.io/pypi/v/tesci.svg)
[![License-1]( https://img.shields.io/badge/License-Apache-blue.svg)](https://img.shields.io/badge/License-Apache-blue.svg)
[![License-2]( https://img.shields.io/badge/License-MIT-green.svg)](https://img.shields.io/badge/License-MIT-green.svg)
[![Python-Versions](https://img.shields.io/pypi/pyversions/tesci.svg)](https://img.shields.io/pypi/pyversions/tesci.svg)

## Overview

`TeslaSCIToolkit` (abbrev. `tesci`) is a scientific mapping tool that comes with the following features:

 1. Merging data from multiple citation databases
 2. Restricting access to sensitive columns in data sources with aggregations
 3. Exporting transformed data into other repositories
 4. CI/CD integration, currently GitHub Actions

For examples and use-cases, see [examples](examples/) directory.

## Quickstart

### Aggregating data from a single database source

To create an aggregation of [`simple.csv`](examples/simple/simple.csv) based on average `salary` and `age`.

#### 1. Interactive approach

```properties
tesci start -d simple.csv -o exported.csv​
tesci aggregate avg -c salary -a avg_salary​
tesci aggregate avg -c age -a avg_age​
tesci apply​
```

#### 2. Configuration approach

```yml
aggregate:​
  - alias: avg_salary​
    column: salary​
    function: avg​
  - alias: avg_age​
    column: age​
    function: avg​
data:​
  dest: exported.csv​
  src: simple.csv
```

The result is a transformation from `simple.csv` to `exported.csv`:

| | | |
|--|--|--|
|<table> <tr><th>id</th><th>name</th><th>email</th><th>phone-number</th><th>age</th><th>salary</th></tr><tr><td>1</td><td>John Doe</td><td>john@mail.com</td><td>1234567890</td><td>33</td><td>100000</td></tr><tr><td>2</td><td>Jane Doe</td><td>jane@mail.com</td><td>0987654321</td><td>44</td><td>200000</td></tr><tr><td>3</td><td>John Smith</td><td>smith@mail.com</td><td>1234509876</td><td>55</td><td>300000</td></tr><tr><td>4</td><td>Jane Williams</td><td>jwilliams@mail.com</td><td>1234509876</td><td>31</td><td>98000</td></tr><tr><td>5</td><td>Jack Miller</td><td></td><td>1234509876</td><td>33</td><td>79000</td></tr> </table>|&rarr;|<table> <tr><th>avg_salary</th><th>avg_age</th></tr><tr><td>155400.0</td><td>39.2</td></tr> </table>|

### Merging data from multiple citation databases

After retrieving data sources from citation databases of your choice, place the databases in a directory of your choice.
Then, specify the configuration used for merging. An example of a configuration is [here](examples/bibliometric-study/config.yml).

After specifying your configuration choices, merge can then by run with:

`tesci similarity merge --first-src PATH --second-src PATH --dest DIR`

where PATH and DIR refer to relative filesystem paths and directories.

## License

Licensed under either of Apache License, Version 2.0 or MIT license.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "tesci",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": "",
    "keywords": "scientific mapping,merging data sources",
    "author": "",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/d4/1d/eb57e2fe7f81ea5d79f8c3bfe9243f7e4032165546f79627d146df163ca5/tesci-1.0.0.tar.gz",
    "platform": null,
    "description": "# tesci\r\n\r\n## An interactive toolkit for merging data from multiple citation databases\r\n\r\n[![PyPI](https://img.shields.io/pypi/v/tesci.svg)](https://img.shields.io/pypi/v/tesci.svg)\r\n[![License-1]( https://img.shields.io/badge/License-Apache-blue.svg)](https://img.shields.io/badge/License-Apache-blue.svg)\r\n[![License-2]( https://img.shields.io/badge/License-MIT-green.svg)](https://img.shields.io/badge/License-MIT-green.svg)\r\n[![Python-Versions](https://img.shields.io/pypi/pyversions/tesci.svg)](https://img.shields.io/pypi/pyversions/tesci.svg)\r\n\r\n## Overview\r\n\r\n`TeslaSCIToolkit` (abbrev. `tesci`) is a scientific mapping tool that comes with the following features:\r\n\r\n 1. Merging data from multiple citation databases\r\n 2. Restricting access to sensitive columns in data sources with aggregations\r\n 3. Exporting transformed data into other repositories\r\n 4. CI/CD integration, currently GitHub Actions\r\n\r\nFor examples and use-cases, see [examples](examples/) directory.\r\n\r\n## Quickstart\r\n\r\n### Aggregating data from a single database source\r\n\r\nTo create an aggregation of [`simple.csv`](examples/simple/simple.csv) based on average `salary` and `age`.\r\n\r\n#### 1. Interactive approach\r\n\r\n```properties\r\ntesci start -d simple.csv -o exported.csv\u200b\r\ntesci aggregate avg -c salary -a avg_salary\u200b\r\ntesci aggregate avg -c age -a avg_age\u200b\r\ntesci apply\u200b\r\n```\r\n\r\n#### 2. Configuration approach\r\n\r\n```yml\r\naggregate:\u200b\r\n  - alias: avg_salary\u200b\r\n    column: salary\u200b\r\n    function: avg\u200b\r\n  - alias: avg_age\u200b\r\n    column: age\u200b\r\n    function: avg\u200b\r\ndata:\u200b\r\n  dest: exported.csv\u200b\r\n  src: simple.csv\r\n```\r\n\r\nThe result is a transformation from `simple.csv` to `exported.csv`:\r\n\r\n| | | |\r\n|--|--|--|\r\n|<table> <tr><th>id</th><th>name</th><th>email</th><th>phone-number</th><th>age</th><th>salary</th></tr><tr><td>1</td><td>John Doe</td><td>john@mail.com</td><td>1234567890</td><td>33</td><td>100000</td></tr><tr><td>2</td><td>Jane Doe</td><td>jane@mail.com</td><td>0987654321</td><td>44</td><td>200000</td></tr><tr><td>3</td><td>John Smith</td><td>smith@mail.com</td><td>1234509876</td><td>55</td><td>300000</td></tr><tr><td>4</td><td>Jane Williams</td><td>jwilliams@mail.com</td><td>1234509876</td><td>31</td><td>98000</td></tr><tr><td>5</td><td>Jack Miller</td><td></td><td>1234509876</td><td>33</td><td>79000</td></tr> </table>|&rarr;|<table> <tr><th>avg_salary</th><th>avg_age</th></tr><tr><td>155400.0</td><td>39.2</td></tr> </table>|\r\n\r\n### Merging data from multiple citation databases\r\n\r\nAfter retrieving data sources from citation databases of your choice, place the databases in a directory of your choice.\r\nThen, specify the configuration used for merging. An example of a configuration is [here](examples/bibliometric-study/config.yml).\r\n\r\nAfter specifying your configuration choices, merge can then by run with:\r\n\r\n`tesci similarity merge --first-src PATH --second-src PATH --dest DIR`\r\n\r\nwhere PATH and DIR refer to relative filesystem paths and directories.\r\n\r\n## License\r\n\r\nLicensed under either of Apache License, Version 2.0 or MIT license.\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A toolkit to aid in scientific mapping",
    "version": "1.0.0",
    "project_urls": null,
    "split_keywords": [
        "scientific mapping",
        "merging data sources"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f66309b2c3bc9d24dc253d5ae61fdf95712428461f5ee5c181a3928fedea8f22",
                "md5": "dc506ceb56fed0a9185a52cce5a4ef93",
                "sha256": "4808f5dbb711cb83839123f85c23ea737754add73b3ab1f3892f6a0ed6ae6ddc"
            },
            "downloads": -1,
            "filename": "tesci-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dc506ceb56fed0a9185a52cce5a4ef93",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 20692,
            "upload_time": "2024-03-14T19:47:33",
            "upload_time_iso_8601": "2024-03-14T19:47:33.858801Z",
            "url": "https://files.pythonhosted.org/packages/f6/63/09b2c3bc9d24dc253d5ae61fdf95712428461f5ee5c181a3928fedea8f22/tesci-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d41deb57e2fe7f81ea5d79f8c3bfe9243f7e4032165546f79627d146df163ca5",
                "md5": "cfaddbdc198fb35fb3d5406386964615",
                "sha256": "3b55e6f9a6f355d05bf3e3de6cb5382333b5db1652ffac4fc0b0fa1bbf5592a5"
            },
            "downloads": -1,
            "filename": "tesci-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "cfaddbdc198fb35fb3d5406386964615",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 18158,
            "upload_time": "2024-03-14T19:47:35",
            "upload_time_iso_8601": "2024-03-14T19:47:35.347556Z",
            "url": "https://files.pythonhosted.org/packages/d4/1d/eb57e2fe7f81ea5d79f8c3bfe9243f7e4032165546f79627d146df163ca5/tesci-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-14 19:47:35",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "tesci"
}
        
Elapsed time: 0.21147s