difi


Namedifi JSON
Version 1.1 PyPI version JSON
download
home_pagehttps://github.com/moeyensj/difi
SummaryDid I Find It?
upload_time2021-05-12 17:16:11
maintainer
docs_urlNone
authorJoachim Moeyens
requires_python>=3.7
licenseBSD 3-Clause License
keywords astronomy astrophysics space science asteroids comets solar system
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # difi
Did I Find It?  
[![Build Status](https://dev.azure.com/moeyensj/difi/_apis/build/status/moeyensj.difi?branchName=master)](https://dev.azure.com/moeyensj/difi/_build/latest?definitionId=1&branchName=master)
[![Build Status](https://travis-ci.com/moeyensj/difi.svg?branch=master)](https://travis-ci.com/moeyensj/difi)
[![Coverage Status](https://coveralls.io/repos/github/moeyensj/difi/badge.svg?branch=master)](https://coveralls.io/github/moeyensj/difi?branch=master)
[![Docker Pulls](https://img.shields.io/docker/pulls/moeyensj/difi)](https://hub.docker.com/r/moeyensj/difi)  
[![Python 3.7+](https://img.shields.io/badge/Python-3.7%2B-blue)](https://img.shields.io/badge/Python-3.7%2B-blue)
[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![DOI](https://zenodo.org/badge/152989392.svg)](https://zenodo.org/badge/latestdoi/152989392)

## About
`difi` is a simple package that takes pre-formatted linkage information from software such as [MOPS](https://github.com/lsst/mops_daymops), [pytrax](https://github.com/pytrax/pytrax), or [THOR](https://github.com/moeyensj/thor) and analyzes which objects have been found given a set of known labels (or truths). A key performance criteria is that `difi` needs to be fast by avoiding Python for loops and instead uses clever `pandas.DataFrame` manipulation. 

## Installation

The following installation paths are available:  
[Anaconda](#Anaconda)  
[PyPi](#PyPi)  
[Docker](#Docker)  
[Source](#Source)  

### Anaconda
`difi` can be downloaded directly from anaconda:  
```conda install -c moeyensj difi```

Or, if preferred, installed into its own environment via:  
```conda create -n difi_py38 -c moeyensj difi python=3.8```

### PyPi
`difi` is also available from the Python package index:  
```pip install difi```

### Docker

A Docker container with the latest version of the code can be pulled using:  
```docker pull moeyensj/difi:latest```

To run the container:  
```docker run -it moeyensj/difi:latest```

The difi code is installed the /projects directory, and is by default also installed in the container's Python installation. 

### Source
Clone this repository using either `ssh` or `https`. Once cloned and downloaded, `cd` into the repository. 

To install difi in its own `conda` enviroment please do the following:  
```conda create -n difi_py38 -c defaults -c conda-forge --file requirements.txt python=3.8```  

Or, to install difi in a pre-existing `conda` environment called `difi_py38`:  
```conda activate difi_py38```  
```conda install -c defaults -c conda-forge --file requirements.txt```  

Or, to install pre-requisite software using `pip`:  
```pip install -r requirements.txt```

Once pre-requisites have been installed using either one of the three options above, then:  
```python setup.py install```

Or, if you would like to make an editable install then:  
```python setup.py develop```

You should now be able to start Python and import difi. 

## Example

The example below can be found in greater detail in this [Jupyter Notebook](https://github.com/moeyensj/difi/tree/master/examples/tutorial.ipynb).

### Assumed Inputs
`difi` is designed to analyze a set of linkages made by external software where some of the underlying true linkages are known. It needs just two DataFrames of data:
- 1) a DataFrame containing observations, with a column for observation ID and a column for the underlying truth (don't worry! -- `difi` can handle false positives and unknown truths as well)  

![observations](docs/images/observations_noclasses.png "Observations")

- 2) a DataFrame describing the linkages that were found in the observations by the external software. This DataFrame needs just two columns, one with the linkage ID and the other with the observation IDs that form that linkage  

![linkage_members](docs/images/linkage_members.png "linkage_members")

### What Can I Find? 
In most cases the user can determine what known truths in their observations dataframe can be found by their respective linking algorithm. `difi` has two simple findability metrics: 

The 'min_obs' metric: any object with this many or more observations is considered findable.  
![analyzeObservations](docs/images/cifi_min_obs.png "min_obs")

The 'nightly_linkages' metric: any object with this many or more observations is considered findable.  
![analyzeObservations](docs/images/cifi_nightly_linkages.png "nightly_linkages")

Which objects are findable?  
![all_truths](docs/images/cifi_all_truths.png "all_truths")

What observations made each object findable?  
![findable_observations](docs/images/cifi_findable_observations.png "findable_observations")

A summary of what kinds of objects are findable might be useful.  
![summary](docs/images/cifi_summary_min_obs.png "summary")

### Did I Find It? 
Now lets see what the external linking software did find. 

![analyzeLinkages](docs/images/difi.png "analyzeLinkages.png")

`difi` assumes there to be three different types of linkages:
- 'pure': all observations in a linkage belong to a unique truth
- 'partial': up to a certain percentage of non-unique thruths are allowed so long as one truth has at least the minimum required number of unique observations
- 'mixed': a linkage containing different observations belonging to different truths, we avoid using the word 'false' for these linkages as they may contain unknown truths depending on the use case. We leave interpretation up to the user. 

Thanks to the power of `pandas` it can be super easy to isolate the different linkage types and analyze them separately.
Selecting 'pure' linkages:

![all_linkages_pure](docs/images/pure_linkages.png "pure_linkages")

Selecting 'partial' linkages:

![all_linkages_partial](docs/images/partial_linkages.png "all_linkages_partial")

Selecting 'mixed' linkages:

![all_linkages_mixed](docs/images/mixed_linkages.png "all_linkages_mixed")


Understanding the specifics behind each linkage is one thing, but how did the linking algorithm perform on an object by object basis. 
![allTruths](docs/images/difi_all_truths.png "all_truths")

### Tutorial
A detailed tutorial on `difi` functionality can be found [here](https://github.com/moeyensj/difi/tree/master/examples).



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/moeyensj/difi",
    "name": "difi",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "astronomy,astrophysics,space,science,asteroids,comets,solar system",
    "author": "Joachim Moeyens",
    "author_email": "moeyensj@uw.edu",
    "download_url": "https://files.pythonhosted.org/packages/1f/81/9153c501ec872b150af0c12c254be655713824a2c5dfc5c959f04b3c67ca/difi-1.1.tar.gz",
    "platform": "",
    "description": "# difi\nDid I Find It?  \n[![Build Status](https://dev.azure.com/moeyensj/difi/_apis/build/status/moeyensj.difi?branchName=master)](https://dev.azure.com/moeyensj/difi/_build/latest?definitionId=1&branchName=master)\n[![Build Status](https://travis-ci.com/moeyensj/difi.svg?branch=master)](https://travis-ci.com/moeyensj/difi)\n[![Coverage Status](https://coveralls.io/repos/github/moeyensj/difi/badge.svg?branch=master)](https://coveralls.io/github/moeyensj/difi?branch=master)\n[![Docker Pulls](https://img.shields.io/docker/pulls/moeyensj/difi)](https://hub.docker.com/r/moeyensj/difi)  \n[![Python 3.7+](https://img.shields.io/badge/Python-3.7%2B-blue)](https://img.shields.io/badge/Python-3.7%2B-blue)\n[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)\n[![DOI](https://zenodo.org/badge/152989392.svg)](https://zenodo.org/badge/latestdoi/152989392)\n\n## About\n`difi` is a simple package that takes pre-formatted linkage information from software such as [MOPS](https://github.com/lsst/mops_daymops), [pytrax](https://github.com/pytrax/pytrax), or [THOR](https://github.com/moeyensj/thor) and analyzes which objects have been found given a set of known labels (or truths). A key performance criteria is that `difi` needs to be fast by avoiding Python for loops and instead uses clever `pandas.DataFrame` manipulation. \n\n## Installation\n\nThe following installation paths are available:  \n[Anaconda](#Anaconda)  \n[PyPi](#PyPi)  \n[Docker](#Docker)  \n[Source](#Source)  \n\n### Anaconda\n`difi` can be downloaded directly from anaconda:  \n```conda install -c moeyensj difi```\n\nOr, if preferred, installed into its own environment via:  \n```conda create -n difi_py38 -c moeyensj difi python=3.8```\n\n### PyPi\n`difi` is also available from the Python package index:  \n```pip install difi```\n\n### Docker\n\nA Docker container with the latest version of the code can be pulled using:  \n```docker pull moeyensj/difi:latest```\n\nTo run the container:  \n```docker run -it moeyensj/difi:latest```\n\nThe difi code is installed the /projects directory, and is by default also installed in the container's Python installation. \n\n### Source\nClone this repository using either `ssh` or `https`. Once cloned and downloaded, `cd` into the repository. \n\nTo install difi in its own `conda` enviroment please do the following:  \n```conda create -n difi_py38 -c defaults -c conda-forge --file requirements.txt python=3.8```  \n\nOr, to install difi in a pre-existing `conda` environment called `difi_py38`:  \n```conda activate difi_py38```  \n```conda install -c defaults -c conda-forge --file requirements.txt```  \n\nOr, to install pre-requisite software using `pip`:  \n```pip install -r requirements.txt```\n\nOnce pre-requisites have been installed using either one of the three options above, then:  \n```python setup.py install```\n\nOr, if you would like to make an editable install then:  \n```python setup.py develop```\n\nYou should now be able to start Python and import difi. \n\n## Example\n\nThe example below can be found in greater detail in this [Jupyter Notebook](https://github.com/moeyensj/difi/tree/master/examples/tutorial.ipynb).\n\n### Assumed Inputs\n`difi` is designed to analyze a set of linkages made by external software where some of the underlying true linkages are known. It needs just two DataFrames of data:\n- 1) a DataFrame containing observations, with a column for observation ID and a column for the underlying truth (don't worry! -- `difi` can handle false positives and unknown truths as well)  \n\n![observations](docs/images/observations_noclasses.png \"Observations\")\n\n- 2) a DataFrame describing the linkages that were found in the observations by the external software. This DataFrame needs just two columns, one with the linkage ID and the other with the observation IDs that form that linkage  \n\n![linkage_members](docs/images/linkage_members.png \"linkage_members\")\n\n### What Can I Find? \nIn most cases the user can determine what known truths in their observations dataframe can be found by their respective linking algorithm. `difi` has two simple findability metrics: \n\nThe 'min_obs' metric: any object with this many or more observations is considered findable.  \n![analyzeObservations](docs/images/cifi_min_obs.png \"min_obs\")\n\nThe 'nightly_linkages' metric: any object with this many or more observations is considered findable.  \n![analyzeObservations](docs/images/cifi_nightly_linkages.png \"nightly_linkages\")\n\nWhich objects are findable?  \n![all_truths](docs/images/cifi_all_truths.png \"all_truths\")\n\nWhat observations made each object findable?  \n![findable_observations](docs/images/cifi_findable_observations.png \"findable_observations\")\n\nA summary of what kinds of objects are findable might be useful.  \n![summary](docs/images/cifi_summary_min_obs.png \"summary\")\n\n### Did I Find It? \nNow lets see what the external linking software did find. \n\n![analyzeLinkages](docs/images/difi.png \"analyzeLinkages.png\")\n\n`difi` assumes there to be three different types of linkages:\n- 'pure': all observations in a linkage belong to a unique truth\n- 'partial': up to a certain percentage of non-unique thruths are allowed so long as one truth has at least the minimum required number of unique observations\n- 'mixed': a linkage containing different observations belonging to different truths, we avoid using the word 'false' for these linkages as they may contain unknown truths depending on the use case. We leave interpretation up to the user. \n\nThanks to the power of `pandas` it can be super easy to isolate the different linkage types and analyze them separately.\nSelecting 'pure' linkages:\n\n![all_linkages_pure](docs/images/pure_linkages.png \"pure_linkages\")\n\nSelecting 'partial' linkages:\n\n![all_linkages_partial](docs/images/partial_linkages.png \"all_linkages_partial\")\n\nSelecting 'mixed' linkages:\n\n![all_linkages_mixed](docs/images/mixed_linkages.png \"all_linkages_mixed\")\n\n\nUnderstanding the specifics behind each linkage is one thing, but how did the linking algorithm perform on an object by object basis. \n![allTruths](docs/images/difi_all_truths.png \"all_truths\")\n\n### Tutorial\nA detailed tutorial on `difi` functionality can be found [here](https://github.com/moeyensj/difi/tree/master/examples).\n\n\n",
    "bugtrack_url": null,
    "license": "BSD 3-Clause License",
    "summary": "Did I Find It?",
    "version": "1.1",
    "project_urls": {
        "Homepage": "https://github.com/moeyensj/difi"
    },
    "split_keywords": [
        "astronomy",
        "astrophysics",
        "space",
        "science",
        "asteroids",
        "comets",
        "solar system"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3b5ca66ca5332f76a623525372fa40815ea04d908146b85402fe0f86bc501d25",
                "md5": "79ea61df83b130764d3a308d9a03c932",
                "sha256": "7d391c9cf7cb33eaf14c0b06051ca7e8af65c1d9a5cea320165aa173447cc0d9"
            },
            "downloads": -1,
            "filename": "difi-1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "79ea61df83b130764d3a308d9a03c932",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 18574,
            "upload_time": "2021-05-12T17:16:10",
            "upload_time_iso_8601": "2021-05-12T17:16:10.721051Z",
            "url": "https://files.pythonhosted.org/packages/3b/5c/a66ca5332f76a623525372fa40815ea04d908146b85402fe0f86bc501d25/difi-1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1f819153c501ec872b150af0c12c254be655713824a2c5dfc5c959f04b3c67ca",
                "md5": "1d499ad91591f72253e83acfa44b11bc",
                "sha256": "95bfc0bbb506b5eb00b7267f5d63b37636b1bdd869a69f29d900f0f7cbcef047"
            },
            "downloads": -1,
            "filename": "difi-1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "1d499ad91591f72253e83acfa44b11bc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 4749672,
            "upload_time": "2021-05-12T17:16:11",
            "upload_time_iso_8601": "2021-05-12T17:16:11.707933Z",
            "url": "https://files.pythonhosted.org/packages/1f/81/9153c501ec872b150af0c12c254be655713824a2c5dfc5c959f04b3c67ca/difi-1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2021-05-12 17:16:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "moeyensj",
    "github_project": "difi",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [],
    "lcname": "difi"
}
        
Elapsed time: 0.09710s