# Nullval
This repository contains the required package containing various mathematical \
approaches using different numerical technique
Under construction! Not ready for use yet! Currently experimenting and planning!
Developed by Mukul namagiri
+ This repository contains different kinds of methods for the treament of null values
and outliers\
Using various kinds of numerical techniques for the ideal replacement of values in your dataframe
## Accepted format
+ This module takes **xml, json, csv and excel** and pandas dataframe as input
+ automatically identifies the locations of null values and outliers
+ ideal values for data imputations
## Directory structure of the repository
```
nullvalue/
│
├── .gitignore
│
├── nullval/
│ ├── __init__.py
│ ├── cubic_spline_interpolation.py
│ ├── linear_interpolation.py
│ └── loader.py
| |__ polynomial_interpolation.py
| |__ splines_interpolation.py
| |__ trigonometric_interpolation.py
| |__ auto.py
│
├── tests/
│ ├── init.py
│ └── test_lagrange_interpolation.py
| |__ test_linear_interpolation.py
| |__ test_polynomial_interpolation.py
| |__ test_spline_interpolation.py
| |__ test_trigonometric_interpolation.py
│
├── api_reference.md
│
├── pyproject.toml
│
├── README.rst
│
└── README.md
```
## requirements for the package
They are already added to the toml file but in case
```
pandas==1.3.3
numpy==1.21.4
tqdm
scikit-learn==0.24.2
seaborn==0.11.2
matplotlib==3.5.1
statsmodels==0.13.0
tensorflow==2.8.0
plotly==5.5.0
```
## Installation
```
pip install nulval
```
# Usage guide
**loader loads and formats the data and auto fins the ideal solution**
## Step - 1
```python
from nullval import loader
path = "<enter the default path according to the environment>"
# converts to dataframe
data = loader.auto(path)
# returns the index of the nulls and the outliers
loader.nulls_and_outs(data)
```
# Advantages and the Disadvantages of each of the method
### Linear interpolation
#### Advantages
+ Easy to implement and less computational requirements
+ Quick to compute and effective for larger data sets with loads of missing values
+ have more local control, less sensitive to outliers, works well with noisy data, handles discontinous data well
#### Disadvantages
> not good for complex patterns, sharp corners, poor performance for smooth functions, requires higher order derivatives
### Lagrange interpolation
+ Straight forward, tries to give the best fit
+ works for equidistant and the non equidistant points, no need to solve linear systems
#### Disadvantages
> **Runge's phenomenon** for higher degree and the widely spaced points --> oscillations occur at edges of intervals leading to poor approximation
> higher computational costs and does not work for dynamic dataset, higher storage requirements
### Splines interpolation
#### Advantages
+ gives more local control by breaking down the domain into smaller fragments, more precise interpolation
+ smoother interpolation and reduces oscillations, differentiable, piecewise continous
#### Disadvantages
> More computataional effort, hard to choose appropriate boundaries, could lead to overfitting, takes significant resources, higher memory usage, beyond range interpolation
### Polynomial interpolation
#### Advantages
+ gives the exact fit, provides analytical expression for further theoretical analysis
+ allows for flexibility in choosing the base polynomial
#### Disadvantages
> same as those of lagrange
### Trigonometric interpolation
#### Advantages
+ Most natural fit for periodic data and capture harmonics well, gives high precision for smooth functions
+ avoids runge phenomenon, fast computation with fft and basis function
#### Disadvantages
> non periodic data issues, discontinous boundary effects, global nature
Raw data
{
"_id": null,
"home_page": "https://github.com/Mukullight/nullval",
"name": "nullval",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0.0,>=3.9.19",
"maintainer_email": null,
"keywords": "finance, dataloader, outliers_finder, Nullvalue_finder, ",
"author": "Mukul namagiri",
"author_email": "mukulnamagiri1@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/36/4a/35ee261bb544940bd10155afd4fb7552d13804511238c4bdb5ef95820cd2/nullval-0.0.2.tar.gz",
"platform": null,
"description": "# Nullval\nThis repository contains the required package containing various mathematical \\\napproaches using different numerical technique\n\n\nUnder construction! Not ready for use yet! Currently experimenting and planning!\n\nDeveloped by Mukul namagiri\n\n+ This repository contains different kinds of methods for the treament of null values \nand outliers\\\nUsing various kinds of numerical techniques for the ideal replacement of values in your dataframe \n## Accepted format \n+ This module takes **xml, json, csv and excel** and pandas dataframe as input\n+ automatically identifies the locations of null values and outliers \n+ ideal values for data imputations \n\n## Directory structure of the repository\n\n\n```\nnullvalue/\n\u2502\n\u251c\u2500\u2500 .gitignore\n\u2502\n\u251c\u2500\u2500 nullval/\n\u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u251c\u2500\u2500 cubic_spline_interpolation.py\n\u2502 \u251c\u2500\u2500 linear_interpolation.py\n\u2502 \u2514\u2500\u2500 loader.py\n| |__ polynomial_interpolation.py\n| |__ splines_interpolation.py\n| |__ trigonometric_interpolation.py\n| |__ auto.py\n\u2502\n\u251c\u2500\u2500 tests/\n\u2502 \u251c\u2500\u2500 init.py\n\u2502 \u2514\u2500\u2500 test_lagrange_interpolation.py\n| |__ test_linear_interpolation.py\n| |__ test_polynomial_interpolation.py\n| |__ test_spline_interpolation.py\n| |__ test_trigonometric_interpolation.py\n\u2502\n\u251c\u2500\u2500 api_reference.md\n\u2502\n\u251c\u2500\u2500 pyproject.toml\n\u2502\n\u251c\u2500\u2500 README.rst\n\u2502\n\u2514\u2500\u2500 README.md\n```\n## requirements for the package \n\nThey are already added to the toml file but in case \n```\npandas==1.3.3\nnumpy==1.21.4\ntqdm\nscikit-learn==0.24.2\nseaborn==0.11.2\nmatplotlib==3.5.1\nstatsmodels==0.13.0\ntensorflow==2.8.0\nplotly==5.5.0\n```\n## Installation\n\n```\npip install nulval\n```\n\n# Usage guide\n **loader loads and formats the data and auto fins the ideal solution**\n## Step - 1 \n```python\nfrom nullval import loader\n\npath = \"<enter the default path according to the environment>\"\n# converts to dataframe\ndata = loader.auto(path)\n# returns the index of the nulls and the outliers \nloader.nulls_and_outs(data)\n```\n\n# Advantages and the Disadvantages of each of the method\n### Linear interpolation \n#### Advantages\n+ Easy to implement and less computational requirements\n+ Quick to compute and effective for larger data sets with loads of missing values\n+ have more local control, less sensitive to outliers, works well with noisy data, handles discontinous data well\n#### Disadvantages\n> not good for complex patterns, sharp corners, poor performance for smooth functions, requires higher order derivatives \n### Lagrange interpolation \n+ Straight forward, tries to give the best fit\n+ works for equidistant and the non equidistant points, no need to solve linear systems\n#### Disadvantages \n> **Runge's phenomenon** for higher degree and the widely spaced points --> oscillations occur at edges of intervals leading to poor approximation\n> higher computational costs and does not work for dynamic dataset, higher storage requirements\n### Splines interpolation\n#### Advantages\n+ gives more local control by breaking down the domain into smaller fragments, more precise interpolation\n+ smoother interpolation and reduces oscillations, differentiable, piecewise continous \n#### Disadvantages \n> More computataional effort, hard to choose appropriate boundaries, could lead to overfitting, takes significant resources, higher memory usage, beyond range interpolation\n### Polynomial interpolation \n#### Advantages \n+ gives the exact fit, provides analytical expression for further theoretical analysis \n+ allows for flexibility in choosing the base polynomial \n#### Disadvantages \n> same as those of lagrange \n### Trigonometric interpolation\n#### Advantages\n+ Most natural fit for periodic data and capture harmonics well, gives high precision for smooth functions\n+ avoids runge phenomenon, fast computation with fft and basis function\n#### Disadvantages \n> non periodic data issues, discontinous boundary effects, global nature \n\n\n\n\n\n\n\n\n\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A package for the treatment of nullvalues and outliers in your data set using various mathematical approaches ",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://github.com/Mukullight/nullval",
"Repository": "https://github.com/Mukullight/nullval"
},
"split_keywords": [
"finance",
" dataloader",
" outliers_finder",
" nullvalue_finder",
" "
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f9bb3445dc3568fc135eb19d0d3aa52316148b5304a97ddbf03cbc362ae5500c",
"md5": "05b2f65e4fd63a5e117b5f736ef17cc9",
"sha256": "5f39ad202e3f2f8bacaea748dde1cee7225c9ed9f9ff809a471e7bbcc441eb5c"
},
"downloads": -1,
"filename": "nullval-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "05b2f65e4fd63a5e117b5f736ef17cc9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0.0,>=3.9.19",
"size": 13338,
"upload_time": "2024-07-23T10:42:03",
"upload_time_iso_8601": "2024-07-23T10:42:03.492172Z",
"url": "https://files.pythonhosted.org/packages/f9/bb/3445dc3568fc135eb19d0d3aa52316148b5304a97ddbf03cbc362ae5500c/nullval-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "364a35ee261bb544940bd10155afd4fb7552d13804511238c4bdb5ef95820cd2",
"md5": "123f8632377fd8f7c11615cd9b7a1fac",
"sha256": "8f77a446b072af6ebf054d01ae4a529d7621282193624e86fb9f08bf0622ca5c"
},
"downloads": -1,
"filename": "nullval-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "123f8632377fd8f7c11615cd9b7a1fac",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0.0,>=3.9.19",
"size": 9640,
"upload_time": "2024-07-23T10:42:05",
"upload_time_iso_8601": "2024-07-23T10:42:05.569884Z",
"url": "https://files.pythonhosted.org/packages/36/4a/35ee261bb544940bd10155afd4fb7552d13804511238c4bdb5ef95820cd2/nullval-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-07-23 10:42:05",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Mukullight",
"github_project": "nullval",
"github_not_found": true,
"lcname": "nullval"
}