lppinv


Namelppinv JSON
Version 0.4.2 PyPI version JSON
download
home_pagehttps://github.com/econcz/lppinv
SummaryA non-iterated general implementation of the LPLS estimator for cOLS, TM, and custom cases
upload_time2024-02-17 18:08:12
maintainer
docs_urlNone
authoreconcz
requires_python
licenseMIT
keywords estimator linear programming least squares ols constrained in values transaction matrix custom pseudoinverse singular value decomposition numpy scipy statsmodels
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # A non-iterated general implementation of the LPLS estimator for cOLS, TM, and custom cases

The program implements the **LPLS** (linear programming through least squares) estimator with the help of the Moore-Penrose inverse (pseudoinverse), calculated using *singular value decomposition (SVD)*, with emphasis on the estimation of OLS constrained in values (**cOLS**), Transaction Matrix (**TM**), and **custom** (user-defined) cases. The pseudoinverse offers a
unique minimum-norm least-squares solution, which is the best linear unbiased estimator (BLUE); see Albert (1972, Chapter VI). (Over)determined problems are accompanied by *regression* analysis, which is feasible in their case. For such and especially all remaining cases, a Monte Carlo-based
*ttest* of mean **NRMSE** (normalized by the standard deviation of the RHS) is performed, the sample being drawn from a uniform or user-provided distribution (via a *Python function*).

OLS constrained in values (**cOLS**) is an estimation based on constraints in the model and/or data but not in parameters. Typically, such models are of size ≤ **kN**, where **N** is the number of observations, since the number of their constraints may vary in the LHS (e.g., level, derivatives, etc.).

**Example of a cOLS problem:**  
*Estimate the trend and the cyclical component of a country's GDP given the textbook or any other definition of its peaks, troughs, and saddles. For a pre-LPLS approach to this problem, see (Bolotov, 2014).*

Transaction Matrix (**TM**) of size (**M x N**) is a formal model of interaction (allocation, assignment, etc.) between **M** and **N** elements in any imaginable system, such as intercompany transactions (netting tables), industries within/between economies (input-output tables), cross-border trade/investment (trade/investment matrices), etc., where **row** and **column sums** are known, but **individual elements** of the TM may not be:  

- a netting table is a type of **TM** where **M = N** and the elements are subsidiaries of a MNC;
- an input-output table (IOT) is a type of **TM** where **M = N** and the elements are industries;
- a matrix of trade/investment is a type of **TM** where **M = N** and the elements are countries or (macro)regions, where diagonal elements may be equal to zero;
- a country-product matrix is a type of **TM** where **M ≠ N** and the elements are of different types;  
...

**Example of a TM problem:**  
*Estimate the matrix of trade/investment with/without zero diagonal elements, the country shares in which are unknown. For a pre-LPLS approach to this problem, see (Bolotov, 2015).*

## Methods and formulas:
The LP problem in the **LPLS** estimator is a matrix equation **`a @ x = b`**, loosely based on the structure of the Simplex tableau, where **`a`** consists of coefficients for CONSTRAINTS, LP-type CHARACTERISTIC and/or SPECIFIC, and for SLACK/SURPLUS VARIABLES (the upper part) as well as for the MODEL (the lower part), as illustrated in Figure 1. Each part of **`a`** can be omitted to accommodate a particular case:  

- **cOLS** problems require SPECIFIC CONSTRAINTS, no LP-type CHARACTERISTIC CONSTRAINTS, and a MODEL;  
- **TM** requires LP-type CHARACTERISTIC CONSTRAINTS, no SPECIFIC CONSTRAINTS, and an optional MODEL;  
- SLACK/SURPLUS VARIABLES are included only for inequality constraints and should be set to **1** or **-1**;  
...

**Figure 1: Matrix equation `a @ x = b`**

| `a` || `b` |
| :-: |:-:| :-: |
| CONSTRAINTS: CHARACTERISTIC/SPECIFIC | SL/SU VARIABLES | CONSTRAINTS |
| MODEL || MODEL |

Source: self-prepared

The solution to the equation, **`x = pinv(a) @ b`**, is estimated with the help of SVD and is a **minimum-norm least-squares generalized solution** if the rank of **`a`** is not full. To check if **`a`** is within the computational limits, its (maximum) dimensions can be calculated using the formulas:  

- **(k \* N) x (K + K\*)** **cOLS** - without slack/surplus variables;
- **(k \* N) x (K + K\* + l)** **cOLS** - with slack/surplus variables;
- **(M + N) x (M \* N)** **TM** - without slack/surplus variables;
- **(M + N) x (M \* N + l**) **TM** - with slack/surplus variables;
- **M x N**  **custom** - without slack/surplus variables;
- **M x (N + l)** **custom** - with slack/surplus variables;

where, in **cOLS** problems, **K** is the number of independent variables in the model (including the constant), **K\*** is the number of eventual extra variables in CONSTRAINTS, and **N** is the number of observations; in **TM**, **M** and **N** are the dimensions of the matrix; and in **custom** cases, **M** and **N** or **M x (N + l)** are the dimensions of **`a`**.

## Parameters:
**Type of the LP problem**  

- `cols=False` : *bool*, OLS constrained in values (**cOLS**)  
- `tm=False` : *bool*, Transaction Matrix (**TM**), options **cols** and **tm** are mutually exclusive (not specifying any of them equals **custom**)  
- `rhs=np.empty((0,0))` : *array_like*, right-hand side of the problem (rowsums first for TM problems)

**Constructing the LHS**  

- `model=np.empty((0,0))` : *array_like*, the MODEL part of **`a`**, including an eventual user-provided constant as a matrix column in **cOLS** (see Methods and formulas)  
- `constraints=np.empty((0,0))` : *array_like*, the CONSTRAINTS part of **`a`**  
- `slackvars=np.empty((0,0))` : *array_like*, the SLACK/SURPLUS VARIABLES part of **`a`**  
- `zero_diagonal=False` : *bool*, set all the diagonal elements of **`a`** to 0  

**SVD-based estimation**  

- `tolerance=None` : *float*, scipy.linalg.lstsq's cutoff for `small' singular values; used to determine effective rank of a. Singular values smaller than cond \* largest_singular_value are considered zero
- `level=95` : *int*, confidence level (by default: **95**)

**Monte-Carlo-based t-test**  

- `seed=123456789` : *int*, random-number seed, # is any number between 0 and 2^31-1 (or 2,147,483,647) (by default: **123456789**) 
- `iterate=300` : *int*, number of iterations, # must be divisible by 50 (by default: **300**)  
- `distribution=runiform`: *fucntion*, random-variable generating function, name of an earlier declared Python object returning an *array_like* **(r x c)** with two arguments, real scalars **r** and **c** (by default: **lppinv.runiform**, see Examples on how to pass np.random.uniform to **lppinv**)  
- `mc=True` : *bool*, skip the Monte Carlo-based t-test  
- `trace=True`: *bool*, hide any output with the exception of dots

## Results:

**`class lppinv.LPpinvResult(`**  

- `OLSResults` : *statsmodels.regression.linear_model.OLSResults* or *None*, statsmodels' results class for for an OLS model  
- `TtestResult` : *(scipy.stats._result_classes.TtestResult, scipy.stats._result_classes.TtestResult, scipy.stats._result_classes.TtestResult)* or *(None, None, None)*, scipy's result of a t-test  
- `solution` : *array_like*, the solution obtained from the **LPLS** estimator  
- `a` : *array_like*, the **`a`** matrix  
- `nrmse` : *float*, root mean square error normalized by the standard deviation of **`b'**
- `r2_c` : *float* or *np.nan*, R-squared for CONSTRAINTS in TM  

**)**

## Errors:

- `LPpinvError` : *lppinv.LPpinvError*, specific Python-exception-derived object raised by lppinv functions  
- `LinAlgError` : *scipy.linalg.LinAlgError*, generic Python-exception-derived object raised by linalg functions  

## Examples:
```
import numpy  as np
import lppinv as lp

# cOLS problem
print('\n', lp.solve(
    cols=True,
    rhs=np.array([[0, 0, 0, 0, -1, 0.2, 0.9, 2.1]]).T,
    constraints=np.array([np.ones(4), [0, 1, 2, 3], [0, 5, 2, 8]]).T,
    model=np.array([np.ones(4), [0, 1, 2, 3], [0, 5, 2, 8]]).T,
    slackvars=np.array([[-1, 1, -1, 0]]).T,
    mc=False
))

# TM problem
def rnormal(r=None, c=None):
    return (np.random.normal(loc=0.0, scale=1.0, size=(r, c)))

print('\n', lp.solve(
    tm=True,
    rhs=np.array([[4, 5, 3, 4, 6], [1, 2, 0, np.nan, np.nan]]).T,
    zero_diagonal=True,
    distribution=rnormal
))

# custom problem
print('\n', lp.solve(
    rhs=np.array([[2, 3, 9], [5, 7, 9]]).T,
    model=np.vstack([[0, 1, 1], [1, 0, 1], [1, 1, 0]]),
    tolerance=10**-10,
    trace=False
))
```

## References:

1. Albert, A., 1972. *Regression And The Moore-Penrose Pseudoinverse.* New York: Academic Press.  

2. Bolotov, I. 2014. *Modeling of Time Series Cyclical Component on a Defined Set of Stationary Points and its Application on the US Business Cycle. [Paper presentation]. The 8th International Days of Statistics and Economics: Prague.* https://msed.vse.cz/msed_2014/article/348-Bolotov-Ilya-paper.pdf  

3. Bolotov, I. 2015. *Modeling Bilateral Flows in Economics by Means of Exact Mathematical Methods.* [Paper presentation]. The 9th International Days of Statistics and Economics: Prague. https://msed.vse.cz/msed_2015/article/111-Bolotov-Ilya-paper.pdf  

**PS** Please also check the Web of Science (WoS) for new research on LPLS.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/econcz/lppinv",
    "name": "lppinv",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "estimator,linear programming,least squares,OLS constrained in values,transaction matrix,custom,pseudoinverse,singular value decomposition,numpy,scipy,statsmodels",
    "author": "econcz",
    "author_email": "29724411+econcz@users.noreply.github.com",
    "download_url": "https://files.pythonhosted.org/packages/ba/72/088878855e240f7117edf1e3c258fdbd6ef4e4047bf3cb2881b7af649926/lppinv-0.4.2.tar.gz",
    "platform": null,
    "description": "# A non-iterated general implementation of the LPLS estimator for cOLS, TM, and custom cases\n\nThe program implements the **LPLS** (linear programming through least squares) estimator with the help of the Moore-Penrose inverse (pseudoinverse), calculated using *singular value decomposition (SVD)*, with emphasis on the estimation of OLS constrained in values (**cOLS**), Transaction Matrix (**TM**), and **custom** (user-defined) cases. The pseudoinverse offers a\nunique minimum-norm least-squares solution, which is the best linear unbiased estimator (BLUE); see Albert (1972, Chapter VI). (Over)determined problems are accompanied by *regression* analysis, which is feasible in their case. For such and especially all remaining cases, a Monte Carlo-based\n*ttest* of mean **NRMSE** (normalized by the standard deviation of the RHS) is performed, the sample being drawn from a uniform or user-provided distribution (via a *Python function*).\n\nOLS constrained in values (**cOLS**) is an estimation based on constraints in the model and/or data but not in parameters. Typically, such models are of size \u2264 **kN**, where **N** is the number of observations, since the number of their constraints may vary in the LHS (e.g., level, derivatives, etc.).\n\n**Example of a cOLS problem:**  \n*Estimate the trend and the cyclical component of a country's GDP given the textbook or any other definition of its peaks, troughs, and saddles. For a pre-LPLS approach to this problem, see (Bolotov, 2014).*\n\nTransaction Matrix (**TM**) of size (**M x N**) is a formal model of interaction (allocation, assignment, etc.) between **M** and **N** elements in any imaginable system, such as intercompany transactions (netting tables), industries within/between economies (input-output tables), cross-border trade/investment (trade/investment matrices), etc., where **row** and **column sums** are known, but **individual elements** of the TM may not be:  \n\n- a netting table is a type of **TM** where **M = N** and the elements are subsidiaries of a MNC;\n- an input-output table (IOT) is a type of **TM** where **M = N** and the elements are industries;\n- a matrix of trade/investment is a type of **TM** where **M = N** and the elements are countries or (macro)regions, where diagonal elements may be equal to zero;\n- a country-product matrix is a type of **TM** where **M \u2260 N** and the elements are of different types;  \n...\n\n**Example of a TM problem:**  \n*Estimate the matrix of trade/investment with/without zero diagonal elements, the country shares in which are unknown. For a pre-LPLS approach to this problem, see (Bolotov, 2015).*\n\n## Methods and formulas:\nThe LP problem in the **LPLS** estimator is a matrix equation **`a @ x = b`**, loosely based on the structure of the Simplex tableau, where **`a`** consists of coefficients for CONSTRAINTS, LP-type CHARACTERISTIC and/or SPECIFIC, and for SLACK/SURPLUS VARIABLES (the upper part) as well as for the MODEL (the lower part), as illustrated in Figure 1. Each part of **`a`** can be omitted to accommodate a particular case:  \n\n- **cOLS** problems require SPECIFIC CONSTRAINTS, no LP-type CHARACTERISTIC CONSTRAINTS, and a MODEL;  \n- **TM** requires LP-type CHARACTERISTIC CONSTRAINTS, no SPECIFIC CONSTRAINTS, and an optional MODEL;  \n- SLACK/SURPLUS VARIABLES are included only for inequality constraints and should be set to **1** or **-1**;  \n...\n\n**Figure 1: Matrix equation `a @ x = b`**\n\n| `a` || `b` |\n| :-: |:-:| :-: |\n| CONSTRAINTS: CHARACTERISTIC/SPECIFIC | SL/SU VARIABLES | CONSTRAINTS |\n| MODEL || MODEL |\n\nSource: self-prepared\n\nThe solution to the equation, **`x = pinv(a) @ b`**, is estimated with the help of SVD and is a **minimum-norm least-squares generalized solution** if the rank of **`a`** is not full. To check if **`a`** is within the computational limits, its (maximum) dimensions can be calculated using the formulas:  \n\n- **(k \\* N) x (K + K\\*)** **cOLS** - without slack/surplus variables;\n- **(k \\* N) x (K + K\\* + l)** **cOLS** - with slack/surplus variables;\n- **(M + N) x (M \\* N)** **TM** - without slack/surplus variables;\n- **(M + N) x (M \\* N + l**) **TM** - with slack/surplus variables;\n- **M x N**  **custom** - without slack/surplus variables;\n- **M x (N + l)** **custom** - with slack/surplus variables;\n\nwhere, in **cOLS** problems, **K** is the number of independent variables in the model (including the constant), **K\\*** is the number of eventual extra variables in CONSTRAINTS, and **N** is the number of observations; in **TM**, **M** and **N** are the dimensions of the matrix; and in **custom** cases, **M** and **N** or **M x (N + l)** are the dimensions of **`a`**.\n\n## Parameters:\n**Type of the LP problem**  \n\n- `cols=False` : *bool*, OLS constrained in values (**cOLS**)  \n- `tm=False` : *bool*, Transaction Matrix (**TM**), options **cols** and **tm** are mutually exclusive (not specifying any of them equals **custom**)  \n- `rhs=np.empty((0,0))` : *array_like*, right-hand side of the problem (rowsums first for TM problems)\n\n**Constructing the LHS**  \n\n- `model=np.empty((0,0))` : *array_like*, the MODEL part of **`a`**, including an eventual user-provided constant as a matrix column in **cOLS** (see Methods and formulas)  \n- `constraints=np.empty((0,0))` : *array_like*, the CONSTRAINTS part of **`a`**  \n- `slackvars=np.empty((0,0))` : *array_like*, the SLACK/SURPLUS VARIABLES part of **`a`**  \n- `zero_diagonal=False` : *bool*, set all the diagonal elements of **`a`** to 0  \n\n**SVD-based estimation**  \n\n- `tolerance=None` : *float*, scipy.linalg.lstsq's cutoff for `small' singular values; used to determine effective rank of a. Singular values smaller than cond \\* largest_singular_value are considered zero\n- `level=95` : *int*, confidence level (by default: **95**)\n\n**Monte-Carlo-based t-test**  \n\n- `seed=123456789` : *int*, random-number seed, # is any number between 0 and 2^31-1 (or 2,147,483,647) (by default: **123456789**) \n- `iterate=300` : *int*, number of iterations, # must be divisible by 50 (by default: **300**)  \n- `distribution=runiform`: *fucntion*, random-variable generating function, name of an earlier declared Python object returning an *array_like* **(r x c)** with two arguments, real scalars **r** and **c** (by default: **lppinv.runiform**, see Examples on how to pass np.random.uniform to **lppinv**)  \n- `mc=True` : *bool*, skip the Monte Carlo-based t-test  \n- `trace=True`: *bool*, hide any output with the exception of dots\n\n## Results:\n\n**`class lppinv.LPpinvResult(`**  \n\n- `OLSResults` : *statsmodels.regression.linear_model.OLSResults* or *None*, statsmodels' results class for for an OLS model  \n- `TtestResult` : *(scipy.stats._result_classes.TtestResult, scipy.stats._result_classes.TtestResult, scipy.stats._result_classes.TtestResult)* or *(None, None, None)*, scipy's result of a t-test  \n- `solution` : *array_like*, the solution obtained from the **LPLS** estimator  \n- `a` : *array_like*, the **`a`** matrix  \n- `nrmse` : *float*, root mean square error normalized by the standard deviation of **`b'**\n- `r2_c` : *float* or *np.nan*, R-squared for CONSTRAINTS in TM  \n\n**)**\n\n## Errors:\n\n- `LPpinvError` : *lppinv.LPpinvError*, specific Python-exception-derived object raised by lppinv functions  \n- `LinAlgError` : *scipy.linalg.LinAlgError*, generic Python-exception-derived object raised by linalg functions  \n\n## Examples:\n```\nimport numpy  as np\nimport lppinv as lp\n\n# cOLS problem\nprint('\\n', lp.solve(\n    cols=True,\n    rhs=np.array([[0, 0, 0, 0, -1, 0.2, 0.9, 2.1]]).T,\n    constraints=np.array([np.ones(4), [0, 1, 2, 3], [0, 5, 2, 8]]).T,\n    model=np.array([np.ones(4), [0, 1, 2, 3], [0, 5, 2, 8]]).T,\n    slackvars=np.array([[-1, 1, -1, 0]]).T,\n    mc=False\n))\n\n# TM problem\ndef rnormal(r=None, c=None):\n    return (np.random.normal(loc=0.0, scale=1.0, size=(r, c)))\n\nprint('\\n', lp.solve(\n    tm=True,\n    rhs=np.array([[4, 5, 3, 4, 6], [1, 2, 0, np.nan, np.nan]]).T,\n    zero_diagonal=True,\n    distribution=rnormal\n))\n\n# custom problem\nprint('\\n', lp.solve(\n    rhs=np.array([[2, 3, 9], [5, 7, 9]]).T,\n    model=np.vstack([[0, 1, 1], [1, 0, 1], [1, 1, 0]]),\n    tolerance=10**-10,\n    trace=False\n))\n```\n\n## References:\n\n1. Albert, A., 1972. *Regression And The Moore-Penrose Pseudoinverse.* New York: Academic Press.  \n\n2. Bolotov, I. 2014. *Modeling of Time Series Cyclical Component on a Defined Set of Stationary Points and its Application on the US Business Cycle. [Paper presentation]. The 8th International Days of Statistics and Economics: Prague.* https://msed.vse.cz/msed_2014/article/348-Bolotov-Ilya-paper.pdf  \n\n3. Bolotov, I. 2015. *Modeling Bilateral Flows in Economics by Means of Exact Mathematical Methods.* [Paper presentation]. The 9th International Days of Statistics and Economics: Prague. https://msed.vse.cz/msed_2015/article/111-Bolotov-Ilya-paper.pdf  \n\n**PS** Please also check the Web of Science (WoS) for new research on LPLS.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A non-iterated general implementation of the LPLS estimator for cOLS, TM, and custom cases",
    "version": "0.4.2",
    "project_urls": {
        "Download": "https://github.com/econcz/lppinv/archive/pypi-0_4_2.tar.gz",
        "Homepage": "https://github.com/econcz/lppinv"
    },
    "split_keywords": [
        "estimator",
        "linear programming",
        "least squares",
        "ols constrained in values",
        "transaction matrix",
        "custom",
        "pseudoinverse",
        "singular value decomposition",
        "numpy",
        "scipy",
        "statsmodels"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ba72088878855e240f7117edf1e3c258fdbd6ef4e4047bf3cb2881b7af649926",
                "md5": "fcc76a80fb8b47489fc3d878114c482c",
                "sha256": "131988b281f59844f62efcbba5c722113f74c3874c8fce9af76e8252296b1d48"
            },
            "downloads": -1,
            "filename": "lppinv-0.4.2.tar.gz",
            "has_sig": false,
            "md5_digest": "fcc76a80fb8b47489fc3d878114c482c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 8889,
            "upload_time": "2024-02-17T18:08:12",
            "upload_time_iso_8601": "2024-02-17T18:08:12.117911Z",
            "url": "https://files.pythonhosted.org/packages/ba/72/088878855e240f7117edf1e3c258fdbd6ef4e4047bf3cb2881b7af649926/lppinv-0.4.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-17 18:08:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "econcz",
    "github_project": "lppinv",
    "github_not_found": true,
    "lcname": "lppinv"
}
        
Elapsed time: 0.19685s