# CLSP — Convex Least Squares Programming
The **Convex Least Squares Programming (CLSP)** estimator is a two-step method for solving underdetermined, ill-posed, or structurally constrained least-squares problems. It combines pseudoinverse-based estimation with convex-programming correction (e.g., Lasso, Ridge, Elastic Net) to ensure numerical stability, structural coherence, and enhanced interpretability.
## Installation
```bash
pip install pyclsp
```
## Quick Example
```python
import numpy as np
import matplotlib.pyplot as plt
from clsp import CLSP
# Example allocation problem: row sums (first 5) + col sums (last 5)
b = np.array([22, 23, 26, 27, 21, 28, 24, 22, 24, 21], dtype=float)
# Initialize estimator
model = CLSP()
# Solve the system (allocation problem)
result = model.solve(problem='ap', b=b, m=5, p=5, final=True)
print("Estimated matrix:")
print(result.x)
# Access diagnostics
print("NRMSE:", model.nrmse)
# Correlogram: RMSA sensitivity by constraint
corr = model.corr()
plt.figure(figsize=(8, 4))
plt.grid(True, linestyle="--", alpha=0.6)
plt.bar(range(len(corr["rmsa_i"])), corr["rmsa_i"])
plt.xlabel("Constraint index")
plt.ylabel("RMSA (row deletion effect)")
plt.title("CLSP Correlogram")
plt.tight_layout()
plt.show()
# Hypothesis test
print("t-test on NRMSE:", model.ttest())
```
## User Reference
For comprehensive information on the estimator’s capabilities, advanced configuration options, and implementation details, please refer to the docstrings provided in each of the individual .py source files. These docstrings contain complete descriptions of available methods, their parameters, expected input formats, and output structures.
### The `CLSP` Class
```python
self.__init__()
```
Stores the solution, goodness-of-fit statistics, and ancillary parameters.
The class has three core methods: `solve()`, `corr()`, and `ttest()`.
**Selected attributes:**
`self.A` : *np.ndarray*
design matrix `A` = [`C` | `S`; `M` | `Q`], where `Q` is either a zero matrix or *S_residual*.
`self.b` : *np.ndarray*
vector of the right-hand side.
`self.zhat` : *np.ndarray*
vector of the first-step estimate.
`self.r` : *int*
number of refinement iterations performed in the first step.
`self.z` : *np.ndarray*
vector of the final solution. If the second step is disabled, it equals `self.zhat`.
`self.x` : *np.ndarray*
`m` x `p` matrix or vector containing the variable component of `z`.
`self.y` : *np.ndarray*
vector containing the slack component of `z`.
`self.kappaC` : *float*
spectral κ() for *C_canon*.
`self.kappaB` : *float*
spectral κ() for *B* = *C_canon^+*`A`.
`self.kappaA` : *float*
spectral κ() for `A`.
`self.rmsa` : *float*
total root mean square alignment (RMSA).
`self.r2_partial` : *float*
R^2 for the `M` block in `A`.
`self.nrmse` : *float*
mean square error calculated from `A` and normalized by standard deviation (NRMSE).
`self.nrmse_partial` : *float*
mean square error calculated from the `M` block in `A` and normalized by standard deviation (NRMSE).
`self.z_lower` : *np.ndarray*
lower bound of the diagnostic interval (confidence band) based on κ(`A`).
`self.z_upper` : *np.ndarray*
upper bound of the diagnostic interval (confidence band) based on κ(`A`).
`self.x_lower` : *np.ndarray*
lower bound of the diagnostic interval (confidence band) based on κ(`A`).
`self.x_upper` : *np.ndarray*
upper bound of the diagnostic interval (confidence band) based on κ(`A`).
`self.y_lower` : *np.ndarray*
lower bound of the diagnostic interval (confidence band) based on κ(`A`).
`self.y_upper` : *np.ndarray*
upper bound of the diagnostic interval (confidence band) based on κ(`A`).
### Solver Method: `solve()`
```python
self.solve(problem, C, S, M, b, m, p, i, j, zero_diagonal, r, Z, tolerance, iteration_limit, final, alpha)
```
Solves the Convex Least Squares Programming (CLSP) problem.
This method performs a two-step estimation:
(1) a pseudoinverse-based solution using either the Moore–Penrose or Bott–Duffin inverse, optionally iterated for refinement;
(2) a convex-programming correction using Lasso, Ridge, or Elastic Net regularization (if enabled).
**Parameters:**
`problem` : *str*, optional
Structural template for matrix construction. One of:
- *'ap'* or *'tm'* : allocation (transaction) matrix problem (AP).
- *'cmls'* or *'rp'* : constrained-model least squares (regression) problem.
- anything else: general CLSP problem (user-defined `C` and/or `M`).
`C`, `S`, `M` : *np.ndarray* or *None*
Blocks of the design matrix `A` = [`C` | `S`; `M` | `Q`]. If `C` and/or `M` are provided, the matrix `A` is constructed accordingly (please note that for AP, `C` is constructed automatically and known values are specified in `M`).
`b` : *np.ndarray* or *None*
Right-hand side vector. Must have as many rows as `A` (please note that for AP, it should start with row sums). Required.
`m`, `p` : *int* or *None*
Dimensions of X ∈ ℝ^{m×p}, relevant for AP.
`i`, `j` : *int*, default = *1*
Grouping sizes for row and column sum constraints in AP.
`zero_diagonal` : *bool*, default = *False*
If *True*, enforces structural zero diagonals.
`r` : *int*, default = *1*
Number of refinement iterations for the pseudoinverse-based estimator.
`Z` : *np.ndarray* or *None*
A symmetric idempotent matrix (projector) defining the subspace for Bott–Duffin pseudoinversion. If *None*, the identity matrix is used, reducing the Bott–Duffin inverse to the Moore–Penrose case.
`tolerance` : *float*, default = *square root of machine epsilon*
Convergence tolerance for NRMSE change between refinement iterations.
`iteration_limit` : *int*, default = *50*
Maximum number of iterations allowed in the refinement loop.
`final` : *bool*, default = *True*
If *True*, a convex programming problem is solved to refine `zhat`. The resulting solution `z` minimizes a weighted L1/L2 norm around `zhat` subject to `Az` = `b`.
`alpha` : *float*, default = *1.0*
Regularization parameter (weight) in the final convex program:
- `α = 0`: Lasso (L1 norm)
- `α = 1`: Tikhonov Regularization/Ridge (L2 norm)
- `0 < α < 1`: Elastic Net
`*args`, `**kwargs` : optional
CVXPY arguments passed to the CVXPY solver.
**Returns:**
*self*
### Correlogram Method: `corr()`
```python
self.corr(reset, threshold)
```
Computes the structural correlogram of the CLSP constraint part.
This method performs a row-deletion sensitivity analysis on the canonical constraint matrix `[C` | `S`], denoted as *C_canon*, and evaluates the marginal effect of each constraint row on numerical stability, angular alignment, and estimator sensitivity.
For each row `i` in `C_canon`, it computes:
- The Root Mean Square Alignment (`RMSA_i`) with all other rows `j` ≠ `i`.
- The change in condition numbers κ(`C`), κ(`B`), and κ(`A`) when row `i` is deleted.
- The effect on estimation quality: changes in `nrmse`, `zhat`, `z`, and `x` when row `i` is deleted.
Additionally, it computes the total `rmsa` statistic across all rows, summarizing the overall angular alignment of *C_canon*.
**Parameters:**
`reset` : *bool*, default = *False*
If *True*, forces recomputation of all diagnostic values (the results are preserved for eventual reproduction after the method is called).
`threshold` : *float*, default = *0*
If positive, limits the output to constraints with `RMSA_i` ≥ `threshold`.
**Returns:**
*dict* of *list*
A dictionary containing per-row diagnostic values:
{
`"constraint"` : `[1, 2, ..., k]`, # 1-based indices
`"rmsa_i"` : list of `RMSA_i` values,
`"rmsa_dkappaC"` : list of Δκ(`C`) after deleting row `i`,
`"rmsa_dkappaB"` : list of Δκ(`B`) after deleting row `i`,
`"rmsa_dkappaA"` : list of Δκ(`A`) after deleting row `i`,
`"rmsa_dnrmse"` : list of Δ`nrmse` after deleting row `i`,
`"rmsa_dzhat"` : list of Δ`zhat` after deleting row `i`,
`"rmsa_dz"` : list of Δ`z` after deleting row `i`,
`"rmsa_dx"` : list of Δ`x` after deleting row `i`,
}
### T-Test Method: `ttest`
```python
self.ttest(reset, sample_size, seed, distribution)
```
Performs a Monte Carlo-based one- or two-sided t-test on the NRMSE statistic.
This function simulates right-hand side vectors `b` using a user-defined or default distribution and recomputes the estimator for every new `b`. It
tests whether the observed NRMSE significantly deviates from the null distribution (under H₀) of simulated NRMSE values. The quality of the test depends on the size of the simulated sample.
**Parameters:**
`reset` : *bool*, default = *False*
If *True*, forces recomputation of the NRMSE null distribution (under H₀) (the results are preserved for eventual reproduction after the method is called).
`sample_size` : *int*, default = *50*
Size of the Monte Carlo simulated sample under H₀.
`seed` : *int* or *None*, optional
Optional random seed to override the default.
`distribution` : *str* or *None*, default = *’normal’*
Distribution for generating simulated `b` vectors. One of (standard): *'normal'*, *'uniform'*, or *'laplace'*.
**Returns:**
*dict*
Dictionary with test results and null distribution statistics:
{
`'p_one_left'` : P(nrmse ≤ null mean),
`'p_one_right'` : P(nrmse ≥ null mean),
`'p_two_sided'` : 2-sided t-test p-value,
`'nrmse'` : observed value,
`'mean_null'` : mean of the null distribution (under H₀),
`'std_null'` : standard deviation of the null distribution (under H₀)
}
## Bibliography
To be added.
## License
MIT License — see the [LICENSE](LICENSE) file.
Raw data
{
"_id": null,
"home_page": null,
"name": "pyclsp",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "estimators, convex-optimization, least-squares, generalized-inverse, regularization",
"author": null,
"author_email": "The Economist <29724411+econcz@users.noreply.github.com>",
"download_url": "https://files.pythonhosted.org/packages/c2/f4/c1ff821e3b32a31f485c574adb26263f67778c7e776b64138b6a1d6d339e/pyclsp-1.1.2.tar.gz",
"platform": null,
"description": "# CLSP \u2014 Convex Least Squares Programming\n\nThe **Convex Least Squares Programming (CLSP)** estimator is a two-step method for solving underdetermined, ill-posed, or structurally constrained least-squares problems. It combines pseudoinverse-based estimation with convex-programming correction (e.g., Lasso, Ridge, Elastic Net) to ensure numerical stability, structural coherence, and enhanced interpretability.\n\n## Installation\n\n```bash\npip install pyclsp\n```\n\n## Quick Example\n\n```python\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom clsp import CLSP\n\n# Example allocation problem: row sums (first 5) + col sums (last 5)\nb = np.array([22, 23, 26, 27, 21, 28, 24, 22, 24, 21], dtype=float)\n\n# Initialize estimator\nmodel = CLSP()\n\n# Solve the system (allocation problem)\nresult = model.solve(problem='ap', b=b, m=5, p=5, final=True)\nprint(\"Estimated matrix:\")\nprint(result.x)\n\n# Access diagnostics\nprint(\"NRMSE:\", model.nrmse)\n\n# Correlogram: RMSA sensitivity by constraint\ncorr = model.corr()\nplt.figure(figsize=(8, 4))\nplt.grid(True, linestyle=\"--\", alpha=0.6)\nplt.bar(range(len(corr[\"rmsa_i\"])), corr[\"rmsa_i\"])\nplt.xlabel(\"Constraint index\")\nplt.ylabel(\"RMSA (row deletion effect)\")\nplt.title(\"CLSP Correlogram\")\nplt.tight_layout()\nplt.show()\n\n# Hypothesis test\nprint(\"t-test on NRMSE:\", model.ttest())\n```\n\n## User Reference\n\nFor comprehensive information on the estimator\u2019s capabilities, advanced configuration options, and implementation details, please refer to the docstrings provided in each of the individual .py source files. These docstrings contain complete descriptions of available methods, their parameters, expected input formats, and output structures.\n\n### The `CLSP` Class\n\n```python\nself.__init__()\n```\n\nStores the solution, goodness-of-fit statistics, and ancillary parameters.\n\nThe class has three core methods: `solve()`, `corr()`, and `ttest()`.\n\n**Selected attributes:** \n`self.A` : *np.ndarray* \ndesign matrix `A` = [`C` | `S`; `M` | `Q`], where `Q` is either a zero matrix or *S_residual*.\n\n`self.b` : *np.ndarray* \nvector of the right-hand side.\n\n`self.zhat` : *np.ndarray* \nvector of the first-step estimate.\n\n`self.r` : *int* \nnumber of refinement iterations performed in the first step.\n\n`self.z` : *np.ndarray* \nvector of the final solution. If the second step is disabled, it equals `self.zhat`.\n\n`self.x` : *np.ndarray* \n`m` x `p` matrix or vector containing the variable component of `z`.\n\n`self.y` : *np.ndarray* \nvector containing the slack component of `z`.\n\n`self.kappaC` : *float* \nspectral \u03ba() for *C_canon*.\n\n`self.kappaB` : *float* \nspectral \u03ba() for *B* = *C_canon^+*`A`.\n\n`self.kappaA` : *float* \nspectral \u03ba() for `A`.\n\n`self.rmsa` : *float* \ntotal root mean square alignment (RMSA).\n\n`self.r2_partial` : *float* \nR^2 for the `M` block in `A`.\n\n`self.nrmse` : *float* \nmean square error calculated from `A` and normalized by standard deviation (NRMSE).\n\n`self.nrmse_partial` : *float* \nmean square error calculated from the `M` block in `A` and normalized by standard deviation (NRMSE).\n\n`self.z_lower` : *np.ndarray* \nlower bound of the diagnostic interval (confidence band) based on \u03ba(`A`).\n\n`self.z_upper` : *np.ndarray* \nupper bound of the diagnostic interval (confidence band) based on \u03ba(`A`).\n\n`self.x_lower` : *np.ndarray* \nlower bound of the diagnostic interval (confidence band) based on \u03ba(`A`).\n\n`self.x_upper` : *np.ndarray* \nupper bound of the diagnostic interval (confidence band) based on \u03ba(`A`).\n\n`self.y_lower` : *np.ndarray* \nlower bound of the diagnostic interval (confidence band) based on \u03ba(`A`).\n\n`self.y_upper` : *np.ndarray* \nupper bound of the diagnostic interval (confidence band) based on \u03ba(`A`).\n\n### Solver Method: `solve()`\n\n```python\nself.solve(problem, C, S, M, b, m, p, i, j, zero_diagonal, r, Z, tolerance, iteration_limit, final, alpha)\n```\n\nSolves the Convex Least Squares Programming (CLSP) problem.\n\nThis method performs a two-step estimation:\n(1) a pseudoinverse-based solution using either the Moore\u2013Penrose or Bott\u2013Duffin inverse, optionally iterated for refinement;\n(2) a convex-programming correction using Lasso, Ridge, or Elastic Net regularization (if enabled).\n\n**Parameters:** \n`problem` : *str*, optional \n Structural template for matrix construction. One of: \n - *'ap'* or *'tm'* : allocation (transaction) matrix problem (AP). \n - *'cmls'* or *'rp'* : constrained-model least squares (regression) problem. \n - anything else: general CLSP problem (user-defined `C` and/or `M`).\n\n`C`, `S`, `M` : *np.ndarray* or *None* \n Blocks of the design matrix `A` = [`C` | `S`; `M` | `Q`]. If `C` and/or `M` are provided, the matrix `A` is constructed accordingly (please note that for AP, `C` is constructed automatically and known values are specified in `M`).\n\n`b` : *np.ndarray* or *None* \n Right-hand side vector. Must have as many rows as `A` (please note that for AP, it should start with row sums). Required.\n\n`m`, `p` : *int* or *None* \n Dimensions of X \u2208 \u211d^{m\u00d7p}, relevant for AP.\n\n`i`, `j` : *int*, default = *1* \n Grouping sizes for row and column sum constraints in AP.\n\n`zero_diagonal` : *bool*, default = *False* \n If *True*, enforces structural zero diagonals.\n\n`r` : *int*, default = *1* \n Number of refinement iterations for the pseudoinverse-based estimator.\n\n`Z` : *np.ndarray* or *None* \n A symmetric idempotent matrix (projector) defining the subspace for Bott\u2013Duffin pseudoinversion. If *None*, the identity matrix is used, reducing the Bott\u2013Duffin inverse to the Moore\u2013Penrose case.\n\n`tolerance` : *float*, default = *square root of machine epsilon* \n Convergence tolerance for NRMSE change between refinement iterations.\n\n`iteration_limit` : *int*, default = *50* \n Maximum number of iterations allowed in the refinement loop.\n\n`final` : *bool*, default = *True* \n If *True*, a convex programming problem is solved to refine `zhat`. The resulting solution `z` minimizes a weighted L1/L2 norm around `zhat` subject to `Az` = `b`.\n\n`alpha` : *float*, default = *1.0* \n Regularization parameter (weight) in the final convex program: \n\t- `\u03b1 = 0`: Lasso (L1 norm) \n\t- `\u03b1 = 1`: Tikhonov Regularization/Ridge (L2 norm) \n\t- `0 < \u03b1 < 1`: Elastic Net\n\n`*args`, `**kwargs` : optional \n CVXPY arguments passed to the CVXPY solver.\n\n**Returns:** \n*self*\n\n### Correlogram Method: `corr()`\n\n```python\nself.corr(reset, threshold)\n```\n\nComputes the structural correlogram of the CLSP constraint part.\n\nThis method performs a row-deletion sensitivity analysis on the canonical constraint matrix `[C` | `S`], denoted as *C_canon*, and evaluates the marginal effect of each constraint row on numerical stability, angular alignment, and estimator sensitivity.\n\nFor each row `i` in `C_canon`, it computes: \n\t- The Root Mean Square Alignment (`RMSA_i`) with all other rows `j` \u2260 `i`. \n\t- The change in condition numbers \u03ba(`C`), \u03ba(`B`), and \u03ba(`A`) when row `i` is deleted. \n\t- The effect on estimation quality: changes in `nrmse`, `zhat`, `z`, and `x` when row `i` is deleted.\n\nAdditionally, it computes the total `rmsa` statistic across all rows, summarizing the overall angular alignment of *C_canon*.\n\n**Parameters:** \n`reset` : *bool*, default = *False* \n If *True*, forces recomputation of all diagnostic values (the results are preserved for eventual reproduction after the method is called).\n\n`threshold` : *float*, default = *0* \n If positive, limits the output to constraints with `RMSA_i` \u2265 `threshold`.\n\n**Returns:** \n*dict* of *list* \n A dictionary containing per-row diagnostic values: \n { \n `\"constraint\"` : `[1, 2, ..., k]`, # 1-based indices \n `\"rmsa_i\"` : list of `RMSA_i` values, \n `\"rmsa_dkappaC\"` : list of \u0394\u03ba(`C`) after deleting row `i`, \n `\"rmsa_dkappaB\"` : list of \u0394\u03ba(`B`) after deleting row `i`, \n `\"rmsa_dkappaA\"` : list of \u0394\u03ba(`A`) after deleting row `i`, \n `\"rmsa_dnrmse\"` : list of \u0394`nrmse` after deleting row `i`, \n `\"rmsa_dzhat\"` : list of \u0394`zhat` after deleting row `i`, \n `\"rmsa_dz\"` : list of \u0394`z` after deleting row `i`, \n `\"rmsa_dx\"` : list of \u0394`x` after deleting row `i`, \n }\n\n### T-Test Method: `ttest`\n\n```python\nself.ttest(reset, sample_size, seed, distribution)\n```\n\nPerforms a Monte Carlo-based one- or two-sided t-test on the NRMSE statistic.\n\nThis function simulates right-hand side vectors `b` using a user-defined or default distribution and recomputes the estimator for every new `b`. It \ntests whether the observed NRMSE significantly deviates from the null distribution (under H\u2080) of simulated NRMSE values. The quality of the test depends on the size of the simulated sample.\n\n**Parameters:** \n`reset` : *bool*, default = *False* \n If *True*, forces recomputation of the NRMSE null distribution (under H\u2080) (the results are preserved for eventual reproduction after the method is called).\n\n`sample_size` : *int*, default = *50* \n Size of the Monte Carlo simulated sample under H\u2080.\n\n`seed` : *int* or *None*, optional \n Optional random seed to override the default.\n\n`distribution` : *str* or *None*, default = *\u2019normal\u2019* \n Distribution for generating simulated `b` vectors. One of (standard): *'normal'*, *'uniform'*, or *'laplace'*.\n\n**Returns:** \n*dict* \n Dictionary with test results and null distribution statistics: \n { \n `'p_one_left'` : P(nrmse \u2264 null mean), \n `'p_one_right'` : P(nrmse \u2265 null mean), \n `'p_two_sided'` : 2-sided t-test p-value, \n `'nrmse'` : observed value, \n `'mean_null'` : mean of the null distribution (under H\u2080), \n `'std_null'` : standard deviation of the null distribution (under H\u2080) \n }\n\n## Bibliography\nTo be added.\n\n## License\nMIT License \u2014 see the [LICENSE](LICENSE) file.\n",
"bugtrack_url": null,
"license": null,
"summary": "Modular Two-Step Convex Optimization Estimator for Ill-Posed Problems",
"version": "1.1.2",
"project_urls": {
"Bug Tracker": "https://github.com/econcz/pyclsp/issues",
"omepage": "https://github.com/econcz/pyclsp"
},
"split_keywords": [
"estimators",
" convex-optimization",
" least-squares",
" generalized-inverse",
" regularization"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "aea2c4d408c6cb4028e273ae66383c67b11b4abf4a7c20ee24bba97df41cfab6",
"md5": "d60b5478ac3cc74268ebfe69807c2a5d",
"sha256": "592f9d12ff3fde0fadfcb1ee6c61c5827381978df83c0e25446e00a1dd287c9d"
},
"downloads": -1,
"filename": "pyclsp-1.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d60b5478ac3cc74268ebfe69807c2a5d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 17918,
"upload_time": "2025-08-18T06:36:32",
"upload_time_iso_8601": "2025-08-18T06:36:32.226913Z",
"url": "https://files.pythonhosted.org/packages/ae/a2/c4d408c6cb4028e273ae66383c67b11b4abf4a7c20ee24bba97df41cfab6/pyclsp-1.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "c2f4c1ff821e3b32a31f485c574adb26263f67778c7e776b64138b6a1d6d339e",
"md5": "aa99fed99c359e1fcff692e2f6914970",
"sha256": "60af6c961e6ff2dd52aaeb58590250e4cdb5b0aee90098736c55f67b661e549f"
},
"downloads": -1,
"filename": "pyclsp-1.1.2.tar.gz",
"has_sig": false,
"md5_digest": "aa99fed99c359e1fcff692e2f6914970",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 18160,
"upload_time": "2025-08-18T06:36:33",
"upload_time_iso_8601": "2025-08-18T06:36:33.652699Z",
"url": "https://files.pythonhosted.org/packages/c2/f4/c1ff821e3b32a31f485c574adb26263f67778c7e776b64138b6a1d6d339e/pyclsp-1.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-18 06:36:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "econcz",
"github_project": "pyclsp",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pyclsp"
}