doe-toolbox

Name	doe-toolbox JSON
Version	1.3 JSON
	download
home_page
Summary	Design of experiments toolbox.
upload_time	2023-06-17 17:44:06
maintainer
docs_url	None
author	miltos_90
requires_python	>=3.10
license	GNU General Public License v3.0
keywords	python statistics doe
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # DOE Toolbox

## Description

A simple Design of Experiments (DoE) toolbox written in python, which provides a range of tools and functions to facilitate the planning of experimental designs. It is intended to assist researchers, engineers, and analysts in efficiently exploring and optimizing systems, processes, and products by systematically varying factors and analyzing their effects on the response variable.

It includes:
1. [Factorial designs](#factorial)
    * *Generic full-factorial* (`fullfact`): A design in which every setting of every factor appears with every setting of every other factor, for all possible combinations.
    * *2-level full-factorial* (`ff2n`): Same as above, but the factors are constrained to have two levels each (high/low or +1 and -1).
    * *2-level fractional factorial* (`fracfact`): Similar to a full factorial DoE, but a subset of factor combinations is selected, based on specific rules to ensure that important main effects and interactions can still be estimated with a reduced number of experimental runs.
    * *2-level fractional factorial generator* (`fracfactgen`): Convenient design generator to control how the fraction (or subset of runs) iwill selected from the full set of runs in a fractional factorial design.
2. [Response surface designs](#rsm)
    * *Box-Wilson Central Composite Designs (CCD)* (`ccdesign`): CCD designs start with a factorial or fractional factorial design (with center points) and add "star" points to estimate curvature for the estimation of quadratic models.
    * *Box-Behnken* (`bbdesign`): An alternative to CCD, being an independent quadratic design in that it does not contain an embedded factorial or fractional factorial design. For three factors, the Box-Behnken design offers some advantage in requiring a fewer number of runs than a CCD. However, for four or more factors, this advantage disappears.
3. [Latin Hypercube sampling (LHS)](#lhs): An experimental design in which the range of each factor is divided into equal intervals or bins. Within each bin, one and only one sample point is selected randomly. The selection process ensures that the samples are evenly distributed across the parameter space and that each combination of factor levels occurs exactly once. It is especially useful for, and commonly employed in, simulation studies, sensitivity analysis, and optimization problems.

## Installation

The package can be easily installed with pip via a DOS or Unix command shell:

```bash
pip install doe-toolbox
```

### Requirements
The following packages are required:
* numpy >= 1.24.2,
* scipy >= 1.10.1.

See `requirements.txt` file


## Usage

### Factorial Design of Experiments <a name="factorial"></a>

#### fullfact

##### Description

Full factorial designs consist of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors.
Such designs can be generated using the `fullfact` function  outputs factor settings for a full factorial design with *n* factors, where the number of levels for each factor is given by the vector `levels` of length *n*. 

The output is an *m*-by-*n* numpy array, where *m* is the number of treatments in the full-factorial design. 
Each row corresponds to a single treatment, and each column contains the settings for a single factor, with floating point scalars ranging from -1 to +1.

##### Example
The following generates a ten-run full-factorial design with five levels for the first factor and two levels for the second factor:

```python
>>> import doe_box as dbox
>>> dbox.fullfact(levels = [5, 2])
array([[-1. , -1. ],
       [-1. ,  1. ],
       [-0.5, -1. ],
       [-0.5,  1. ],
       [ 0. , -1. ],
       [ 0. ,  1. ],
       [ 0.5, -1. ],
       [ 0.5,  1. ],
       [ 1. , -1. ],
       [ 1. ,  1. ]])
```

#### fracfact

##### Description

Fractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design. 
The subset is chosen so as to exploit the sparsity-of-effects principle to expose information about the most important features of the problem studied, while using a fraction of the effort of a full factorial design in terms of experimental runs and resources. 
In simple terms, it makes use of the fact that many experiments in full factorial design are often redundant, giving little or no new information about the system.

Such designs can be generated using the `fracfact` function, which creates the two-level fractional factorial designs defined by the generator `gen`. 
The latter is a (case-sensitive) string listing the factors in the design, formed from the 52 case-sensitive letters *a*-*Z*, separated by spaces.
Standard convention notation indicates to use *a*-*z* for the first 26 factors, and, if necessary, *A*-*Z* for the remaining factors. 
A valid example would be: `gen = 'a b c abc'`.

Similar to the previous, the output is an *m*-by-*n* numpy array, where *m* is the number of treatments in the fractional-factorial design. 
Each row corresponds to a single treatment, and each column contains the settings for a single factor, with floating point scalars ranging from -1 to +1.

##### Example
The following generates an eight-run fractional factorial design for four factors, in which the fourth factor is the product of the first three:

```python
>>> gen = 'a b c abc'
>>> dbox.fracfact(gen)

array([[-1, -1, -1, -1],
       [-1, -1,  1,  1],
       [-1,  1, -1,  1],
       [-1,  1,  1, -1],
       [ 1, -1, -1,  1],
       [ 1, -1,  1, -1],
       [ 1,  1, -1, -1],
       [ 1,  1,  1,  1]])
```

Note that more sophisticated generator strings can be created using the +" and "-" operators. The "-" operator will swap the column levels:

```python
>>> gen = 'a b c -abc'
>>> dbox.fracfact(gen)

array([[-1, -1, -1,  1],
       [-1, -1,  1, -1],
       [-1,  1, -1, -1],
       [-1,  1,  1,  1],
       [ 1, -1, -1, -1],
       [ 1, -1,  1,  1],
       [ 1,  1, -1,  1],
       [ 1,  1,  1, -1]])
```

#### fracfactgen

##### Description

The `fracfactgen` function uses the Franklin-Bailey algorithm to find generators for the smallest two-level fractional-factorial design.

It requires two inputs:
* `terms`: Is a string of factors formed formed from the 52 case-sensitive letters *a*-*Z*, separated by spaces.
Standard convention notation indicates to use 'a'-'z' for the first 26 factors, and, if necessary, 'A'-'Z' for the remaining factors. 
A valid example would be: `terms = 'a b c ab ac'`. 
Single-letter factors indicate the main effects to be estimated, whereas multiple-letter factors indicate the interactions to be estimated. 
You can pass the output generators of `fracfactgen` to `fracfact`, in order to produce the corresponding fractional-factorial design.
* `resolution`: Is an integer indicating the required resolution of the design. A design of resolution *R* is one in which no *n*-factor interaction is confounded with any other effect containing less than *R â€“ n* factors. Thus, a resolution *III* design does not confound main effects with one another but may confound them with two-way interactions, while a resolution *IV* design does not confound either main effects or two-way interactions but may confound two-way interactions with each other. It is an optional argument, with the default value being equal to 3.

If `fracfactgen` is unable to find a design at the requested resolution, it tries to find a lower-resolution design sufficient to calibrate the model. If it is successful, it returns the generators for the lower-resolution design along with a warning. If it fails, an error is raised.

##### Example
The following will determine the effects of four two-level factors, for which there may be two-way interactions. A full-factorial design would require 2<sup>4</sup> = 16 runs. The `fracfactgen` function will generators for a resolution *IV* (separating main effects) fractional-factorial design that requires only 2<sup>3</sup> = 8 runs:


```python
>>> dbox.fracfactgen(terms = 'a b c d', resolution = 4)

'a b c abc'
```

### Response Surface Designs <a name="rsm"></a>

#### ccdesign

##### Description

Central Composite Designs (CCDs) are a type of experimental design useful in response surface methodology, for building a second order (quadratic) model for the response variable without needing to use a complete three-level factorial experiment. Such designs can be generated using the `ccdesign` function. 
It needs the following input arguments:

* `numFactors`: Number of factors in the design. Must be an integer at least equal to *2*.

The following optional arguments can be set:
*  `fraction`: Integer indicating the fraction of full-factorial cube, expressed as an exponent of 1/2. If not set by the user, the default values are the following:
    * 0, i.e. full factorial design, when `numFactors` &le; 4
    * 1, i.e. a 1/2 fraction design, when 4 <  `numFactors` &le;  7 or `numFactors` > 11
    * 2, i.e. a 1/4 fraction design, when 7 < `numFactors` &le;  9 
    * 3, i.e. a 1/8 fraction design, when `numFactors` = 10 
    * 4, i.e. a 1/16 fraction design, when `numFactors` = 11
* `centerPoints`: Number of center points to be added in the factorial and axial parts of the design.
    Can be one of:
    * 'orthogonal' (default): Number of center points will be computed so that an orthogonal design will be provided.
    * 'uniform'   : Number of center points will be computed so that uniform precision will be achieved.
    * A strictly positive integer, specifying the number of center points directly.
* `designType`: It defines the type of the CCD. 
    Can be one of:
    * 'circumscribed' (default): It is the original type of CCD, where axial points are located at distance *a* from the center point.
    * 'inscribed' : The inscribed CCD is characterized by that axial points are 
                    located at factor levels *âˆ’1* and *1*, while the factorial points are brought into the interior of the design space and are located at distance *1/a* from the center point.
    * 'faced': In a face-centered CCD, the axial points are located at a distance equal to *1* from the center point, i.e. at the face of the design cube if the design involves three experimental factors.

For *n > 2* factors, the output DoE matrix has dimensions *m* by *n*, with *m* being the number of runs in the design. 
Each row represents one run, with settings for all factors represented in the corresponding columns. The resulting factor values are normalized, so that the cube points take values between *-1* and *1*.

##### Example
The following generates a two-factor, full-factorial, circumscribed, orthogonal CCD:

```python
>>> dbox.ccdesign(numFactors = 2)

array([[-1.        , -1.        ],
       [-1.        ,  1.        ],
       [ 1.        , -1.        ],
       [ 1.        ,  1.        ],
       [-1.41421356, -0.        ],
       [ 1.41421356,  0.        ],
       [-0.        , -1.41421356],
       [ 0.        ,  1.41421356],
       [ 0.        ,  0.        ],
       [ 0.        ,  0.        ],
       [ 0.        ,  0.        ],
       [ 0.        ,  0.        ],
       [ 0.        ,  0.        ],
       [ 0.        ,  0.        ],
       [ 0.        ,  0.        ],
       [ 0.        ,  0.        ]])
```

#### bbdesign

##### Description

Boxâ€“Behnken designs are experimental designs also used in response surface methodology, for the estimation of second order (quadratic) models for the response variable.
Box-Behnken designs are considered to be more proficient and most powerful than CCD designs, despite their poor coverage of the corners of nonlinear design spaces.

This type of design can be generated using the `bbdesign` function, which takes the following input arguments:

* `numFactors`: Number of factors in the design. Must be an integer at least equal to *3*.
* `numCenter` : Number of centerpoints in the design. It is an optional argument, and if no input is provided, a pre-determined number of points are automatically included, whose number depends on the value of `numFactors`.

For *n > 2* factors, the output DoE matrix has dimensions *m* by *n*, with *m* being the number of runs in the design. 
Each row represents one run, with settings for all factors represented in the corresponding columns. The resulting factor values are normalized, so that the cube points thaveake values between *-1* and *1*.

The output matrix dBB is m-by-n, where m is the number of runs in the design. Each row represents one run, with settings for all factors represented in the columns. Factor values are normalized so that the cube points take values between -1 and 1.

##### Example
The following generates a two-factor, full-factorial, circumscribed, orthogonal CCD:

```python
>>> dbox.bbdesign(numFactors = 3)

array([[-1, -1,  0],
       [-1,  1,  0],
       [ 1, -1,  0],
       [ 1,  1,  0],
       [-1,  0, -1],
       [-1,  0,  1],
       [ 1,  0, -1],
       [ 1,  0,  1],
       [ 0, -1, -1],
       [ 0, -1,  1],
       [ 0,  1, -1],
       [ 0,  1,  1],
       [ 0,  0,  0],
       [ 0,  0,  0],
       [ 0,  0,  0]])
```

### Latin Hypercube Sampling <a name="lhs"></a>

#### lhs

##### Description
Latin Hypercube Sampling (LHS) is a statistical sampling technique used to efficiently explore the parameter space of a system or model. It is commonly employed in simulation studies, sensitivity analysis, and optimization problems.

In LHS, the range of each input variable or factor is divided into equally spaced intervals. Within each interval, a single sample point is randomly selected. The key feature of LHS is that it ensures that each level or bin of each factor occurs exactly once in the sampled dataset, providing a representative and evenly distributed coverage of the parameter space.

To generate a design of this type, the `lhs` function can be used.
Its input arguments include:
* `numSamples`  : Number of samples to be generated, specified as a positive integer.
* `numVariables`: Number of variables in the design, specified as a positive integer.

Additional optional arguments include:
* `criterion` : Criterion to be used to evaluate the improvement of the  design over each iteration. It can be one of:
    * 'maxdist' (default): Maximizes the minimum sample-to-sample distance.
    * 'mincorr': Minimize the sum of between-column squared correlations.
* `smooth`: Boolean indicator whether the points that will be produced 
    should be randomly distributed.
    * If `smooth = True` (default): One point from each of the
    intervals: (0, 1/*n*), (1/*n* , 2/*n*),
    (1-1/*n*, 1), will be sampled, with a subsequent random permutation.
    * If `smooth = False`:
    The points will produced by sampling only at the midpoints of the intervals, i.e. at  
    .5/*n*, 1.5/*n*, ..., 1 - .5/*n*,
    with *n* being the value of `numSamples` selected.
    The de

For *n* `numSamples` and *m* `numVariables`, the function returns a Latin hypercube sample matrix of size *n*-by-*m*. For each column of the output matrix, the *n* values are randomly distributed, with each one from the intervals defined according to the value of `smooth`.

##### Example

The following generates a Latin hypercube sample of ten rows (samples) and three columns (variables):

```python
>>> import numpy as np
>>> np.random.seed(10) # Set the seed for reproducibility
>>> dbox.lhs(numSamples = 10, numVariables = 3)

array([[0.75248678, 0.9707202 , 0.69357489],
       [0.60211809, 0.16602922, 0.45049514],
       [0.10229193, 0.35592262, 0.26817272],
       [0.8480203 , 0.24218636, 0.51460662],
       [0.49319027, 0.75354692, 0.32180509],
       [0.02813972, 0.8413978 , 0.99629056],
       [0.96493436, 0.44368093, 0.87002701],
       [0.54876658, 0.63265331, 0.08408063],
       [0.39495223, 0.06621841, 0.18919362],
       [0.28210972, 0.51141729, 0.7634635 ]])
```

Each column of the output matrix contains one random number in each interval [0,0.1], [0.1,0.2], [0.2,0.3], [0.3,0.4], [0.4,0.5], [0.5,0.6], [0.6,0.7], [0.7,0.8], [0.8,0.9], and [0.9,1].

To obtain a discrete design, the default value of `smooth` should be overwritten, and set to `False`:

```python
>>> import numpy as np
>>> np.random.seed(10) # Set the seed for reproducibility
>>> x = dbox.lhs(numSamples = 10, numVariables = 3, smooth = False)
>>> x 

array([[0.75, 0.85, 0.15],
       [0.05, 0.95, 0.55],
       [0.65, 0.55, 0.05],
       [0.35, 0.15, 0.45],
       [0.15, 0.25, 0.25],
       [0.25, 0.75, 0.65],
       [0.45, 0.05, 0.75],
       [0.95, 0.45, 0.95],
       [0.55, 0.35, 0.85],
       [0.85, 0.65, 0.35]])
```

To see the effect of changing the default `criterion`, first evaluate the sum the squared correlation of the *x* matrix above:

```python
>>> corr  = np.corrcoef(x, rowvar = False) 
>>> (sum(corr.flatten() ** 2) - 3)/2

0.07004591368227708
```

Subsequently, generating a new matrix with `criterion = mincorr`, the same evaluations result in a squared-correlation sum of:

```python
>>> np.random.seed(10)
>>> x = dbox.lhs(numSamples = 10, numVariables = 3, smooth = False, criterion = 'mincorr')
>>> corr  = np.corrcoef(x, rowvar = False) 
>>> (sum(corr.flatten() ** 2) - 3)/2

0.027731864095500214
```

As can be seen, minimizing the correlations results in a design with much lower sum of squared correlations.

## References

Good starting points for additional information on each experimental design type can be found on:

* [Factorial designs](https://en.wikipedia.org/wiki/Factorial_experiment)
* [Box-Behnken designs](https://en.wikipedia.org/wiki/Box%E2%80%93Behnken_design)
* [Central composite designs](https://en.wikipedia.org/wiki/Central_composite_design)
* [Latin-Hypercube designs](https://en.wikipedia.org/wiki/Latin_hypercube_sampling)

In addition, a wealth of information about DoE can be found on the [NIST](https://www.itl.nist.gov/div898/handbook/pri/pri.htm) website, including discussion on how to choose and analyze various DoEs, as well as several case studies.

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "doe-toolbox",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "python,statistics,doe",
    "author": "miltos_90",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/f2/3d/950c5f8b77695d397c8d8269bcc5cf408a7b80c0e17a086e91840a82ba63/doe_toolbox-1.3.tar.gz",
    "platform": null,
    "description": "# DOE Toolbox\r\n\r\n## Description\r\n\r\nA simple Design of Experiments (DoE) toolbox written in python, which provides a range of tools and functions to facilitate the planning of experimental designs. It is intended to assist researchers, engineers, and analysts in efficiently exploring and optimizing systems, processes, and products by systematically varying factors and analyzing their effects on the response variable.\r\n\r\nIt includes:\r\n1. [Factorial designs](#factorial)\r\n    * *Generic full-factorial* (`fullfact`): A design in which every setting of every factor appears with every setting of every other factor, for all possible combinations.\r\n    * *2-level full-factorial* (`ff2n`): Same as above, but the factors are constrained to have two levels each (high/low or +1 and -1).\r\n    * *2-level fractional factorial* (`fracfact`): Similar to a full factorial DoE, but a subset of factor combinations is selected, based on specific rules to ensure that important main effects and interactions can still be estimated with a reduced number of experimental runs.\r\n    * *2-level fractional factorial generator* (`fracfactgen`): Convenient design generator to control how the fraction (or subset of runs) iwill selected from the full set of runs in a fractional factorial design.\r\n2. [Response surface designs](#rsm)\r\n    * *Box-Wilson Central Composite Designs (CCD)* (`ccdesign`): CCD designs start with a factorial or fractional factorial design (with center points) and add \"star\" points to estimate curvature for the estimation of quadratic models.\r\n    * *Box-Behnken* (`bbdesign`): An alternative to CCD, being an independent quadratic design in that it does not contain an embedded factorial or fractional factorial design. For three factors, the Box-Behnken design offers some advantage in requiring a fewer number of runs than a CCD. However, for four or more factors, this advantage disappears.\r\n3. [Latin Hypercube sampling (LHS)](#lhs): An experimental design in which the range of each factor is divided into equal intervals or bins. Within each bin, one and only one sample point is selected randomly. The selection process ensures that the samples are evenly distributed across the parameter space and that each combination of factor levels occurs exactly once. It is especially useful for, and commonly employed in, simulation studies, sensitivity analysis, and optimization problems.\r\n\r\n## Installation\r\n\r\nThe package can be easily installed with pip via a DOS or Unix command shell:\r\n\r\n```bash\r\npip install doe-toolbox\r\n```\r\n\r\n### Requirements\r\nThe following packages are required:\r\n* numpy >= 1.24.2,\r\n* scipy >= 1.10.1.\r\n\r\nSee `requirements.txt` file\r\n\r\n\r\n## Usage\r\n\r\n### Factorial Design of Experiments <a name=\"factorial\"></a>\r\n\r\n#### fullfact\r\n\r\n##### Description\r\n\r\nFull factorial designs consist of two or more factors, each with discrete possible values or \"levels\", and whose experimental units take on all possible combinations of these levels across all such factors.\r\nSuch designs can be generated using the `fullfact` function  outputs factor settings for a full factorial design with *n* factors, where the number of levels for each factor is given by the vector `levels` of length *n*. \r\n\r\nThe output is an *m*-by-*n* numpy array, where *m* is the number of treatments in the full-factorial design. \r\nEach row corresponds to a single treatment, and each column contains the settings for a single factor, with floating point scalars ranging from -1 to +1.\r\n\r\n##### Example\r\nThe following generates a ten-run full-factorial design with five levels for the first factor and two levels for the second factor:\r\n\r\n```python\r\n>>> import doe_box as dbox\r\n>>> dbox.fullfact(levels = [5, 2])\r\narray([[-1. , -1. ],\r\n       [-1. ,  1. ],\r\n       [-0.5, -1. ],\r\n       [-0.5,  1. ],\r\n       [ 0. , -1. ],\r\n       [ 0. ,  1. ],\r\n       [ 0.5, -1. ],\r\n       [ 0.5,  1. ],\r\n       [ 1. , -1. ],\r\n       [ 1. ,  1. ]])\r\n```\r\n\r\n#### fracfact\r\n\r\n##### Description\r\n\r\nFractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design. \r\nThe subset is chosen so as to exploit the sparsity-of-effects principle to expose information about the most important features of the problem studied, while using a fraction of the effort of a full factorial design in terms of experimental runs and resources. \r\nIn simple terms, it makes use of the fact that many experiments in full factorial design are often redundant, giving little or no new information about the system.\r\n\r\nSuch designs can be generated using the `fracfact` function, which creates the two-level fractional factorial designs defined by the generator `gen`. \r\nThe latter is a (case-sensitive) string listing the factors in the design, formed from the 52 case-sensitive letters *a*-*Z*, separated by spaces.\r\nStandard convention notation indicates to use *a*-*z* for the first 26 factors, and, if necessary, *A*-*Z* for the remaining factors. \r\nA valid example would be: `gen = 'a b c abc'`.\r\n\r\nSimilar to the previous, the output is an *m*-by-*n* numpy array, where *m* is the number of treatments in the fractional-factorial design. \r\nEach row corresponds to a single treatment, and each column contains the settings for a single factor, with floating point scalars ranging from -1 to +1.\r\n\r\n##### Example\r\nThe following generates an eight-run fractional factorial design for four factors, in which the fourth factor is the product of the first three:\r\n\r\n```python\r\n>>> gen = 'a b c abc'\r\n>>> dbox.fracfact(gen)\r\n\r\narray([[-1, -1, -1, -1],\r\n       [-1, -1,  1,  1],\r\n       [-1,  1, -1,  1],\r\n       [-1,  1,  1, -1],\r\n       [ 1, -1, -1,  1],\r\n       [ 1, -1,  1, -1],\r\n       [ 1,  1, -1, -1],\r\n       [ 1,  1,  1,  1]])\r\n```\r\n\r\nNote that more sophisticated generator strings can be created using the +\" and \"-\" operators. The \"-\" operator will swap the column levels:\r\n\r\n```python\r\n>>> gen = 'a b c -abc'\r\n>>> dbox.fracfact(gen)\r\n\r\narray([[-1, -1, -1,  1],\r\n       [-1, -1,  1, -1],\r\n       [-1,  1, -1, -1],\r\n       [-1,  1,  1,  1],\r\n       [ 1, -1, -1, -1],\r\n       [ 1, -1,  1,  1],\r\n       [ 1,  1, -1,  1],\r\n       [ 1,  1,  1, -1]])\r\n```\r\n\r\n#### fracfactgen\r\n\r\n##### Description\r\n\r\nThe `fracfactgen` function uses the Franklin-Bailey algorithm to find generators for the smallest two-level fractional-factorial design.\r\n\r\nIt requires two inputs:\r\n* `terms`: Is a string of factors formed formed from the 52 case-sensitive letters *a*-*Z*, separated by spaces.\r\nStandard convention notation indicates to use 'a'-'z' for the first 26 factors, and, if necessary, 'A'-'Z' for the remaining factors. \r\nA valid example would be: `terms = 'a b c ab ac'`. \r\nSingle-letter factors indicate the main effects to be estimated, whereas multiple-letter factors indicate the interactions to be estimated. \r\nYou can pass the output generators of `fracfactgen` to `fracfact`, in order to produce the corresponding fractional-factorial design.\r\n* `resolution`: Is an integer indicating the required resolution of the design. A design of resolution *R* is one in which no *n*-factor interaction is confounded with any other effect containing less than *R \u00e2\u20ac\u201c n* factors. Thus, a resolution *III* design does not confound main effects with one another but may confound them with two-way interactions, while a resolution *IV* design does not confound either main effects or two-way interactions but may confound two-way interactions with each other. It is an optional argument, with the default value being equal to 3.\r\n\r\nIf `fracfactgen` is unable to find a design at the requested resolution, it tries to find a lower-resolution design sufficient to calibrate the model. If it is successful, it returns the generators for the lower-resolution design along with a warning. If it fails, an error is raised.\r\n\r\n##### Example\r\nThe following will determine the effects of four two-level factors, for which there may be two-way interactions. A full-factorial design would require 2<sup>4</sup> = 16 runs. The `fracfactgen` function will generators for a resolution *IV* (separating main effects) fractional-factorial design that requires only 2<sup>3</sup> = 8 runs:\r\n\r\n\r\n```python\r\n>>> dbox.fracfactgen(terms = 'a b c d', resolution = 4)\r\n\r\n'a b c abc'\r\n```\r\n\r\n### Response Surface Designs <a name=\"rsm\"></a>\r\n\r\n#### ccdesign\r\n\r\n##### Description\r\n\r\nCentral Composite Designs (CCDs) are a type of experimental design useful in response surface methodology, for building a second order (quadratic) model for the response variable without needing to use a complete three-level factorial experiment. Such designs can be generated using the `ccdesign` function. \r\nIt needs the following input arguments:\r\n\r\n* `numFactors`: Number of factors in the design. Must be an integer at least equal to *2*.\r\n\r\nThe following optional arguments can be set:\r\n*  `fraction`: Integer indicating the fraction of full-factorial cube, expressed as an exponent of 1/2. If not set by the user, the default values are the following:\r\n    * 0, i.e. full factorial design, when `numFactors` &le; 4\r\n    * 1, i.e. a 1/2 fraction design, when 4 <  `numFactors` &le;  7 or `numFactors` > 11\r\n    * 2, i.e. a 1/4 fraction design, when 7 < `numFactors` &le;  9 \r\n    * 3, i.e. a 1/8 fraction design, when `numFactors` = 10 \r\n    * 4, i.e. a 1/16 fraction design, when `numFactors` = 11\r\n* `centerPoints`: Number of center points to be added in the factorial and axial parts of the design.\r\n    Can be one of:\r\n    * 'orthogonal' (default): Number of center points will be computed so that an orthogonal design will be provided.\r\n    * 'uniform'   : Number of center points will be computed so that uniform precision will be achieved.\r\n    * A strictly positive integer, specifying the number of center points directly.\r\n* `designType`: It defines the type of the CCD. \r\n    Can be one of:\r\n    * 'circumscribed' (default): It is the original type of CCD, where axial points are located at distance *a* from the center point.\r\n    * 'inscribed' : The inscribed CCD is characterized by that axial points are \r\n                    located at factor levels *\u00e2\u02c6\u20191* and *1*, while the factorial points are brought into the interior of the design space and are located at distance *1/a* from the center point.\r\n    * 'faced': In a face-centered CCD, the axial points are located at a distance equal to *1* from the center point, i.e. at the face of the design cube if the design involves three experimental factors.\r\n\r\nFor *n > 2* factors, the output DoE matrix has dimensions *m* by *n*, with *m* being the number of runs in the design. \r\nEach row represents one run, with settings for all factors represented in the corresponding columns. The resulting factor values are normalized, so that the cube points take values between *-1* and *1*.\r\n\r\n##### Example\r\nThe following generates a two-factor, full-factorial, circumscribed, orthogonal CCD:\r\n\r\n```python\r\n>>> dbox.ccdesign(numFactors = 2)\r\n\r\narray([[-1.        , -1.        ],\r\n       [-1.        ,  1.        ],\r\n       [ 1.        , -1.        ],\r\n       [ 1.        ,  1.        ],\r\n       [-1.41421356, -0.        ],\r\n       [ 1.41421356,  0.        ],\r\n       [-0.        , -1.41421356],\r\n       [ 0.        ,  1.41421356],\r\n       [ 0.        ,  0.        ],\r\n       [ 0.        ,  0.        ],\r\n       [ 0.        ,  0.        ],\r\n       [ 0.        ,  0.        ],\r\n       [ 0.        ,  0.        ],\r\n       [ 0.        ,  0.        ],\r\n       [ 0.        ,  0.        ],\r\n       [ 0.        ,  0.        ]])\r\n```\r\n\r\n#### bbdesign\r\n\r\n##### Description\r\n\r\nBox\u00e2\u20ac\u201cBehnken designs are experimental designs also used in response surface methodology, for the estimation of second order (quadratic) models for the response variable.\r\nBox-Behnken designs are considered to be more proficient and most powerful than CCD designs, despite their poor coverage of the corners of nonlinear design spaces.\r\n\r\nThis type of design can be generated using the `bbdesign` function, which takes the following input arguments:\r\n\r\n* `numFactors`: Number of factors in the design. Must be an integer at least equal to *3*.\r\n* `numCenter` : Number of centerpoints in the design. It is an optional argument, and if no input is provided, a pre-determined number of points are automatically included, whose number depends on the value of `numFactors`.\r\n\r\nFor *n > 2* factors, the output DoE matrix has dimensions *m* by *n*, with *m* being the number of runs in the design. \r\nEach row represents one run, with settings for all factors represented in the corresponding columns. The resulting factor values are normalized, so that the cube points thaveake values between *-1* and *1*.\r\n\r\nThe output matrix dBB is m-by-n, where m is the number of runs in the design. Each row represents one run, with settings for all factors represented in the columns. Factor values are normalized so that the cube points take values between -1 and 1.\r\n\r\n##### Example\r\nThe following generates a two-factor, full-factorial, circumscribed, orthogonal CCD:\r\n\r\n```python\r\n>>> dbox.bbdesign(numFactors = 3)\r\n\r\narray([[-1, -1,  0],\r\n       [-1,  1,  0],\r\n       [ 1, -1,  0],\r\n       [ 1,  1,  0],\r\n       [-1,  0, -1],\r\n       [-1,  0,  1],\r\n       [ 1,  0, -1],\r\n       [ 1,  0,  1],\r\n       [ 0, -1, -1],\r\n       [ 0, -1,  1],\r\n       [ 0,  1, -1],\r\n       [ 0,  1,  1],\r\n       [ 0,  0,  0],\r\n       [ 0,  0,  0],\r\n       [ 0,  0,  0]])\r\n```\r\n\r\n### Latin Hypercube Sampling <a name=\"lhs\"></a>\r\n\r\n#### lhs\r\n\r\n##### Description\r\nLatin Hypercube Sampling (LHS) is a statistical sampling technique used to efficiently explore the parameter space of a system or model. It is commonly employed in simulation studies, sensitivity analysis, and optimization problems.\r\n\r\nIn LHS, the range of each input variable or factor is divided into equally spaced intervals. Within each interval, a single sample point is randomly selected. The key feature of LHS is that it ensures that each level or bin of each factor occurs exactly once in the sampled dataset, providing a representative and evenly distributed coverage of the parameter space.\r\n\r\nTo generate a design of this type, the `lhs` function can be used.\r\nIts input arguments include:\r\n* `numSamples`  : Number of samples to be generated, specified as a positive integer.\r\n* `numVariables`: Number of variables in the design, specified as a positive integer.\r\n\r\nAdditional optional arguments include:\r\n* `criterion` : Criterion to be used to evaluate the improvement of the  design over each iteration. It can be one of:\r\n    * 'maxdist' (default): Maximizes the minimum sample-to-sample distance.\r\n    * 'mincorr': Minimize the sum of between-column squared correlations.\r\n* `smooth`: Boolean indicator whether the points that will be produced \r\n    should be randomly distributed.\r\n    * If `smooth = True` (default): One point from each of the\r\n    intervals: (0, 1/*n*), (1/*n* , 2/*n*),\r\n    (1-1/*n*, 1), will be sampled, with a subsequent random permutation.\r\n    * If `smooth = False`:\r\n    The points will produced by sampling only at the midpoints of the intervals, i.e. at  \r\n    .5/*n*, 1.5/*n*, ..., 1 - .5/*n*,\r\n    with *n* being the value of `numSamples` selected.\r\n    The de\r\n\r\nFor *n* `numSamples` and *m* `numVariables`, the function returns a Latin hypercube sample matrix of size *n*-by-*m*. For each column of the output matrix, the *n* values are randomly distributed, with each one from the intervals defined according to the value of `smooth`.\r\n\r\n##### Example\r\n\r\nThe following generates a Latin hypercube sample of ten rows (samples) and three columns (variables):\r\n\r\n```python\r\n>>> import numpy as np\r\n>>> np.random.seed(10) # Set the seed for reproducibility\r\n>>> dbox.lhs(numSamples = 10, numVariables = 3)\r\n\r\narray([[0.75248678, 0.9707202 , 0.69357489],\r\n       [0.60211809, 0.16602922, 0.45049514],\r\n       [0.10229193, 0.35592262, 0.26817272],\r\n       [0.8480203 , 0.24218636, 0.51460662],\r\n       [0.49319027, 0.75354692, 0.32180509],\r\n       [0.02813972, 0.8413978 , 0.99629056],\r\n       [0.96493436, 0.44368093, 0.87002701],\r\n       [0.54876658, 0.63265331, 0.08408063],\r\n       [0.39495223, 0.06621841, 0.18919362],\r\n       [0.28210972, 0.51141729, 0.7634635 ]])\r\n```\r\n\r\nEach column of the output matrix contains one random number in each interval [0,0.1], [0.1,0.2], [0.2,0.3], [0.3,0.4], [0.4,0.5], [0.5,0.6], [0.6,0.7], [0.7,0.8], [0.8,0.9], and [0.9,1].\r\n\r\nTo obtain a discrete design, the default value of `smooth` should be overwritten, and set to `False`:\r\n\r\n```python\r\n>>> import numpy as np\r\n>>> np.random.seed(10) # Set the seed for reproducibility\r\n>>> x = dbox.lhs(numSamples = 10, numVariables = 3, smooth = False)\r\n>>> x \r\n\r\narray([[0.75, 0.85, 0.15],\r\n       [0.05, 0.95, 0.55],\r\n       [0.65, 0.55, 0.05],\r\n       [0.35, 0.15, 0.45],\r\n       [0.15, 0.25, 0.25],\r\n       [0.25, 0.75, 0.65],\r\n       [0.45, 0.05, 0.75],\r\n       [0.95, 0.45, 0.95],\r\n       [0.55, 0.35, 0.85],\r\n       [0.85, 0.65, 0.35]])\r\n```\r\n\r\nTo see the effect of changing the default `criterion`, first evaluate the sum the squared correlation of the *x* matrix above:\r\n\r\n```python\r\n>>> corr  = np.corrcoef(x, rowvar = False) \r\n>>> (sum(corr.flatten() ** 2) - 3)/2\r\n\r\n0.07004591368227708\r\n```\r\n\r\nSubsequently, generating a new matrix with `criterion = mincorr`, the same evaluations result in a squared-correlation sum of:\r\n\r\n```python\r\n>>> np.random.seed(10)\r\n>>> x = dbox.lhs(numSamples = 10, numVariables = 3, smooth = False, criterion = 'mincorr')\r\n>>> corr  = np.corrcoef(x, rowvar = False) \r\n>>> (sum(corr.flatten() ** 2) - 3)/2\r\n\r\n0.027731864095500214\r\n```\r\n\r\nAs can be seen, minimizing the correlations results in a design with much lower sum of squared correlations.\r\n\r\n## References\r\n\r\nGood starting points for additional information on each experimental design type can be found on:\r\n\r\n* [Factorial designs](https://en.wikipedia.org/wiki/Factorial_experiment)\r\n* [Box-Behnken designs](https://en.wikipedia.org/wiki/Box%E2%80%93Behnken_design)\r\n* [Central composite designs](https://en.wikipedia.org/wiki/Central_composite_design)\r\n* [Latin-Hypercube designs](https://en.wikipedia.org/wiki/Latin_hypercube_sampling)\r\n\r\nIn addition, a wealth of information about DoE can be found on the [NIST](https://www.itl.nist.gov/div898/handbook/pri/pri.htm) website, including discussion on how to choose and analyze various DoEs, as well as several case studies.\r\n",
    "bugtrack_url": null,
    "license": "GNU General Public License v3.0",
    "summary": "Design of experiments toolbox.",
    "version": "1.3",
    "project_urls": null,
    "split_keywords": [
        "python",
        "statistics",
        "doe"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3bccccaf887d3cdfa63891b33da2a9b5c0b31320ad9e9da0a0093fefe8f5d020",
                "md5": "10a37cf009d3aeba2c290d6c5bfd214f",
                "sha256": "ffd59ab39b6a0d599fdd46dca6d65ff392ed35a216c475b39acca703522f1e1d"
            },
            "downloads": -1,
            "filename": "doe_toolbox-1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "10a37cf009d3aeba2c290d6c5bfd214f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 35174,
            "upload_time": "2023-06-17T17:44:05",
            "upload_time_iso_8601": "2023-06-17T17:44:05.124603Z",
            "url": "https://files.pythonhosted.org/packages/3b/cc/ccaf887d3cdfa63891b33da2a9b5c0b31320ad9e9da0a0093fefe8f5d020/doe_toolbox-1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f23d950c5f8b77695d397c8d8269bcc5cf408a7b80c0e17a086e91840a82ba63",
                "md5": "260186408e5760ea313ef92d6ef01aa5",
                "sha256": "f878029563c75e8aa1f07b891bfc2dc81d5fa2c1d46623e8c7ae9be05f90c5db"
            },
            "downloads": -1,
            "filename": "doe_toolbox-1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "260186408e5760ea313ef92d6ef01aa5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 38528,
            "upload_time": "2023-06-17T17:44:06",
            "upload_time_iso_8601": "2023-06-17T17:44:06.925425Z",
            "url": "https://files.pythonhosted.org/packages/f2/3d/950c5f8b77695d397c8d8269bcc5cf408a7b80c0e17a086e91840a82ba63/doe_toolbox-1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-17 17:44:06",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "doe-toolbox"
}

miltos_90