DESIGN OF EXPERIMENT with NEURO PEX



NEURO PEX is the first software in the world dedicated to the design of experiment for nonlinear models and for neural networks (version 1 in 2004, version 2 in 2005). In this section, we are explaining how Neuro Pex, combined with Neuro Shop, deals with the famous example of Box and Lucas (1959). This example is a chemical reaction A => B => C whose kinetics is described by the knowledge-based model. From an educational standpoint, this model’s interest is that it consists of 1 factor (also refered as input) and 2 parameters (also refered coefficients), which allows displaying in simple 2D graphs the ouptut with its confidence intervals and the confidence areas around the coefficients.
If you cannot see the following images correctly, download the zip file containing the whole images in real size (1024x768) and display them in full screen with a slideshow viewer.


0. NETRAL Model Dispatcher, Neuro Shop and Neuro Pex

All the software edited by NETRAL can be started with the Model Dispatcher. Here, start Neuro Shop.

1. Neuro Shop software

In Neuro Shop, click on the icon Compiler (a window appears in the lower right corner) then open/load the Box-Lucas.mgl model.

2. Displaying the mgl model

Every algebraic equation can be represented by a mgl model which contains this information:
- The model’s name
- A list of the inputs and their variation domain [min..max]
- The output
- A list of parameters (coefficients), their expected values (D-optimality) and/or their variation domains [min..max] (X-optimality)
- The model’s algebraic equation.

3. Displaying the nml model

Click the Compile button. The nml model is displayed in the central window. The nml format, common to every NETRAL software, considers any analytic model as a succession of distinctive operations (addition, subtraction, multiplication, division, logarithm, exponential, sinus, arctg, etc...) making a unidirectional graph. The nml format used by NETRAL’s software, allows calculating transfer function, first-order and second-order derivatives with respect to the parameters, first-order and second-order derivatives with respect to the inputs, second-order mixed derivatives with respect to the inputs and the parameters.

4. Display the analytic expression

"Right click + Properties visible + Computation Formula" displays the analytic form of the model after it has been recalculated from the graph. This one has to be the same as the mgl model expression. The nml model is automatically saved in a directory of the nml model. Quit Neuro Shop.

5. Displaying mgl and nml format

NETRAL proposes open codes. The mgl and nml files can be read (and modified!) in a simple text editor. The nml format is a xml code. You will find in dark blue the min and max values at first sight of the b1 parameter of the Box-Lucas’ model.

6. Displaying mgl and nml format (continuation)

In dark blue, the answer y to the Box-Lucas’ model.

7. Starting Neuro Pex

In NETRAL Model Dispatcher, load Box-Lucas.nml then start Neuro Pex.

8. Loading the model

Neuro Pex reads models that have been created in those formats:
1. nml: Neuro One and Neuro Shop standard format
2. mgl: text (ASCII) file of the analytic equation
3. dlm: dll to interface with external computer codes in a master software/slave software logic (Neuro Pex/your computer code)
"Right Click + Properties visible + Computation Formula" display the analytic form of the model, recalculated from the graph.

9. Experimental and parametric field

Min and max values of x=time factor (input).
In this example, there is no constraint on the experimental field.
Assumed values for the parameters (coefficients) w(0) and w(1). They are the values we want to re-estimate.
Min and max values for w(0) and w(1) are not necessary for the D-optimality critter, but they are essential for the X-optimality critter. Neuro Pex displays -infinite and +infinite.

10. Impact of noise on the response

Knowing the experimental noise (or measurement variability) is essential in design of experiment.
1. The experimental noise is known and identical on the whole experimental space.
2. The experimental noise is estimated from a response sample. Here, 23% of the average responses.
3. The experimental noise varies in the experimental space in proportion to the response. It is often the case for exponential phenomena in which the response varies on several orders of magnitude. Minimum threshold are proposed to avoid a nil noise for a nil response (division by 0).
4. OA model of the experimental noise is available.

11. Loading data / measurement already available

If some experimental data are already available, they can be proposed, but not set (not protected), or proposed and set (protected) in the future calculated designs.

12. Candidates generation

The candidate points Neuro Pex generates by default are a regular network of the experimental space. Classical values are 2, 3, 4, 5, 9, 11, 21, 51, 101, 1001.
When this number is becoming too high, which is often the case for high dimension hypercubes, you may propose a lower number of points which will be randomly drawn (without replacement) from the hypercubes nodes.
If the experimental space is not a hypercube, you should load an external table configured to measure.

13. Data tables

Neuro Pex displays the table of the candidate points.
1. Here, only one input for the Box-Lucas model, with one taking every 0.1 hour = 6 minutes.
2. Before realising the design, some measurements have not been made yet and so the responses are not filled in. NAN = Not a Number.
3. The selection code is 3, which indicate that the point of measurement is proposed but not protected.

14. Calculation option and displaying of the graphs

Calculating D-efficiency (a few minutes) and Hamilton confidence area (a few hours) demand lots of time and are not always necessary.
Neuro Pex allows the user to choose the graphs which will be automatically displayed.

15. Choosing the type of the design to be calculated

Neuro Pex is the first software in the world to calculate in a user-friendly way D- and X-optimal designs.
D-optimality maximizes the determinant of the information matrix, minimizes the determinant of the dispersion matrix, minimizes the confidence ellipsoid around the parameters (coefficients) of the model. This confidence ellipsoid is a mere estimation at the first order of the exact confidence area around the parameters.
X-optimality minimizes, at a finite distance (i.e. For the number of experiment that has been decided), the volume’s expected value of the exact area around the parameters.

16. Options associated to the calculation of D-optimal designs

2 fields at least have to be filled:
1. The maximum number of experiments the user is ready to do.
2a. The possibility to repeat support points (This option was chosen here);
2b. The obligation to repeat in p support points if p is the number of parameters of the model;
2c. The ban on repetitions which is recommended for computer designs

17. Calculation of D-optimal designs synthesis of the results

Neuro Pex needs a few seconds or minutes to calculate all the D-optimal design from the minimum number (usually p points) to the maximum number of points / trials that have been asked for (in the previous screen).
When these calculations are done, Neuro Pex has got all the individual results in two series of tabs: design synthesis and diagnosis.
The first tab is a synthesis of the results which contains the following information:
- Calculation number
- Date and hour
- Number of experiments in the calculated design
- Maximal parametric curvature
- Maximal parametric curvature
- Maximal intrinsic curvature

18. Synthesis of the results (continuation)

Other columns of the synthesis of the results:
- Mean intrinsic curvature
- Reference curvature for the size of the design
- D-efficiency rating
- Normed determinant value
- Trace of the information matrix
- Greatest eigenvalue of the information matrix
- Conditioning of the information matrix eigenvalues
- Maximal correlation calculated from the correlation matrix of the least square estimator of the parameters
- Mean correlation calculated from the correlation matrix of the least square estimator of the parameters
- Mean standard deviation on the prediction error
- Maximal standard deviation on the prediction error

19. D-efficiency garland

The D-efficiency garland can be regular with p points or irregular. Here it is regular very p=2 points.

20. Standard deviation on the prediction error

This graph is very important.
It indicates the number of trials / experiments to realise to achieve a certain level of accuracy (mean or maximal) about the predictions. The estimated errors displayed here are calculated from the table of the candidate points.

21. Determinant of the information matrix

The graph represents the determinant of the information matrix of the least square estimator of the parameters in function of the design size

22. Trace of the information matrix

The graph represents the trace of the information matrix of the least square estimator of the parameters in function of the design size

23. Diagnostic Report

In the diagnostic page, the first tab is the diagnostic report which gathers the information associated to the design in consideration, here the p = 2 points / trial design. This information is:
0. Title and p-point design calculated
1. Summary of the parametric estimations
2. Variance-covariance matrix of the least square estimator of the parameters
3. Correlation matrix of the least square estimator of the parameters

24. Diagnostic Report (continuation)

Other information shown in the diagnostic report:
3. Correlation matrix of the least square estimator of the parameters
4. Measurement of the design D-efficiency
5. Measurement of non-linearity. Measurement of non-linearity uses the calculation of the model second order derivative.
6. Predicted Performances of the model
7. Volume Hamilton confidence area (not calculated here)

25. 2-point D-optimal design

The second tab displays the D-optimal plan which has the least points, only 2 points here.
For the initial values of the parameters considered equal to w(0) = 0,2 and w(1) = 0,7 (screenshots 3. and 9. above), the two new tests have to be realised at x = time = 1,2 hours (1 hour and 12 minutes) for the first one and at x = time = 6,9 hours (6 hours and 54 minutes).

26. Continuous design

The continuous design is calculated from Torsney’s algorithm. It affects to each candidate point (screenshot 13) a probability of appearing or ‘weight’. The sum of the masses equal to one. The continuous plan is used to calculate the D-efficiency and the garland.

27. Performance appraisal / Prediction

For a given size of the design, here 2 points by trial, Neuro Pex calculates for each candidate the mean response, the anticipated bias, the error on the prediction and a confidence interval (min/max values)for the response.
The 2-point design being here equal to the coefficient number of the model, there is no degree of liberty available to calculate confidence intervals.

28. 14-experiment design

Double-click on the 14-experiment line of the results synthesis table to load automatically the plan with 14 points / trial / experiment and the related diagnostics.

29. Diagnostic report

The 14-point is the 2-point design repeated 7 times (regular garland).
The degrees of liberty are now numerous enough to calculate the confidence intervals (IC_Min, IC_Max) on the parameters and on the response (screenshot 32).

30. Diagnostic report (continuation)

Continuation of the diagnostic report for the 14-point plan.

31. 14-point D-optimal design

The 14-point D-optimal design is the 2-point design repeated 7 times.

32. Performance appraisal / Prediction

The degrees of liberty are numerous enough to calculate the confidence intervals (IC_Min, IC_Max) on the parameters and on the response, here the candidate points (screenshot 29).
The Box-Lucas model being with 1 input and 1 output, "right click + Field/ Field view + y + y/time" displays the response of the model for each candidate point in function of the “time“ input.

33. Prediction on the y response mean value

Neuro Pex displays the y response of the model for each candidate point in function of the “time“ input.
The optimal points, repeated or not (black lines)are at x = time = 1.2 hours (1 hour 12 minutes) and at x = time = 6.9 hours (6 hours 54 minutes).
"Right click + Plot choice" allows you to display several plots, among which the confidence intervals on the response.

34. Confidence intervals

Copy the IC-Min and IC-Max fields (lower and upper bond of the confidence interval) from the Field list menu to main plots menu.

35. Confidence intervals (continuation)

The confidence intervals IC_Min and IC_Max are a first order appraisal of the confidence area around the y response. There are located symmetrically around the y response.
For non-linear models, the first order information may be proven drmatically wrong and is not sufficient.

36. Monte-Carlo simulation

Neuro Pex proposes Monte-Carlo simulations to estimate precisely the possible values of the parameters and the response, as well as their respective confidence area.
- Simulations number: from 100 to 30.000. Efron recommands to do at least 5000 simulations.
- Initialisations number: having different parameters values (cental values +/- standard deviation calculated at the first order for the considered plan, here 14 points, cf. screenshot 29) at the learning algorithm initilisation, Neuro Pex will keep the results of the best learning.
- Epoch number: the maximal number of iterations realised by the learning algorithm to reestimate the coefficients.

37. Diagnostic reports of the Monte-Carlo simulations

Neuro Pex completes the diagnostic repport of the 14-point plan (screenshots 29 and 30) with the results of the Monte-Carlo simulations. With a great number of simulations, these estimations are more precise than the first order ones.

38. Appraisal of the Monte-Carlo performance

Click on the edit menu then on + performance appraisal (Monte-Carlo).

39. Appraisal of the Monte-Carlo performance (continuation)

From the Monte-Carlo simulations and the values calculated after re-learning the models, here 1000 models, Neuro Pex displays for each candidate point the response’s mean value, the anticipated bias, the error on the prediction and a confidence interval (min /max values) for the response.

40. Monte-Carlo confidence Intervals

"Right click + Plot choice + IC_Min + IC_Max" displays the confidence intervals IC_Min and IC_Max estimated by a Monte-Carlo simulation. Depending on the number of simulations, they can be more accurate than the confidence intervals calculated at the first order (screenshot 35). They are not necessarily symmetric around the y response.

41. Monte-Carlo predictions

"Edit + Predictions (Monte-Carlo) " displays the table of the [1000 simulations x 101 candidates points] estimated from the learning of the [1000 designs x 14 points]. The design points are simulated by the computer according to the model law and the noise on the response that has been decided at screenshot 10.

42. Monte-Carlo predictions (continuation)

"Right click + Histogram + Data(69)" displays in a histogram the 1000 predicted responses at the D-optimal point with time = 6,9 hours, from the 1000 optimal 14-point designs.

43. Histogram of the Monte-Carlo predictions

Neuro Pex displays in a histogram the 1000 predicted responses at the D-optimal point with time = 6,9 hours, from the 1000 optimal 14-point designs. There is a little asymmetry for great values.

44. Parametric Monte-Carlo estimation

"Edit + Estimations (Monte-Carlo) " displays the table of the [1000 x 2] parameters estimated. from the learning of the [1000 designs x 14 points]. The design points are simulated by the computer according to the model law and the noise on the response that has been decided at screenshot 10.

45. Parametric Monte-Carlo estimation (continuation)

"Right click + See Field/Field+ b2 + b2/b1" displays the 1000 couples [b1, b2].

46 Parametric Monte-Carlo 14-point estimation

The Box-Lucas model being with 2 parameters, it is interesting to display the graph b2/b1, here the 1000 couples [b1, b2] estimated for [1.000 designs x 14 points].

47. Parametric Monte-Carlo 6-point estimation

The same view built from [1000 designs x 6 points].

48. Parametric Monte-Carlo 2-point estimation

The same view built from [1000 designs x 2 points].

49. Monte-Carlo 6-point confidence intervals

The view of the IC_Min and IC_Max confidence intervals, estimated by a Monte-Carlo simulation for a 6-point design. Compare with the screenshot 40.

50. Saving the D-optimal design

"File + save design" allows you to save the 14-point design of experiments. Neuro Pex is the first software in the world that proposes in such a user-friendly way the calculation of designs of experiments for non-linear knowledge-based models and neural networks réseaux de neurones and then the displaying of the results.




Please contact us if you want to know more about Neuro Pex.









© Netral - June 2007