For more information about the problem itself we refer to "Global Optimization of Deceptive Functions With Sparse Sampling" by Forrester and Jones.
So we wanted to try out several surrogate model types from the SUMO toolbox and compare it with the Kriging implementation of Forrester, additionally we implemented Blind Kriging based on R code obtained from Joseph Roshan (Georgia Tech). See "Blind Kriging: A New Method for Developing Metamodels" by Roshan et al.
After constructing the surrogate model a Mean Square Error (MSE) was calculated based on a validation set of 100 samples. Afterwards we plotted these errors versus the sample size used to create the surrogate model.

It should be noted that the hyperparameters of all surrogate models, with the exception of Blind Kriging and Kriging (Forrester), are optimized using 5-fold cross validation. The last two use a log-likelihood to determine the hyper parameters.
As you can see the preliminary results are not really conclusive of what is the best surrogate model type. Surprisingly enough, the Kriging model of the DACE toolbox does well with a low number of samples but gets really worse towards the end. The RBF model follows the same trend.
More interesting is the fact that the other Kriging implementations are about the same. Also noteworthy is that the 2nd custom Kriging run had more time to find the right parameters (that's why it performs a bit better at the end).
Blind Kriging is quite interesting as the implementation is very rough at the moment. The hyper parameters can be better optimized as well as a better search in the variable selection phase is possible. We expect at least to slightly improve its results. Looking at the QQ-plot of blind kriging (plotting the prediction vs validationset) we can see that the prediction error over the whole range is quite uniform (which is nice).

The author would like to thank Forrester et al. for kindly providing their results and the dataset.
UPDATE:
To analyse the performance of Blind Kriging a plot was made of the Bayesian variable selection phase. The leave-one-out cross validation score (CV) is plotted against the number of terms in the regression part. The lowest CV score determines the number of terms chosen for the final Blind Kriging model, this minimum is denoted by a star.

Ivo
Dirk

0 comments:
Post a Comment