Tuesday, August 10, 2010

SUMO-Toolbox 7.0.2 Released + Publication

Hello,

We have just released version 7.0.2 of the SUMO-Toolbox. This is the second incremental update since the toolbox was released under an open source license.

This coincides with a new summary publication that should be used whenever referring to the toolbox:

A Surrogate Modeling and Adaptive Sampling Toolbox for Computer Based Design pdf
D. Gorissen, K. Crombecq, I. Couckuyt, T. Dhaene, P. Demeester,
Journal of Machine Learning Research,
Vol. 11, pp. 2051-2055, July 2010.

More information can be found on the SUMO Toolbox wiki.

--Dirk

Thursday, February 18, 2010

SUMO-Toolbox 7.0 Released - Open Source

We are very proud to announce the 7.0 release of the SUrrogate MOdeling (SUMO) Toolbox. The main novelty of this release is that from now on the SUMO Toolbox will now be available under a open source license (AGPLv3) for non-commercial use.

Details are available on the SUMO website, see: http://www.sumowiki.intec.ugent.be/index.php/License_terms

Besides the adoption of an open source license for non-commercial use, this release has seen many bug fixes and improvements in the Kriging and SampleEvaluator components.

[*] Download instructions can be found here:

http://www.sumo.intec.ugent.be/?q=SUMO_toolbox#download


[*] The full changelog and release history is available here:

http://www.sumowiki.intec.ugent.be/index.php/Whats_new
http://www.sumowiki.intec.ugent.be/index.php/Changelog

All users are strongly advised to upgrade (remember to delete old versions first). Upgrade instructions can be found here:

http://www.sumowiki.intec.ugent.be/index.php/FAQ#Upgrading


If you encounter any problems when downloading or using the toolbox please let us know here:

http://www.sumowiki.intec.ugent.be/index.php/Contact


Enjoy! :)

--Dirk

Wednesday, November 11, 2009

3D Surface Modeling

In an earlier post I wrote about how the SUMO framework could be extended and applied to classification problems. While doing that it struck me that the SUMO Toolbox could be similarly used for 3D surface modeling.

With 3D surface modeling, I mean the fitting of 3D geometric data in order to reproduce the shapes of various 3D objects like cubes, spheres, chairs, tables, dragons, etc. Ideally this leads to a closed analytic expression that fully describes the object.

Work to this end has already been done of course, using RBF models (FastRBF) and neural gas models (I cant seen to find the link right now). However, since it was quite straightforward to implement (only a new example was needed, the modeling code did not have to change) I thought I would quickly add an example and associated demo file. This will be available in 6.3.

Wat is needed is a triangular surface mesh (in standard Matlab format) and the surface normals for each triangle (can be easily calculated). Given such a mesh I added a Matlab function that can decide whether any given point is inside or outside the mesh (using the excellent InPolyhedron function by Luigi Giaccari). Putting these two together in a new 3DModel example then allows any of the SUMO toolbox model types or sample selection algorithms to be used for fitting the object.

But how does it work? Well its very simple, the idea is to regard the problem as a classification (or regression) problem: (1) fit a model on the 3D points, use -1 and 1 as output values to indicate if a point is inside or outside the object, (2) of the final model, plot the isosurface at isovalue = 0

The object should magically appear. Of course, the more complicated the model the more data will be needed to get the details right and keep things smooth. As a proof of concept example I used SVM models and a simple sphere:


Not the fanciest and shiniest example, nor the nicest visualization, but it proves the point :) Adding more complicated models (teapots, dragons, sculptures, ...) is now trivial, limited only by available computer memory and processing power...

Of course many improvements can be made to the straightforward approach described here. But that is just a matter of some spare time and motivation :)

What would be interesting (and easy) though is to couple this with the LOLA sample selection algorithm. Since it should seek out the boundary automatically. Potentially saving a lot of time...

--Dirk

PS: naturally the models need to be closed for this to work

Saturday, October 24, 2009

SUMO Lab YouTube Channel

We always had a collection of videos available. Putting these online as a collection of avi links is a bit too Web 1.0 :)

Therefore we started a YouTube channel where we hope to add some stuff from time to time:


http://www.youtube.com/sumolab


Feel free to make suggestions or leave comments.

--Dirk

Surrogate Models for Classification

From our FAQ:

Question: Does the SUMO Toolbox support classification problems?
Short Answer: Yes, now it does
Long Answer: see below

At the SUMO Lab we spend most of our time on the problem of generating an accurate surrogate model (metamodel) for a given data set or simulation code with a minimum number of data points (= adaptive sampling, sequential design, active learning). The goal is to make this this process as efficient and pain-free as possible.

To aid this work we developed the Matlab SUMO Toolbox which implements a number of frameworks and abstractions to facilitate:
  • model selection
  • model complexity selection (= hyperparameter optimization)
  • adaptive sampling (= active learning)
  • Design of Experiments (DoE)
  • data visualization
  • data interfacing
  • distributed execution of simulations
The work always revolved around regression/function approximation type problems. However, many of the algorithms and sub problems we encountered are equally applicable to classification. So wouldn't it be possible to leverage the SUMO framework for classification problems as well? Encouraged by some comments of one of the toolbox users I looked into this.

It turned out that with only 30mins work I had a first demo ready. Since some of the model types inside SUMO already support classification internally (e.g., the SVM models) I just needed to add some extra options and tweak the model plotting code somewhat.
The result is that now you can use the SUMO plugins for hyperparameter optimization, model selection, adaptive sampling, etc. and apply them to classification problems. The code will become available in version 6.3. If you want to play around with it earlier, just let me know.

As a proof of principle example I took the classical two spiral problem and configured SUMO to use SVM models (parameters optimized with DIRECT) and the density based sample selection algorithm. The resulting movie generated by SUMO is given below:



Remark that while the basic support for classification is there, our focus remains on the classic surrogate modeling problem (regression). So dont expect major developments in this area anytime soon. Rather, the purpose of this was just to show that it can be done quite easily. The basic support is there and now its up to an interested somebody to pick it up and improve or extend it as needed :)

Remark also that exactly the same could be done for Time Series prediction. If I turn out to have a spare hour here or there I might do a similar post on that as well :)

--Dirk

Tuesday, October 20, 2009

SUMO-Toolbox v6.2.1 released!

After a lot of hard work and testing I am happy to announce the availability of version 6.2.1 of the SUMO-Toolbox. This is a bugfix release, hot on the heels of our 6.2 release.

This version has seen a lot of internal cleanups and feature improvements. You can find information about new features here.
The full list of changes is available here


Note that post-release bugs may have been found so remember to check the known bugs page. For how to upgrade see the Upgrading FAQ entry here.



--Dirk

Tuesday, August 25, 2009

Scalability of Delaunayn

As part of our work on a new model selection metric (Linear Reference Model (LRM) selection) we needed an idea of how Matlabs' triangulation routine (based on qhull) scales with the number of points and dimensionality.

The two plots below show the results of a simple test we did on a high end desktop machine. It turns out that the main limitation is memory usage, which in turn is very closely linked with the dimensionality. Beyond 6 dimensions with a couple thousand points we would just get a malloc memory error. For those cases we are looking at an iterative or approximate implementation. The Hull code by Clarkson also seems promising.






--Dirk