Friday, May 1, 2009

FAQ#3: J'accuse!

Q: The PHM challenge should also allow researchers from the model-free camp to participate. This means that pattern recognition based supervised learning techniques should be able to be used. Since the data is not labeled, it is not possible to run experiments related to classification performance. The authors of the challenge should provide two different data sets, one labeled for the learning and one unlabelled for the testing. Currently only model-based fault diagnosis can be done. This excludes a complete branch of research.

A: Any format will favor some group of researchers. Last year, the competition was exactly what you asked for: essentially a homework problem, with neatly packaged labeled data that gave machine learning researchers a huge advantage. Next year, some other group will assuredly have an advantage.

However, we have made every effort to level the playing field. We have provided background on domain fundamentals; Matlab code for algorithms to extract features from the data; and links to excellent papers on the analysis of this type of data. Moreover, we believe that the problem is difficult enough that very innovative approaches may be required to solve it: researchers who are looking at the problem from a "fresh perspective" may actually have an advantage...

1 comment:

  1. I don't see why it has to be one or the other. You could easily do what you've done and release some labeled data. That way people who have their own methods can use them on the data and pattern recognition researchers can use the released functions. As you've pointed out this is a very difficult problem and is most likely going to have to blend both approaches in practice.

    ReplyDelete