H6429: Computational Intelligence, Methods and Applications

Computational Intelligence (CI) is based on inspiration from statistical, pattern recognition, neural network, machine learning, fuzzy logic, evolutionary computing, scientific visualization and other sources. This course covers basic CI theory, visualization methods, the use of two software packages implementing many CI algorithms, Yale/WEKA and GhostMiner, and examples of practical applications of CI methods to data in technical, medical and bioinformatics domains. Please download these packages, read their instructions and play with them.

Wlodzislaw Duch

Time and place:

The edveNTUre version of H6429 course is here. It contains the same presentation as the web version, and in addition audio recordings of these lectures. You may want to use it to send emails to the whole group. If you experience problems connecting to this server please ask our e-learning experts, for example Ms Josephine Goh Lay Kian (ASLKGoh at NTU).

Exam will be of the restricted open book type. You will be allowed to bring a single book of your choice, that is a real printed book, not your own notebook, not lecture notes, and no additional material, such as scribbles in the book, please.
Open book exams are rare at NTU, but I am convinced that they motivate students to read the textbook carefully, as there is no time during the examination to learn or search in the book for too long.
Time: 30/11/2006, Thursday, 9:30-12:30
Location: Examination Hall 7

Course outline:

  1. CI overview, types of adaptive systems, learning and applications. (2 h)
  2. Visualization and exploratory data analysis: few variables, parallel coordinates and other direct multivariate visualization algorithms, Principal Component Analysis (PCA), Self-Organized Mappings (SOM) and Multidimensional Scaling (MDS). (9 h)
  3. Theory: overview of statistical approaches to learning, bias-variance decomposition, expectation maximization algorithm, model selection, evaluation of results, ROC curves. (5 h)
  4. CI packages in action: WEKA and GhostMiner, presentation of algorithms available in these packages. (5 h)
  5. Statistical algorithms: discriminant analysis - linear (LDA), Fisher (FDA), regularized (RDA), probabilistic data modeling, kernel methods. (4 h)
  6. Density estimation and rule induction, separability criteria. (5h)
  7. Similarity based methods, generation of prototypes, similarity functions. (2 h)
  8. Improving CI models: boosting, stacking, ensemble learning, meta-learning, information theory for selection of features. (6 h)

Total number of lecture hours: 38 hours, including the review hour; note that we have holidays on Tuesday 9.08 (and also 1.11 and 3.11).

Reference texts: 3 best books, giving you a solid foundations:

  1. R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification (2nd Edition), J Wiley 2000
  2. T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning. Springer 2001
  3. A. Webb, Statistical Pattern Recognition. Wiley, 2-nd ed. 2002

Other useful books:

  1. D. Hand, H. Mannila, P. Smyth, Principles of Data Mining, MIT Press 2001
  2. V. Kecman, Learning and soft computing, MIT Press 2001
  3. Amit Konar, Computational Intelligence. Principles, Techniques and Applications. Springer 2005
  4. I.H. Witten, E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann 1999 (WEKA book) Some WEKA models are described here, with examples for the Gamma-ray burst analysis.
  5. J. P. Marques De Sa, Pattern Recognition: Concepts, Methods, and Applications. Springer Verlag, 2001

Question/answers.
Some links relevant to Computational Intelligence.

Course slides are in the PDF format:

Note that these slides may change just before the lecture and even a week after, if some corrections/additions are needed or if some questions are asked.

Recordings of all lectures in WMA (it has nice voice codek) are posted in edveNTUre, also PPT versions of slides are stored there.

General Introduction: 3 hours

  1. Lecture 1: Organization, what this is all about | 10.08
  2. Lecture 2: CI - problems and inspirations | 10.08
  3. Lecture 3: more inspirations and some probability theory concepts | 10.08

    What is Computational Intelligence and what could it become?

Visualization of multidimensional data: 8 hours

  1. Lecture 4: Direct visualization of multidimensional data | 17.08
  2. Lecture 5: Exploratory data analysis and linear projections of data | 17.08
  3. Lecture 6: Principal Component Analysis | 17.08/31.08

  4. Lecture 7: Discriminant Components | 31.08
  5. Lecture 8: Projection pursuit and ICA | 31.08
  6. Lecture 9: Self Organized Mapping - intro | 31.08/7.09

  7. Lecture 10: SOM and Growing Cell Structures | 7.09
  8. Lecture 11/12: Multidimensional Scaling | 7.09

Some links to visualization | Links to SOM | Interactive histogram demo | PCA Java demo | My Matlab programs for FDA projections | Just for fun: 4D Rubick cube applet.
Further reading: see the visualization techniques review paper, and my papers: Visualization of hidden node activity in neural networks, and Coloring black boxes: visualization of neural network decisions.
Search for interesting methods for non-linear dimensionality reduction, isometric mapping; multidimensional scaling; stochastic proximity embedding; reducing the dimensionality of data with neural networks etc.


Assignment no. 1 info is here


Theory of adaptive systems: 3 hours

  1. Lecture 12: Bayesian decisions: foundation of learning | 21.09
  2. Lecture 13: Bayesian risks and Naive Bayes approach | 21.09
  3. Lecture 14: Bias-variance tradeoff | 21/28.09

CI packages in action: WEKA/Yale and GhostMiner, decision trees: 6 hours.
Recess week! But we have a make-up lecture, usuall time!

  1. Lecture 15: Evaluation of results, model selection | 28.09
  2. Lecture 16: ROC and Introduction to WEKA/Yale | 28.09
  3. Lecture 17: Knowledge extraction from simplest decision trees | 28.09/12.10

Unfortunately 5.10 I shall be in HK on a conference, so the next lecture will be on the 12.10.

  1. Lecture 18: Decision trees in WEKA/Yale and GM | 12.10
  2. Lecture 19: Pruning of decision trees | 12.10
  3. Lecture 20: SSV decision trees and other GM models | 12.10

Statistical, SVM and kernel algorithms: 5 hours.

  1. Lecture 21: Discriminant analysis (DA) - linear machines | 19.10
  2. Lecture 22: Linear discrimination - variants: Fisher DA, Regularized DA (RDA) | 19.10
  3. Lecture 23: Logistic DA, and linear SVM | 19.10

  4. Lecture 24: SVM in the non-linear case, kernels | 19.10
  5. Lecture 25: Kernel PCA and FDA, 26.10

Probability density estimation: 3 hours.

  1. Lecture 26: Density estimation, expectation maximization, 26.10

  2. Lecture 27: Expectation maximization and density modeling, 26.10
  3. Lecture 28: Non-parametric density estimation, 26.10/2.11


Assignment no. 2 info is here


Fuzzy rule induction: 2 hours.

  1. Lecture 29: Approximation theory, RBF and FSM networks, 2.11
  2. Lecture 30: Neurofuzzy system FSM and covering algorithms, 2.11

  3. Lecture 31: Combinatorial reasoning, or learning from partial observations, 2.11

Similarity-based methods and improvment of CI models accuracy: 3+3=6 hours.

  1. Lecture 32: Nearest neighbor methods, 9.11

  2. Lecture 33: Information theory, 9.11
  3. Lecture 34: Applications of information theory, visualization and selection of information, 9.11

Improving CI models by meta-learning: 2 hours.

  1. Lecture 35: Feature selection and discretization methods, 16.11
  2. Lecture 36: Meta-learning: committees, sampling and bootstrap, 16.11

This in fact is 37 hours (lectures 10-11 + assignment intro took 3 hours) ... perhaps we need to cut some stuff a bit ...
Meeting/revision on 23.11 for 2 hours before exam?

Examination will be an open book, but please remember that there is no book that covers it all, and that you may bring only ONE book of your choice (a real book, not your own notebook), and no other notes or printed materials or books full of scribbles, please. I am obliged to go through your book and check it.

The old examination papers 2003-2005 should be at this link
http://exampapers.ntu.edu.sg.ezlibproxy1.ntu.edu.sg/.
Select S1 04, 05 or 06, click on the left side "Masters of Philosophy (SCE)" and select H6429 - Special Advanced Topic 2 - Computational Intelligence: Methods and Applications.

Links to some free books that you may print and bring are below:


Time: 30/11/2006, Thursday, 9:30-12:30
Location: Examination Hall 7

Wlodzislaw Duch