Courses
Getting Started With Data Mining

This training session is a perfect place to start if you are new to data mining and have little-to-no background in statistics or machine learning.

CART: Data Mining with Decision Trees

Discover the power of tree-structured data mining during this popular introductory seminar, geared toward statisticians and IT audiences who are interested in understanding the conceptual basis of decision tree technology -- what it is, why it works, how it has been used, and how it can help you make better business decisions.

Advanced CART & Hybrid Modeling Techniques

Sharpen your decision tree expertise during this one-day advanced course, geared towards analysts and modelers with prior knowledge of tree algorithms.

Predictive Modeling with MARS Automated Non-linear Regression

What is MARS? Why does it work? How can it be used? How can it help you develop more accurate regression models for problems such as predicting credit card holder balances, insurance claim losses, customer catalog orders, and cell phone use?

TreeNet/MART

TreeNet stochastic gradient boosting is Stanford University Professor Jerome Friedman's latest advance in data mining methodology. In TreeNet, classification and regression models are built up gradually through a potentially large collection of small trees, each of which improves on its predecessors through an error-correcting strategy. Although each tree may have only one split, the full model can be extraordinarily accurate. The final model takes the form of a series expansion in which every term is a (small) tree.

RandomForests

RandomForests®, created by Leo Breiman and Adele Cutler, is based on learning ensembles of CART trees. By judiciously injecting randomness into the tree building process and then combining hundreds of these trees, RF is able to deliver high performance predictive models and a variety of novel exploratory data analysis results. RF also incorporates new metric free CLUSTER analyses that automatically select the variables used to define each cluster, with potentially different variables defining each cluster.