Interpretable AI Series (2) White-box models

This post mainly covers chapter 2 of "Interpretable AI".

White-box models are simple and straightforward machine learning models.

To be concrete, the linear regression model assumes that the label (also known as the result variable, or dependent variable) is the linear combination of the input features. So the absolute value of each feature's coefficient represents the share of the influence, including positive and negative. 

For example, a company plans an advertising budget on both TV and newspapers. What's the best ratio of these 2 channels? The answer is: depends on the prediction of the regression model. Let's say the model shows the distribution of annual incomes is: one-third from newspapers and two-thirds from TV, then we should put two-thirds of the budgets on TV, and one-third on newspapers. The main limitation of the linear model is that most relationships between input and output aren't linear. So it can give us a comparatively rough prediction.

The next white-box model is the decision tree (DT for short). It's also a model that complies with human instinct: a series of "if-then-else". Since DT is nonlinear, there're no coefficients that imply the importance of features. However, we can calculate the importance of each feature by the portion of nodes that belongs to this feature in all nodes' importance.

Here comes 2 questions:

1. How to determine a node belongs to which feature? We know that every node in DT has a splitting rule: split on which feature, and split at which value. We say "the node belongs to feature A" if the splitting feature of the node is A.

2. How to calculate the importance of a node? The importance of a node is computed as the decrease of the cost function or impurity measure of the tree without this node. See the formula on page 38.

The last part of this chapter discusses GAM (Generalized Additive Model) and how to interpret it. Here the author introduced "partial dependence plot". By marginalizing rest features, a partial dependence plot visualizes how the independent variable change with the investigating feature changes (the exact meaning of this sentence need to be studied carefully with exact data context, which I didn't do it yet). See figure 2.21 for detailed illustrations.

Comments

Popular posts from this blog

2023: On the Road

Yet another advice to kids

The Joy of Reading in Natural Light