Explainable AI with Shapley values¶

This is an introduction to explaining machine learning models with Shapley values. Shapley values are a widely used approach from cooperative game theory that come with desirable properties. This tutorial is designed to help build a solid understanding of how to compute and interpet Shapley-based explanations of machine learning models. We will take a practical hands-on approach, using the shap Python package to explain progressively more complex models. This is a living document, and the serves as an introduction to the shap Python package. So if you have feedback or contributions please open an issue or pull request to make this tutorial better!

Note this document depends on a new API for SHAP that may change slightly in the coming weeks.

Outline

Explaining a linear regression model
Explaining a generalized additive regression model
Explaining a gradient boosted decision tree regression model
Explaining a logistic regression model
Explaining a XGBoost logistic regression model
Dealing with correlated input features
Explaining a transformers NLP model

Explaining a linear regression model¶

Before using Shapley values to explain complicated models, it is helpful to understand how they work for simple models. One of the simplest model types is standard linear regression, and so below we train a linear regression model on the classic boston housing dataset. This dataset consists of 506 neighboorhood regions around Boston in 1978, where our goal is to predict the median home price (in thousands) in each neighboorhood from 14 different features:

CRIM - per capita crime rate by town
ZN - proportion of residential land zoned for lots over 25,000 sq.ft.
INDUS - proportion of non-retail business acres per town.
CHAS - Charles River dummy variable (1 if tract bounds river; 0 otherwise)
NOX - nitric oxides concentration (parts per 10 million)
RM - average number of rooms per dwelling
AGE - proportion of owner-occupied units built prior to 1940
DIS - weighted distances to five Boston employment centres
RAD - index of accessibility to radial highways
TAX - full-value property-tax rate per $10,000
PTRATIO - pupil-teacher ratio by town
B - 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
LSTAT - % lower status of the population
MEDV - Median value of owner-occupied homes in $1000’s

[92]:

import pandas as pd

[1]:

import shap
import sklearn

# a classic housing price dataset
X,y = shap.datasets.boston()
X100 = shap.utils.sample(X, 100)

# a simple linear model
model = sklearn.linear_model.LinearRegression()
model.fit(X, y)

[1]:

LinearRegression()

[1]:

import shap
import sklearn

# a classic housing price dataset
X,y = shap.datasets.boston()
X100 = shap.utils.sample(X, 100)

# a simple linear model
model = sklearn.linear_model.LinearRegression()
model.fit(X, y)

[1]:

LinearRegression()

Examining the model coefficients¶

The most common way of understanding a linear model is to examine the coefficients learned for each feature. These coefficients tell us how much the model output changes when we change each of the input features:

[2]:

print("Model coefficients:\n")
for i in range(X.shape[1]):
    print(X.columns[i], "=", model.coef_[i].round(4))

Model coefficients:

CRIM = -0.108
ZN = 0.0464
INDUS = 0.0206
CHAS = 2.6867
NOX = -17.7666
RM = 3.8099
AGE = 0.0007
DIS = -1.4756
RAD = 0.306
TAX = -0.0123
PTRATIO = -0.9527
B = 0.0093
LSTAT = -0.5248

While coefficents are great for telling us what will happen when we change the value of an input feature, by themselves they are not a great way to measure the overall importance of a feature. This is because the value of each coeffient depends on the scale of the input features. If for example we were to measure the age of a home in minutes instead of years, then the coeffiect for the AGE feature would become $0.0007 * 365 * 24 * 60 = 367.92$. Clearly the number of minutes since a house was built is not more important than the number of years, yet it’s coeffiecent value is much larger. This means that the magnitude of a coeffient is not nessecarily a good measure of a feature’s importance in a linear model.

A more complete picture using partial dependence plots¶

To understand a feature’s importance in a model it is necessary to understand both how changing that feature impacts the model’s output, and also the distribution of that feature’s values. To visualize this for a linear model we can build a classical partial dependence plot and show the distribution of feature values as a histogram on the x-axis:

[3]:

shap.plots.partial_dependence("RM", model.predict, X100, ice=False, model_expected_value=True, feature_expected_value=True)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_8_0.png

The gray horizontal line in the plot above represents the expected value of the model when applied to the boston housing dataset. The vertical gray line represents the average value of the AGE feature. Note that the blue partial dependence plot line (which the is average value of the model output when we fix the AGE feature to a given value) always passes through the interesection of the two gray expected value lines. We can consider this intersection point as the “center” of the partial dependence plot with respect to the data distribution. The impact of this centering will become clear when we turn to Shapley values next.

Reading SHAP values from partial dependence plots¶

The core idea behind Shapley value based explanations of machine learning models is to use fair allocation results from cooperative game theory to allocate credit for a model’s output $f(x)$ among its input features . In order to connect game theory with machine learning models it is nessecary to both match a model’s input features with players in a game, and also match the model function with the rules of the game. Since in game theory a player can join or not join a game, we need a way for a feature to “join” or “not join” a model. The most common way to define what it means for a feature to “join” a model is to say that feature has “joined a model” when we know the value of that feature, and it has not joined a model when we don’t know the value of that feature. To evaluate an existing model $f$ when only a subset $S$ of features are part of the model we integrate out the other features using a conditional expectated value formulation. This formulation can take two forms:

\[E[f(X) \mid X_S = x_S]\]

or

\[E[f(X) \mid do(X_S = x_S)]\]

In the first form we know the values of the features in S because we observe them. In the second form we know the values of the features in S because we set them. In general, the second form is usually preferable, both becuase it tells us how the model would behave if we were to intervene and change its inputs, and also because it is much easier to compute. In this tutorial we will focus entirely on the the second formulation. We will also use the more specific term SHAP values to refer to Shapley values applied to a conditional expectation function of a machine learning model.

SHAP values can be very complicated to compute (they are NP-hard in general), but linear models are so simple that we can read the SHAP values right off a partial dependence plot. When we are explaining a prediction $f(x)$, the SHAP value for a specific feature $i$ is just the difference between the expected model output and the partial dependence plot at the feature’s value $x_i$:

[4]:

# compute the SHAP values for the linear model
background = shap.maskers.Independent(X, max_samples=1000)
explainer = shap.Explainer(model.predict, background)
shap_values = explainer(X)

# make a standard partial dependence plot
sample_ind = 18
fig,ax = shap.partial_dependence_plot(
    "RM", model.predict, X, model_expected_value=True,
    feature_expected_value=True, show=False, ice=False,
    shap_values=shap_values[sample_ind:sample_ind+1,:],
    shap_value_features=X.iloc[sample_ind:sample_ind+1,:]
)

Permutation explainer: 507it [00:14, 34.26it/s]

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_11_1.png

The close correspondence between the classic partial dependence plot and SHAP values means that if we plot the SHAP value for a specific feature across a whole dataset we will exactly trace out a mean centered version of the partial dependence plot for that feature:

[5]:

shap.plots.scatter(shap_values[:,"RM"])

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_13_0.png

The additive nature of Shapley values¶

One the fundemental properties of Shapley values is that they always sum up to the difference between the game outcome when all players are present and the game outcome when no players are present. For machine learning models this means that SHAP values of all the input features will always sum up to the difference between baseline (expected) model output and the current model output for the prediction being explained. The easiest way to see this is through a waterfall plot that starts our background prior expectation for a home price $E[f(X)]$, and then adds features one at a time until we reach the current model output $f(x)$:

[6]:

# the waterfall_plot shows how we get from shap_values.base_values to model.predict(X)[sample_ind]
shap.plots.waterfall(shap_values[sample_ind], max_display=14)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_15_0.png

Explaining an additive regression model¶

The reason the partial dependence plots of linear models have such a close connection to SHAP values is because each feature in the model is handled independently of every other feature (the effects are just added together). We can keep this additive nature while relaxing the linear requirement of straight lines. This results in the well-known class of generalized additive models (GAMs). While there are many ways to train these types of models (like setting an XGBoost model to depth-1), we will use InterpretMLs explainable boosting machines that are specifically designed for this.

[7]:

# fit a GAM model to the data
import interpret.glassbox
model_ebm = interpret.glassbox.ExplainableBoostingRegressor()
model_ebm.fit(X, y)

# explain the GAM model with SHAP
explainer_ebm = shap.Explainer(model_ebm.predict, background)
shap_values_ebm = explainer_ebm(X)

# make a standard partial dependence plot with a single SHAP value overlaid
fig,ax = shap.partial_dependence_plot(
    "RM", model_ebm.predict, X, model_expected_value=True,
    feature_expected_value=True, show=False, ice=False,
    shap_values=shap_values_ebm[sample_ind:sample_ind+1,:],
    shap_value_features=X.iloc[sample_ind:sample_ind+1,:]
)

Permutation explainer: 507it [00:37, 13.64it/s]

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_17_1.png

[8]:

shap.plots.scatter(shap_values_ebm[:,"RM"])

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_18_0.png

[10]:

# the waterfall_plot shows how we get from explainer.expected_value to model.predict(X)[sample_ind]
shap.plots.waterfall(shap_values_ebm[sample_ind], max_display=14)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_19_0.png

[11]:

# the waterfall_plot shows how we get from explainer.expected_value to model.predict(X)[sample_ind]
shap.plots.beeswarm(shap_values_ebm, max_display=14)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_20_0.png

Explaining a non-additive boosted tree model¶

[12]:

# train XGBoost model
import xgboost
model_xgb = xgboost.XGBRegressor(nestimators=100, max_depth=2).fit(X, y)

# explain the GAM model with SHAP
explainer_xgb = shap.Explainer(model_xgb, background)
shap_values_xgb = explainer_xgb(X)

# make a standard partial dependence plot with a single SHAP value overlaid
fig,ax = shap.partial_dependence_plot(
    "RM", model_xgb.predict, X, model_expected_value=True,
    feature_expected_value=True, show=False, ice=False,
    shap_values=shap_values_ebm[sample_ind:sample_ind+1,:],
    shap_value_features=X.iloc[sample_ind:sample_ind+1,:]
)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_22_0.png

[13]:

shap.plots.scatter(shap_values_xgb[:,"RM"])

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_23_0.png

[14]:

shap.plots.scatter(shap_values_xgb[:,"RM"], color=shap_values)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_24_0.png

Explaining a linear logistic regression model¶

[15]:

# a classic adult census dataset price dataset
X_adult,y_adult = shap.datasets.adult()

# a simple linear logistic model
model_adult = sklearn.linear_model.LogisticRegression(max_iter=10000)
model_adult.fit(X_adult, y_adult)

def model_adult_proba(x):
    return model_adult.predict_proba(x)[:,1]
def model_adult_log_odds(x):
    p = model_adult.predict_log_proba(x)
    return p[:,1] - p[:,0]

[15]:

LogisticRegression(max_iter=10000)

Note that explaining the probability of a linear logistic regression model is not linear in the inputs.

[51]:

# make a standard partial dependence plot
sample_ind = 18
fig,ax = shap.partial_dependence_plot(
    "Capital Gain", model_adult_proba, X_adult, model_expected_value=True,
    feature_expected_value=True, show=False, ice=False
)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_28_0.png

If we use SHAP to explain the probability of a linear logistic regression model we see strong interaction effects. This is because a linear logistic regression model NOT additive in the probability space.

[52]:

# compute the SHAP values for the linear model
background_adult = shap.maskers.Independent(X_adult, max_samples=1000)
explainer = shap.Explainer(model_adult_proba, background_adult)
shap_values_adult = explainer(X_adult[:1000])

Permutation explainer: 1001it [00:32, 30.99it/s]

[41]:

shap.plots.scatter(shap_values_adult[:,"Age"])

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_31_0.png

If we instead explain the log-odds output of the model we see a perfect linear relationship between the models inputs and the model’s outputs. It is important to remember what the units are of the model you are explaining, and that explaining different model outputs can lead to very different views of the model’s behavior.

[53]:

# compute the SHAP values for the linear model
explainer_log_odds = shap.Explainer(model_adult_log_odds, background_adult)
shap_values_adult_log_odds = explainer_log_odds(X_adult[:1000])

divide by zero encountered in log
Permutation explainer: 1001it [00:33, 30.02it/s]

[54]:

shap.plots.scatter(shap_values_adult_log_odds[:,"Age"])

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_34_0.png

[38]:

# make a standard partial dependence plot
sample_ind = 18
fig,ax = shap.partial_dependence_plot(
    "Age", model_adult_log_odds, X_adult, model_expected_value=True,
    feature_expected_value=True, show=False, ice=False,
    #shap_values=shap_values[sample_ind:sample_ind+1,:],
    #shap_value_features=X.iloc[sample_ind:sample_ind+1,:]
)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_35_0.png

Explaining an XGBoost logistic regression model¶

[56]:

# train XGBoost model
X,y = shap.datasets.adult()
model = xgboost.XGBClassifier(nestimators=100, max_depth=2).fit(X, y)

# compute SHAP values
explainer = shap.Explainer(model, X)
shap_values = explainer(X)

# set a display version of the data to use for plotting (has string values)
shap_values.display_data = shap.datasets.adult(display=True)[0].values

 94%|=================== | 30731/32561 [00:11<00:00]

By default a SHAP bar plot will take the mean absolute value of each feature over all the instances (rows) of the dataset.

[60]:

shap.plots.bar(shap_values)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_39_0.png

But the mean absolute value is not the only way to create a global measure of feature importance, we can use any number of transforms. Here we show how using the max absolute value highights the Capital Gain and Capital Loss features, since they have infrewuent but high magnitude effects.

[61]:

shap.plots.bar(shap_values.abs.max(0))

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_41_0.png

If we are willing to deal with a bit more complexity we can use a beeswarm plot to summarize the entire distribution of SHAP values for each feature.

[62]:

shap.plots.beeswarm(shap_values)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_43_0.png

By taking the absolute value and using a solid color we get a compromise between the complexity of the bar plot and the full beeswarm plot. Note that the bar plots above are just summary statistics from the values shown in the beeswarm plots below.

[63]:

shap.plots.beeswarm(shap_values.abs, color="shap_red")

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_45_0.png

[65]:

shap.plots.heatmap(shap_values[:1000])

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_46_0.png

[66]:

shap.plots.scatter(shap_values[:,"Age"])

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_47_0.png

[67]:

shap.plots.scatter(shap_values[:,"Age"], color=shap_values)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_48_0.png

[69]:

shap.plots.scatter(shap_values[:,"Age"], color=shap_values[:,"Capital Gain"])

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_49_0.png

[75]:

shap.plots.scatter(shap_values[:,"Relationship"], color=shap_values)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_50_0.png

Dealing with correlated features¶

[77]:

clustering = shap.utils.hclust(X_adult, y_adult)

[78]:

shap.plots.bar(shap_values, clustering=clustering)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_53_0.png

[83]:

shap.plots.bar(shap_values, clustering=clustering, cluster_threshold=0.8)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_54_0.png

[82]:

shap.plots.bar(shap_values, clustering=clustering, cluster_threshold=1.8)

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_55_0.png

Explaining a transformers NLP model¶

This demonstrates how SHAP can be effectively applied to complex model types with highly structured inputs.

[87]:

import transformers
import nlp
import torch
import numpy as np
import scipy as sp

# load a BERT sentiment analysis model
tokenizer = transformers.DistilBertTokenizerFast.from_pretrained("distilbert-base-uncased")
model = transformers.DistilBertForSequenceClassification.from_pretrained(
    "distilbert-base-uncased-finetuned-sst-2-english"
).cuda()

# define a prediction function
def f(x):
    tv = torch.tensor([tokenizer.encode(v, pad_to_max_length=True, max_length=500) for v in x]).cuda()
    outputs = model(tv)[0].detach().cpu().numpy()
    scores = (np.exp(outputs).T / np.exp(outputs).sum(-1)).T
    val = sp.special.logit(scores[:,1]) # use one vs rest logit units
    return val

# build an explainer using a token masker
explainer = shap.Explainer(f, tokenizer)

# explain the model's predictions on IMDB reviews
imdb_train = nlp.load_dataset("imdb")["train"]
shap_values = explainer(imdb_train[:10])

explainers.Partition is still in an alpha state, so use with caution...

[88]:

# plot the first sentence's explanation
shap.plots.text(shap_values[:3])

0th instance:

Bromwell High is

a cartoon comedy

.

It

ran

at the

same time as some

other programs about school life,

such as "Teachers".

My 35 years in the teaching profession lead

me to believe that Bromwell High's

satire is much closer to reality than is

"Teachers".

The scramble to survive financially, the insightful students who can see right through their pathetic teachers' pomp, the pettiness of the whole situation, all remind me of the schools I knew and their students. When I saw the episode in which a student repeatedly tried to burn down the school, I immediately recalled ......... at .........

. High.

A classic line: INSPECTOR: I'

m here to sack one of your teachers.

STUDENT: Welcome to Bromwell High.

I expect that many adults of my age

think that Bromwell

High is

far

fetched

.

What a pity that it isn't!

1st instance:

Homelessness (or Houselessness as George Carlin stated) has been an issue for years but never a plan to help those on the street that were once considered human who did everything from going to school, work, or vote for the matter.

Most people think of the homeless as just a lost cause while worrying about things such as racism, the war on Iraq, pressuring kids to succeed, technology, the elections, inflation, or worrying if they'll be next to end up on the streets.

But what if you were given a bet to live on the streets for a month without the luxuries you once had from a home, the entertainment sets, a bathroom, pictures on the wall, a computer, and everything you once treasure to see what it's like to be homeless? That is Goddard Bolt's lesson. Mel Brooks (who directs) who stars as Bolt plays a rich man who has everything in the world until deciding to make a bet with a sissy rival (Jeffery Tambor) to see if he can live in the streets for thirty days without the luxuries; if Bolt succeeds, he can do what he wants with a future project of making more buildings.

The bet's on where Bolt is thrown on the street with a bracelet on his leg to monitor his every move where he can't step off the sidewalk.

He's given the nickname Pepto by a vagrant after it's written on his forehead where Bolt meets other

characters including a woman by the name of Molly (Lesley Ann Warren) an ex-dancer who got divorce before losing her home,

and her pals Sailor (Howard Morris) and Fumes (Teddy Wilson) who are already used to the streets. They're survivors. Bolt isn't.

He's

not used

to reaching

mutual agreements like he once did when being rich where

it's fight or flight, kill or be killed.

While the love connection between Molly

and Bolt wasn't

necessary to plot,

I found

"Life

Stinks

" to

be one of Mel Brooks' observant

films where prior to being a comedy,

it shows a tender side compared to his slapstick work such as Blazing Saddles,

Young Frankenstein, or Spaceballs for the matter,

to show what it's like having something valuable before losing it the next day or on the other hand making a stupid bet like all rich people do when they don't know what to do with their money.

Maybe they should give it to the homeless instead of using it like Monopoly money. Or maybe this film will inspire you to help others.

2nd instance:

Brilliant over-acting by Lesley Ann Warren. Best dramatic hobo lady I have ever seen, and love scenes in clothes warehouse are second to none.

The corn on face

is a

classic

,

as good

as anything

in Blazing Saddles.

The

take

on

lawyers

is also

superb

.

After being accused of

being a turncoat,

selling out his boss,

and being dishonest the lawyer of Pepto Bolt shrugs indifferently "I'm a lawyer" he says.

Three funny words. Jeffrey Tambor, a favorite from the later Larry Sanders show, is fantastic here too as a mad millionaire who wants to crush the ghetto. His character is more malevolent than usual. The hospital scene, and the scene where the homeless invade a demolition site, are all-time classics.

Look for the legs scene and the two big diggers fighting (one bleeds).

This movie gets better each time I see it

(which is quite often).

[90]:

shap.plots.bar(shap_values.abs.mean(0))

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_59_0.png

[91]:

shap.plots.bar(shap_values.abs.sum(0))

../../_images/example_notebooks_general_Explainable_AI_with_Shapley_Values_60_0.png

[ ]: