The Definitions of Interpretability and Learning of Interpretable Models

05/29/2021
by   Weishen Pan, et al.
0

As machine learning algorithms getting adopted in an ever-increasing number of applications, interpretation has emerged as a crucial desideratum. In this paper, we propose a mathematical definition for the human-interpretable model. In particular, we define interpretability between two information process systems. If a prediction model is interpretable by a human recognition system based on the above interpretability definition, the prediction model is defined as a completely human-interpretable model. We further design a practical framework to train a completely human-interpretable model by user interactions. Experiments on image datasets show the advantages of our proposed model in two aspects: 1) The completely human-interpretable model can provide an entire decision-making process that is human-understandable; 2) The completely human-interpretable model is more robust against adversarial attacks.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro