Interpretability and explainability of machne learning models | PeWe

Once a machine learning algorithm is trained, it can be difficult to understand why it gives a particular response to a set of data inputs. This can be a disadvantage because for people it is not completely clear how the algorithm makes its decisions and then people do not trust it. The objective of interpreting and explaining machine learning algorithm is to say: “Algorithm makes this decision because these features are the most important.”
In our work, we divide interpreting into two parts: feature selection methods and explaining methods. Feature selection methods choose features, which are the most important for train model and make predictions. Explaining methods explain predictions of the classifier. They can say, which features were important for prediction. Feature selection methods make the model easier to understand and explaining methods explain prediction of the model. In addition to interpreting and explaining, we also focus on classifier performance. We demonstrate, that feature selection can improve model performance and that feature selection can distinguish randomly generated data and real data.