|Attack detection via randonm forests.ipynb|
Artificial intelligence (AI) is used for various purposes that are critical to human life. To forestall an algorithm-based authoritarian society, decisions based on machine learning ought to inspire trust by being explainable. In fact, explainability is not only desirable for ethical reasons, but it is also a legal requirement enshrined in the European Union's General Data Protection Regulation. Additionally, explainability is valuable to scientists and experts depending on AI methods. To accomplish AI explainability, it must be feasible to obtain explanations systematically and automatically. A usual methodology to explain decisions made by a deep learning model (a.k.a. black-box model) is to build a surrogate model based on a less difficult, more understandable decision algorithm. In this work, we focus on explaining the behavior of black-box models trained via federated learning. Federated learning is a decentralized machine learning technique that aggregates partial models trained by a set of peers on their own private data to obtain a global model. In our approach, we use random forests containing a fixed number of decision trees of restricted depth as surrogates of the federated black-box model. Our aim is to determine the causes underlying misclassification by the federated model, which may reflect manipulations introduced in the model by some peers or by the model manager itself. To this end, we leverage partial decision trees in the forest to compute the importance of the features involved in the wrong decisions. We have applied our approach to detect security and privacy attacks that malicious peers or the model manager may orchestrate in federated learning scenarios. Empirical results show that we are able to detect attacks with high accuracy and, in contrast with other attack detection mechanisms, to explain the operation of such attacks.