Machine Learning Without Black Boxes

Mar 18, 2020 by Dominik Matula Data Scientist

With the boost of computing power in past years, it has happened that almost anyone can run a sophisticated machine learning model, e.g. neural networks. And with modern frameworks such as Tensorflow, it is quite accessible and also popular to do everything with such a cool tool.

But over the years, we have often heard from our clients that black box solutions are unacceptable for them or are not allowed under current legislation. So there is a question – do you have any alternatives to those black box solutions in your data science toolbox? Of course, we have! We would like to devote this article to one possible solution to this problem, to a simple yet equally fascinating machine learning model—Bayesian networks.

Is a black box model suitable for your problem?

A simplified version of Moore’s Law says that computing power is increasing exponentially. The increase in performance allows for considerable progress in the field of data science. What seemed incalculable to statisticians 30 years ago can now be calculated within a few seconds on a standard laptop. The massive progress in computer technology also makes it possible to train increasingly sophisticated machine learning models. And I am not only talking about deep neural networks. In predictive modelling, so-called ensemble models are prevalent, for example. Ensemble models are created by combining sub-models which can be very complex themselves.

But before we decide to use such an ML tool in practice, we should ask ourselves three key questions:

Do we have a vast quantity of high-quality historical data available?
Do we have sufficient computing power to process this data?

(As we have mentioned before, this is usually not the most pressing problem.)

Are we willing to pay for model interpretability?

The last criterion points to the well-known fact: complex ML tools are black box models.

It is very difficult or almost impossible to extract a causal statement from them. For instance: The client did X, therefore the model assigned him value Y. Business owners are then left with no choice but to believe that everything is properly folded inside.

White box models

Please do not misunderstand me; I do not want to detract from the value of modern ML frameworks. On the contrary! From a professional point of view, they are truly fascinating! And if you give me a problem, I will be happy to design and build a high-performance solution based on them.

However, it should be remembered that this is far from reality for every problem. Moreover, in machine learning in finance and banking black box access is often not allowed. (Or it’s results – e.g. are you sure your model is not discriminating against any ethnic group?) And this is where the white box comes into play.

A white box is not the best name to describe how it works. A transparent box would be a better name, as the key advantage of this kind of machine learning model is the ability to get an in-depth look at the algorithm.

That is a solution that statisticians came up with back when computers were still in their infancy. At the time, all calculations were done manually, so there could be no question of a black box solution. But there is no need to limit ourselves to regression models, decision trees and other classical tools. Especially, as they are often blamed for losing their breath a bit compared to modern approaches.

Bayesian networks – and ideal counterpart to black box ML

And here we come back to Bayesian networks—our favourite white-box model for developing end-to-end solutions. Their performance is excellent and adequately responds not just to the transparency issue. Bayesian networks answer all three of the above objections to black box machine learning models:

Bayesian networks can be built on much smaller training sets. Furthermore, the complexity of the systems can be regulated and increased as the number of samples grows.
Bayesian network training is straightforward. I will jump ahead and say that it is enough to evaluate the contribution of each node of the network and put these values together.
The output from Bayesian networks is easy to interpret. Every decision of the model can be traced and checked.

Have you ever heard about Bayesian networks? It’s not a fancy new tool in the world of data science. Professor Judea Pearl came up with them 30 years ago, and they gained their repute in various fields, e.g. in medicine. But why not adopt this approach in the field of finance?

In the next article, we will describe the basic concepts behind Bayesian reasoning. You will adopt this technique on a small example. And then we move to a real Bayesian network. So stay tuned!