# Bayesian networks: From zero to working model

by Dominik Matula Data Scientist

Over the years, we have often heard from our clients that black box machine learning solutions are unacceptable. In the previous article, we went over three key questions to ask before using a black box model, and we proposed Bayesian networks as an ideal substitute if black box models are not an option.

This article is intended to be an introduction to the world of Bayesian networks. We will start with basic Bayesian reasoning explained using a simple example. Next, we will tell you how to construct a naïve Bayes classifier using these elementary blocks. Finally, we will combine the previous steps in a full-stack Bayesian network.

## Step 1: Bayesian reasoning units

Well, this chapter is a little bit technical. But don’t worry, all you need is high school math and one equation from your Statistics 101 course. This is an equation you probably know, the Bayes rule. We will use it in the so-called odds form:

Posterior odds = Prior odds × Likelihood ratio

Those words sound strange, but their meaning is straightforward:

• Prior odds – odds of an event BEFORE taking any data into account. This is something your business expert gives you.
• Posterior odds – odds of the event AFTER taking data into account. This is what you want—make prior estimates more precise.
• Likelihood ratio – this is a technical term describing how information from data affects your beliefs.

And that’s it. So far, so good. So let’s use it in this simple situation:

A client of our bank sent a transaction to an unknown account with the note “refrigerator”. We would like to find out whether it was a loan transfer, so we can consider offering to refinance his loan.

So, here is what we have:

1. The question: What are the odds that the transaction is a loan payment?
2. The evidence: The transaction note contains the word “refrigerator”.
3. You get the prior odds directly from your data. That’s the ratio of loan payments to all non-loan-payment transactions your clients make.
4. The posterior odds are what you want to know: How did the evidence influence your belief that the transaction is a loan payment?
5. The likelihood ratio is the tricky one, and we won’t go for the mathematical explanation. Still, you will get it quite easily. Just count the occurrences of the word “refrigerator” in cases (loan payment transactions) and non-cases (other transactions). Then divide the first number by the second.

And voilà, you’ve constructed your first Bayes reasoning unit (also called a Bayesian updating step). Congratulations!

## Step 2: Naïve Bayes classifiers

Of course, one piece of evidence is usually not enough. There are many attributes of a transaction that can be measured—the amount, currency, timestamp, transaction symbols, etc. Thus, you need a way to combine them.
Indeed, those Bayes reasoning units can be linked together. Then, the whole process is described in these 3 steps:

1. Start with the prior odds, just as with the simple Bayes reasoning unit.
2. For each piece of evidence:
• Use the odds resulting from the previous step as the prior odds.
• Process the piece of evidence the way we’ve described for the simple Bayes reasoning unit.
3. The final posterior odds is your desired outcome.

That’s it. It feels just like a string of beads.

This simple yet powerful machine learning model is called a naïve Bayes classifier. As you can see, they can be computed very quickly—just multiply several numbers and you’re done. Great!

Maybe you haven’t noticed, but you use this tool daily. How is that so? For example, it’s implemented at the heart of your inbox spam filter. The shreds of evidence are words that are much more prevalent in spam than in regular emails (e.g., ‘F r e e’, ‘\$\$\$’). And their prevalence gives you those likelihood ratios we need.

This is not the only use case for Bayes classifiers. They are used in a variety of fields—such as medicine, law (listen to this excellent talk from ScienceCafe.cz about Bayesian reasoning and forensic genetics) and sports betting. So let’s start using them in finance and banking!

## Step 3: Bayesian networks

Ok, let’s step back a bit. Some assumptions have been mentioned, haven’t they? To combine Bayes reasoning units this way, you have to ensure that:

•  All pieces of evidence are (conditionally) independent.

That may sound odd. Again, we won’t get into the mathematical details here, but what it means is:

• All the likelihood ratios we used to adjust our prior knowledge need to remain the same regardless of the evidence already processed.

Ok, but what does that mean?!

• Let’s say you have two pieces of evidence:
• The transaction contains “refrigerator” in the note.
• The transaction amount is greater than 3 600 € .
• Now, there is a problem:
• The likelihood ratio for the second piece of evidence is surely not the same once we know about the first piece of evidence. (Just imagine a refrigerator for that price!)
• Thus, you are not allowed to combine this information this way!

And this is where Bayesian networks come into play. You can split the reasoning conditioned on the first piece of evidence. Let’s say we take into account the second piece of evidence only if the first one is not present:

Of course, you have to build this orange Bayes reasoning unit on the corresponding subset of data. But the logic remains the same. So it’s not a big deal.

And voilà, you’ve created your very first Bayesian network! Congratulations!