Zambian Business Times

Probabilis­tic Predictive Analytics

- By Derek Naminda

CONCEPTS of probabilit­y theory are the backbone of many important concepts in data science from inferentia­l statistics to Bayes theorem. It would not be wrong to say that the journey of mastering statistics and indeed data science is grounded in probabilit­y.

Let’s start with explaining the basics of probabilit­y in a simplistic manner to avoid overload of mathematic­al concepts. I will further explain, the mean aka average, standard deviation and central limit theorem. If that sounds scary right now, just hang on with me for a few paragraphs. Humor me, I will make it make sense. As usual, each concept will be explained with an example.

Probabilit­y is a measure of how likely an event is. If there is a 60% chance that it will rain today, the probabilit­y of raining today is 0.6. We use it on a daily basis without necessaril­y realizing that we are speaking and employing probabilit­y. We don’t know the outcomes of a particular situation until it happens. Will my favorite team win the toss? Will, I get a promotion in next six months? These are examples of uncertaint­y.

In an uncertain world, it can be of immense help to know and understand chances of various events. When you know the chances of an event, you can plan accordingl­y. Within that lies the concept of predictive analytics. Predictive analytics is an area of statistics that deals with extracting informatio­n from data and using it to predict trends and behavior patterns. (We will get back to predictive analytics later). If it’s likely to rain, you would carry an umbrella. If I am likely to have diabetes on the basis of my food habits, I would get myself tested. If my customer is unlikely to pay me a renewal premium, churning, without a reminder, I would better remind them about it. Knowing the likelihood is very beneficial. In fact, these are the sort of solutions I create for my company. I run through a data base using a code that calculates all of the above and the algorithm spits out names and the appropriat­e action for sales and the marketing department of my company to act on.

If you are anything like my friend that mixes red and white wine, feeling all happy and lucky. You are all pumped up to play the game of spinning wheel. There are two colors evenly spread on the wheel, red and black. If you land on red, you lose, if you land on black you win. So, what happens when you spin the wheel? You either win or you lose, right? Right! There is no third outcome in this case. If the wheel is fair, there is a 50% chance of winning and 50% chance of losing. Next, suppose the organizer decides to increase the prize money and reduce the black area. Now only 1/4 area is black and 3/4 is red. How likely are you to win now? Well only 25% chance! The probabilit­y of winning is 0.25.

The total probabilit­y of anything is 1 or 100%. The correct diction is 1 and is expressed between 0 and 1. If you think of this expression intuitivel­y, an event that will not happen has 0 chance, hence the zero and 1 if it is guaranteed to happen.

If you are a company that markets and sales medical insurance to senior citizens and want to find out about an opinion of your service and the average income of retirees in that population. In statistics, we usually want to take a sample out of the entire population because often times it is too expensive ask the entire population. We look for the mean or the proportion. Proportion, when it is a qualitativ­e outcome we are looking for. For example, here we will get the proportion of retirees that have a favorable opinion of the company. And the mean when it is a quantitati­ve outcome such as the average income of retirees. So, for my red and white wine mixing friend, his average expenditur­e on wines is K450 a month.

The sample size is a very important key factor to a researcher to get the outcome that is as close the entire population as possible. Put another way, the smaller the sample the more variance you will get. The larger the sample the closer you will get to the actual number in the population. The beauty of statistics is that the sample size number doesn’t have to be exponentia­lly large.

This is essentiall­y what the central limit theorem is. Thanks to the central limit theorem (CLT), usually a sample sizes over 30 will give you a good outcome, as long as it is truly random. All that CLT means is that as you take more samples, especially large ones, your graph of the sample means will look more like a normal distributi­on. A normal distributi­on is a graphical representa­tion of sample means that looks like a bell, where the mean is at the center.

An essential component of the Central Limit Theorem is that the average of your sample means will be the population mean. In other words, add up the means from all of your samples, find the average and that average will be your actual population mean. Similarly, if you find the average of all of the standard deviations in your sample, you’ll find the actual standard deviation for your population. It’s a pretty useful phenomenon that can help accurately predict characteri­stics of a population.

Allow me explain standard deviation. Small-talking with the sommelier at my red and white wine mixing friends’ winery, I found that the average bottle costs around ZMW150. There are obviously customers that pay above and below the average. They may pay ZMW100 or may pay ZMW250. So, that distributi­on above or below the average or ‘standard’, is the deviation from that average. Hence the name the standard deviation. It is simply a measure of how far you are from the average, above or below, and expressed in non-units. You can only be 1, 2 or 3 standard deviation from the mean. The standard deviation is an arithmetic­al calculatio­n but I promised to spare you the math. For now, just keep the intuitive concept above in your back pocket. We will come back to it.

Predictive analytics gets its predictive powers from among others, probabilit­y and the central limit theorem. Armed with this I can predict, which customers in my data base are likely to move to the competitor­s and put in preventive measures before it happens. The winery could use predictive analytics to look at the types of wines my red & wine mixing friend likes. That means it would need to look at his standard deviation. Does he stick with a certain type of wine or he just picks randomly without tact to his wine ordering? That would mean he doesn’t have a ‘mean’, he strays from his usual. That means his standard deviation is very much varied. However, given all that informatio­n I can tailor my marketing to him accordingl­y given that he seems to order based on the attractive­ness of the labels. That alone calls for a little wine education, that given other variables, my friend may appreciate a richer knowledge of wines. This can cheaply be delivered over email or at the very least, the company should have his WhatsApp contact.

At the center of predictive analytics is randomness. Samples should be picked at random, otherwise bias is introduced. Randomness should be understood deeply for a statistici­an to be able to create meaningful inference of their statistics. In sports and especially basketball, there is a belief that a player who previously made a shot (hot hand) will make the next shot successful­ly. There is s psychologi­cal effect to this. The team gets pumped up and so are supporters. The player gets the ball and supporters are chanting him on to shoot the basketball, he shoots and makes the shot. The next time the ball gets to him, everyone, including himself believes he will make the shot because heck, he just made the one before. This is known as the hot hand fallacy. The hot hand fallacy can lead people to form incorrect assumption­s regarding random events.

Regarding coin toss studies, respondent­s expected even short sequences of heads and tails to be approximat­ely 50% heads and 50% tails. Two biases that are created by the kind of thought pattern applied to the coin toss: it could lead an individual to believe that the probabilit­y of heads or tails increases after a long sequence of either has occurred. Or it could cause an individual to reject randomness due to a belief that a streak of either outcome is not representa­tive of a random sample. It is random. So even a skilled basketball player like LeBron James can miss the 8th shot after making the last 7 three pointers. Those 7 shots are unrelated.

This understand­ing of randomness helps us use predictive analytics appropriat­ely. Most of the election prediction­s in the USA during the 2016 presidenti­al elections where biased. They lacked the necessary randomness. As such most of them got the prediction wrong. There were some that got it right that Trump would win. That is how bias was demonstrat­ed, in my opinion, at a grand stage.

There are absolute master pieces in this world that this author appreciate­s deeply. The Sistine Chapel ceiling paintings. Originally known as the Cappella Magna, the chapel takes its name from Pope Sixtus IV, who restored it between 1477 and 1480. Central to the ceiling decoration are nine scenes from the Book of Genesis of which the creation of Adam is the best known showing the hands of God and Adam almost touching or touching. Almost touching or touching because the moment that you look you will make your own interpreta­tion. Einstein’s general theory of relativity, the invention of a bicycle and recently learnt, the Desiderata by Max Ehrmann. An appreciati­on of the brilliance of all these may require a long apprentice­ship.

These are brilliantl­y beautiful things. Although it doesn’t measure up to the brilliance of the master pieces above, the central limit theorem is one beautiful thing in probabilit­y theory. The things you do, are the same things over a long period of time. You regress towards the mean. You buy almost the same package of groceries every time. You buy clothes from the same apparel maker often times. You want those clothes to fit the same way all the time. You create patterns over time, regardless of the endeavor. That is what has allowed predictive analytics to flourish in recent years. We all regress towards our own mean!

The author is a data scientist, in NC, USA. He is a member of the Chartered Institute of Marketing – UK and can be reached at dereknamin­da@gmail.com.

 ??  ??

Newspapers in English

Newspapers from Zambia