Use Octave for Your Statistical Tasks
This thirteenth article of the mathematical journey through open source, gives a glimpse of statistical capabilities of octave.
Statistics is all about data, probabilities, averages, deviations and random numbers. Octave shows you how to compute these easily.
Data moments
Moments mean the various kinds of averages, median, modes, etc. To understand them, let's look at a random data set, which may be generated by using any of the various distributions. Let's examine the most ‘natural’ normal distribution–yes, the bell-shaped one. Figure 1 shows one centred around 0 (the mean) with a spread of 3 (the standard deviation), generated using normpdf(), as cited below. Note that Octave has a whole set of all such functions.
Among the many moments, the four common ones are: 1) median() gets the middlemost element in the sorted arrangement of data; 2) mode() gets the most frequently occurring data point; 3) cov() gives the variance between two sets of data points, i.e., the covariance; and 4) cor() gives the relation between two sets of data points, i.e., the correlation ranging from -1 to 1. A correlation of 1 indicates they are completely related, 0 indicates they are totally unrelated, and -1 indicates they are completely related, but inversely.
Visualising the probabilities
Random numbers are a beautiful example of probabilities. They occur as per their probabilities, decided by the probability density function (PDF) they follow. Let's visualise that, using a live example. Again, as in the above example, let's look at some random numbers following the normal distribution. But, this time we should take a lot more points, say 100,000 points, so that we can actually see them following the bell-shape. But still, we are not having infinite points to give the continuous bell. So, what we have to do is collect the random points around some pre-designated buckets of fixed ranges. That is what a histogram is. histc() does exactly that, returning the number of points in each of the buckets, which we can then plot to see our bell. Another function, hist(), does all those in a more beautiful way. Figure 2 shows the two plots, for which the code is as follows: