Per­haps no-one has done more for the cause of data-driven de­ci­sion-mak­ing in the minds of the pub­lic than Nate Sil­ver. His book, The Sig­nal and the Noise, ex­plains how stat i st ica l mod­el­ling im­proves our pre­dic­tions about ev­ery­thing from the weather to sports to the stock mar­ket. Data sci­ence is the hottest f ield to be in right now, and Sil­ver is its poster child.

But for most peo­ple, the gulf be­tween recog­nis­ing the im­por­tance of data and ac­tu­ally be­gin­ning to an­a­lyse it is mas­sive. How do those with­out ex­ten­sive train­ing in sta­tis­tics equip them­selves with the skills nec­es­sary to thrive (or even just sur­vive) in our age of ‘ big data’?

Last month I had the chance to put that ques­tion to Sil­ver, and his an­swers may sur­prise you. Far from coun­selling that ev­ery­one must ma­jor in sta­tis­tics, in the edited con­ver­sa­tion be­low he ad­vises stu­dents and ex­ec­u­tives alike to roll up their sleeves – no mat­ter their sta­tis­ti­cal lit­er­acy – and get their hands dirty with data.

I think the best train­ing is al­most al­ways go­ing to be hands-on train­ing. In some ways the book is fairly ab­stract, partly be­cause you’re try­ing to look at a lot of dif­fer­ent fields. You’re try­ing not to make crazy gen­er­al­i­sa­tions across too many spheres.

But my ex­pe­ri­ence is all work­ing with base­ball data, or learn­ing game the­ory be­cause you want to be bet­ter at poker, right? Or [you] want to build bet­ter elec­tion mod­els be­cause you’re cu­ri­ous and you think the cur­rent prod­ucts out there aren’t as strong as they could be. So, get­ting your hands dirty with the data set is, I think, far and away bet­ter than spend­ing too much time do­ing read­ing and so forth.

Again, I think the ap­plied ex­pe­ri­ence is a lot more im­por­tant than the aca­demic ex­pe­ri­ence. It prob­a­bly can’t hurt to take a stats class in univer­sity.

But it re­ally is some­thing that re­quires a lot of dif­fer­ent parts of your brain. I mean the thing that’s tough­est to teach is the in­tu­ition for what are big ques­tions to ask – that in­tel­lec­tual cu­rios­ity. That bullsh*t de­tec­tor, for lack of a bet­ter term, where you see a data set and you have at least a first ap­proach on how much sig­nal there is there. That can help to make you a lot more ef­fi­cient.

That stuff is kind of hard to teach through book learn­ing. So it’s by ex­pe­ri­ence. I would be an ad­vo­cate if you’re go­ing to have an ed­u­ca­tion, then have it be a pretty di­verse ed­u­ca­tion so you’re f lex­ing lots of dif­fer­ent mus­cles.

You can learn the tech­ni­cal skills later on, and you’ll be more mo­ti­vated to learn more of the tech­ni­cal skills when you have some prob­lem you’re try­ing to solve or some fi­nan­cial in­cen­tive to do so. So, I think not spe­cial­is­ing too early is im­por­tant.

I mean my path has been kind of sui generis in some ways, right? Prob­a­bly an on­line course could work, but I think ac­tu­ally when peo­ple are self-taught with oc­ca­sional guid­ance, with oc­ca­sional pushes here and there, that could work well.

An ideal sit­u­a­tion is when you’re study­ing on your own and maybe you have some type of men­tor who you talk to now and then. You should be alert that you’re go­ing to make some dumb mis­takes at first. And some will take a one-time cor­rec­tion. Oth­ers will take a life­time to learn. But yes, peo­ple who are mo­ti­vated on their own, I think, are al­ways go­ing to do bet­ter than peo­ple who are fed a diet of things.

Say an or­gan­i­sa­tion brings in a bunch of ‘stat heads’, to use your ter­mi­nol­ogy. Do you silo them in their own depart­ment that serves the rest of the com­pany? Or is it im­por­tant to make sure that ev­ery team has some­one who has the ana-

