BCX has a re­cruit­ment prob­lem and the so­lu­tion is data sci­ence

In 2011, the in­ter­na­tional man­age­ment con­sult­ing firm Mckin­sey & Com­pany es­ti­mated that the US would face a short­age of al­most 200 000 work­ers with deep an­a­lyt­i­cal skills by 2018. That same re­port also pre­dicted a short­age of 1.5-mil­lion man­agers and an­a­lysts who know how to em­ploy big data ef­fec­tively. By 2015, the US had seven mil­lion job list­ings that re­quired some com­puter pro­gram­ming or gen­eral cod­ing skills, and that job mar­ket is grow­ing 12% faster than any other. In­ter­na­tional mar­ket an­a­lyst firm Gart­ner puts the growth in de­mand for data sci­en­tists at about three times that of statis­ti­cians and busi­ness-in­tel­li­gence an­a­lysts.

When the Ex­plore Data Sci­ence Academy (EDSA) opened its doors at the end of 2017, South Africa was al­ready be­hind the curve. The es­ti­mated global data sci­ence skills short­fall was al­ready es­ti­mated at two mil­lion jobs. A R50-mil­lion war chest, courtesy of BCX (Telkom’s enterprise arm), will fund 300 can­di­dates over three years. Three hun­dred doesn’t re­ally sound like a par­tic­u­larly high num­ber of can­di­dates when con­sid­er­ing those short­fall num­bers, but it def­i­nitely makes for great em­ploy­ment fore­cast­ing.


how im­por­tant data sci­ence has be­come to so­ci­ety, the re­cent Face­book con­tro­versy prob­a­bly wised you up to this field of study. When Cam­bridge An­a­lyt­ica gained ac­cess to 50 mil­lion Face­book users’ per­sonal data, it took the dili­gent work of a few ded­i­cated data sci­en­tists to iden­tify the targets, which could then be at­tacked us­ing var­i­ous strate­gies, all to ad­vance the Trump cam­paign’s agenda. Op­po­si­tion politi­cians were dis­cred­ited through real-world bribery meth­ods such as, for in­stance, stag­ing a trap where they could en­gage in on­line af­fairs. On a low level, the cam­paign could then di­rectly tar­get po­ten­tial vot­ers with more tai­lored mes­sag­ing.

All it took was for about 270 000 peo­ple to in­ter­act with an ap­pli­ca­tion known as thi­sis your dig­i­tal life, and this breach was pos­si­ble through a loop­hole in data-ac­cess poli­cies.

“It’s no longer good enough for an ed­u­cated per­son to say: ‘I didn’t think of it,’” says EDSA stu­dent For­tu­nate. She came to the pro­gramme by way of an in­dus­trial- engi­neer­ing de­gree, a year’s worth of in­dus­try ex­pe­ri­ence and then a year of teach­ing English in Thai­land. Her com­men­tary is aimed at the data in­dus­try, as well as the pro­fes­sional re­spon­si­bil­ity that comes with it.

“Un­less you have the money to cut your­self off and live on a small is­land, and not carry a smart­phone or have so­cial me­dia ac­counts, you can’t stop your govern­ment or your com­pany from tracking you. Even if they don’t know your name or per­sonal de­tails, they still have data about some­one like you. They can still iden­tify you with­out know­ing your name. I don’t think that’s healthy, although the rea­son I got into in­dus­trial engi­neer­ing and data sci­ence is to find out more about how ma­chines or sys­tems re­late to hu­mans. A lot of the big tech sto­ries have been ex­am­ples of where an­other hu­man sim­ply wasn’t con­sid­ered in the big pic­ture of things. Now you can build some­thing for Cape Town and it can scale to reach a place like Kathu, but in the same breath, a lot of peo­ple who built that so­lu­tion will say that they never thought that it would get that far. They don’t think about the peo­ple in­volved in us­ing the sys­tems.”

She men­tions Kathu specif­i­cally, be­cause that’s where a fel­low stu­dent named An­dries is orig­i­nally from. His EDSA path started with a master’s de­gree in bioin­for­mat­ics. “I’ve been work­ing with data, but in a com­pletely dif­fer­ent sense al­most, from a bi­o­log­i­cal point of view, but data is data,” he ex­plains of his ca­reer tra­jec­tory. “At the mo­ment, we’re work­ing in a com­pletely dif­fer­ent field to what I’m trained in.

“I’m hop­ing that in the fu­ture I can go back into bi­ol­ogy. I think the skills are so gen­eral that you can go any­where and ap­ply your knowl­edge to it. That’s also why I’m here. I think my pas­sion is more the com­puter-sci­ence part – bi­ol­ogy was like a nice side project that I com­fort­ably slid into,” he says. “I started with a de­gree in ge­net­ics be­cause it’s kind of the pro­gram­ming of bi­ol­ogy. My pas­sion for com­puter

sci­ence only grew dur­ing that. There were very few cour­ses to choose from when I started out, but now it’s kind of bloom­ing.”

EDSA was launched in Oc­to­ber of 2017 and the class of 2018 couldn’t do more to chal­lenge the per­cep­tions of the South African IT in­dus­try. Vis­i­tors to the academy will be shocked to find a sus­pi­cious lack of the stereo­typ­i­cal white male com­puter nerd. BCX put up the R50 mil­lion for 300 can­di­dates to com­plete a free year-long course and in­tern­ship over three years. Sim­ple arith­metic dic­tates that the class of 2018 is 100 strong. Eighty-six stu­dents are non-white, 42 are fe­male, 54 are un­der the age of 25 and 42 only have a ma­tric cer­tifi­cate.

Tick­ing all of those afore­men­tioned boxes is Sharné, a 2017 ma­tric­u­lant from Mitchell’s Plain. “I was part of an or­gan­i­sa­tion that teaches cod­ing for girls, called Code for Cape Town. I did that since grade 10,” she says. “When they find out about tech-re­lated things, they tell us. They sent me a link to ap­ply and it sounded like some­thing that I wanted to do. At first it was very scary be­cause all the other peo­ple had de­grees and stuff, but I’m get­ting it now. I had to learn ev­ery­thing.”

CODE4CT is a widely ac­claimed ini­tia­tive that in­tro­duces girls to ba­sic web build­ing skills and ex­poses them to op­por­tu­ni­ties in the ICT (in­for­ma­tion and com­mu­ni­ca­tion tech­nolo­gies) in­dus­try. The or­gan­i­sa­tion is cur­rently an­tic­i­pat­ing 75 000 cod­ing-re­lated jobs to come on­line in South Africa by 2020, with the bulk of those jobs pro­jected to come in the coun­try’s fastest-grow­ing eco­nomic sec­tors.

Data sci­ence, how­ever, is far re­moved from the Javascript Sharné has be­come used to deal­ing with while a CODE4CT can­di­date. But she still holds her own in a room filled with de­gree hold­ers be­cause of her dig­i­tal lit­er­acy, a solid foun­da­tion for any young pro­fes­sional look­ing for a foothold to en­ter the mod­ern work­place.

“Be­fore I ac­tu­ally went to high school, I knew I wanted to do some­thing with com­put­ers, but I didn’t know what. When I got to high school, I wanted to be a pro­gram­mer. But, ever since be­ing here, where you’re do­ing more than just pro­gram­ming, I’ve found that there are more as­pects to it. I love solv­ing prob­lems, and this course gives me the

abil­ity to come up with an idea and build it my­self.”

A big fac­tor in the ris­ing de­mand for data anal­y­sis is the cur­rent in­ter­net of things (IOT) boom. Wide de­ploy­ment of sen­sors that con­stantly feed back in­for­ma­tion to cen­tral servers makes for moun­tains of data that needs to be sorted through. But there aren’t enough hu­mans on the planet to do that kind of slog work, so the role of the data sci­en­tist then turns to in­form­ing the sys­tems and pro­cesses that can eas­ily deal with those tasks.

An­dries hails from iron- ore coun­try out in the North­ern Cape and feels hope­ful about the fu­ture ap­pli­ca­tion of tech­nol­ogy in his home town. “We are in a very for­tu­nate po­si­tion in the world right now with ev­ery­thing be­ing con­nected. Kathu is a min­ing town, min­ing iron ore, so there are very prac­ti­cal ap­pli­ca­tions for data sci­ence, such as op­ti­mis­ing ore ex­trac­tion or trans­port routes. But, for the gen­eral pop­u­la­tion, there’s a trend for most peo­ple car­ry­ing smart­phones.

“I don’t know if you’re fa­mil­iar with Strava,” he con­tin­ues. “They re­lease heat maps ev­ery year. With ev­ery­one car­ry­ing a smart­phone, ev­ery­one has the po­ten­tial to be mon­i­tored all the time and with that comes po­ten­tial ben­e­fits like be­ing able to track your health. That’s where we tie into it. Now we can write a sim­ple app, mon­i­tor­ing how some­one sleeps or how their heart is func­tion­ing, and put all that data to­gether and help peo­ple.

“There are other ma­chine-learn­ing tech­niques that can be im­ple­mented, like you take a photo of your­self over time and get a med­i­cal opin­ion. I think there isn’t this bar­rier any­more where res­i­dents of a small town far away can’t get ac­cess to tech­nol­ogy. What ap­plies to some­one in Cape Town ver­sus what ap­plies to some­one liv­ing in a ru­ral area is now the same.”

The first task EDSA stu­dents were ex­posed to was help­ing the Western Cape govern­ment deal with the city’s crip­pling wa­ter short­age. As Pop­u­lar

Me­chan­ics has cov­ered in the past, a big part of COCT’S (City of Cape Town) mes­sag­ing was the qual­ity of their ref­er­ence data. For­tu­nate gives some in­sight into how the task teams dealt with the anal­y­sis:

“With the last project we had – the Cape Town wa­ter cri­sis – we were

dis­cussing how in­sti­tu­tions choose to share data that is avail­able. Some peo­ple were say­ing that you can’t do this be­cause it’s too po­lit­i­cal, but as the world moves to­ward be­ing more con­nected, each com­pany and data sci­en­tist has a re­spon­si­bil­ity to un­der­stand what the im­pli­ca­tions are in say­ing, ‘ This com­mu­nity is us­ing too much wa­ter,’ or, ‘ This per­son is do­ing this.’”

That project is now com­plete, and EDSA was due to present its find­ings to the City as this mag­a­zine went to print. Key find­ings were that COCT doesn’t re­port well on its data and that was a draw­back for the project. Deal­ing with an­nual con­sump­tion data forced too many as­sump­tions. On the up­side, COCT’S model was very pes­simistic, with Day Zero prob­a­bly fol­low­ing about a month after the dates that were spec­i­fied – this is a good safety net for plan­ning pur­poses.

An­dries’ group (stu­dents split up into task teams of about eight in a group) found that wa­ter isn’t reach­ing the dams ef­fi­ciently and at­trib­uted that to a va­ri­ety of rea­sons, in­clud­ing drought and in­va­sive alien plants.

For­tu­nate iden­ti­fied that the data re­port­ing omit­ted dec­i­mals, which can mas­sively af­fect the real read­ings. She was sat­is­fied with the re­duc­tion in leaks and how the city upped its game to bring down those losses to around 20%, but less happy with the fact that just the 20% wa­ter loss through leaks trans­lates to 10 litres of wa­ter per per­son per day.

“It’s those kind of in­sights that are just sort of sit­ting there, which you can’t re­ally un­earth un­less you have the skills that we are learn­ing now,” says For­tu­nate. “The only in­for­ma­tion you have out­side of that is that your sub­urb is us­ing this much wa­ter and that num­ber is pretty much mean­ing­less to you, but with ev­ery­thing else like the sup­ply and de­mand fac­tors com­ing to­gether, it tells a very com­pelling story.”

Sharné points to an­other anom­aly in the way sub­urbs were re­ported. “Some years, the City had ex­tra sub­urbs, and it makes it dif­fi­cult to com­pare an­nual data, be­cause some­times the sub­urbs are added and other times not.”

The stu­dents are ma­ture enough to un­der­stand the grav­ity of the things they are un­earthing. They are aware of oth­ers who also pos­sess th­ese skills and the broader im­pli­ca­tions of the dam­age that could be caused by a rogue agent con­duct­ing sim­i­lar re­search with ma­li­cious in­tent.

“We need to be aware of the re­spon­si­bil­i­ties we have,” says For­tu­nate. “The Western Cape govern­ment has a re­spon­si­bil­ity to be as pes­simistic as pos­si­ble, be­cause the peo­ple have elected them to look after the best in­ter­ests of the prov­ince and the only thing they can do is scare them, and cur­rently that’s the model be­ing used. But, hav­ing done that, they need to fol­low through with vis­i­ble of­fi­ci­at­ing, such as me­ter read­ings, or else it un­der­mines the scare tac­tics be­cause now more than ever, you can get the num­bers and dis­agree with them. For the peo­ple who have the data, that re­spon­si­bil­ity comes back to talk­ing to peo­ple like adults.”

And that may be the big­gest im­pact pro­lif­er­a­tion of data anal­y­sis skills will have: It will em­power more peo­ple to di­gest data more in­tel­li­gently. Hav­ing knowl­edge of a sit­u­a­tion is all we could ever ask for as cit­i­zens, be­cause that knowl­edge will in­form our de­ci­sions. With­out that knowl­edge, you are a pas­sen­ger in the process; you can only lis­ten to what is told to you.

One of the sug­ges­tions that EDSA came up with when work­ing on this project is to crowd­source data. “If you, as a home owner, could do your own read­ings and sub­mit it on an app, then the City has bet­ter data on wa­ter us­age and then a clearer pic­ture of where the wa­ter is go­ing,” An­dries ex­plains.

We’ve heard a lot about how self­ab­sorbed mil­len­ni­als are, but spend­ing time with younger South Africans and

even bait­ing them into mak­ing snap judge­ments about the way the older gen­er­a­tions went about things re­veals a very thought­ful youth.

“It’s dif­fi­cult to com­pare to­day’s gen­er­a­tion and their eth­i­cal re­spon­si­bil­i­ties with those of our par­ents. It sounds corny, but with great power comes great re­spon­si­bil­ity. Our lives are op­ti­mised in a way our par­ents and grand­par­ents could never have imag­ined,” says An­dries when probed about the fail­ings of his fore­fa­thers. “I can go on my phone and see traf­fic is bad and it would take me longer to drive to work than to walk. That was never an op­tion for my par­ents.

“Per­son­ally, I’m will­ing to make some sac­ri­fices to pay for all of th­ese op­ti­mi­sa­tions. You have to be care­ful to not ac­knowl­edge the fact that times are chang­ing. In the Dark Ages, you couldn’t walk out­side your cas­tle with­out some­one com­ing up to kill you; it’s just a com­pletely dif­fer­ent way of be­ing killed with your pri­vacy be­ing com­pro­mised. We sim­ply need to ad­just.”

The team be­hind EDSA is a group of for­mer ac­tu­ar­ies, pro­gram­mers and busi­ness an­a­lysts who want to give back. Shaun Dipp­nall is the de­facto fig­ure­head, and he comes to the project by way of the Univer­sity of KZN, where he was an ac­tu­ar­ial lec­turer, and an im­pres­sive CV that lists Old Mu­tual as one of his more se­nior po­si­tions.

“It’s go­ing bril­liantly. We had 100 at the start and only two have left for good jobs. The en­ergy has been good. Our KPI (key per­for­mance in­di­ca­tor) is how many kids are in the lab on a Fri­day night; it’s a big marker of en­gage­ment. And it’s pretty much full ev­ery Fri­day after 5 pm,” says Shaun. “The Cape wa­ter­cri­sis project was de­liv­ered and it looks re­ally good, with great out­comes and in­sights.”

Twenty teams pre­sented find­ings on the wa­ter- cri­sis project and the man­age­ment team then ag­gre­gated that into a sin­gle pre­sen­ta­tion that will be given to the City. The next hur­dle for the stu­dents is to get to grips with ma­chine learn­ing, and then dive straight into a six-week prac­ti­cal project from the end of July. There­after is three months of work ex­pe­ri­ence.

“We’re busy talk­ing to cor­po­rate South Africa to spon­sor us a project where they can take ten stu­dents to solve the prob­lem or do the project in their work en­vi­ron­ment. That way, there’s strong hope that the stu­dents can im­press and get a job.”

The tricky part is en­sur­ing that the em­ploy­ment op­por­tu­nity fore­casts match the ac­tual skills needs felt by cor­po­rate South Africa. And that has been the tightrope that all boom­ing in­dus­tries needed to walk.

But that’s year one done. Year two starts re­cruit­ing in July and the hype ma­chine is in full swing. “We get emails daily ask­ing when ap­pli­ca­tions start for 2019,” says Shaun. EDSA an­tic­i­pates in ex­cess of 10 000 ap­pli­ca­tions and the cur­rent idea is to get more com­pa­nies to spon­sor more stu­dents, with a hope to get an­other cou­ple of hun­dred and ex­pand the pro­gramme to Joburg.

“We were de­lib­er­ately very tough and most peo­ple have felt over­whelmed,” he ex­plains about the cur­ricu­lum. “Within that, we’ve had to stream a co­hort and give them ex­tra pro­gram­ming be­cause that can be dif­fi­cult. There are about 20 stu­dents be­ing sup­ple­mented with more pro­gram­ming stuff for about a month. Other changes were to make the block pe­ri­ods four weeks long in­stead of three, so that they don’t burn out.”

EDSA is an in­tel­li­gently de­signed pro­gramme with tan­gi­ble out­comes in one of the sex­i­est job mar­kets right now. But what ex­actly do the stu­dents ac­tu­ally want to gain from it, and how do they see their ca­reers be­yond the pro­gramme?

Sharné has two op­tions in mind after EDSA. “Ei­ther I go to var­sity and study, or I’m work­ing as a data sci­en­tist for a com­pany. But that’s for me to de­cide dur­ing this course, be­cause it all de­pends on which com­pany I get placed with, and why. I also want to build things be­cause it re­ally helps peo­ple. I was part of a project two years ago that helped ru­ral preg­nant peo­ple track when they would most likely give birth.”

An­dries has his eye on some­thing par­tic­u­larly in­ter­est­ing: a po­si­tion at Google Deep­mind. “I would love to go into proper ma­chine learn­ing, and do al­go­rithms or de­vel­op­ment of the al­go­rithms. At the mo­ment, we’re still on the user side of those al­go­rithms and un­der­stand it, but I want to be on the other side and build them.”

For­tu­nate wants to help out with grass­roots-level food pro­duc­tion and com­mu­nity de­vel­op­ment.

The hope here, of course, is that all th­ese bets will pay off and that South Africa can emerge as a new breed­ing ground for in­flu­en­tial data sci­en­tists.

Left to right: EDSA su­dents Sharné, An­dries and For­tu­nate

Of the 100 stu­dents, 86 are non-white, 52 are fe­male and 54 are un­der the age of 25.

Shaun Dipp­nall is the de facto head of the academy and a sea­soned ac­tu­ary and ac­tu­ar­ial-sciences lec­turer.


