Data science gaps
But there’s a big problem. We have a global talent shortage, and the rapidly growing demand for data scientists far outpaces the anemic growth in supply. Dan Meyer of the Analytics Association of the Philippines estimates that there are only a few hundred data scientists in the country today. Others are less optimistic, saying that there are only 10 of them, and this dearth is made more stark by the fact that the rest of the world are competing for our local talents by offering them dollar-salaried jobs overseas.
There is also a lack of understanding on what a data scientist is. According to data scientist recruiter Patrick Visouthivong of NewGate Talent Solutions, a lot of organizations say they require data scientists, only to realize later that what they require are analysts. “I place a data scientist in a company, but the data scientist is tasked to perform data gathering, data cleansing, and visualization. He or she eventually gets bored and leaves the company,” he said.
To shed light on this, authors O’Neil and Schutt describe the data scientist’s duties in their book “Doing Data Science”:
“More generally, a data scientist is someone who knows how to extract meaning from and interpret data, which requires both tools and methods from statistics and machine learning, as well as being human. She spends a lot of time in the process of collecting, cleaning, and munging data, because data is never clean. This process requires persistence, statistics, and software engineering skills—skills that are also necessary for understanding biases in the data, and for debugging logging output from code.”
“Once she gets the data into shape, a crucial part is exploratory data analysis, which combines visualization and data sense. She’ll algorithms— some with the intention of understanding product usage and the overall health of the product, and others to serve as prototypes that ultimately get baked back into the product. She may design experiments, and she is a critical part of data-driven decision making. She’ll communicate with team members, engineers, and leadership in clear language and with data visualizations so that even if her colleagues are not immersed in the data themselves, they will understand the implications.”
The data scientist, therefore, should possess skills and competencies in statistical and visualization tools, machine learning, and model- building. But more important is the skill and practice hypotheses, testing these, and determining causality.
So, there are two gaps that need to be addressed. One is the demand side, that is, understanding of organizations on what a data scientist is; the other is the supply of data scientists.
- bent upon hiring managers and recruiters to learn about the exact requirements of the organization and how a data scientist can add value. It’s laudable that industry players are helping to educate recruiters, such as Asia Select, which recently held RecruiTech 2017 and attended by recruiters of various companies, where I spoke about the growing data science requirements of Philippine organizations. More of such events and fora should be held to further educate key decisions makers in companies.
On the other hand, the supplyside gap can be addressed through multi-sectoral efforts in building data science skills. One approach being done by educational institutions is to offer courses on data science, which is now available in the Asian Institute of Management. The newly created Analytics Association of the Philippines also seeks to educate the public through various events and trainings on data science and analytics.
Another approach is to tap into the PhD pool of universities. The critical skill that a data scientist methodology – a frame of mind for which PhDs are trained. Their skill can be upgraded with computer science concepts in machine learning and programming, which is an easier task than to teach a way of thinking. This approach can be adopted and sponsored by hiring companies, and organizations that can offer such skills-upgrade to PhDs.
There is much to be gained as we address these gaps. Competitiveness by our industries can be greatly improved if companies put data at the center of their strategy.