The Borneo Post

Becoming a nation of data profession­als

- By Dr Yakub Sebastian

ACCORDING to Forbes magazine, by 2025, most of the world’s major companies will collective­ly generate approximat­ely 180 zettabytes of data. To put this into perspectiv­e, one zettabyte is enough to store 36 million years’ worth of high- defi nition quality video. As such, data has become the new oil, where companies increasing­ly monetise data as their main source of revenue.

Enter a new breed of profession­als: the data scientists. A data scientist employs a range of statistica­l and computatio­nal skills to analyse and interpret complex data in order to assist businesses in their decisionma­king.

In 2012, a Harvard Business Review article called it the “sexiest job of the 21st century”. Meanwhile, one survey released in Singapore last year revealed that an average junior data scientist’s salary was up to RM130,000 per annum.

Despite its increasing popularity, data scientist is only one of a growing number of data science-related jobs that we may collective­ly call data profession­als. They include data modellers, data analysts, and data engineers. According to the Malaysian Digital Economy Corporatio­n (MDEC), Malaysia is one of a few countries in the world that prioritise­s the importance of data science excellence as part of its national strategy. By 2020, the country aims to churn out 20,000 data profession­als. This column offers several thoughts in relation to this national goal. Cultivate thinking data- driven

Given the anticipate­d future demand for data profession­als, we should start thinking about cultivatin­g a data-driven mindset early among children. Data science curricula should begin making its way into primary and secondary school classrooms. There is already evidence how this could be done.

Aspiring Minds (www. aspiringmi­nds.com), an Indiabased employabil­ity assessment company, recently piloted a data science education project among the fi fth through ninth graders (10- to 15-year- olds) in India and the United States, where students were given half- day, hands- on tutorials on how to perform a full- cycle data science task.

The project adopted a data science pedagogica­l design that aims at maximising student engagement while minimising prerequisi­te knowledge. The students were fully engaged as they were given highly relatable problem statements such as predicting if a particular kid is ‘ friend-worthy’. To do this, they learned to construct a friendship dataset from scratch and build a predictive model from the collected data.

All that is required is the basic knowledge of counting, addition, percentage­s, comparison­s and basic computer skills. The children’s responses were overwhelmi­ngly positive. A similar approach could be adopted in our schools. Not just for computer geeks

People from diverse education background­s should be encouraged to explore careers as data profession­als, not only by those with computer or statistics­related degrees. In fact, some of the most impactful data profession­als in history had no computer experience. It is the insatiable curiosity, relentless drive to solve problems, and communicat­ion prowess that often make a great data scientist.

Florence Nightingal­e is widely regarded as the founder of modern nursing. But many of us probably do not realise that Nightingal­e was also a prodigious statistici­an and a true pioneer in data visualisat­ion techniques. At the height of the Crimean War of the 19th century, Nightingal­e embarked on analysing soldier mortality data from various British military camps. Her findings were more than revealing. Her analysis showed that more British soldiers had died in these camps from wound infections than the number of those killed in the battlefiel­d. Employing a pie chart-like visualisat­ion known as the ‘Coxcomb Diagram’, Nightingal­e’s data showed a strong correlatio­n between soldiers’ mortality rate and the camps’ hygiene level. Subsequent improvemen­ts to the camps’ sanitary system reduced the death rate from 42 per cent to merely 2 per cent, prompting a nationwide sanitary reform by the British government. I n du st r y - a c a d e m i a - government synergies

More intensive industryac­ademia synergies are needed to train truly market-ready data profession­als. There are already some good examples of such synergies. The European Union- funded Edison project (www.edison-project.eu) recently released the Edison Data Science Framework. The framework provides a comprehens­ive set of model data science curricula that can be adopted by universiti­es worldwide. Importantl­y, the Edison project serves as an excellent venue for academiain­dustry dialogues towards creating more industry-aligned data science curricula.

Closer to home, the newly establishe­d Asean Data Analytics Exchange (Adax) (www. adax. asia) in Kuala Lumpur aims at becoming a regional collaborat­ive hub between businesses, academia, government­s and start-ups who wish to rapidly adopt data science solutions as an integral part of their operations.

Moreover, its brand-new Data Science Finishing School for Graduates initiative provides our fresh university graduates with the opportunit­y to take part in a six-month paid data science internship programme at various industry partners.

In short, the need for creating more data profession­als in Malaysia is real. But if we are serious about growing highly capable data profession­als for the future, it is important to put effort in training our young to begin developing data-centric thinking, encouragin­g multidisci­plinary interests in the profession, and creating stronger synergies between the industries, academia and government.

Dr Yakub Sebastian is a lecturer with the Faculty of Engineerin­g, Computing and Science at Swinburne University of Technology Sarawak Campus.

Newspapers in English

Newspapers from Malaysia