Houston Chronicle

UH endeavor into data science begins by defining the field

- By Lindsay Ellis

Data science in Houston is a loaded term.

It conjures a scuttled University of Texas System project, attractive skills on a job listing and, perhaps, an endorsemen­t by Mayor Sylvester Turner, calling for collaborat­ions between universiti­es to bring Houston into the burgeoning field.

On Thursday, experts in a quiet, sunlit University of Houston classroom made concrete the vague field by giving examples of data science in action. The campus’s new institute hosted roughly 40 people, who listened to field leaders define the term, show how data science can be used in political science and demonstrat­e a model of coding for large-scale analysis.

UH announced its center in October, but the summer series beginning this month will be the first public events from the institute, meant to fortify the city’s entry into a field that has attracted significan­t interest around Texas and nationwide.

“This is our maiden voyage,” said Andrea Prosperett­i, a UH mechanical engineerin­g professor who directs UH’s institute.

The University of Houston initially said that certificat­e programs and potentiall­y graduate degrees would begin as early as fall 2018. And in February, Amr Elnashai, UH’s vice president for research and technology transfer, said the university planned to offer “an extensive set” of master’s degrees to integrate data science with engineerin­g, arts, music and other fields.

The institute has not yet moved to create those programs. The Texas Higher Education Coordinati­ng Board has received no applicatio­ns from UH for any

data science certificat­es or degrees since October, a spokeswoma­n for the board said.

Bridging discipline­s

Still, introducto­ry courses this fall and the lecture series this summer will put UH on its way, Prosperett­i said. UH will aim to offer an unofficial certificat­e program — giving those who complete the programs recognitio­n for their knowledge but no state-approved certificat­e — in the coming semester as well, Prosperett­i said.

The goal is still to offer coordinati­ng board-recognized programs, he said, adding that the internal approval process at UH would begin in the fall.

“This is not going to happen overnight,” Prosperett­i said, adding that the university has talked to Vanderbilt University data science leaders about mutual interest in the field, as well as companies for potential internship­s.

UH’s first hurdle may well be teaching Houston what data science is.

Andrey Skripnikov knows he’s a data scientist. But ask him what “data scientist” means, and it’ll take longer to explain. Perhaps it’s easier, he said at UH’s Thursday event, to explain what a data scientist is not.

A data scientist isn’t a data specialist, he said. Data specialist­s take in data and organize it, using programmin­g chops to clean up the databases to make analysis easier.

“They’re the mechanics to the car you’re trying to drive,” he added.

A data scientist also isn’t a data analyst, he added, who take the cleanedup data from specialist­s and summarize it with basic statistica­l concepts and a pretty visualizat­ion.

Data scientists, he said, ask good questions. They have intuition and a firm knowledge base to know what calculatio­ns to do and how to work through large masses of informatio­n. And with experience in industry, data scientists think critically about their results.

“Demand for these jobs is growing all over the place,” he said, adding that data scientists work in informatio­n technology, education, science, consulting and financial services, adding the ability to critically analyze large amounts of informatio­n to teams in various sectors. “It’s the sauce to any dish, the sauce to any meal — the dish, the meal being your majors.”

That approach — introducin­g data science courses to majors in various fields — is exactly the strategy UH plans to take for its new institute.

UH does not plan to hire faculty specifical­ly for the institute. Instead, Prosperett­i said, the institute will funnel money to various academic department­s to pay lecturers’ salaries.

“A lot of the activity will reside in the existing colleges, in the existing department­s,” he said. “That is how we will reach the entire university.”

In addition, graduate students will do finalproje­ct internship­s with companies around Houston to grow bigger research projects.

The institute will receive about $1 million from UH’s budget in the next fiscal year, Prosperett­i said, along with $2 million from the Cullen Foundation.

Large Texas push

UH is one of many universiti­es in Texas working to bolster data science offerings.

Rice University said it has hired nine faculty for its own data science initiative. In 2015, officials said the university would spend $43 million on the program, including new faculty and staff. Rice will host a conference on data science in October.

In fall 2014, the University of Texas at Austin rebranded its statistics and scientific computatio­n division to a statistics and data sciences department. It offers more than 100 undergradu­ate and graduate courses to students who study topics as wide-ranging as nursing, education and liberal arts.

And the University of Houston-Downtown offers a bachelor’s degree in data science, which it says is the first of its kind in Texas.

Broad implicatio­ns

Not every effort has succeeded. A task force convened by the University of Texas System considered using hundreds of acres of land in Houston for a data science institute that would delve deep into health, energy and education. But political headwinds proved too steep. Pushback from regents and local lawmakers forced Chancellor William McRaven to call off the project, admitting he misstepped in buying the land.

UH announced its own institute in the fall. University officials said they first considered creating the program months before UT’s plans were made public.

Prosperett­i said he wants UH to make a name for itself in the field by integratin­g courses with many majors.

“We will reach the entire university,” he said. “The system should be a synonym for data science.” Partially explaining the field’s rapid growth is its potential to influence various discipline­s, and on Thursday, that applicabil­ity was on display.

Francisco Cantu, a UH political science professor, explained how he collected and analyzed data from photos of voting tallies to determine the extent of election fraud in Mexico’s 1988 presidenti­al election.

The model Cantu created identified fraudulent tallies by examining the layers of data in each photo. The model could, for example, distinguis­h when someone made a mistake (changing 100 votes for a candidate to 101 votes) from when someone inflated results by hundreds of votes to manipulate the outcome.

Arnie McAdams, attending the lecture, watched with interest.

McAdams, an accountant who just finished his second year at UH’s law school, heard about the data science lecture via email.

And while he watched, he wondered about possible implicatio­ns of data science on the justice system, knowing that bias influences judicial decisions. Data science, he realized, may be able to measure the scope of bias.

“If you can better understand that,” he said, “you can make a different justice system.”

Newspapers in English

Newspapers from United States