UNIVERSITY PARK, Pa. — The Penn State Institute for Computational and Data Sciences’ 2020 virtual symposium brought together researchers, computational experts, government officials and industry specialists who are working to build interdisciplinary communities that can transform this deluge of data into a stream of solutions for the world’s most pressing scientific and societal challenges.
“We are here today because of data — More specifically, a deluge of data,” Jenni Evans, professor of meteorology and atmospheric science and director of the Institute for Computational and Data Sciences. “This deluge is especially evident in interdisciplinary research, where we are using a diversity of data to integrate multiple perspectives to address complex problems, often involving a diversity of data and the challenges that brings.”
She added that finding ways to use this data deluge to our advantage would also require support from industries outside of academia.
“ICDS has been partnering with industry to support new research collaborations and explore new technologies. The challenges of the data deluge have been a core driver of these collaborations,” she told the 420 people who attended the symposium virtually on Oct. 21 and 22.
Providing computational support
Computational power is the fuel to drive data-based innovations and solutions, according to Nick Jones, executive vice president and provost at Penn State, who also spoke at the symposium. He said the University is committed to providing researchers with the tools and expertise to fully explore the power of big data to manage that data deluge, while tackling scientific and societal challenges.
“Data science is particularly well-suited to tasks that are data intensive and require many complex computations to be executed quickly,” said Jones. “Using HPC techniques, researchers like you can perform tasks in only hours that previously could take weeks or even months to complete. This makes big data enormously useful in many areas of research, from the physical and life sciences, information sciences and engineering to studies in education, business and the liberal arts, among others. At Penn State, we believe strongly in the power of computational research to create positive social impacts, and we are actively and broadly engaged in it.”
He listed a few of those critical projects that Penn State researchers are involved in, including using data science to help discover treatments for COVID-19, improving economic development and humanitarian efforts, and battling discrimination and threats to social justice.
Lora Weiss, senior vice president for research, added that the Penn State research community is a worldwide community of excellence.
"The participants gathered here can collaborate with more than 5,000 researchers who work across the globe," said Weiss. “We have tremendous talent here at Penn State. We are broad and deep. The National Science Foundation each year looks at research expenditures — not just NSF funds, but across the board — and Penn State has 18 research fields ranked in the top ten. This is more than any other university and it’s a testament to our excellence, our breadth and our depth.”
The symposium featured four panel sessions aimed at reviewing the problems and opportunities of data science, as well as diving into associated tools and technologies, such as machine learning and artificial intelligence.
Several of the panel members indicated that because AI and ML are technologies that have massive potential for impact — both positive and negative — on society and science, it requires collaborations not just across disciplines, but also across the academia, industry and government.
Soundar Kumara, the Allen E. Pearce and Allen M. Pearce Professor of Industrial Engineering, who moderated the panel on "AI and the Manufacturing Industry," said that interest in artificial intelligence and machine learning has increased over the past several years for industry, but that interest is increasingly important for manufacturing operations.
"Today, we see that AI and machine learning is commonplace — and that every industry and every manufacturing industry is trying to embrace AI and machine learning," he said.
Big data in agriculture
The panel, "Big Data, Agriculture and the Food Supply," explored the challenges and opportunities surrounding the use of technology in the agricultural industry.
David P. Hughes, associate professor of entomology and biology, told participants that PlantVillage, a project that he is leading, can put the power of AI literally into the hands of smallholder farmers through their smartphones. The project could have far-reaching — and immediate — benefits for these farmers who already face numerous challenges, but are particularly vulnerable to the effects of climate change.
"The reach of the system that we have, of our University system, of the collection of land grants, can be global, can be immediate and can impact hundreds of millions of people in the immediate future, as well as the long-term future wherever people have to cope with the challenges of pests and climate change," said Hughes.
The data deluge and democracy
The panel, "Social Engineering with Data: Disinformation and Destabilization of Geo-Political Order," sought to foster collaborations between data science and government to raise awareness of the role that the data deluge plays in significant challenges to democracy.
“It’s important that we know what social engineering is and how our own data enables our own manipulation,” said Anne Toomey McKenna, affiliate faculty member of ICDS. “We are surrounded by omnivorous data collectors — taking in an using whatever data is available. Our own devices — we’re on Zoom right now, we all have phones — are collecting all kinds of data. Our cities collect data. We have sensors everywhere.”
U.S. Secretary of Homeland Security and Pennsylvania’s 43rd Gov. Thomas J. Ridge, who served as one of the panelists, said that big data has massive potential to do good, as well as cause harm.
“The bottom line is, in today’s world, this data-centric world, where data can be used in such a positive way — to improve our lives, to improve education, to improve agriculture, to improve healthcare — but that data also has geopolitical implications when it is improperly used,” said Ridge. “And, clearly, it has been used and will continue to be used as a matter of statecraft by our enemies.”
Data and DNA
Working with health data requires extreme sensitivity on the part of data scientists, according to the "Data, Genetics, and DNA – Value, Ethics, and Risks" panelists. Data scientists must consider both health data privacy laws, as well as adherence to ethical approaches to privacy.
Daniel Susser, assistant professor of information sciences and technology and philosophy and technology research associate in the Rock Ethics Institute, pointed out that managing genetic data of study participants who consented may not go far enough. The privacy of family members, who share the participant’s DNA, could also be at stake.
"There are real ethical questions about whether informed consent procedures work when we're talking about information that is deeply revealing not only about individuals, but about their networks," said Susser to session participants.
National Science Foundation support of collaborations
Participants had a chance to listen to a keynote speech from Chaitan Baru, senior science adviser at the National Science Foundation’s Convergence Accelerator, about the NSF Convergence Accelerator. The model is a big tent approach to science, using the combined resources of industry and academia to promote projects for the public good. For example, Baru mentioned that the accelerator model includes a pitch to a panel that may include venture capitalists and other members of industry.
Multidisciplinarity — stretching across academia and across industries — is the key to tackling big challenges, said Baru.
“The kind of problems that we want to tackle are the kinds of ones that you couldn’t just do by one domain — computer scientists doing it, or neuroscientists doing it, but requires a real multidisciplinary team of folks working together,” said Baru.
ICDS 2020 was sponsored by IBM, Dell Technologies, AWS, VMWare and Intel.