Institute for Computational and Data Sciences

ICDS Fall Symposium speakers offer insights on digital fairness

Institute for Computational and Data Sciences fosters collaborations, encourages action to make data science fair for everyone

The ICDS 2021 Fall Symposium featured keynote speakers and interactive discussions on making data science fairer for everyone. The discussions included a chat about energy, justice and big data, guided by experts (above) Katlyn Turner, Jessica Omukuti, Jean Paul Allain and Helen Greatrex.  Credit: Penn State. Creative Commons

UNIVERSITY PARK, Pa. — Experts from industry and academia offered their insights and led panel discussions on ways to improve digital fairness at the recent Institute for Computational and Data Sciences (ICDS) Fall Symposium.

The event was held virtually and included an interdisciplinary group of Penn State scientists who served as speakers and moderators, along with experts from Microsoft, New York University and other world-class institutions. Topics for the symposium included: using data to promote justice in energy use issues; understanding the ethical challenges and considerations in using machine learning methods to explore clinical medicine; reviewing the dimensions of law, policy and data; and forging partnerships with industry.

Jenni Evans, professor of meteorology and atmospheric science and ICDS director, told the symposium attendees that the data science community is constantly reminded of what can go wrong in the unsophisticated use of data in the development of ever-more powerful data science technologies, such as artificial intelligence and machine learning.

“For example, a health care algorithm used to estimate risk for more than 200 million people showed bias against African Americans, or a hiring algorithm for a major tech company showed an inherent gender bias,” she said. “Only after deploying these algorithms did the community recognize their levels of inherent unfairness.”

Evans added that the speed of development in data science sometimes outpaces full understanding of the ramifications.

“In the computational and data sciences, algorithms and techniques can sometimes evolve faster than we can fully consider their impacts,” said Evans. “Treatment of the data underpinning artificial intelligence — or AI — and other modes of algorithm development has increasingly come under scrutiny. These insights are improving the quality and of these tools and the potential for a positive impact across society.”

Nicholas P. Jones, Penn State’s executive vice president and provost, told attendees that fairness is vital for aligning technological prowess with human values.

“As Penn State's computational research hub, ICDS has become increasingly integral to our research enterprise,” said Jones. “Rapid advances in high-performance computing are enabling us to take big data and make positive impacts in many areas of daily life from energy, communications and medicine, to infrastructure, finance and manufacturing, among others. However, as we use data science and artificial intelligence to benefit society and conduct related research, we know that the models relying on data and algorithms must align and support core human values to achieve fairness.”

Jones said the conversations during the symposium would help inspire action toward making data science fairer and more equitable.

“Your work will facilitate new collaborations and endeavors to ensure that fairness is a fundamental and requisite attribute of effective data science,” he added.

Genomic data

Lorin Crawford, a senior researcher at Microsoft Research and the RGSS Assistant Professor of Biostatistics at Brown University, served as one of the keynote speakers and discussed the lack of diversity in genomic data and its impact on scientific discovery, as well as health disparity.

Genome-wide association studies, for example, lack diversity of participants, said Crawford.

“Most of these studies are dominated by people who have self-identified as having European ancestry,” said Crawford. “In 2009, only 4% of people were self-identified as having non-European ancestry. This has improved a little bit in 2016. But still an overwhelming amount of what we know about genetic architecture has been driven by individuals of one group.”

This lack of diversity impacts models that are used in studies that can have major scientific and health impacts, he said.

“Without being more inclusive, research has shown that the clinical use of things like polygenic risk scores may exasperate health disparities,” Crawford said. “If I trained my models on one group and then I tried to use those polygenic risk scores to predict phenotypes in other groups, what you would see is my predictive accuracy goes down in my model.”

Encouraging more diverse participants in these genome-wide studies offers numerous advantages for science and society.

“People have really started to look even more recently of this idea of what happens when we start to include more individuals in these studies,” said Crawford. “And it’s been even shown explicitly that when we are more inclusive, our understanding of complex traits improves quite radically.”

Technology and race

Charlton McIlwain, vice provost for faculty engagement and development and professor of media, culture and communication at NYU, gave the second keynote speech and offered insights on the relationship between race and technology, a subject of his book, “Black Software: The Internet & Racial Justice, from the AfroNet to Black Lives Matter.”

McIlwain said the race, racism and white supremacy structure and guide in very specific ways the first principles and first uses of computing technology.

“When we think about race and technology — the history of race and technology — typically people's minds go to tech first: Can we fix the tech? What are the problems in the technology?” said McIlwain. “But we haven't so much asked ourselves a question about what preceded the technology, what particular structure — social, economic and political — proceeded the advent of computing, computing technology and the rise of computing technology.”

Energy, justice and big data

Scientists and policymakers are relying on data to help society with the fight against climate change and in the adoption of alternative carbon-free energy sources. However, these efforts must be conducted responsibly and with fairness in mind, according to panelists at the symposium’s session on energy, justice and big data.

One of the panelists, Jessica Omukuti, a research fellow at Oxford Net Zero and a postdoctoral research associate in the Interdisciplinary Global Development Centre at the University of York, spoke on the need to understand not just how climate change is affecting certain locations, but also affixing responsibility on who is causing the pollution that is changing the climate.

“The last few IPCC [Intergovernmental Panel on Climate Change] reports have highlighted that a warming of two degrees or even 1.5 degrees is very, very dangerous for a lot of people,” said Omukuti. “And so we'll have biodiversity losses, we'll have ecosystems collapsing, some communities will lose their lands, most of them in in coastal areas, but also those in very dry land areas. And so thinking about this brings in the question of who's responsible for this? And who should be addressing this? And how is it related to energy?”

Omukuti added that climate change is caused by emissions, and most of the emissions that have caused climate change so far, have come from a specific group of people in the Global North, including parts of North America, Europe and Asia.

Most of the people who cause little risk to the climate live in the Global South — countries in Africa, parts of Asia and Latin America, according to Omukuti. On the other hand, the Global North — part of the Americas, most of Europe and part of Asia — have been responsible for causing a vast amount of the emissions that threaten the climate.

“But, then when you think about the effects of global warming, you see them more biased toward the very poor countries,” said Omukuti.

Omukuti suggested that data could be used to both prioritize help for those most negatively affected by climate change and properly balance accountability for the emissions that influence climate change.

The panel also included Penn State faculty members Jean Paul Allain and Helen Greatrex, who are both ICDS co-hires. Allain is the head of the Ken and Mary Alice Lindquist Department of Nuclear Engineering and the Lloyd and Dorothy Foehr Huck Chair in Plasma Medicine in the Huck Institutes of the Life Sciences, and Greatrex is an assistant professor of geography and statistics.

Action for fairness

The goal of the symposium was to bring people together and to inspire them to tackle these critical challenges in data science, according to Evans.

“Ensuring fairness will require not only being able to predict some of these adverse consequences during the development phase, but also that we have the right people at the table from the start,” she said. “It’s no simple task, and the talks during this symposium will shed some light on possible ways that the scientific community can move forward with digital fairness.”

The symposium also featured a virtual poster presentation. The 2021 ICDS Fall Symposium’s winning virtual poster presentation included research on autism detection and urban flooding.

Symposium sponsors included IBM, PosterSmith, and the National Institute of Statistical Sciences.

Last Updated November 29, 2021

Contacts