UNIVERSITY PARK, Pa. — Never has the world been better positioned to predict and respond to natural disasters. The stream of data at our fingertips is seemingly endless.
But the size of this mounting trove of information in itself poses a problem. For example, running flood calculations for a city facing heavy rains using a century of data is highly accurate. But the calculation is useless if it takes days or weeks to compute.
That’s where Penn State’s Geoinformatics and Earth Observation laboratory (GEOlab) comes in. Led by Guido Cervone, associate professor of geoinformatics and associate director of the Institute for CyberScience, it’s tasked with three main objectives: developing computational algorithms to swiftly analyze massive amounts of data; improving numerical modeling; and incorporating volunteered geographical information such as social media and citizen science data into the equation.
Through these objectives, Cervone’s team tackles a range of areas including flood prediction and response; short and long-term climate predictions for agriculture and energy forecasting; and immediate data collection for events such as the 2011 Fukushima Daiichi nuclear accident.
“Our research is improving society’s ability to forecast and respond to a range of research areas, including flooding and other natural disasters, as well as segments of the energy market, including solar and wind energy prediction,” Cervone said. “Our methodologies could ultimately have a big impact in both disaster response and our transition to renewable energies.”
GEOlab was established in 2014 within the Department of Geography and Institute for Cyberscience to address big-data problems related to Earth science. Their research, funded largely by the Office of Naval Research and the National Science Foundation, sits at the intersection between computer science, meteorology and geospatial science. The group primarily works out of Penn State but spends time at the National Center for Atmospheric Research in Colorado.
GEOlab is currently comprised of two postdoctoral researchers, Martina Calovi and Liping Yang, and six doctoral students, Laura Clemente-Harding, Elena Sava, Carolynne Hultquist, Yanan Xin, Weiming Hu and Courtney Jackson. Here’s a look at a few of the projects GEOlab is working on.
Social media filling gaps
During a flooding event, responders rely on remote-sensing data from satellite imaging to target the hardest hit areas.
That’s supplemented by models used to forecast rising waters. But these methods are limited. It could be days before the satellite again passes over that region, with weather often masking the view. And the models aren’t perfectly accurate.
Sava, a doctoral candidate in geography, is using non-conventional data to give emergency responders a clearer picture of the damage from flooding and other disasters.
Sava generates real-time maps by combining these data sets and applies geolocations to images gleaned from social media such as Twitter to fill in the gaps. She’s integrating different data sources, often generated during the time of an event, in order to assess the extent of flood over space and time as well as estimate damage. This could complement model simulations.
“We use social media as part of our data source. Being able to give meaning to something someone posted that may not have a scientific intent to it and turn that into something useful is very promising,” Sava said. “Because people on the ground are the ones who are mostly impacted. Being able to use some of that and bring it into the science is an exciting new opportunity”
Sava is also strengthening the ability for computers to detect floodwaters from images. Automating the process would paint a quicker and more accurate picture of potential flood areas and could ultimately save lives.
Citizen science helps disaster response
During the Fukushima power plant crisis, radioactive particles dispersed into the atmosphere and deposited on the ground in the Fukushima prefecture. Government agencies responded to monitor the release of these toxins. So did the Safecast project, a citizen-led group that immediately dispersed more than 8,000 portable radiation-measurement devices to the public, increasing the number of devices in subsequent months.
Citizens gathered more than 80 million geolocated radiation data points. The information helped fill the data gaps left when power was knocked out, disabling many of the government’s measuring tools. Safecast data was often gleaned from areas such as roads or communities.
Carolynne Hultquist, a doctoral candidate in geography, compared these results with the government analysis. The goal was to determine if crowd-sourced measurements could provide reliable data during emergencies.
“A lot of times, citizen science is framed as scientists asking citizens to help with a project, but there are a lot of emerging citizen-led projects where scientists can help validate or analyze the data,” Hultquist said. “The validation process can also add a lot of credibility to citizen-led projects and could be used during a whole host of disasters.”
Managing big data
Imagine you wanted to forecast the amount of energy that could be harnessed from the sun nationwide using photovoltaic solar panels. Or, you wanted to chart the most productive regions. For accuracy, you could take 100-year models for thousands of locations. But that computation could take months, or longer.
How much data could you omit while still retaining accuracy?
That’s what Weiming Hu, a doctoral candidate in geography, is trying to figure out using his programing skills and statistics knowledge.
Simply, you could start cutting data points from each region. But that’s not the best way, Hu says.
“By targeting areas that have a larger weather variability we can achieve the same level of accuracy with far less computation,” Hu said. “We’re saving our computing resources and speeding up the process. We’re steering the computation to where it is needed.”
Hu, in collaboration with Michael Mann, distinguished professor of atmospheric science, and Shantenu Jha, associate professor at Rutgers University, is integrating machine learning algorithms and the Analog Ensemble technique to automatically construct an unstructured grid topology over the United States for faster and better weather forecasts.
These forecasts are then included in a model that predicts energy production under different atmospheric conditions with the goal of determining which locations are best suited for long-term solar power generation, factoring for changes in climate.
The future of GEOlab
New challenges arise from data that has promising potential but is unstructured. Drones are among the fastest growing platform to generate high-quality data, and they are being investigated by GEOlab.
Cervone said that the areas of data generation have outpaced our ability to utilize this information. Solving that will improve our ability to adapt and respond in a range of areas.
“The challenge is that data have grown at a faster rate than our ability to analyze them,” Cervone said. “Our group will have the greatest impact in areas related to both disaster and renewable energy by developing scalable methodologies to run on high-performance computers designed to tackle this ever-increasing and unprecedented amount of information.”
Cervone envisions the creation of a new center with the Institute for CyberScience (ICS) and the Environmental and Earth Systems Institute (EESI) to stress geoinformatics research at Penn State.