Research

Building a landslide prediction tool with Google and AI

In "deepLDB," Google Earth images are used to identify past rainfall-induced landslide events around the world. Credit: Shen Multi-scale Hydrology, Processes and Intelligence GroupAll Rights Reserved.

UNIVERSITY PARK, Pa. — In their 2019 AI Impact Challenge, Google asked nonprofits, social enterprises and research institutions around the world, “How would you use artificial intelligence (AI) for social good?” 

“We had a good idea that was looking for such an opportunity,” said Chaopeng Shen, associate professor of civil and environmental engineering at Penn State and principal investigator of “deepLDB,” one of 20 projects awarded funding by Google in the challenge last year. “Rainfall-induced landslides are a huge risk for people who live in mountainous areas, and we thought there was a possibility to use AI to better forecast them.”

Worldwide, landslides cause thousands of deaths and injuries and cost billions of dollars each year, according to the United States Geological Survey (USGS). The most frequent of these are induced by rainfall, often transforming into fast-moving debris flows like the Montecito, California mudslides in 2018.

But Shen said that many of these events also go unreported, complicating efforts to study and eventually predict them.

“Most of the information comes from news reports, and there are a lot of missing events,” Shen said. “In order for us to better forecast landslides, we need to start with a good landslide database.”

From left: Tong Qiu, Chaopeng Shen and Daniel Kifer. Credit: Penn StateCreative Commons

Shen noted that with the availability of satellite images from Google Earth, past landslides can be identified from space. However, finding just one — much less the thousands needed to populate a comprehensive database — requires an entire team to scour imagery for evidence of a past event.

Unless you have AI.

“The first goal of our work was to produce an artificial intelligence method to identify these events from the satellite images,” Shen said. “Once the AI is trained — when it can determine what’s a landslide and what’s not — we can apply it to a very large area, and it will automatically find the place with a suspected event.” 

At the start of the project, Shen and Penn State co-investigators Tong Qiu, associate professor of civil and environmental engineering, and Daniel Kifer, professor of computer science, were provided with an initial dataset of known rainfall-induced landslides by the USGS. After finding the events in Google Earth, they used the satellite images as training examples in a process called “supervised learning.” 

“It’s basically object identification,” Shen said. “By looking at the satellite image, you get a sense that there might have been an event because the scene changed dramatically. Most of the visual cues come from the vegetation.”

Over time, the AI began recognizing the cues it could use to identify a landslide, but it also needed to spot the differences from other occurrences, too. The shape of a disturbance might have indicated a landslide, but it could also have been from a wildfire, excavated mine or torn-down building.

“It has to be able to differentiate the real signals from the noise,” Shen said. “What’s a rainfall-induced landslide, and what’s not?”

According to Associate Professor Chaopeng Shen, past landslide events can be found using just one satellite image. However, having both a "before" and "after" image increases the accuracy of the identification. Credit: Shen Multi-scale Hydrology, Processes and Intelligence GroupAll Rights Reserved.

After a year of training, Shen said the model is now correctly identifying a landslide 97% of the time, but he emphasized more training examples are still needed. The researchers set up a website where people could upload their own Google Earth images to help train the model. 

“If an aerial image of a landslide is not from an area we’ve been focused on, they can help us correct it,” Shen said. “The more data we have, the more accurate the model will be.”

According to Shen, the level of precision in the database is what sets “deepLDB” apart, and it allows them to start moving on to the second goal of the project: prediction. 

“The second step is to use AI to associate the events in the database with rainfall and other local conditions to try to predict what’s going to happen next,” Shen said. “The novel aspect of the project is we have a very high spatial accuracy, meaning we know exactly where these events are. With this kind of precision, we can overlay the events with other datasets like soil texture and elevation and find out some of the fundamental reasons why it happens in one area and not the other. Or why yesterday and not the day before.”

He added that work has just begun on the prediction model, and they have worked with Google AI experts to find the best way to build the AI as it looks for patterns in the growing database. 

“The folks that I’ve worked with at Google and their philanthropic organization, Google.org, really want to create some positive impacts in the world,” Shen said. “Hopefully, we’ll be able to save lives with this effort.”

Last Updated November 9, 2020

Contact