UNIVERSITY PARK, Pa. — An operating system’s kernel acts as the translator between user and machine. To improve the reliability of a kernel, developers can isolate the operating system’s device drivers and prevent a failure in one component from affecting other components. Isolation, however, requires impractical amounts of human effort.
A team of researchers, led by G. Gary Tan and Trent Jaeger, professors of computer science and engineering at Penn State, set out to develop a framework that could automate and reduce the amount of manual work needed for device driver isolation in the presence of challenging kernel patterns.
The researchers presented their framework at the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI ‘22), which took place July 11-13 in Carlsbad, California. OSDI is a premier conference in operating systems research and brings together professionals from academic and industrial backgrounds to discuss the design, implementation and implications of systems software, according to its website.
The operating system kernel controls and coordinates all hardware and software in the computer. Device drivers allow the kernel to interact with hardware without knowing the component’s details. For example, when a user directs their computer to print a document, the kernel invokes certain interface functions provided by a printer driver, which processes the data and sends the job to the printer.
According to Tan, to effectively isolate device drivers and maintain kernel-driver communication, developers needed to inspect the large and complex communication interface between a driver and the kernel and decide what data needed to be synchronized by examining all the interactions between the driver and the kernel. They also needed to handle challenging synchronization patterns such as data concurrency, writing thousands of lines of code to keep operations smooth.
“Isolation is an effective technique for improving reliability in software systems, such as the kernel, but relying on human effort to isolate drivers is unrealistic, so we set out to develop a framework to automate the process,” Tan said. “With isolation, failure in one component is constrained within its own domain; bugs in one component cannot directly affect the rest of the system. This significantly improves reliability.”