Skip to main content
Seminar | Argonne Leadership Computing Facility

msr-safe, libmsr, and the Variorum Project: Hoisting Low-Level Processor Features into Userspace

ALCF Seminar

Abstract: Intel processors have a wealth of features for power and energy measurement and control, cache allocation, and performance telemetry. Accessing these features requires ring 0” access, which in practice means working within the operating system kernel. Performing kernel-level research on production supercomputers tends to be slow at best, with the review process for new kernel code being understandably conservative.

My team at Lawrence Livermore National Laboratory (LLNL) has solved this problem by creating the msr-safe Linux kernel module. msr-safe allows users to program individual model-specific registers (which control a majority of the features we’re interested in), but also allows system administrators to provide group-level, bitwise whitelist control over which registers and portions of registers are exposed. This approach has allowed us to minimize the amount of code running in the kernel and moved the security evaluation to whether specific users may be trusted with specific capabilities. By doing so, we have been able to do cutting-edge systems research well before the official Linux kernel provided approved” device drivers, and this in turn has allowed us to influence the direction Intel is taking with future architectures. libmsr provides a friendly — well, friendlier” interface to the most commonly used processor features. The Variorum project, funded in part by a recent TechBase round at LLNL, expands our approach to other architectures (Power9, nVidia, ARM, Xilinx) as well as other protocols (particularly IPMI and CSRs/PCI).

The talk will cover a handful of otherwise inaccessible processor features and how we’ve leveraged them. I will also touch on practical issues of doing systems research on production machines, security issues, feature documentation issues, and how msr-safe has deepened our vendor collaboration and commitments. Feature requests are welcome, as are patches.

 This seminar will be streamed.