Spectre is here to stay: An analysis of side-channels and speculative execution by Ross Mcilroy, Jaroslav Sevcik, Tobias Tebbi, Ben L. Titzer, Toon Verwaest (2019) (read in 2019)

Published by marco on

Disclaimer: these are notes I took while reading this book. They include citations I found interesting or enlightening or particularly well-written. In some cases, I’ve pointed out which of these applies to which citation; in others, I have not. Any benefit you gain from reading these notes is purely incidental to the purpose they serve in reminding me of what I once read. Please see Wikipedia for a summary if I’ve failed to provide one sufficient for your purposes. If my notes serve to trigger an interest in this book, then I’m happy for you.

This paper (arXiv) focuses only on intra-process attacks, advocating for mitigation via separation into multiple processes, using hardware-level protection that provides a stronger guarantee.

“This paper explores speculative side-channel attacks and their implications for programming languages. These attacks leak information through micro-architectural side-channels which we show are not mere bugs, but in fact lie at the foundation of optimization.”

This paper proves unequivocally that software mitigation of intra-process, side-channel attacks are futile. An attack is always possible, with a combination of scaling for any-resolution timers and patience. “As we have seen, access to a timer, no matter the resolution, leaks µ-architectural information.” Type-checks in any languages may partially mitigate, but this paper proves that they cannot full mitigate and are, therefore, largely useless for data-security.

The only difference is the bit-rate of extracted information, not whether information can be extracted. They not only proved this with theorems, they implemented many of the attacks to provide accurate estimates of the expected extraction bit-rates in the presence of various mitigations.

There is one particularly virulent variant (#4) that has literally no mitigation.

“Mitigating type confusion for stack slots alone would have required a complete redesign of the backend of the optimizing compiler, perhaps man years of work, without a guarantee of completeness.”

As anyone who’s been following this problem suspected (or pretty much already knew), the world traded of security for performance long ago. Though chip manufacturers and operating-system designers paid lip service to security, performance improvements were paramount.

We’ve known since the end of 2017, but now we have proof. The paper sums it up more nicely than I could,

“Our models, our mental models, are wrong; we have been trading security for performance and complexity all along and didn’t know it. It is now a painful irony that today, defense requires even more complexity with software mitigations, most of which we know to be incomplete. And complexity makes these three open problems all that much harder. Spectre is perhaps, too appropriately named, as it seems destined to haunt us for a long time.”

Citations

“This paper explores speculative side-channel attacks and their implications for programming languages. These attacks leak information through micro-architectural side-channels which we show are not mere bugs, but in fact lie at the foundation of optimization.”

“In variant 1 we’ve shown that indirect jump prediction can be exploited to bypass the implicit type checks that are part of a typical language’s virtual dispatch mechanism. As it turns out, the branch target buffer on most CPUs are approximate in order to save space. For example, Intel 64-bit CPUs only store the low-order 32 bits of the from address (the address of the indirect jump) and the low-order 32 bits of the relative target address (the predicted address). Upon lookup, the predictor ignores the upper 32 bits of the from address and reuses a prediction for an aliased from address. This allows an attacker to train a target indirect branch to speculatively jump to any address within a 4GiB range without ever executing the victim branch.”

Position 397-402

“This is particularly bad, because the attacker can create speculative indirect jumps to anywhere, i.e., control flow that cannot possibly exist in the original code, such as jumping into the middle of arbitrary machine code that simply happens to be a leaking gadget. That means an attacker may not even need to craft an instruction sequence, but find an extant instruction sequence in the victim’s code, similar to return-oriented programming. This can even work across processes. [20] found that the branch target buffer on Intel chips is shared across hyperthreads on the same core, allowing one process to inject predictions into another.”

Position 402-407

“User programs should not be able to access unmapped virtual memory addresses, write to read-only memory [18], or read from kernel memory. Such attempts should result in a faults. Some CPUs seem to check for a fault too late, effectively speculating through a hardware permission check. This depends on the specific details of a CPU’s trap mechanism of course; e.g. faulting at retirement is too late if the processor has already accessed the memory and supplied its value to dependent instructions, which leaked the value into µ-state. Lipp et al. [22], describe a variant 3 attack that enables leakage of data in kernel memory to a userspace process.”

Position 409-414

“Since memory is often the bottleneck in many programs, modern CPUs utilize not only caching but dynamic alias analysis known as memory disambiguation. When executing a store, the CPU uses a predictor to determine which, if any, subsequent loads will depend on the store. If the prediction is no-alias, the CPU may speculatively execute a later load before the store. If the prediction turns out to be incorrect and the store address and load address are in fact aliases, this will be detected when instructions are being retired in program order, and the load will be aborted and re-executed. This, too, represents a vulnerability, since loads that are speculatively executed out of order observe stale values from memory.”

Position 419-423

“Bypassing stores is only one way a memory disambiguator can speculatively accelerate loads [33]. As long as violations are detected and repaired before retirement, other aggressive forwarding strategies could be implemented. If the memory disambiguator learns that a load typically aliases a store, it could speculatively forward the value even if the source address for the load is not yet known. Similarly the disambiguator could learn that two consecutive loads typically load from the same address, and inject the result from the first load into the second.”

Position 423-427

“What characteristics of a programming language make it exploitable on today’s modern hardware? As we have seen, access to a timer, no matter the resolution, leaks µ-architectural information. We point out several language features whose typical implementations may be vulnerable to Spectre. In these we found that a key to constructing the universal read gadget was speculative pointer crafting, whereby an attacker exploits speculation to trick the implementation into interpreting attacker-controlled input as a machine-level pointer, feeding this pointer into a (normally innocent, but speculatively dangerous) load to achieve the universal read gadget.”

Position 440-444

“In particular, we found variant 1 to be quite simple to exploit. For managed languages, variant 3 is only different from variant 1 in that superuser memory can be accessed. Variant 2 is only easily exploitable if one has direct control over the virtual memory addresses of code. Variant 4 can be difficult to exploit reliably due to the black box nature of the memory disambiguator state. We focused exclusively on in-process attacks and not cross-process attacks.”

Position 463-466

“Variant 4 defeats everything we could think of. We explored more mitigations for variant 4 but the threat proved to be more pervasive and dangerous than we anticipated. For example, stack slots used by the register allocator in the optimizing compiler could be subject to type confusion, leading to pointer crafting. Mitigating type confusion for stack slots alone would have required a complete redesign of the backend of the optimizing compiler, perhaps man years of work, without a guarantee of completeness.

“We recognized quickly that a compiler backend overhaul, a complete audit of the entire runtime system, and application of (not yet designed) mitigations in the C++ compiler for the VM’s code itself were intractable for essentially any-sized codebase.

“For this reason we do not believe that variant 4 can be effectively mitigated in software, due not just to manpower, but a lack of architectural options, since reasoning about variant 4 requires the confounding assumption that in speculation, writes to memory may not be visible to subsequent reads at all.”

Position 558-565

“Spectre defeats an important layer of software security. The community has assumed for decades that programming language security enforced with static and dynamic checks could guarantee confidentiality between computations in the same address space. Our work has discovered there are numerous vulnerabilities in today’s languages that when run on today’s CPUs allow construction of the universal read gadget, which completely destroys language-enforced confidentiality.”

Position 570-573

“1. Finding µ-architectural side channels requires enumerating and modeling relevant µ-state, a difficult task for processors that are closed source and full of valuable and carefully-guarded intellectual property. 2. Understanding vulnerabilities requires us to model how programs can manipulate and observe µ- state, which also requires us to understand complex µ-state in black-box processors. 3. Mitigating vulnerabilities is perhaps the most challenging of all, since efficient software mitigations needed for extant hardware seem to be in their infancy, and hardware mitigation for future designs is a completely open design problem.”

Position 577-581

“We were able to leak over 1KB/s from variant 1 gadgets in C++ using rdtsc with 99.99% accuracy and over 10B/s from JavaScript using a low resolution timer. We demonstrated a potential 2.5KB/s variant 4 vulnerability, but with low reliability, starting at 0.01% but amplifiable up to 20% through various techniques. We found that using shared memory to construct a timer worked well enough in JavaScript to measure individual cache hits and misses and exploit any of the known leaks.”

“Our models, our mental models, are wrong; we have been trading security for performance and complexity all along and didn’t know it. It is now a painful irony that today, defense requires even more complexity with software mitigations, most of which we know to be incomplete. And complexity makes these three open problems all that much harder. Spectre is perhaps, too appropriately named, as it seems destined to haunt us for a long time.”

Position 586-589