Stacked Borrows: An Aliasing Model for Rust by Ralf Jung, Hoang-hai Dang, Jeehoon Kang, Derek Dreyer (2020, read in 2020)

Published by marco on

Updated by marco on

Disclaimer: these are notes I took while reading this book. They include citations I found interesting or enlightening or particularly well-written. In some cases, I’ve pointed out which of these applies to which citation; in others, I have not. Any benefit you gain from reading these notes is purely incidental to the purpose they serve of reminding me what I once read. Please see Wikipedia for a summary if I’ve failed to provide one sufficient for your purposes. If my notes serve to trigger an interest in this book, then I’m happy for you.

This paper (PDF) (Saarland Informatics Campus, Germany) is a recent one addressing one of the main weaknesses of the “borrow checker” in Rust. What’s a borrow checker? Where other programming languages have chosen from manual memory-management (C, early C++, early Pascal, etc.), garbage-collection (C#, Java, etc.) and reference-counting (Swift, modern C++, etc.), Rust chose a new path: the borrow checker.

The rules of the language require that the program be written in a manner where it is always clear who owns memory and who is “using” memory. Allocated memory is always deallocated when it leaves the scope in which it was allocated, unless the memory was passed on to a different scope by “loaning” it. If a “borrowed” reference lingers after this scope is closed, the compiler flags it as an error. Rust also has a lot of support for explicitly copying references, when the borrow checker can’t be satisfied any other way. The algorithm treats mutable and immutable references accordingly. See the Rust documentation on references and borrowing or the paper itself for an in-depth discussion.

This approach requires more involvement on the part of the programmer, but also results in programs that provably have neither memory leaks nor access violations should the program pass compilation. This is a very interesting property for Rust to be able to guarantee—and it accomplishes it without a garbage collector (with its associated performance issues and difficulty in being predictable enough for real-time, system-level code) or reference-counting (which also incurs performance overhead and require participation of the runtime). Instead, a Rust program’s memory usage is guaranteed by the compiler to be correct, so there is no need for asynchronous tasks or runtime support.

In the current incarnation of Rust, this guarantee comes with a big caveat: the borrow checker does not deal with unsafe blocks at all. It is hoped that most Rust code can avoid unsafe blocks, relying instead of higher-level abstractions that hide unsafe code. However, what guarantee is there that the unsafe code used in the common libraries or popular crates is not misallocating or misusing memory? While the code in the base libraries is fastidiously written and covered by myriad tests, errors are bound to slip in.

This is where the software proposed in this paper comes in. The authors build on the borrow-checker concept to extend it to unsafe regions as well. They come up with a new borrow-checking algorithm called “Stacked Borrows” that is more sophisticated than the initial algorithm introduced with and still used in modern Rust. The authors show that the their algorithm not only allows a compiler to improve its borrow-checking but also increases the number of situations in which a compiler can be 100% sure of the placement of variable manipulations so that it can optimize many more situations to produce much more performant code.[1]

That is,

“In this work, we propose Stacked Borrows, an operational semantics for memory accesses in Rust. Stacked Borrows defines an aliasing discipline and declares programs violating it to have undefined behavior, meaning the compiler does not have to consider such programs when performing optimizations.”

They obviously spent quite a bit of time honing their algorithm, running their test interpreter and compiler against a large part of the Rust standard library and popular crates. Their paper includes not only the algorithm, but also several proofs for why the algorithm can guarantee certain properties that a compiler can use to optimize code much better than today’s Rust compiler. Not only that, but their attack on the corpus of Rust code yielded several cases in which highly central Rust runtime/system unsafe code was incorrect and possibly leaked or violated memory. Several of their pull requests have been accepted and the team and their tool is quite well-received in the Rust community.

They have made this tool, an interpreter named Miri available for anyone to use and test until “[we can] eventually mak[e] a variant of Stacked Borrows part of the official semantics of Rust.”

[1] A lot of code optimization done by compilers involves being able to move variable loads and stores in a way that avoids touching registers more often than necessary, improving cache locality, etc. without changing the semantics of the program. Understandably, a compiler is required to be pessimistic about changing anything from what the developer actually specified. Where “lowering” is a free transformation from one formulation to another that is exactly equivalent, the optimizations we are talking about are still provable, but are no longer so “obviously” transformations between equivalent states. For example, a compiler could avoid certain volatility constraints if it can guarantee that a certain change is always preceded by a memory fence, or the compiler can leave off array-bounds-checking if it knows that the indexing will never leave the bounds of the array, etc.↩

Citations

None.