Skip to content

aarch64-dit crate is missing fencing (for both compiler and CPU) #1472

@thomcc

Description

@thomcc

The aarch64-dit crate has two notable flaws:

  1. It is possible for the compiler to reorder operations outside the critical section where DIT is enabled.
  2. It does not perform speculation fencing, which is documented as necessary at least on apple platforms in this document: https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms#Enable-DIT-for-constant-time-cryptographic-operations.

Flaw 1: Compiler Reordering

Here's an example of the first flaw: https://godbolt.org/z/8fdsnxrc5. Essentially the compiler is free to move instructions outside of the asm!("msr DIT #1") and asm!("msr DIT #0") guarded region, and you can see that it essentially compiles:

if have_dit {
    dit_enable();
}
let res = sensitive(a, b);
if have_dit {
    dit_disable();
}
return res;

into

if have_dit {
    dit_enable();
    dit_disable();
}
return sensitive(a, b);

which defeats the point, since the instructions which are data-dependent are no longer in the block where DIT is enabled.

In an ideal world, inserting compiler fences (or atomic fences, or adjusting the asm! options...) would fix this, however they only apply to memory accesses, so that does not work. The only approach I can think of that does work is to use an indirect jump whose target we obscure from the vision of the compiler. Something like:

impl Dit {
    // ...snip ...
    /// Call `f` with DIT enabled, returning the result. Takes care to ensure
    /// that that any computations which occur during `f` do not leak outside
    /// the critical section.
    pub fn with<R, F: FnOnce() -> R>(s: &self, f: F) -> R {
        let guard = self.enable();
        let mut o: Option<F> = Some(f);
        // Note: Could be unwrap_unchecked().
        let f: &mut dyn FnMut() -> R = &mut || o.take().unwrap()();
        core::hint::black_box(f)()
    }
    // ...snip...
}

This works, as demonstrated here: https://godbolt.org/z/fv1Gofjxc (note: the add instruction was changed to floating point addition just so that it's clearer in the assembly listing what is happening—you can see that the fadd takes place in the indirectly called function, which is called (the blr x8) between the msr DIT #1; and msr DIT #0 instructions.

Flaw 2: Speculation

I'm unsure if this is needed on targets besides apple (my suspicion is yes, or at least that there's no guarantee for it to be a no), but apple documents it as necessary in the linked documentation.

Essentially, you need something like this after you enable DIT, and before you disable it (actually, I think you might not need it before disabling it, since it's fine if a DIT-mode processor speculates operations outside of DIT mode).

if is_aarch64_feature_detected!("sb") {
    // note: in the real impl this must happen
    // in a `#[target_feature(enable = "sb")]` function)
    asm!("sb");
} else {
    asm!("dsb nsh", "isb sy");
}

Of course, a real impl needs to:

  • Put asm!("sb") inside a #[target_feature(enable = "sb")] function.
  • Use cpufeatures rather than is_aarch64_feature_detected (which requires adding sb support to cpufeatures).
  • And so on.

A patch fixing these issues is here: master...thomcc:rust-crypto-utils:aarch64-dit-sb-fix. I don't have time to do a PR and respond to review feedback, feel free to mess with it however you please, or to take a different approach that solves the same problem.

One issue with my patch is that it leaves the Dit::enable() function present and non-deprecated, even though it's a footgun. I think probably the right call would be to remove it. But that begs the question of whether or not the API should have a different structure in other ways too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions