lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 4 Feb 2021 16:11:12 -0800
From:   Andy Lutomirski <>
To:     Andrew Cooper <>,
        "H. Peter Anvin" <>
Cc:     Dave Hansen <>,
        LKML <>,
        Jan Kiszka <>, X86 ML <>,
        Peter Zijlstra <>
Subject: Re: [RFC][PATCH 2/2] x86: add extra serialization for non-serializing MSRs

On Thu, Feb 4, 2021 at 3:37 PM Andrew Cooper <> wrote:
> On 05/03/2020 17:47, Dave Hansen wrote:
> > Jan Kiszka reported that the x2apic_wrmsr_fence() function uses a
> > plain "mfence" while the Intel SDM (10.12.3 MSR Access in x2APIC
> > Mode) calls for "mfence;lfence".
> >
> > Short summary: we have special MSRs that have weaker ordering
> > than all the rest.  Add fencing consistent with current SDM
> > recommendatrions.
> >
> > This is not known to cause any issues in practice, only in
> > theory.
> So, I accept that Intel have their own reasons for what is written in
> the SDM, but "not ordered with stores" is at best misleading.
> The x2APIC (and other) MSRs, aren't serialising.  That's fine, as is the
> fact that the WRMSR to trigger them doesn't have memory operands, and is
> therefore not explicitly ordered with other loads and stores.
> Consider:
>     xor %edi, %edi
>     movb (%rdi), %dl
>     wrmsr
> It is fine for a non-serialising wrmsr here to execute speculative in
> terms of internal calculations, but nothing it does can escape the local
> core until the movb has fully retired, and is therefore globally visible.
> Otherwise, I can send IPIs from non-architectural paths (in this case,
> behind a page fault), and causality is broken.

I'm wondering if a more mild violation is possible:

Initialize *addr = 0.

mov $1, (addr)

remote cpu's IDT vector:

mov (addr), %rax
%rax == 0!

There's no speculative-execution-becoming-visible-even-if-it-doesn't-retire
here -- there's just an ordering violation.  For Linux, this would
presumably only manifest as a potential deadlock or confusion if the
IPI vector code looks at the list of pending work and doesn't find the
expected work in it.

Dave?  hpa?  What is the SDM trying to tell us?

Powered by blists - more mailing lists