lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 31 Jan 2019 10:27:06 +0100
From:   Christoffer Dall <christoffer.dall@....com>
To:     Julien Thierry <julien.thierry@....com>
Cc:     James Morse <james.morse@....com>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        daniel.thompson@...aro.org, joel@...lfernandes.org,
        marc.zyngier@....com, catalin.marinas@....com, will.deacon@....com,
        mark.rutland@....com, Arnd Bergmann <arnd@...db.de>,
        linux-arch@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts

On Thu, Jan 31, 2019 at 08:56:04AM +0000, Julien Thierry wrote:
> 
> 
> On 31/01/2019 08:19, Christoffer Dall wrote:
> > On Mon, Jan 28, 2019 at 03:42:42PM +0000, Julien Thierry wrote:
> >> Hi James,
> >>
> >> On 28/01/2019 11:48, James Morse wrote:
> >>> Hi Julien,
> >>>
> >>> On 21/01/2019 15:33, Julien Thierry wrote:
> >>>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
> >>>> to interract with guest TLBs, switching from EL2&0 translation regime
> >>>
> >>> (interact)
> >>>
> >>>
> >>>> to EL1&0.
> >>>>
> >>>> However, some non-maskable asynchronous event could happen while TGE is
> >>>> cleared like SDEI. Because of this address translation operations
> >>>> relying on EL2&0 translation regime could fail (tlb invalidation,
> >>>> userspace access, ...).
> >>>>
> >>>> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
> >>>> clear it if necessary when returning to the interrupted context.
> >>>
> >>> Yes please. This would not have been fun to debug!
> >>>
> >>> Reviewed-by: James Morse <james.morse@....com>
> >>>
> >>>
> >>
> >> Thanks.
> >>
> >>>
> >>> I was looking for why we need core code to do this, instead of updating the
> >>> arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
> >>> to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
> >>> itself.
> >>>
> >>
> >> Yes, that's the main reason.
> >>
> > I wondered the same thing, but I don't understand the explanation :(
> > 
> > Why can't we do a local_daif_mask() around the (very small) calls that
> > clear TGE instead?
> > 
> 
> That would protect against the pseudo-NMIs, but you can still get an
> SDEI at that point even with all daif bits set. Or did I misunderstand
> how SDEI works?
> 

I don't know the details of SDEI.  From looking at this patch, the
logical conclusion would be that SDEIs can then only be delivered once
we've called nmi_enter, but since we don't call this directly from the
code that clears TGE for doing guest TLB invalidation (or do we?) then
masking interrupts at the PSTATE level should be sufficient.

Surely I'm missing some part of the bigger picture here.

Thanks,

    Christoffer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ