linux-kernel - Re: [PATCHv2 3/3] x86/tdx: Handle load_unaligned

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YoflYI6AACAqAt9l@google.com>
Date:   Fri, 20 May 2022 19:00:48 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:     tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        dave.hansen@...el.com, luto@...nel.org, peterz@...radead.org,
        ak@...ux.intel.com, dan.j.williams@...el.com, david@...hat.com,
        hpa@...or.com, linux-kernel@...r.kernel.org,
        sathyanarayanan.kuppuswamy@...ux.intel.com,
        thomas.lendacky@....com, x86@...nel.org
Subject: Re: [PATCHv2 3/3] x86/tdx: Handle load_unaligned_zeropad()
 page-cross to a shared page

On Fri, May 20, 2022, Kirill A. Shutemov wrote:
> On Fri, May 20, 2022 at 05:47:30PM +0000, Sean Christopherson wrote:
> > On Fri, May 20, 2022, Kirill A. Shutemov wrote:
> > > @@ -299,6 +301,24 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
> > >  	if (WARN_ON_ONCE(user_mode(regs)))
> > >  		return -EFAULT;
> > >  
> > > +	/*
> > > +	 * load_unaligned_zeropad() relies on exception fixups in case of the
> > > +	 * word being a page-crosser and the second page is not accessible.
> > > +	 *
> > > +	 * In TDX guests, the second page can be shared page and VMM may
> > > +	 * configure it to trigger #VE.
> > > +	 *
> > > +	 * Kernel assumes that #VE on a shared page is MMIO access and tries to
> > > +	 * decode instruction to handle it. In case of load_unaligned_zeropad()
> > > +	 * it may result in confusion as it is not MMIO access.
> > 
> > The guest kernel can't know that it's not "MMIO", e.g. nothing prevents the host
> > from manually serving accesses to some chunk of shared memory instead of backing
> > the shared chunk with host DRAM.
> 
> It would require the guest to access shared memory only with instructions
> that we can deal with. I don't think we have such guarantee.

Ya, it's purely thoereticaly behavior.  But panicking if the kernel can't decode
the instruction is really all the guest can do.

> > > +	 * Check fixup table before trying to handle MMIO.
> > 
> > This ordering is wrong, fixup should be done if and only if the instruction truly
> > "faults".  E.g. if there's an MMIO access lurking in the kernel that is wrapped in
> > exception fixup, then this will break that usage and provide garbage data on a read
> > and drop any write.
> 
> When I tried to trigger the bug, the #VE actually succeed, because
> load_unaligned_zeropad() uses instruction we can decode. But due
> misalignment, the part of that came from non-shared page got overwritten
> with data that came from VMM.

That's a bug in the emulation then.  I.e. it needs to deal with page splits.

> I guess we can try to detect misaligned accesses and handle them
> correctly. But it gets complicated and easer to screw up.

At a minimum, it should reject EPT violation #VEs that split pages (on either side).
That's needed irrespective of fixup, e.g. if there's a bug in there kernel that
results in splitting an MMIO region, then panicking is better than data corruption.

Then the post-failure fixup will work, i.e. the load_unaligned_zeropad() will work
like you intend here, without risking spurious fixup.
 
> Do we ever use exception fixups for MMIO accesses to justify the
> complication?

It's essentially impossible to prove because identifying all the MMIO accesses in
the kernel (and drivers!) is extremely difficult, e.g. see the I/O APIC code which
uses a struct to overlay MMIO.