linux-kernel - Re: [PATCH 0/2] KVM: x86/mmu: .change

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5d81a9cd-f96d-bcdb-7878-74c2ead26cfb@arm.com>
Date:   Wed, 6 Sep 2023 17:18:51 +0100
From:   Robin Murphy <robin.murphy@....com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Yan Zhao <yan.y.zhao@...el.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, pbonzini@...hat.com,
        Christoph Hellwig <hch@....de>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH 0/2] KVM: x86/mmu: .change_pte() optimization in TDP MMU

On 2023-09-06 15:44, Sean Christopherson wrote:
> On Wed, Sep 06, 2023, Robin Murphy wrote:
>> On 2023-09-05 19:59, Sean Christopherson wrote:
>>> And if the driver *doesn't* initialize the data, then the copy is at best pointless,
>>> and possibly even worse than leaking stale swiotlb data.
>>
>> Other than the overhead, done right it can't be any worse than if SWIOTLB
>> were not involved at all.
> 
> Yep.
> 
>>> Looking at commit ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE"),
>>> IIUC the data leak was observed with a synthetic test "driver" that was developed
>>> to verify a real data leak fixed by commit a45b599ad808 ("scsi: sg: allocate with
>>> __GFP_ZERO in sg_build_indirect()").  Which basically proves my point: copying
>>> from the source only adds value absent a bug in the owning driver.
>>
>> Huh? IIUC the bug there was that the SCSI layer failed to sanitise
>> partially-written buffers. That bug was fixed, and the scrutiny therein
>> happened to reveal that SWIOTLB *also* had a lower-level problem with
>> partial writes, in that it was corrupting DMA-mapped memory which was not
>> updated by the device. Partial DMA writes are not in themselves indicative
>> of a bug, they may well be a valid and expected behaviour.
> 
> The problem is that the comment only talks about leaking data to userspace, and
> doesn't say anything about data corruption or the "swiotlb needs to match hardware"
> justification that Linus pointed out.  I buy both of those arguments for copying
> data from the original page, but the "may prevent leaking swiotlb content" is IMO
> completely nonsensical, because if preventing leakage is the only goal, then
> explicitly initializing the memory is better in every way.
> 
> If no one objects, I'll put together a patch to rewrite the comment in terms of
> mimicking hardware and not corrupting the caller's data.

Sounds good to me. I guess the trouble is that as soon as a CVE is 
involved it can then get hard to look past it, or want to risk appearing 
to downplay it :)

>>> IMO, rather than copying from the original memory, swiotlb_tbl_map_single() should
>>> simply zero the original page(s) when establishing the mapping.  That would harden
>>> all usage of swiotlb and avoid the read-before-write behavior that is problematic
>>> for KVM.
>>
>> Depends on one's exact definition of "harden"... Corrupting memory with
>> zeros is less bad than corrupting memory with someone else's data if you
>> look at it from an information security point of view, but from a
>> not-corrupting-memory point of view it's definitely still corrupting memory
>> :/
>>
>> Taking a step back, is there not an argument that if people care about
>> general KVM performance then they should maybe stop emulating obsolete PC
>> hardware from 30 years ago, and at least emulate obsolete PC hardware from
>> 20 years ago that supports 64-bit DMA?
> 
> Heh, I don't think there's an argument per se, people most definitely shouldn't
> be emulating old hardware if they care about performance.  I already told Yan as
> much.
> 
>> Even non-virtualised, SWIOTLB is pretty horrible for I/O performance by its
>> very nature - avoiding it if at all possible should always be preferred.
> 
> Yeah.  The main reason I didn't just sweep this under the rug is the confidential
> VM use case, where SWIOTLB is used to bounce data from guest private memory into
> shread buffers.  There's also a good argument that anyone that cares about I/O
> performance in confidential VMs should put in the effort to enlighten their device
> drivers to use shared memory directly, but practically speaking that's easier said
> than done.

Indeed a bunch of work has gone into SWIOTLB recently trying to make it 
a bit more efficient for such cases where it can't be avoided, so it is 
definitely still interesting to learn about impacts at other levels like 
this. Maybe there's a bit of a get-out for confidential VMs though, 
since presumably there's not much point COW-ing encrypted private 
memory, so perhaps KVM might end up wanting to optimise that out and 
thus happen to end up less sensitive to unavoidable SWIOTLB behaviour 
anyway?

Cheers,
Robin.