[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201130185500.GB744128@google.com>
Date: Mon, 30 Nov 2020 10:55:00 -0800
From: Tom Roeder <tmroeder@...gle.com>
To: Keith Busch <kbusch@...nel.org>
Cc: Christoph Hellwig <hch@....de>, Jens Axboe <axboe@...com>,
Sagi Grimberg <sagi@...mberg.me>,
Peter Gonda <pgonda@...gle.com>,
Marios Pomonis <pomonis@...gle.com>,
linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] nvme: Cache DMA descriptors to prevent corruption.
On Fri, Nov 20, 2020 at 06:29:54AM -0800, Keith Busch wrote:
>On Fri, Nov 20, 2020 at 09:02:43AM +0100, Christoph Hellwig wrote:
>> On Thu, Nov 19, 2020 at 05:27:37PM -0800, Tom Roeder wrote:
>> > This patch changes the NVMe PCI implementation to cache host_mem_descs
>> > in non-DMA memory instead of depending on descriptors stored in DMA
>> > memory. This change is needed under the malicious-hypervisor threat
>> > model assumed by the AMD SEV and Intel TDX architectures, which encrypt
>> > guest memory to make it unreadable. Some versions of these architectures
>> > also make it cryptographically hard to modify guest memory without
>> > detection.
>>
>> I don't think this is a useful threat model, and I've not seen a
>> discussion on lkml where we had any discussion on this kind of threat
>> model either.
>>
>> Before you start sending patches that regress optimizations in various
>> drivers (and there will be lots with this model) we need to have a
>> broader discussion first.
>>
>> And HMB support, which is for low-end consumer devices that are usually
>> not directly assigned to VMs aren't a good starting point for this.
>
>Yeah, while doing this for HMB isn't really a performance concern, this
>method for chaining SGL/PRP lists would be.
I see that this answers a question I just asked in my reply to the
previous message. Sorry about that. Can you please point me to the code
in question?
>
>And perhaps more importantly, the proposed mitigation only lets the
>guest silently carry on from such an attack while the device is surely
>corrupting something. I think we'd rather free the wrong address since
>that may at least eventually raise an error.
From a security perspective, I'd rather not free the wrong address,
since that could lead to an attack on the guest (use-after-free). But I
agree with the concern about fixing the problem silently. Maybe this
code should instead raise an error itself in this case after comparing
the cached values with the values stored in the DMA memory?
Powered by blists - more mailing lists