linux-kernel - Re: [bug] __blk_mq_run_hw_queue suspicious rcu usage

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.21.1912130121500.215313@chino.kir.corp.google.com>
Date:   Fri, 13 Dec 2019 01:33:35 -0800 (PST)
From:   David Rientjes <rientjes@...gle.com>
To:     Christoph Hellwig <hch@....de>
cc:     "Lendacky, Thomas" <Thomas.Lendacky@....com>,
        Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...nel.dk>,
        "Singh, Brijesh" <brijesh.singh@....com>,
        Ming Lei <ming.lei@...hat.com>,
        Peter Gonda <pgonda@...gle.com>,
        Jianxiong Gao <jxgao@...gle.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>
Subject: Re: [bug] __blk_mq_run_hw_queue suspicious rcu usage

On Thu, 12 Dec 2019, David Rientjes wrote:

> Since all DMA must be unencrypted in this case, what happens if all 
> dma_direct_alloc_pages() calls go through the DMA pool in 
> kernel/dma/remap.c when force_dma_unencrypted(dev) == true since 
> __PAGE_ENC is cleared for these ptes?  (Ignoring for a moment that this 
> special pool should likely be a separate dma pool.)
> 
> I assume a general depletion of that atomic pool so 
> DEFAULT_DMA_COHERENT_POOL_SIZE becomes insufficient.  I'm not sure what 
> size any DMA pool wired up for this specific purpose would need to be 
> sized at, so I assume dynamic resizing is required.
> 
> It shouldn't be *that* difficult to supplement kernel/dma/remap.c with the 
> ability to do background expansion of the atomic pool when nearing its 
> capacity for this purpose?  I imagine that if we just can't allocate pages 
> within the DMA mask that it's the only blocker to dynamic expansion and we 
> don't oom kill for lowmem.  But perhaps vm.lowmem_reserve_ratio is good 
> enough protection?
> 
> Beyond that, I'm not sure what sizing would be appropriate if this is to 
> be a generic solution in the DMA API for all devices that may require 
> unecrypted memory.
> 

Secondly, I'm wondering about how the DMA pool for atomic allocations 
compares with lowmem reserve for both ZONE_DMA and ZONE_DMA32.  For 
allocations where the classzone index is one of these zones, the lowmem 
reserve is static, we don't account the amount of lowmem allocated and 
adjust this for future watermark checks in the page allocator.  We always 
guarantee that reserve is free (absent the depletion of the zone due to 
GFP_ATOMIC allocations where we fall below the min watermarks).

If all DMA memory needs to have _PAGE_ENC cleared when the guest is SEV 
encrypted, I'm wondering if the entire lowmem reserve could be designed as 
a pool of lowmem pages rather than a watermark check.  If implemented as a 
pool of pages in the page allocator itself, and today's reserve is static, 
maybe we could get away with a dynamic resizing based on that static 
amount?  We could offload the handling of this reserve to kswapd such that 
when the pool falls below today's reserve amount, we dynamically expand, 
do the necessary unencryption in blockable context, and add to the pool.  
Bonus is that this provides high-order lowmem reserve if implemented as 
per-order freelists rather than the current watermark check that provides 
no guarantees for any high-order lowmem.

I don't want to distract from the first set of questions in my previous 
email because I need an understanding of that anyway, but I'm hoping 
Christoph can guide me on why the above wouldn't be an improvement even 
for non encrypted guests.