linux-kernel - Re: [bug] __blk_mq_run_hw_queue suspicious rcu usage

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.21.1909051534050.245316@chino.kir.corp.google.com>
Date:   Thu, 5 Sep 2019 15:37:16 -0700 (PDT)
From:   David Rientjes <rientjes@...gle.com>
To:     Christoph Hellwig <hch@....de>
cc:     Tom Lendacky <thomas.lendacky@....com>,
        Brijesh Singh <brijesh.singh@....com>,
        Jens Axboe <axboe@...nel.dk>, Ming Lei <ming.lei@...hat.com>,
        Peter Gonda <pgonda@...gle.com>,
        Jianxiong Gao <jxgao@...gle.com>, linux-kernel@...r.kernel.org,
        x86@...nel.org, iommu@...ts.linux-foundation.org
Subject: Re: [bug] __blk_mq_run_hw_queue suspicious rcu usage

On Thu, 5 Sep 2019, Christoph Hellwig wrote:

> > Hi Christoph, Jens, and Ming,
> > 
> > While booting a 5.2 SEV-enabled guest we have encountered the following 
> > WARNING that is followed up by a BUG because we are in atomic context 
> > while trying to call set_memory_decrypted:
> 
> Well, this really is a x86 / DMA API issue unfortunately.  Drivers
> are allowed to do GFP_ATOMIC dma allocation under locks / rcu critical
> sections and from interrupts.  And it seems like the SEV case can't
> handle that.  We have some semi-generic code to have a fixed sized
> pool in kernel/dma for non-coherent platforms that have similar issues
> that we could try to wire up, but I wonder if there is a better way
> to handle the issue, so I've added Tom and the x86 maintainers.
> 
> Now independent of that issue using DMA coherent memory for the nvme
> PRPs/SGLs doesn't actually feel very optional.  We could do with
> normal kmalloc allocations and just sync it to the device and back.
> I wonder if we should create some general mempool-like helpers for that.
> 

Thanks for looking into this.  I assume it's a non-starter to try to 
address this in _vm_unmap_aliases() itself, i.e. rely on a purge spinlock 
to do all synchronization (or trylock if not forced) for 
purge_vmap_area_lazy() rather than only the vmap_area_lock within it.  In 
other words, no mutex.

If that's the case, and set_memory_encrypted() can't be fixed to not need 
to sleep by changing _vm_unmap_aliases() locking, then I assume dmapool is 
our only alternative?  I have no idea with how large this should be.