linux-kernel - Re: [BUG REPORT]net: page_pool: kernel crash at iommu_get_dma

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKgT0Ue1+wvoFzymvMhUvbbSTRgW8=qYySkH80KqRKHCXdHWPg@mail.gmail.com>
Date: Tue, 6 Aug 2024 06:09:23 +0530
From: Alexander Duyck <alexander.duyck@...il.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: Yonglong Liu <liuyonglong@...wei.com>, "David S. Miller" <davem@...emloft.net>, 
	Jakub Kicinski <kuba@...nel.org>, pabeni@...hat.com, hawk@...nel.org, 
	ilias.apalodimas@...aro.org, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>, 
	"shenjian (K)" <shenjian15@...wei.com>, Salil Mehta <salil.mehta@...wei.com>, iommu@...ts.linux.dev
Subject: Re: [BUG REPORT]net: page_pool: kernel crash at iommu_get_dma_domain+0xc/0x20

On Mon, Aug 5, 2024 at 6:20 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>
> On 2024/8/3 0:38, Alexander Duyck wrote:
>
> ...
>
> >
> > The issue as I see it is that we aren't unmapping the pages when we
> > call page_pool_destroy. There need to be no pages remaining with a DMA
> > unmapping needed *after* that is called. Otherwise we will see this
> > issue regularly.
> >
> > What we probably need to look at doing is beefing up page_pool_release
> > to add a step that will take an additional reference on the inflight
> > pages, then call __page_pool_put_page to switch them to a reference
> > counted page.
>
> I am not sure if I understand what you meant about, did you mean making
> page_pool_destroy() synchronously wait for the all in-flight pages to
> come back before returning to driver?

Part of the issue is the device appears to be removed from the iommu
before all the pages have been unmap. To fix that we would either need
to unmap all the pages or force the kernel to wait until all of the
pages have been unmapped before the device can be removed from the
iommu group.

> >
> > Seems like the worst case scenario is that we are talking about having
> > to walk the page table to do the above for any inflight pages but it
>
> Which page table are we talking about here?

The internal memory being managed by the kernel in the form of struct
page. Basically we would need to walk through all the struct page
entries and if they are setup to use the page_pool we are freeing we
would have to force them out of the pool.

> > would certainly be a much more deterministic amount of time needed to
> > do that versus waiting on a page that may or may not return.
> >
> > Alternatively a quick hack that would probably also address this would
> > be to clear poll->dma_map in page_pool_destroy or maybe in
>
> It seems we may need to clear pool->dma_sync too, and there may be some
> time window between clearing and checking/dma_unmap?

That is a possibility. However for many platforms dma_sync is a no-op.

> > page_pool_unreg_netdev so that any of those residual mappings would
> > essentially get leaked, but we wouldn't have to worry about trying to
> > unmap while the device doesn't exist.
>
> But how does the page_pool know if it is just the normal unloading processing
> without VF disabling where the device still exists or it is the abnormal one
> caused by the VF disabling where the device will disappear? If it is the first
> one, does it cause resource leaking problem for iommu if some calling for iommu
> is skipped?

It wouldn't. Basically we would have to do this for any page pool that
is being destroyed with pages left in-flight. That is why the
preference would likely be to stall for some time and hope that the
pages get unmapped on their own, and then if they can't we would need
to force this process to kick in so we don't spend forever.