lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0Ue1+wvoFzymvMhUvbbSTRgW8=qYySkH80KqRKHCXdHWPg@mail.gmail.com>
Date: Tue, 6 Aug 2024 06:09:23 +0530
From: Alexander Duyck <alexander.duyck@...il.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: Yonglong Liu <liuyonglong@...wei.com>, "David S. Miller" <davem@...emloft.net>, 
	Jakub Kicinski <kuba@...nel.org>, pabeni@...hat.com, hawk@...nel.org, 
	ilias.apalodimas@...aro.org, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>, 
	"shenjian (K)" <shenjian15@...wei.com>, Salil Mehta <salil.mehta@...wei.com>, iommu@...ts.linux.dev
Subject: Re: [BUG REPORT]net: page_pool: kernel crash at iommu_get_dma_domain+0xc/0x20

On Mon, Aug 5, 2024 at 6:20 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>
> On 2024/8/3 0:38, Alexander Duyck wrote:
>
> ...
>
> >
> > The issue as I see it is that we aren't unmapping the pages when we
> > call page_pool_destroy. There need to be no pages remaining with a DMA
> > unmapping needed *after* that is called. Otherwise we will see this
> > issue regularly.
> >
> > What we probably need to look at doing is beefing up page_pool_release
> > to add a step that will take an additional reference on the inflight
> > pages, then call __page_pool_put_page to switch them to a reference
> > counted page.
>
> I am not sure if I understand what you meant about, did you mean making
> page_pool_destroy() synchronously wait for the all in-flight pages to
> come back before returning to driver?

Part of the issue is the device appears to be removed from the iommu
before all the pages have been unmap. To fix that we would either need
to unmap all the pages or force the kernel to wait until all of the
pages have been unmapped before the device can be removed from the
iommu group.

> >
> > Seems like the worst case scenario is that we are talking about having
> > to walk the page table to do the above for any inflight pages but it
>
> Which page table are we talking about here?

The internal memory being managed by the kernel in the form of struct
page. Basically we would need to walk through all the struct page
entries and if they are setup to use the page_pool we are freeing we
would have to force them out of the pool.

> > would certainly be a much more deterministic amount of time needed to
> > do that versus waiting on a page that may or may not return.
> >
> > Alternatively a quick hack that would probably also address this would
> > be to clear poll->dma_map in page_pool_destroy or maybe in
>
> It seems we may need to clear pool->dma_sync too, and there may be some
> time window between clearing and checking/dma_unmap?

That is a possibility. However for many platforms dma_sync is a no-op.

> > page_pool_unreg_netdev so that any of those residual mappings would
> > essentially get leaked, but we wouldn't have to worry about trying to
> > unmap while the device doesn't exist.
>
> But how does the page_pool know if it is just the normal unloading processing
> without VF disabling where the device still exists or it is the abnormal one
> caused by the VF disabling where the device will disappear? If it is the first
> one, does it cause resource leaking problem for iommu if some calling for iommu
> is skipped?

It wouldn't. Basically we would have to do this for any page pool that
is being destroyed with pages left in-flight. That is why the
preference would likely be to stall for some time and hope that the
pages get unmapped on their own, and then if they can't we would need
to force this process to kick in so we don't spend forever.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ