[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aEzhcIouq6OfJAF6@MiWiFi-R3L-srv>
Date: Sat, 14 Jun 2025 10:41:52 +0800
From: Baoquan He <bhe@...hat.com>
To: David Hildenbrand <david@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Jiri Bohac <jbohac@...e.cz>,
Vivek Goyal <vgoyal@...hat.com>, Dave Young <dyoung@...hat.com>,
kexec@...ts.infradead.org, Philipp Rudo <prudo@...hat.com>,
Donald Dutile <ddutile@...hat.com>, Pingfan Liu <piliu@...hat.com>,
Tao Liu <ltao@...hat.com>, linux-kernel@...r.kernel.org,
David Hildenbrand <dhildenb@...hat.com>,
Michal Hocko <mhocko@...e.cz>
Subject: Re: [PATCH v5 4/5] kdump: wait for DMA to finish when using CMA
On 06/13/25 at 11:19am, David Hildenbrand wrote:
> On 13.06.25 01:47, Andrew Morton wrote:
> > On Thu, 12 Jun 2025 12:18:40 +0200 Jiri Bohac <jbohac@...e.cz> wrote:
> >
> > > When re-using the CMA area for kdump there is a risk of pending DMA
> > > into pinned user pages in the CMA area.
> > >
> > > Pages residing in CMA areas can usually not get long-term pinned and
> > > are instead migrated away from the CMA area, so long-term pinning is
> > > typically not a concern. (BUGs in the kernel might still lead to
> > > long-term pinning of such pages if everything goes wrong.)
> > >
> > > Pages pinned without FOLL_LONGTERM remain in the CMA and may possibly
> > > be the source or destination of a pending DMA transfer.
> > >
> > > Although there is no clear specification how long a page may be pinned
> > > without FOLL_LONGTERM, pinning without the flag shows an intent of the
> > > caller to only use the memory for short-lived DMA transfers, not a transfer
> > > initiated by a device asynchronously at a random time in the future.
> > >
> > > Add a delay of CMA_DMA_TIMEOUT_SEC seconds before starting the kdump
> > > kernel, giving such short-lived DMA transfers time to finish before
> > > the CMA memory is re-used by the kdump kernel.
> > >
> > > Set CMA_DMA_TIMEOUT_SEC to 10 seconds - chosen arbitrarily as both
> > > a huge margin for a DMA transfer, yet not increasing the kdump time
> > > too significantly.
> >
> > Oh. 10s sounds a lot. How long does this process typically take?
> >
> > It's sad to add a 10s delay for something which some systems will never
> > do. I wonder if there's some simple hack we can add. Like having a
> > global flag which gets set the first time someone pins a CMA page
I have the same worry as Andrew. One system run off rails, we don't try
to slam the brake, but wait 10 seconds instead to do that. Lucky we have
noticed people the risk.
>
> We would likely have to do that for any GUP on such a page (FOLL_GET |
> FOLL_PIN), both from gup-fast and gup-slow.
There could be such GUP page, not always? This feature is an opt-in for
users, they can decide or tune the waiting time too?
My personal opinion. I will not suggest people to use it in RHEL, while
other people feel free to try it as the risk has been warned.
>
> Should work, but IMHO can be optimized later, on top of this series.
>
> --
> Cheers,
>
> David / dhildenb
>
Powered by blists - more mailing lists