[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250512201408.81acaf861581f6fb98e8a3b0@linux-foundation.org>
Date: Mon, 12 May 2025 20:14:08 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: David Woodhouse <dwmw2@...radead.org>
Cc: Alexander Graf <graf@...zon.com>, kexec@...ts.infradead.org,
linux-kernel@...r.kernel.org, Baoquan He <bhe@...hat.com>, Pasha Tatashin
<pasha.tatashin@...een.com>, nh-open-source@...zon.com
Subject: Re: [PATCH] kexec: Enable CMA based contiguous allocation
On Mon, 12 May 2025 19:02:20 -0700 David Woodhouse <dwmw2@...radead.org> wrote:
> > > 2) More robust. Even if by accident some page is still in use for DMA,
> > > the new kernel image will be safe from that access because it resides
> > > in a memory region that is considered allocated in the old kernel and
> > > has a chance to reinitialize that component.
> >
> >
> > https://lore.kernel.org/all/20250512140909.3464-1-dssauerw@amazon.de/>
>
> >Is this known to be a problem in current code?
>
> Oh $DEITY yes. The Arm Generic Interrupt Controller is, to quote a dear
> friend, "a cautionary tale of how not to approach a hardware design".
>
> It does a whole bunch of arbitrary DMA all over the place, and doesn't
> even live behind an IOMMU. And doesn't *stop* doing DMA unless you ask
> it *really* nicely; merely shutting down the offending high-level
> components isn't always enough, because they might still to write back
> some caches.
>
> Here's one of the latest examples (not actually the one which has been
> breaking kexec for us, as far as we know, but an example of the genre):
> https://lore.kernel.org/all/20250512140909.3464-1-dssauerw@amazon.de/
>
> So putting the new kernel into a physical memory region which was
> considered 'free' by the previous kernel, as Alex explains, is actually
> a very good defence-in-depth mechanism to protect against such issues.
Lol, it sounds like you're having fun over there.
Alexander, can you please repackage David's info to your taste and
include it in the changelog? Escalating the value of the patch from
"might speed it up, don't know how much" to "addresses grievous
real-world issues" is helpful to the patch's case!
Powered by blists - more mailing lists