[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aIPUkzn3ZdgbKRzG@Asurada-Nvidia>
Date: Fri, 25 Jul 2025 12:01:39 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Mostafa Saleh <smostafa@...gle.com>
CC: Pranjal Shrivastava <praan@...gle.com>, <jgg@...dia.com>,
<will@...nel.org>, <joro@...tes.org>, <robin.murphy@....com>,
<linux-arm-kernel@...ts.infradead.org>, <iommu@...ts.linux.dev>,
<linux-kernel@...r.kernel.org>, <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v3 2/2] iommu/arm-smmu-v3: Replace vsmmu_size/type with
get_viommu_size
On Fri, Jul 25, 2025 at 06:12:07PM +0000, Mostafa Saleh wrote:
> On Fri, Jul 25, 2025 at 09:24:23AM -0700, Nicolin Chen wrote:
> > On Fri, Jul 25, 2025 at 09:18:35AM +0000, Mostafa Saleh wrote:
> > > > > > > On Wed, Jul 23, 2025 at 01:37:53PM +0000, Pranjal Shrivastava wrote:
> > > > > > > > On Mon, Jul 21, 2025 at 01:04:44PM -0700, Nicolin Chen wrote:
> > > > > Had the
> > > > > vintf_size rejected it, we wouldn't be calling the init op.
> > > >
> > > > A data corruption could happen any time, not related to the
> > > > init op. A concurrent buggy thread can overwrite the vIOMMU
> > > > object when a write access to its adjacent memory overflows.
> > >
> > > Can you please elaborate on that, as memory corruption can happen
> > > any time event after the next check and there is no way to defend
> > > against that?
> >
> > That narrative is under a condition (in the context) "when there
> > is a kernel bug corrupting data" :)
> >
> > E.g. some new lines of code allocates a wrong size of memory and
> > writes above the size. If that memory is near this vIOMMU object
> > it might overwrite to this vIOMMU object that this function gets.
> >
> > This certainly won't happen if everything is sane.
>
> I see, but I don't think we should do anything about that, there are
> 100s of structs in the kernel, we can't add checks everywhere, and I
> don't see anything special about this path to add an assertion, this
> kind of defensive programming isn't really helpful. We just need to
> review any new code properly :)
It could help for debugging purpose when writing new lines of code.
Kernel has quite a lot of WARN_ONs fencing something that shouldn't
happen.
With that being said, I admit that this particular line is a bit of
overreacting. Removing it doesn't have too big impact, as something
else would likely crash when such a corruption does happen.
Nicolin
Powered by blists - more mailing lists