[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20231109160517.3d1c1e17@meshulam.tesarici.cz>
Date: Thu, 9 Nov 2023 16:05:17 +0100
From: Petr Tesařík <petr@...arici.cz>
To: Niklas Schnelle <schnelle@...ux.ibm.com>
Cc: Petr Tesarik <petrtesarik@...weicloud.com>,
Christoph Hellwig <hch@....de>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Robin Murphy <robin.murphy@....com>,
Petr Tesarik <petr.tesarik.ext@...wei.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"open list:DMA MAPPING HELPERS" <iommu@...ts.linux.dev>,
open list <linux-kernel@...r.kernel.org>,
Wangkefeng <wangkefeng.wang@...wei.com>,
Roberto Sassu <roberto.sassu@...weicloud.com>,
Petr Tesarik <petr.tesarik1@...wei-partners.com>,
Halil Pasic <pasic@...ux.ibm.com>, stable@...r.kernel.org
Subject: Re: [PATCH 1/1] swiotlb: fix out-of-bounds TLB allocations with
CONFIG_SWIOTLB_DYNAMIC
On Thu, 09 Nov 2023 13:24:48 +0100
Niklas Schnelle <schnelle@...ux.ibm.com> wrote:
> On Wed, 2023-11-08 at 13:21 +0100, Petr Tesařík wrote:
> > On Wed, 8 Nov 2023 12:12:49 +0100
> > Petr Tesarik <petrtesarik@...weicloud.com> wrote:
> >
> > > From: Petr Tesarik <petr.tesarik1@...wei-partners.com>
> > >
> > > Limit the free list length to the size of the IO TLB. Transient pool can be
> > > smaller than IO_TLB_SEGSIZE, but the free list is initialized with the
> > > assumption that the total number of slots is a multiple of IO_TLB_SEGSIZE.
> > > As a result, swiotlb_area_find_slots() may allocate slots past the end of
> > > a transient IO TLB buffer.
> >
> > Just to make it clear, this patch addresses only the memory corruption
> > reported by Niklas, without addressing the underlying issues. Where
> > corruption happened before, allocations will fail with this patch.
> >
> > I am still looking into improving the allocation strategy itself.
> >
> > Petr T
>
> I know this has already been applied but for what its worth I did
> finally manage to test this with my reproducer and the allocation
> overrun is fixed by this change. I also confirmed that at least my
> ConnectX VF TCP/IP test case seems to handle the DMA error gracefully
> enough.
Thank you for testing!
Inded, the failed request is often retried at a later time. For example
I tested with a SCSI driver, and by the time the SCSI layer retried the
request, a new standard pool was already available. But this situation
is not ideal. If nothing else, it incurs an unnecessary delay.
Petr T
Powered by blists - more mailing lists