[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEsiJFku+wR9KxE8@nvidia.com>
Date: Thu, 12 Jun 2025 11:53:24 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Thomas Weißschuh <thomas.weissschuh@...utronix.de>,
Shuah Khan <shuah@...nel.org>, Shuah Khan <skhan@...uxfoundation.org>, "Willy
Tarreau" <w@....eu>, Thomas Weißschuh
<linux@...ssschuh.net>, Kees Cook <kees@...nel.org>, Andy Lutomirski
<luto@...capital.net>, Will Drewry <wad@...omium.org>, Mark Brown
<broonie@...nel.org>, Muhammad Usama Anjum <usama.anjum@...labora.com>,
<linux-kernel@...r.kernel.org>, <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH v4 09/14] selftests: harness: Move teardown conditional
into test metadata
On Thu, Jun 12, 2025 at 10:53:34AM -0700, Nicolin Chen wrote:
> On Thu, Jun 12, 2025 at 12:42:42PM -0300, Jason Gunthorpe wrote:
> > On Thu, Jun 12, 2025 at 05:23:01PM +0200, Thomas Weißschuh wrote:
> > > On Thu, Jun 12, 2025 at 11:58:01AM -0300, Jason Gunthorpe wrote:
> > > > On Thu, Jun 12, 2025 at 04:27:41PM +0200, Thomas Weißschuh wrote:
> > > >
> > > > > If the assumption is that this is most likely a kernel bug,
> > > > > shouldn't it be fixed properly rather than worked around?
> > > > > After all the job of a selftest is to detect bugs to be fixed.
> > > >
> > > > I investigated the history for a bit and it seems likely we cannot
> > > > change the kernel here. Call it an undocumented "feature".
> > >
> > > I looked a bit and it seems to be mentioned in mmap(2):
> > >
> > > For mmap(), offset must be a multiple of the underlying huge page size.
> > > The system automatically aligns length to be a multiple of the underlying huge page size.
> >
> > Oh there you go then :) Horrible design. No way for userspace to know
> > what the rounded up length actually was and thus no way for
> > userspace to unmap it.
>
> OK. I think we would have to skip those cases then.
Or.. maybe we could just allocate a huge page:
@@ -2022,7 +2023,19 @@ FIXTURE_SETUP(iommufd_dirty_tracking)
self->fd = open("/dev/iommu", O_RDWR);
ASSERT_NE(-1, self->fd);
- rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, variant->buffer_size);
+ if (variant->hugepages) {
+ /*
+ * Allocation must be aligned to the HUGEPAGE_SIZE, because the
+ * following mmap() will automatically align the length to be a
+ * multiple of the underlying huge page size. Failing to do the
+ * same at this allocation will result in a memory overwrite by
+ * the mmap().
+ */
+ size = __ALIGN_KERNEL(variant->buffer_size, HUGEPAGE_SIZE);
+ } else {
+ size = variant->buffer_size;
+ }
+ rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, size);
if (rc || !self->buffer) {
SKIP(return, "Skipping buffer_size=%lu due to errno=%d",
variant->buffer_size, rc);
It can just upsize the allocation, i.e. the test case will only
use the first 64M or 128MB out of the reserved 512MB huge page.
Thanks
Nicolin
Powered by blists - more mailing lists