linux-kernel - Re: [BUG] seltests/iommu: runaway ./iommufd consuming 99% CPU after a failed assert()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240327163832.GJ946323@nvidia.com>
Date: Wed, 27 Mar 2024 13:38:32 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Joao Martins <joao.m.martins@...cle.com>
Cc: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>, iommu@...ts.linux.dev,
	Kevin Tian <kevin.tian@...el.com>, Shuah Khan <shuah@...nel.org>,
	linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [BUG] seltests/iommu: runaway ./iommufd consuming 99% CPU after
 a failed assert()

On Wed, Mar 27, 2024 at 03:04:09PM +0000, Joao Martins wrote:
> On 27/03/2024 11:40, Jason Gunthorpe wrote:
> > On Wed, Mar 27, 2024 at 10:41:52AM +0000, Joao Martins wrote:
> >> On 25/03/2024 13:52, Jason Gunthorpe wrote:
> >>> On Mon, Mar 25, 2024 at 12:17:28PM +0000, Joao Martins wrote:
> >>>>> However, I am not smart enough to figure out why ...
> >>>>>
> >>>>> Apparently, from the source, mmap() fails to allocate pages on the desired address:
> >>>>>
> >>>>>   1746         assert((uintptr_t)self->buffer % HUGEPAGE_SIZE == 0);
> >>>>>   1747         vrc = mmap(self->buffer, variant->buffer_size, PROT_READ |
> >>>>> PROT_WRITE,
> >>>>>   1748                    mmap_flags, -1, 0);
> >>>>> → 1749         assert(vrc == self->buffer);
> >>>>>   1750
> >>>>>
> >>>>> But I am not that deep into the source to figure our what was intended and what
> >>>>> went
> >>>>> wrong :-/
> >>>>
> >>>> I can SKIP() the test rather assert() in here if it helps. Though there are
> >>>> other tests that fail if no hugetlb pages are reserved.
> >>>>
> >>>> But I am not sure if this is problem here as the initial bug email had an
> >>>> enterily different set of failures? Maybe all you need is an assert() and it
> >>>> gets into this state?
> >>>
> >>> I feel like there is something wrong with the kselftest framework,
> >>> there should be some way to fail the setup/teardown operations without
> >>> triggering an infinite loop :(
> >>
> >> I am now wondering if the problem is the fact that we have an assert() in the
> >> middle of FIXTURE_{TEST,SETUP} whereby we should be having ASSERT_TRUE() (or any
> >> other kselftest macro that). The expect/assert macros from kselftest() don't do
> >> asserts and it looks like we are failing mid tests in the assert().
> > 
> > Those ASSERT_TRUE cause infinite loops when used within the setup
> > context, I removed them and switched to assert because of this - which
> > did work OK in my testing at least.
> 
> Strange because we make use of ASSERT* widely in our selftests fixture-setup.
> 
> setup_sizes() is run before the tests so it can't use ASSERT macros for sure;
> maybe that's what you refer?

No, it was definately ASSERT/etc if you hit those in the wrong spot
the thing infinite loops. Maybe that was teardown only.

Jason