[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251126195036.GA738503@ziepe.ca>
Date: Wed, 26 Nov 2025 15:50:36 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Borislav Petkov <bp@...en8.de>
Cc: iommu@...ts.linux.dev, Vasant Hegde <vasant.hegde@....com>,
Joerg Roedel <joro@...tes.org>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
linux-kernel@...r.kernel.org
Subject: Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on
CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
On Wed, Nov 26, 2025 at 05:00:31PM +0100, Borislav Petkov wrote:
> + Vasant.
>
> On Wed, Nov 26, 2025 at 03:32:19PM +0100, Borislav Petkov wrote:
> > Hi,
> >
> > this is latest Linus + latest tip/master. Box is Zen3. CCing AMD IOMMU folks
> > because the backtrace points to it.
> >
> > Ideas?
> >
> > [ 12.946913] (journald)[506]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
> > [ 12.948083] (journald)[506]: Successfully forked off '(sd-mkuserns)' as PID 507.
> > [ 12.977579] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
> > [ 12.983638] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x1/.
> > [ 12.983644] Sending NMI from CPU 1 to CPUs 0:
> > [ 12.983652] NMI backtrace for cpu 0
> > [ 12.983655] CPU: 0 UID: 0 PID: 504 Comm: (modprobe) Not tainted 6.18.0-rc7+ #1 PREEMPT(voluntary)
> > [ 12.983658] Hardware name: Supermicro Super Server/H12SSL-i, BIOS 2.5 09/08/2022
> > [ 12.983660] RIP: 0010:delay_halt_mwaitx+0x37/0x40
> > [ 12.983665] Code: 01 31 d2 89 d1 48 05 00 d0 3a 83 0f 01 fa b8 ff ff ff ff b9 02 00 00 00 48 39 c6 48 0f 46 c6 48 89 c3 b8 f0 00 00 00 0f 01 fb <5b> e9 ae 2b 1d ff 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90
> > [ 12.983666] RSP: 0018:ffffc90000003d88 EFLAGS: 00000097
> > [ 12.983668] RAX: 00000000000000f0 RBX: 0000000000000b9b RCX: 0000000000000002
> > [ 12.983670] RDX: 0000000000000000 RSI: 0000000000000b9b RDI: 00000034e14b49dc
> > [ 12.983671] RBP: 00000034e14b49dc R08: 000000000000006c R09: 0000000000000002
> > [ 12.983672] R10: 0000000000000050 R11: 0000000000000002 R12: 00000000000001a4
> > [ 12.983673] R13: 0000000000000002 R14: ffff888100064018 R15: ffff888100064000
> > [ 12.983674] FS: 00007f8ae7452e00(0000) GS:ffff889086e58000(0000) knlGS:0000000000000000
> > [ 12.983675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 12.983676] CR2: 00007f8ae6c9a8a0 CR3: 0000000141913004 CR4: 0000000000770ef0
> > [ 12.983677] PKRU: 55555554
> > [ 12.983678] Call Trace:
> > [ 12.983680] <IRQ>
> > [ 12.983682] delay_halt+0x3b/0x60
> > [ 12.983685] iommu_completion_wait.part.0.isra.0+0xd3/0x100
> > [ 12.983693] domain_flush_complete+0x64/0xc0
> > [ 12.983696] amd_iommu_flush_iotlb_all+0x33/0x50
> > [ 12.983700] fq_flush_timeout+0x34/0xd0
This code in the AMD driver is spinning until the HW completes some
work
Maybe the HW has alot of work to do, or is slow, and this is
legitimately taking a long time? I don't expect
amd_iommu_flush_iotlb_all() to have a problem, that's pretty weird.
Does it eventually boot?
Jason
Powered by blists - more mailing lists