[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a6av0wxk.fsf@linux.ibm.com>
Date: Thu, 02 Jun 2022 12:58:31 -0500
From: Nathan Lynch <nathanl@...ux.ibm.com>
To: Laurent Dufour <ldufour@...ux.ibm.com>
Cc: linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
mpe@...erman.id.au, benh@...nel.crashing.org, paulus@...ba.org,
haren@...ux.vnet.ibm.com, npiggin@...il.com
Subject: Re: [PATCH 0/2] Disabling NMI watchdog during LPM's memory transfer
Laurent Dufour <ldufour@...ux.ibm.com> writes:
> When a partition is transferred, once it arrives at the destination node,
> the partition is active but much of its memory must be transferred from the
> start node.
>
> It depends on the activity in the partition, but the more CPU the partition
> has, the more memory to be transferred is likely to be. This causes latency
> when accessing pages that need to be transferred, and often, for large
> partitions, it triggers the NMI watchdog.
It also triggers warnings from other watchdogs and subsystems that
have soft latency requirements - softlockup, RCU, workqueue. The issue
is more general than the NMI watchdog.
> The NMI watchdog causes the CPU stack to dump where it appears to be
> stuck. In this case, it does not bring much information since it can happen
> during any memory access of the kernel.
When the site of a watchdog backtrace shows a thread stuck on a routine
memory access as opposed to something like a lock acquisition, that is
actually useful information that shouldn't be discarded. It tells us the
platform is failing to adequately virtualize partition memory. This
isn't a benign situation and it's likely to unacceptably affect real
workloads. The kernel is ideally situated to detect and warn about this.
> In addition, the NMI interrupt mechanism is not secure and can generate a
> dump system in the event that the interruption is taken while
> MSR[RI]=0.
This sounds like a general problem with that facility that isn't
specific to partition migration? Maybe it should be disabled altogether
until that can be fixed?
> Given how often hard lockups are detected when transferring large
> partitions, it seems best to disable the watchdog NMI until the memory
> transfer from the start node is complete.
At this time, I'm far from convinced. Disabling the watchdog is going to
make the underlying problems in the platform and/or network harder to
understand.
Powered by blists - more mailing lists