[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241126143024.EKo6QfKL@linutronix.de>
Date: Tue, 26 Nov 2024 15:30:24 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: kernel test robot <oliver.sang@...el.com>
Cc: Peter Zijlstra <peterz@...radead.org>, oe-lkp@...ts.linux.dev,
lkp@...el.com, linux-kernel@...r.kernel.org,
"Paul E. McKenney" <paulmck@...nel.org>, rcu@...r.kernel.org
Subject: Re: [linus:master] [sched, x86] 476e8583ca:
WARNING:at_kernel/rcu/update.c:#torture_sched_setaffinity
On 2024-11-25 22:29:09 [+0800], kernel test robot wrote:
> Hello,
Hi,
> by this commit, we see the config has below diff:
>
> --- /pkg/linux/x86_64-randconfig-161-20241120/gcc-12/35772d627b55cc7fb4f33bae57c564a25b3121a9/.config 2024-11-22 17:03:32.458344665 +0800
> +++ /pkg/linux/x86_64-randconfig-161-20241120/gcc-12/476e8583ca16eecec0a3a28b6ee7130f4e369389/.config 2024-11-22 17:02:59.440805587 +0800
> @@ -121,9 +121,11 @@ CONFIG_BPF_UNPRIV_DEFAULT_OFF=y
> # end of BPF subsystem
>
> CONFIG_PREEMPT_BUILD=y
> -CONFIG_PREEMPT_NONE=y
> +CONFIG_ARCH_HAS_PREEMPT_LAZY=y
> +# CONFIG_PREEMPT_NONE is not set
> # CONFIG_PREEMPT_VOLUNTARY is not set
> # CONFIG_PREEMPT is not set
> +CONFIG_PREEMPT_LAZY=y
> # CONFIG_PREEMPT_RT is not set
> CONFIG_PREEMPT_COUNT=y
> CONFIG_PREEMPTION=y
>
…
> commit: 476e8583ca16eecec0a3a28b6ee7130f4e369389 ("sched, x86: Enable Lazy preemption")
…
> runtime: 300s
> test: cpuhotplug
> torture_type: trivial
…
> [ 150.797530][ T445] ------------[ cut here ]------------
> [ 150.797915][ T445] torture_sched_setaffinity: sched_setaffinity(445) returned -22
> [ 150.798353][ T445] WARNING: CPU: 0 PID: 445 at kernel/rcu/update.c:535 torture_sched_setaffinity (kernel/rcu/update.c:535 (discriminator 3))
I've been staring at this, and this is actually fine. Your config changes
from CONFIG_PREEMPT_NONE to CONFIG_PREEMPT_LAZY which implies
CONFIG_PREEMPTION. The trivial RCU test there does sched_setaffinity()
while preemption is enabled and CPU-hotplug runs in the background. So
you get what you expect either by an attempt to move to a CPU which is
no longer valid or by getting migrated to another CPU in the middle of
your operation.
This is all fine. You need to update your config file or your test.
Sebastian
Powered by blists - more mailing lists