lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZxCt/JQkGZHo0n2o@xsang-OptiPlex-9020>
Date: Thu, 17 Oct 2024 14:26:04 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
CC: Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Peter Zijlstra
	<peterz@...radead.org>, <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
	<linux-kernel@...r.kernel.org>, <rcu@...r.kernel.org>,
	<oliver.sang@...el.com>
Subject: Re: [peterz-queue:sched/lazy] [sched, x86]  74d850cd4c:
 WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print[rcutorture]

hi, Paul,

On Wed, Oct 16, 2024 at 09:12:35AM -0700, Paul E. McKenney wrote:
> On Wed, Oct 16, 2024 at 05:40:48PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2024-10-15 15:47:20 [+0800], kernel test robot wrote:
> > > Hello,
> > Hi,
> > 
> > it took me a while to reproduce this because it does not trigger without
> > the ltp userland and this is not downloaded properly so I had to
> > workaround it. However…
> > 
> > > [   98.006999][  T543] busted-torture: rtc: 00000000639e821e ver: 6796 tfle: 0 rta: 6796 rtaf: 0 rtf: 6787 rtmbe: 1 rtmbkf: 0/0 rtbe: 0 rtbke: 0 rtbf: 0 rtb: 0 nt: 89 barrier: 0/0:0 read-exits: 64 nocb-toggles: 0:0
> > > [   98.017386][  T543] busted-torture: !!! 
> > > [   98.017662][  T543] ------------[ cut here ]------------
> > > [   98.019330][  T543] WARNING: CPU: 0 PID: 543 at kernel/rcu/rcutorture.c:2258 rcu_torture_stats_print+0x24c/0x610 [rcutorture]
> > > [   98.021831][  T543] Modules linked in: rcutorture torture
> > …
> > > [   98.237025][  T543] ------------[ cut here ]------------
> > > [   98.238052][  T543] WARNING: CPU: 0 PID: 543 at kernel/rcu/rcutorture.c:2263 rcu_torture_stats_print+0x373/0x610 [rcutorture]
> > > [   98.240000][  T543] Modules linked in: rcutorture torture
> > …
> > 
> > It took me a while to figure out that this test is using
> > rcu_busted_torture_deferred_free() which in turn invokes the callback
> > before the grace period. Buh.
> > So it looks like LAZY preempt triggers this more reliably than the
> > normal preempt version…
> 
> Agreed!
> 
> Please don't run any rcutorture scenario whose name contains "BUSTED"
> or "busted" unless you are trying to test rcutorture's ability to find
> broken RCU implementations.

thanks for information!

we will refine our test to avoid rcutorture 'busted' related testing.


BTW, for this commit, our bot finished several bisect. we happened to report
based on this 300s-cpuhotplug-busted test since we found the parent is quite
clean:

=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/runtime/test/torture_type:
  vm-snb/rcutorture/debian-12-x86_64-20240206.cgz/x86_64-randconfig-014-20241011/clang-18/300s/cpuhotplug/busted

693ed5cdf46ac062 74d850cd4c5bc26ce83511b4247
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :6          100%           6:6     rcutorture.cpuhotplug-busted.fail
           :6          100%           6:6     dmesg.RIP:rcu_torture_stats_print[rcutorture]
           :6          100%           6:6     dmesg.WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print[rcutorture]


in fact, we found this commit also causes some new issues in other tests, such
like below 300s-cpuhotplug-trivial test


=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/runtime/test/torture_type:
  vm-snb/rcutorture/debian-11.1-i386-20220923.cgz/x86_64-randconfig-103-20241011/gcc-12/300s/cpuhotplug/trivial

693ed5cdf46ac062 74d850cd4c5bc26ce83511b4247
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
          6:6            0%           6:6     dmesg.RIP:rcu_torture_stats_print[rcutorture]
          6:6            0%           6:6     dmesg.RIP:synchronize_rcu_trivial[rcutorture]
           :6          100%           6:6     dmesg.RIP:torture_sched_setaffinity
          6:6            0%           6:6     dmesg.WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print[rcutorture]
          6:6            0%           6:6     dmesg.WARNING:at_kernel/rcu/rcutorture.c:#synchronize_rcu_trivial[rcutorture]
           :6          100%           6:6     dmesg.WARNING:at_kernel/rcu/update.c:#torture_sched_setaffinity    <------ new stat


the config for this 300s-cpuhotplug-trivial test is attached as
config-6.12.0-rc1-00037-g74d850cd4c5b
which is a randconfig, too

it has diff with config from parent as below [1]

also attached a full dmesg, from it:

[   92.540279][  T461] ------------[ cut here ]------------
[   92.540718][  T461] torture_sched_setaffinity: sched_setaffinity(461) returned -22
[   92.541280][  T461] WARNING: CPU: 0 PID: 461 at kernel/rcu/update.c:535 torture_sched_setaffinity+0x3b/0x4f
[   92.541998][  T461] Modules linked in: rcutorture torture crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni sha1_ssse3 rapl bochs drm_vram_helper
 drm_ttm_helper evbug ttm input_leds drm_kms_helper serio_raw qemu_fw_cfg drm ipv6
[   92.543610][  T461] CPU: 0 UID: 0 PID: 461 Comm: rcu_torture_fak Tainted: G        W       T  6.12.0-rc1-00037-g74d850cd4c5b #1
[   92.544421][  T461] Tainted: [W]=WARN, [T]=RANDSTRUCT
[   92.544781][  T461] RIP: 0010:torture_sched_setaffinity+0x3b/0x4f
[   92.545216][  T461] Code: c0 74 29 80 3d cd 7b d5 01 00 75 20 89 c1 c6 05 c2 7b d5 01 01 89 ea 48 c7 c6 c0 92 01 82 48 c7 c7 a6 03 5d 82 e8 2a 26 f7 ff <0
f> 0b 48 63 c3 5b 5d 31 d2 31 c9 31 f6 31 ff c3 cc cc cc cc f3 0f
[   92.546555][  T461] RSP: 0018:ffffc90001cf3e70 EFLAGS: 00210246
[   92.546995][  T461] RAX: 0000000000000000 RBX: ffffffffffffffea RCX: 0000000000000000
[   92.547568][  T461] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   92.551112][  T461] RBP: 00000000000001cd R08: 0000000000000000 R09: 0000000000000000
[   92.551698][  T461] R10: 0000000000000000 R11: 5f65727574726f74 R12: 0000000000000002
[   92.552290][  T461] R13: 0000000000000000 R14: 0000000000000000 R15: ffffc90001fbbb50
[   92.552867][  T461] FS:  0000000000000000(0000) GS:ffff88842fc00000(0000) knlGS:0000000000000000
[   92.553467][  T461] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   92.553909][  T461] CR2: 00000000f6a3e040 CR3: 000000011fd2f000 CR4: 00000000000406b0
[   92.554446][  T461] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   92.555018][  T461] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   92.555555][  T461] Call Trace:
[   92.555809][  T461]  <TASK>
[   92.556024][  T461]  ? __warn+0xa1/0x16e
[   92.556327][  T461]  ? torture_sched_setaffinity+0x3b/0x4f
[   92.556726][  T461]  ? report_bug+0x13b/0x1d1
[   92.557070][  T461]  ? handle_bug+0x53/0x8c
[   92.557390][  T461]  ? exc_invalid_op+0x17/0x74
[   92.557732][  T461]  ? asm_exc_invalid_op+0x1a/0x20
[   92.558140][  T461]  ? torture_sched_setaffinity+0x3b/0x4f
[   92.558558][  T461]  synchronize_rcu_trivial+0x67/0xaf [rcutorture]
[   92.559068][  T461]  ? rcu_lock_release+0xc/0xc [rcutorture]
[   92.559494][  T461]  rcu_torture_fakewriter+0x264/0x469 [rcutorture]
[   92.559972][  T461]  kthread+0xff/0x107
[   92.560271][  T461]  ? list_del_init+0x2e/0x2e
[   92.560608][  T461]  ret_from_fork+0x23/0x35
[   92.560943][  T461]  ? list_del_init+0x2e/0x2e
[   92.561294][  T461]  ret_from_fork_asm+0x11/0x20
[   92.561665][  T461]  </TASK>
[   92.561899][  T461] irq event stamp: 36775
[   92.562218][  T461] hardirqs last  enabled at (36783): [<ffffffff8111d295>] __up_console_sem+0x4e/0x5a
[   92.562956][  T461] hardirqs last disabled at (36792): [<ffffffff8111d27a>] __up_console_sem+0x33/0x5a
[   92.563663][  T461] softirqs last  enabled at (36554): [<ffffffff810b148d>] handle_softirqs+0x393/0x3e0
[   92.564375][  T461] softirqs last disabled at (36539): [<ffffffff810b1594>] __irq_exit_rcu+0x52/0xac
[   92.565083][  T461] ---[ end trace 0000000000000000 ]---



[1]
--- /pkg/linux/x86_64-randconfig-103-20241011/gcc-12/693ed5cdf46ac062aa7713700c77019c60f0ae41/.config   2024-10-12 20:01:25.345180715 +0800
+++ /pkg/linux/x86_64-randconfig-103-20241011/gcc-12/74d850cd4c5bc26ce83511b4247ea786eb73217c/.config   2024-10-12 16:00:42.428850680 +0800
@@ -121,12 +121,15 @@ CONFIG_BPF_UNPRIV_DEFAULT_OFF=y
 # CONFIG_BPF_PRELOAD is not set
 # end of BPF subsystem

-CONFIG_PREEMPT_NONE_BUILD=y
-CONFIG_PREEMPT_NONE=y
+CONFIG_PREEMPT_BUILD=y
+CONFIG_ARCH_HAS_PREEMPT_LAZY=y
+# CONFIG_PREEMPT_NONE is not set
 # CONFIG_PREEMPT_VOLUNTARY is not set
 # CONFIG_PREEMPT is not set
+CONFIG_PREEMPT_LAZY=y
 # CONFIG_PREEMPT_RT is not set
 CONFIG_PREEMPT_COUNT=y
+CONFIG_PREEMPTION=y
 # CONFIG_PREEMPT_DYNAMIC is not set
 CONFIG_SCHED_CORE=y

@@ -148,6 +151,7 @@ CONFIG_HAVE_SCHED_AVG_IRQ=y
 # RCU Subsystem
 #
 CONFIG_TREE_RCU=y
+CONFIG_PREEMPT_RCU=y
 CONFIG_RCU_EXPERT=y
 CONFIG_TREE_SRCU=y
 CONFIG_TASKS_RCU_GENERIC=y
@@ -162,6 +166,9 @@ CONFIG_RCU_STALL_COMMON=y
 CONFIG_RCU_NEED_SEGCBLIST=y
 CONFIG_RCU_FANOUT=64
 CONFIG_RCU_FANOUT_LEAF=16
+CONFIG_RCU_BOOST=y
+CONFIG_RCU_BOOST_DELAY=500
+# CONFIG_RCU_EXP_KTHREAD is not set
 # CONFIG_RCU_NOCB_CPU is not set
 # CONFIG_TASKS_TRACE_RCU_READ_MB is not set
 # CONFIG_RCU_DOUBLE_CHECK_CB_TIME is not set
@@ -6722,6 +6729,7 @@ CONFIG_SCHEDSTATS=y
 # end of Scheduler Debugging

 # CONFIG_DEBUG_TIMEKEEPING is not set
+CONFIG_DEBUG_PREEMPT=y

 #
 # Lock Debugging (spinlocks, mutexes, etc...)
@@ -6820,6 +6828,7 @@ CONFIG_FTRACE=y
 # CONFIG_FUNCTION_TRACER is not set
 # CONFIG_STACK_TRACER is not set
 # CONFIG_IRQSOFF_TRACER is not set
+# CONFIG_PREEMPT_TRACER is not set
 # CONFIG_SCHED_TRACER is not set
 # CONFIG_HWLAT_TRACER is not set
 # CONFIG_OSNOISE_TRACER is not set


> 
> 							Thanx, Paul
> 

View attachment "config-6.12.0-rc1-00037-g74d850cd4c5b" of type "text/plain" (173946 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (46864 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ