lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR02MB4501B990E353A2616594821294510@BYAPR02MB4501.namprd02.prod.outlook.com>
Date:   Fri, 20 Jul 2018 23:05:52 +0000
From:   David Chen <david.chen@...anix.com>
To:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
CC:     David Chen <david.chen@...anix.com>
Subject: RCU nocb list not reclaiming causing OOM

Hi Paul,

We hit an RCU issue on 4.9.37 kernel. One of the nocb_follower list grows too
large, and not getting reclaimed, causing the system to OOM.

Printing the culprit rcu_sched_data:

  nocb_q_count = {
    counter = 32369635
  },
  nocb_follower_head = 0xffff88ae901c0a00,
  nocb_follower_tail = 0xffff88af1538b8d8,
  nocb_kthread = 0xffff88b06d290000,

As you can see here, the nocb_follower_head is not empty, so in theory, the
nocb_kthread shouldn't go to sleep. However, if dump the stack of the kthread:

crash> bt 0xffff88b06d290000
PID: 21     TASK: ffff88b06d290000  CPU: 3   COMMAND: "rcuos/1"
 #0 [ffffafc9020b7dc0] __schedule at ffffffff8d8789dc
 #1 [ffffafc9020b7e38] schedule at ffffffff8d878e76
 #2 [ffffafc9020b7e50] rcu_nocb_kthread at ffffffff8d112337
 #3 [ffffafc9020b7ec8] kthread at ffffffff8d0c6ce7
 #4 [ffffafc9020b7f50] ret_from_fork at ffffffff8d87d755

And if we dis the address at ffffffff8d112337:

/usr/src/debug/kernel-4.9.37/linux-4.9.37-29.nutanix.07142017.el7.centos.x86_64/kernel/rcu/tree_plugin.h: 2106
0xffffffff8d11232d <rcu_nocb_kthread+381>:      test   %rax,%rax
0xffffffff8d112330 <rcu_nocb_kthread+384>:      jne    0xffffffff8d112355 <rcu_nocb_kthread+421>
0xffffffff8d112332 <rcu_nocb_kthread+386>:      callq  0xffffffff8d878e40 <schedule>
0xffffffff8d112337 <rcu_nocb_kthread+391>:      lea    -0x40(%rbp),%rsi

So the kthread is blocked at swait_event_interruptible in the nocb_follower_wait.
This contradict with the fact that nocb_follower_head was not empty. So I
wonder if this is caused by the lack of memory barrier in the place shown below.
If the head is set to NULL after doing xchg, it will overwrite the head set
by leader. This caused the kthread to sleep the next iteration, and the leader
won't wake him up as the tail doesn't point to head.

Please tell me what do you think.

Thanks,
David

diff -ru linux-4.9.37.orig/kernel/rcu/tree_plugin.h linux-4.9.37/kernel/rcu/tree_plugin.h
--- linux-4.9.37.orig/kernel/rcu/tree_plugin.h	2017-07-12 06:42:41.000000000 -0700
+++ linux-4.9.37/kernel/rcu/tree_plugin.h	2018-07-20 15:25:57.311206343 -0700
@@ -2149,6 +2149,7 @@
 		BUG_ON(!list);
 		trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, "WokeNonEmpty");
 		WRITE_ONCE(rdp->nocb_follower_head, NULL);
+		smp_mb();
 		tail = xchg(&rdp->nocb_follower_tail, &rdp->nocb_follower_head);
 
 		/* Each pass through the following loop invokes a callback. */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ