lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <e840280bba54050816470caa870bd9f561efc545.1417649608.git.luto@amacapital.net>
Date:	Wed,  3 Dec 2014 15:37:08 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	unlisted-recipients:; (no To-header on input)
Cc:	Linux Kernel <linux-kernel@...r.kernel.org>,
	Richard Guy Briggs <rgb@...hat.com>,
	Eric Paris <eparis@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Dave Jones <davej@...hat.com>,
	Andy Lutomirski <luto@...capital.net>,
	Oleg Nesterov <oleg@...hat.com>,
	Frédéric Weisbecker <fweisbec@...il.com>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>
Subject: [PATCH v2] context_tracking: Restore previous state in schedule_user

It appears that some SCHEDULE_USER (asm for schedule_user) callers
in arch/x86/kernel/entry_64.S are called from RCU kernel context,
and schedule_user will return in RCU user context.  This causes RCU
warnings and possible failures.

This is intended to be a minimal fix suitable for 3.18.

Reported-and-tested-by: Dave Jones <davej@...hat.com>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: Frédéric Weisbecker <fweisbec@...il.com>
Cc: Paul McKenney <paulmck@...ux.vnet.ibm.com>
Signed-off-by: Andy Lutomirski <luto@...capital.net>
---

Hi all-

This is intended to be a suitable last-minute fix for the RCU issue that
Dave saw.

Dave, can you confirm that this fixes it?

Frédéric, can you confirm that you think that this will have no effect
on correct callers of schedule_user and that will do the right thing
for incorrect callers of schedule_user?

I don't like the x86 asm that calls this at all, and I don't really
like the fragility of the mechanism is general, but I think that this
improves the situation enough to avoid problems in the short term.

With the obvious warning added, I get:

[    0.751022] ------------[ cut here ]------------
[    0.751937] WARNING: CPU: 0 PID: 72 at kernel/sched/core.c:2883 schedule_user+0xcf/0xe0()
[    0.753477] Modules linked in:
[    0.754089] CPU: 0 PID: 72 Comm: mount Not tainted 3.18.0-rc7+ #653
[    0.755258] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[    0.757655]  0000000000000009 ffff880005c13f00 ffffffff81741dca ffff8800069f5a50
[    0.759228]  0000000000000000 ffff880005c13f40 ffffffff8108e781 0000000000000246
[    0.760758]  0000000000000000 00007fff970441c8 00007fff97043fd0 00007f67794ebcc8
[    0.762294] Call Trace:
[    0.762775]  [<ffffffff81741dca>] dump_stack+0x46/0x58
[    0.763739]  [<ffffffff8108e781>] warn_slowpath_common+0x81/0xa0
[    0.764865]  [<ffffffff8108e85a>] warn_slowpath_null+0x1a/0x20
[    0.765958]  [<ffffffff8174565f>] schedule_user+0xcf/0xe0
[    0.766974]  [<ffffffff8174ae69>] sysret_careful+0x19/0x1c
[    0.768011] ---[ end trace 329f34db2b3be966 ]---

So, yes, we have a bug, and this could cause any number of strange
problems.

Changes from v1:
 - Added Dave's Tested-by.
 - Fixed a comment typo.

 kernel/sched/core.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 24beb9bb4c3e..89e7283015a6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2874,10 +2874,14 @@ asmlinkage __visible void __sched schedule_user(void)
 	 * or we have been woken up remotely but the IPI has not yet arrived,
 	 * we haven't yet exited the RCU idle mode. Do it here manually until
 	 * we find a better solution.
+	 *
+	 * NB: There are buggy callers of this function.  Ideally we
+	 * should warn if prev_state != IN_USER, but that will trigger
+	 * too frequently to make sense yet.
 	 */
-	user_exit();
+	enum ctx_state prev_state = exception_enter();
 	schedule();
-	user_enter();
+	exception_exit(prev_state);
 }
 #endif
 
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ