lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 19 Mar 2019 09:35:18 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org
Subject: [PATCH] sched: Do not re-read h_load_next during hierarchical load
 calculation

A NULL pointer dereference bug was reported on a distribution kernel but
the same issue should be present on mainline kernel. It occured on s390
but should not be arch-specific.  A partial oops looks like

[775277.408564] Unable to handle kernel pointer dereference in virtual kernel address space
...
[775277.408759] Call Trace:
[775277.408763] ([<0002c11c56899c61>] 0x2c11c56899c61)
[775277.408766]  [<0000000000177bb4>] try_to_wake_up+0xfc/0x450
[775277.408773]  [<000003ff81ede872>] vhost_poll_wakeup+0x3a/0x50 [vhost]
[775277.408777]  [<0000000000194ae4>] __wake_up_common+0xbc/0x178
[775277.408779]  [<0000000000194f86>] __wake_up_common_lock+0x9e/0x160
[775277.408780]  [<00000000001950de>] __wake_up_sync_key+0x4e/0x60
[775277.408785]  [<00000000005d911e>] sock_def_readable+0x5e/0x98

The bug hits any time between 1 hour to 3 days. The dereference occurs
in update_cfs_rq_h_load when accumulating h_load. The problem is that
cfq_rq->h_load_next is not protected by any locking and can be updated
by parallel calls to task_h_load. Depending on the compiler, code may be
generated that re-reads cfq_rq->h_load_next after the check for NULL and
then oops when reading se->avg.load_avg. The dissassembly showed that it
was possible to reread h_load_next after the check for NULL.

While this does not appear to be an issue for later compilers, it's still
an accident if the correct code is generated. Full locking in this path
would have high overhead so this patch uses READ_ONCE to read h_load_next
only once and check for NULL before dereferencing. It was confirmed that
there were no further oops after 10 days of testing.

Signed-off-by: Mel Gorman <mgorman@...hsingularity.net>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 310d0637fe4b..34aeb40e69d2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7726,7 +7726,7 @@ static void update_cfs_rq_h_load(struct cfs_rq *cfs_rq)
 		cfs_rq->last_h_load_update = now;
 	}
 
-	while ((se = cfs_rq->h_load_next) != NULL) {
+	while ((se = READ_ONCE(cfs_rq->h_load_next)) != NULL) {
 		load = cfs_rq->h_load;
 		load = div64_ul(load * se->avg.load_avg,
 			cfs_rq_load_avg(cfs_rq) + 1);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ