linux-kernel - [PATCH] KGDB: add smp_mb() in synchronisation during exception handler exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <1268158831-6976-1-git-send-email-will.deacon@arm.com>
Date:	Tue,  9 Mar 2010 18:20:31 +0000
From:	Will Deacon <will.deacon@....com>
To:	linux-kernel@...r.kernel.org
Cc:	Will Deacon <will.deacon@....com>,
	KGDB Mailing List <kgdb-bugreport@...ts.sourceforge.net>,
	Catalin Marinas <catalin.marinas@....com>,
	Russell King - ARM Linux <linux@....linux.org.uk>,
	linux-arm-kernel@...ts.infradead.org
Subject: [PATCH] KGDB: add smp_mb() in synchronisation during exception handler exit

KGDB uses atomic variables and busy-wait loops to co-ordinate between
multiple CPUs on an SMP system. When an exception is handled, the primary
CPU executes kgdb_handle_exception() whilst the others execute kgdb_wait.

There comes a point when the waiters are waiting for the primary CPU to finish:

	/* Wait till primary CPU is done with debugging */
(1)	while (atomic_read(&passive_cpu_wait[cpu]))
		cpu_relax();

	/* Do important KGDB stuff */

	/* Signal the primary CPU that we are done: */
	atomic_set(&cpu_in_kgdb[cpu], 0);

In parallel to this, the primary CPU is doing:

	for (i = NR_CPUS-1; i >= 0; i--)
		atomic_set(&passive_cpu_wait[i], 0);
	/*
	 * Wait till all the CPUs have quit
	 * from the debugger.
	 */
	for_each_online_cpu(i) {
(1)		while (atomic_read(&cpu_in_kgdb[i]))
			cpu_relax();
	}

There is a potential deadlock situation at point (1) because the previous
writes to the passive_cpu_wait variables by the primary CPU may not yet be
visible to the other CPUs [for instance, they may be sitting in the local
store buffer]. This means that the waiter CPUs will never exit the while loop
and therefore never write to the cpu_in_kgdb variables, which the primary CPU
is blocked on. Furthermore, because the primary CPU is aggressively performing
reads, the store buffer may not necessarily drain so the system will deadlock.

This deadlock has been experienced on a quad-core ARM11MPCore platform.

The following patch addresses the issue by adding a memory barrier to the
primary CPU before the polling loop, therefore forcing the previous atomic_sets
to be visible before waiting for the waiters to finish.

Cc: KGDB Mailing List <kgdb-bugreport@...ts.sourceforge.net>
Cc: Catalin Marinas <catalin.marinas@....com>
Cc: Russell King - ARM Linux <linux@....linux.org.uk>
Cc: linux-arm-kernel@...ts.infradead.org
Signed-off-by: Will Deacon <will.deacon@....com>
---
 kernel/kgdb.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/kernel/kgdb.c b/kernel/kgdb.c
index 761fdd2..ee7694b 100644
--- a/kernel/kgdb.c
+++ b/kernel/kgdb.c
@@ -1537,6 +1537,7 @@ acquirelock:
 		 * Wait till all the CPUs have quit
 		 * from the debugger.
 		 */
+		smp_mb();
 		for_each_online_cpu(i) {
 			while (atomic_read(&cpu_in_kgdb[i]))
 				cpu_relax();
-- 
1.6.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/