lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20191030102810.20744-1-vincent.whitchurch@axis.com>
Date:   Wed, 30 Oct 2019 11:28:10 +0100
From:   Vincent Whitchurch <vincent.whitchurch@...s.com>
To:     axboe@...nel.dk
Cc:     linux@...linux.org.uk, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        Vincent Whitchurch <rabinv@...s.com>
Subject: [PATCH] buffer: Work around I/O errors due to ARM CPU bug

On my dual-core ARM Cortex-A9, reading from squashfs (over
dm-verity/ubi/mtd) in a loop for hundreds of hours inevitably results in
a read failure in squashfs_read_data().  The errors occur because the
buffer_uptodate() check fails after wait_on_buffer().  Further debugging
shows that the bh was in fact uptodate and that there is no actual I/O
error in the lower layers.

The problem appears to be caused by the read-after-read hazards in the
ARM Cortex-A9 MPCore (erratum #761319, see [1]).  The new value of the
BH_Lock flag is seen but the new value of BH_Uptodate is not even though
both the bits are read from the same memory location.  Work around it by
adding a DMB between the two reads of bh->flags.

 27c:	9d08      	ldr	r5, [sp, #32]
 27e:	2400      	movs	r4, #0
 280:	e006      	b.n	290 <squashfs_read_data+0x290>
 282:	6803      	ldr	r3, [r0, #0]
 284:	07da      	lsls	r2, r3, #31
 286:	f140 810d 	bpl.w	4a4 <squashfs_read_data+0x4a4>
 28a:	3401      	adds	r4, #1
 28c:	42bc      	cmp	r4, r7
 28e:	da08      	bge.n	2a2 <squashfs_read_data+0x2a2>
 290:	f855 0f04 	ldr.w	r0, [r5, #4]!
 294:	6803      	ldr	r3, [r0, #0]
 296:	0759      	lsls	r1, r3, #29
 298:	d5f3      	bpl.n	282 <squashfs_read_data+0x282>
 29a:	f7ff fffe 	bl	0 <__wait_on_buffer>

With this barrier, no failures have been seen in 2500+ hours of the same
test.

[1] http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@...s.com>
---
 include/linux/buffer_head.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 7b73ef7f902d..4ef909a91f8c 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -352,6 +352,14 @@ static inline void wait_on_buffer(struct buffer_head *bh)
 	might_sleep();
 	if (buffer_locked(bh))
 		__wait_on_buffer(bh);
+
+#ifdef CONFIG_ARM
+	/*
+	 * Work around ARM Cortex-A9 MPcore Read-after-Read Hazards (erratum
+	 * 761319).
+	 */
+	smp_rmb();
+#endif
 }
 
 static inline int trylock_buffer(struct buffer_head *bh)
-- 
2.20.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ