linux-kernel - Re: [PATCH v2] buffer: Fix I/O error due to ARM read-after-read hazard

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191112182249.GB22025@arrakis.emea.arm.com>
Date:   Tue, 12 Nov 2019 18:22:49 +0000
From:   Catalin Marinas <catalin.marinas@....com>
To:     Will Deacon <will@...nel.org>
Cc:     Vincent Whitchurch <vincent.whitchurch@...s.com>,
        torvalds@...ux-foundation.org, axboe@...nel.dk,
        linux@...linux.org.uk, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        Vincent Whitchurch <rabinv@...s.com>,
        Richard Earnshaw <Richard.Earnshaw@....com>
Subject: Re: [PATCH v2] buffer: Fix I/O error due to ARM read-after-read
 hazard

On Tue, Nov 12, 2019 at 06:00:35PM +0000, Will Deacon wrote:
> On Tue, Nov 12, 2019 at 04:08:57PM +0000, Catalin Marinas wrote:
> > On Tue, Nov 12, 2019 at 02:02:44PM +0100, Vincent Whitchurch wrote:
> > > On my dual-core ARM Cortex-A9, reading from squashfs (over
> > > dm-verity/ubi/mtd) in a loop for hundreds of hours invariably results in
> > > a read failure in squashfs_read_data().  The errors occur because the
> > > buffer_uptodate() check fails after wait_on_buffer().  Further debugging
> > > shows that the bh was in fact uptodate and that there is no actual I/O
> > > error in the lower layers.
> > > 
> > > The problem is caused by the read-after-read hazards in the ARM
> > > Cortex-A9 MPCore (erratum #761319, see [1]).  The code generated by the
> > > compiler for the combination of the wait_on_buffer() and
> > > buffer_uptodate() calls reads the flags value twice from memory (see the
> > > excerpt of the assembly below).  The new value of the BH_Lock flag is
> > > seen but the new value of BH_Uptodate is not even though both the bits
> > > are read from the same memory location.
> > > 
> > >  27c:	9d08      	ldr	r5, [sp, #32]
> > >  27e:	2400      	movs	r4, #0
> > >  280:	e006      	b.n	290 <squashfs_read_data+0x290>
> > >  282:	6803      	ldr	r3, [r0, #0]
> > >  284:	07da      	lsls	r2, r3, #31
> > >  286:	f140 810d 	bpl.w	4a4 <squashfs_read_data+0x4a4>
> > >  28a:	3401      	adds	r4, #1
> > >  28c:	42bc      	cmp	r4, r7
> > >  28e:	da08      	bge.n	2a2 <squashfs_read_data+0x2a2>
> > >  290:	f855 0f04 	ldr.w	r0, [r5, #4]!
> > >  294:	6803      	ldr	r3, [r0, #0]
> > >  296:	0759      	lsls	r1, r3, #29
> > >  298:	d5f3      	bpl.n	282 <squashfs_read_data+0x282>
> > >  29a:	f7ff fffe 	bl	0 <__wait_on_buffer>
> > > 
> > > Work around this problem by adding a DMB between the two reads of
> > > bh->flags, as recommended in the ARM document.  With this barrier, no
> > > failures have been seen in more than 5000 hours of the same test.
> > > 
> > > [1] http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf
> > 
> > I thought we were going to fix the compiler. I found an old thread here:
> > 
> > https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00714.html
> > 
> > Also cc'ing Richard Earnshaw as he may been involved in the gcc
> > discussion at the time.
> > 
> > While you can add some barrier here, there may be other cases where this
> > can go wrong.
> 
> Hmm, and afaict, even if the compiler was modified to emit LDREX instructions
> for volatile loads, it wouldn't help in this case because test_bit() isn't
> using READ_ONCE().

I think changing volatile accesses to LDREX in gcc wasn't acceptable
since they may read Device memory and not allowed on ARM.

> It's also slightly odd that the proposed patch makes the code look like:
> 
> 	for (i = 0; i < b; i++) {
> 		if (buffer_locked(bh)) {
> 			__wait_on_buffer(bh);
> 			smp_rmb();
> 		}

The proposed patch actually keeps smp_rmb() outside the 'if' block but
your point below still stands.

> 		if (!buffer_uptodate(bh[i]))
> 			goto block_release;
> 	}
> 
> whereas there are other potential RAR orderings between buffer_locked()
> and __wait_on_buffer() and also probably between successive iterations
> of the loop.
> 
> So, really, the only way I see to solve this is for us to use READ_ONCE
> consistently for all relaxed atomic loads (KCSAN is starting to tread on
> this), and then to patch READ_ONCE to emit a DMB at runtime for arch/arm/
> (maybe a static key would work if you can avoid the recursion).

OK, so this includes changing test_bit() to perform a READ_ONCE.

-- 
Catalin