[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 8 Aug 2018 14:40:16 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Catalin Marinas <catalin.marinas@....com>
cc: Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>,
Joao Pinto <Joao.Pinto@...opsys.com>,
libc-alpha@...rceware.org,
Ard Biesheuvel <ard.biesheuvel@...aro.org>,
Jingoo Han <jingoohan1@...il.com>,
Will Deacon <will.deacon@....com>,
Russell King <linux@...linux.org.uk>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Matt Sealey <neko@...uhatsu.net>, linux-pci@...r.kernel.org,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>
Subject: Re: framebuffer corruption due to overlapping stp instructions on
arm64
On Wed, 8 Aug 2018, Catalin Marinas wrote:
> On Wed, Aug 08, 2018 at 10:12:27AM -0400, Mikulas Patocka wrote:
> > On Wed, 8 Aug 2018, Catalin Marinas wrote:
> > > On Fri, Aug 03, 2018 at 01:09:02PM -0400, Mikulas Patocka wrote:
> > > > while (1) {
> > > > start = (unsigned)random() % (LEN + 1);
> > > > end = (unsigned)random() % (LEN + 1);
> > > > if (start > end)
> > > > continue;
> > > > for (i = start; i < end; i++)
> > > > data[i] = val++;
> > > > memcpy(map + start, data + start, end - start);
> > > > if (memcmp(map, data, LEN)) {
> > >
> > > It may be worth trying to do a memcmp(map+start, data+start, end-start)
> > > here to see whether the hazard logic fails when the writes are unaligned
> > > but the reads are not.
> > >
> > > This problem may as well appear if you do byte writes and read longs
> > > back (and I consider this a hardware problem on this specific board).
> >
> > I triad to insert usleep(10000) between the memcpy and memcmp, but the
> > same corruption occurs. So, it can't be read-after-write hazard. It is
> > caused by the improper handling of hazard between the overlapping writes
> > inside memcpy.
>
> It could get it wrong between subsequent writes to the same 64-bit range
> (e.g. the address & ~63 is the same but the data strobes for which bytes
> to write are different). If it somehow thinks that it's a
> write-after-write hazard even though the strobes are different, it could
> cancel one of the writes.
I believe that the SoC has logic for write-after-write detection, but the
logic is broken and corrupts data.
If I insert "dmb sy" between the overlapping writes, there's no corruption
(the PCIe controller won't see any overlapping writes in that case).
> It may be worth trying with a byte-only memcpy() function while keeping
> the default memcmp().
I tried that and byte-only memcpy works without any corruption.
> --
> Catalin
Mikulas
Powered by blists - more mailing lists