[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPv3WKeVkWChEE7WjKuu__bg+zvh9dTNy=U_2jiSyWxFWiatbA@mail.gmail.com>
Date: Mon, 6 Aug 2018 16:07:39 +0200
From: Marcin Wojtas <mw@...ihalf.com>
To: Ard Biesheuvel <ard.biesheuvel@...aro.org>, mpatocka@...hat.com
Cc: Robin Murphy <robin.murphy@....com>,
Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>,
Joao Pinto <Joao.Pinto@...opsys.com>,
Catalin Marinas <catalin.marinas@....com>,
linux-pci@...r.kernel.org, Will Deacon <will.deacon@....com>,
Russell King - ARM Linux <linux@...linux.org.uk>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Matt Sealey <neko@...uhatsu.net>,
Jingoo Han <jingoohan1@...il.com>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: framebuffer corruption due to overlapping stp instructions on arm64
Hi Ard, Mikulas,
pon., 6 sie 2018 o 15:48 Ard Biesheuvel <ard.biesheuvel@...aro.org> napisał(a):
>
> On 6 August 2018 at 15:41, Marcin Wojtas <mw@...ihalf.com> wrote:
> > Hi Mikulas,
> >
> > pon., 6 sie 2018 o 14:42 Robin Murphy <robin.murphy@....com> napisał(a):
> >>
> >> On 06/08/18 11:25, Mikulas Patocka wrote:
> >> [...]
> >> >> None of this explains why some transactions fail to make it across
> >> >> entirely. The overlapping writes in question write the same data to
> >> >> the memory locations that are covered by both, and so the ordering in
> >> >> which the transactions are received should not affect the outcome.
> >> >
> >> > You're right that the corruption couldn't be explained just by reordering
> >> > writes. My hypothesis is that the PCIe controller tries to disambiguate
> >> > the overlapping writes, but the disambiguation logic was not tested and it
> >> > is buggy. If there's a barrier between the overlapping writes, the PCIe
> >> > controller won't see any overlapping writes, so it won't trigger the
> >> > faulty disambiguation logic and it works.
> >> >
> >> > Could the ARM engineers look if there's some chicken bit in Cortex-A72
> >> > that could insert barriers between non-cached writes automatically?
> >>
> >> I don't think there is, and even if there was I imagine it would have a
> >> pretty hideous effect on non-coherent DMA buffers and the various other
> >> places in which we have Normal-NC mappings of actual system RAM.
> >>
> >> > I observe these kinds of corruptions:
> >> > - failing to write a few bytes
> >>
> >> That could potentially be explained by the reordering/atomicity issues
> >> Matt mentioned, i.e. the load is observing part of the store, before the
> >> store has fully completed.
> >>
> >> > - writing a few bytes that were written 16 bytes before
> >> > - writing a few bytes that were written 16 bytes after
> >>
> >> Those sound more like the interconnect or root complex ignoring the byte
> >> strobes on an unaligned burst, of which I think the simplistic view
> >> would be "it's broken".
> >>
> >> FWIW I stuck my old Nvidia 7600GT card in my Arm Juno r2 board (2x
> >> Cortex-A72), built your test program natively with GCC 8.1.1 at -O2, and
> >> it's still happily flickering pixels in the corner of the console after
> >> nearly an hour (in parallel with some iperf3 just to ensure plenty of
> >> PCIe traffic). I would strongly suspect this issue is particular to
> >> Armada 8k, so its' probably one for the Marvell folks to take a closer
> >> look at - I believe some previous interconnect issues on those SoCs were
> >> actually fixable in firmware.
> >>
> >>
> >
> > On my Macchiato I use GT630 card (nuveau driver) + debian + xfce
> > desktop and in dual monitor mode, I could run a couple of 1080p
> > streams. All smooth and I've never noticed any image corruption
> > whatsoever (I spent a lot of time in front of such setup). Just to be
> > on a safe side, can you send me a bootlog and your board revision? I'd
> > like to see your firware version and type.
> >
>
> Hi Marcin,
>
> Could you please try running his reproducer?
This is exactly what I plan to do, as soon as I can plug my GFX card
back to the board (tomorrow). Just to remain aligned - is it ok, if I
boot my debian with GT630 plugged, compile the program with -O2 and
simlply run it on /dev/fb0?
Best regards,
Marcin
Powered by blists - more mailing lists