[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHCPf3tFGqkYEcWNN4LaWThw_rVqT316pzLv6T7RfxwO-eZ0EA@mail.gmail.com>
Date: Thu, 2 Aug 2018 15:49:17 -0500
From: Matt Sealey <neko@...uhatsu.net>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: Catalin Marinas <catalin.marinas@....com>,
Russell King <linux@...linux.org.uk>,
Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>,
Will Deacon <will.deacon@....com>, libc-alpha@...rceware.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: framebuffer corruption due to overlapping stp instructions on arm64
The easiest explanation for this would be that the memory isn’t mapped
correctly. You can’t use PCIe memory spaces with anything other than
Device-nGnRE or stricter mappings. That’s just differences between the AMBA
and PCIe (posted/unposted) memory models.
Normal memory (cacheable or uncacheable, which Linux tends to call “memory”
and “writecombine” respectively) is not a good idea.
There are two options; make sure Links maps it’s framebuffer as Device
memory, or the driver, or both - and make sure that only aligned accesses
happen (otherwise you’ll just get a synchronous exception) and there isn’t
a Normal memory alias.
Alternatively, tell the PCIe driver that the framebuffer is in system
memory - you can map it however you like but there’ll be a performance hit
if you start to use GPU acceleration, but a significant performance boost
from the PoV of the CPU. Only memory accessed from the PCIe master
interface (i.e. reads and writes generated by the card itself - telling the
GPU to pull from system memory or other DMA) can be in Normal memory and
this allows PCIe to be cache coherent with the right interconnect. The
slave port on a PCIe root complex (i.e. CPU writes) can’t be used with
Normal, or reorderable, and therefore your 2GB of graphics memory is going
to be slow from the point of view of the CPU.
To find the correct mapping you’ll need to know just how cache coherent the
PCIe RC is...
Ta,
Matt
On Thu, Aug 2, 2018 at 14:31 Mikulas Patocka <mpatocka@...hat.com> wrote:
> Hi
>
> I tried to use a PCIe graphics card on the MacchiatoBIN board and I hit a
> strange problem.
>
> When I use the links browser in graphics mode on the framebuffer, I get
> occasional pixel corruption. Links does memcpy, memset and 4-byte writes
> on the framebuffer - nothing else.
>
> I found out that the pixel corruption is caused by overlapping unaligned
> stp instructions inside memcpy. In order to avoid branching, the arm64
> memcpy implementation may write the same destination twice with different
> alignment. If I put "dmb sy" between the overlapping stp instructions, the
> pixel corruption goes away.
>
> This seems like a hardware bug. Is it a known errata? Do you have any
> workarounds for it?
>
> I tried AMD card (HD 6350) and NVidia (NVS 285) and both exhibit the same
> corruption. OpenGL doesn't work (it results in artifacts on the AMD card
> and lock-up on the NVidia card), but it's quite expected if even simple
> writing to the framebuffer doesn't work.
>
> Mikulas
>
Content of type "text/html" skipped
Powered by blists - more mailing lists