[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f696ebe8605840e3bb04bb78b60a6cfa@AcuMS.aculab.com>
Date: Fri, 3 Aug 2018 11:24:28 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Ard Biesheuvel' <ard.biesheuvel@...aro.org>,
Ramana Radhakrishnan <ramana.gcc@...glemail.com>
CC: Florian Weimer <fweimer@...hat.com>,
Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>,
GNU C Library <libc-alpha@...rceware.org>,
Andrew Pinski <pinskia@...il.com>,
"Catalin Marinas" <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
"Russell King" <linux@...linux.org.uk>,
LKML <linux-kernel@...r.kernel.org>,
"Mikulas Patocka" <mpatocka@...hat.com>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>
Subject: RE: framebuffer corruption due to overlapping stp instructions on
arm64
From: Ard Biesheuvel
> Sent: 03 August 2018 10:30
...
> The discussion about whether memcpy() should rely on unaligned
> accesses, and whether you should use it on device memory is orthogonal
> to that, and not the heart of the matter IMO
Even on x86 using memcpy() on PCIe memory (maybe mmap()ed into userspace)
isn't a good idea.
In the kernel memcpy_to/fromio() ought to be a better choice but that
is just an alternate name for memcpy().
The problem on x86 is that memcpy() is likely to be implemented as
'rep movsb' on modern cpu - relying on the cpu hardware to perform
cache-line sized transfers (etc).
Unfortunately on uncached locations it has to revert to byte copies.
So PCIe transfers (especially reads) are very slow.
The transfers need to use the largest size register available.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists