lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 25 Jun 2008 22:33:13 -0700
From:	Keith Packard <keithp@...thp.com>
To:	Dave Airlie <airlied@...il.com>
Cc:	keithp@...thp.com, Arjan van de Ven <arjan@...radead.org>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	Dave Airlie <airlied@...ux.ie>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: kmap_atomic_pfn for PCI BAR access?

On Thu, 2008-06-26 at 15:02 +1000, Dave Airlie wrote:

> Thats why keithp wants something like kmap_atomic but for ioremap
> instead of kmaps.

Right, the usage is precisely the same as kmap_atomic -- reading and
writing from a wide range of physical addresses without needing either
permanent map nor taking the huge cost of TLB flushing.

The existing kmap_atomic_pfn is exactly what I need (modulo the lack of
prot bits), except that it only handles non-memory pages on kernels with
CONFIG_HIGHMEM set.

It seems like making this function work on non-HIGHMEM kernels would
require only the reservation of the same few PTEs that HIGHMEM kernels
use, along with suitable hacking to make them work.

> This would be for short temporary ioremaps like kmap_atomic is.

For an integrated graphics device, this is just an optimization. I take
physical pages, map them to the graphics GTT which makes them visible to
the CPU up in I/O space. Then, I map address in the GTT back to the CPU
with kmap_atomic_pfn and viola -- WC mapped access to regular pages, all
without touching the low memory mappings.

Before trying this, we just mapped the 'real' address of the page and
then used clflush to get the contents out to memory where the graphics
device could pick it up. However, the clflush is fairly expensive,
enough so that using the WC mapping turns out to be faster in practice.

Beyond simple integrated graphics performance benefits, we're looking
towards discrete graphics cards where we need to write to VRAM which can
only be made visible through the aperture; in that environment, we're
stuck choosing between ioremap (urf) or the same kmap_atomic_pfn as
above. In this case, there's no question that kmap_atomic_pfn will be a
huge performance benefit.

-- 
keith.packard@...el.com

Download attachment "signature.asc" of type "application/pgp-signature" (190 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ