lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Sat, 1 Aug 2015 10:03:16 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Borislav Petkov <bp@...en8.de>
Cc:	"Luis R. Rodriguez" <mcgrof@...e.com>,
	Toshi Kani <toshi.kani@...com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>, Peter Anvin <hpa@...or.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Borislav Petkov <bp@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Brian Gerst <brgerst@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-mm <linux-mm@...ck.org>,
	Andy Lutomirski <luto@...capital.net>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"linux-tip-commits@...r.kernel.org" 
	<linux-tip-commits@...r.kernel.org>
Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()

On Sat, Aug 1, 2015 at 9:49 AM, Borislav Petkov <bp@...en8.de> wrote:
>
> My simplistic mental picture while thinking of this is the IO range
> where you send the commands to the device and you don't really want to
> delay those but they should reach the device as they get issued.

Well, even for command streams, people often do go for a
write-combining approach, simply because it is *so* much more
efficient on the bus to buffer and burst things. The interface is set
up to not really "combine" things in the over-writing sense, but just
in the "combine continuous writes into bigger buffers on the CPU, and
then write it out as efficiently as possible" sense.

Of course, the device (and the driver) has to be designed properly for
that, and it makes sense only with certain kinds of models, but it can
actually be much more efficient to make the device interface be
something like "write 32-byte command packets to a circular
write-combining buffer" than it is to do things other ways. Back in
the days, that was one of the most efficient ways to try to fill up
the PCI bandwidth.

There are other approaches too, of course, with the modern variation
tending to be "the device does all real accesses by reading over DMA,
and the only time you use IO accesses is for setup and as a 'start
your DMA transfers now' kind of interface". But write-combining MMIO
used to be a very common model for high-performace IO not that long
ago, because DMA didn't actually use to be all that efficient at all
(nasty behavior with caches and snooping etc back before the memory
controller was on-die and DMA accesses snooped caches directly). So
the "DMA is efficient even for smaller things" thing is relatively
recent.

                     Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ