[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CA+55aFxQHPPsviY=CcwLKi-u1q_vizQd5ANqaKCBeasu-r0sQQ@mail.gmail.com>
Date: Sat, 1 Aug 2015 10:03:16 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Borislav Petkov <bp@...en8.de>
Cc: "Luis R. Rodriguez" <mcgrof@...e.com>,
Toshi Kani <toshi.kani@...com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>, Peter Anvin <hpa@...or.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
Borislav Petkov <bp@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Brian Gerst <brgerst@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-mm <linux-mm@...ck.org>,
Andy Lutomirski <luto@...capital.net>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"linux-tip-commits@...r.kernel.org"
<linux-tip-commits@...r.kernel.org>
Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
On Sat, Aug 1, 2015 at 9:49 AM, Borislav Petkov <bp@...en8.de> wrote:
>
> My simplistic mental picture while thinking of this is the IO range
> where you send the commands to the device and you don't really want to
> delay those but they should reach the device as they get issued.
Well, even for command streams, people often do go for a
write-combining approach, simply because it is *so* much more
efficient on the bus to buffer and burst things. The interface is set
up to not really "combine" things in the over-writing sense, but just
in the "combine continuous writes into bigger buffers on the CPU, and
then write it out as efficiently as possible" sense.
Of course, the device (and the driver) has to be designed properly for
that, and it makes sense only with certain kinds of models, but it can
actually be much more efficient to make the device interface be
something like "write 32-byte command packets to a circular
write-combining buffer" than it is to do things other ways. Back in
the days, that was one of the most efficient ways to try to fill up
the PCI bandwidth.
There are other approaches too, of course, with the modern variation
tending to be "the device does all real accesses by reading over DMA,
and the only time you use IO accesses is for setup and as a 'start
your DMA transfers now' kind of interface". But write-combining MMIO
used to be a very common model for high-performace IO not that long
ago, because DMA didn't actually use to be all that efficient at all
(nasty behavior with caches and snooping etc back before the memory
controller was on-die and DMA accesses snooped caches directly). So
the "DMA is efficient even for smaller things" thing is relatively
recent.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists