[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1304360823.19493.27.camel@sbsiddha-MOBL3.sc.intel.com>
Date: Mon, 02 May 2011 11:27:03 -0700
From: Suresh Siddha <suresh.b.siddha@...el.com>
To: Cyrill Gorcunov <gorcunov@...nvz.org>
Cc: Ingo Molnar <mingo@...e.hu>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [patch 1/2] x86, x2apic: minimize IPI register writes using
cluster groups v4
On Mon, 2011-05-02 at 07:02 -0700, Cyrill Gorcunov wrote:
> On 05/02/2011 05:22 PM, Ingo Molnar wrote:
> >
> > * Cyrill Gorcunov <gorcunov@...nvz.org> wrote:
> >
> >> With this change, microbenchmark measuring the cost
> >> of flush_tlb_others(), with the flush tlb IPI being
> >> sent from a cpu in the socket-1 to all the logical
> >> cpus in socket-2 (on a Westmere-EX system that has
> >> 20 logical cpus in a socket) is 3x times better now
> >> (compared to the former 'send one-by-one' algorithm).
> >
> > What kind of microbenchmark was this, could the actual results and measurement
> > methods be shared as well?
>
> Suresh, could you please post the microbenchmark?
It is a simple kernel hack to measure the TSC cost of flush_tlb_others()
with and with out this change. 3x better was specifically for the test
condition where we called flush_tlb_others() on a logical cpu in
socket-1, which sent the flush tlb IPI to all the logical cpu's in
another socket.
This is done on WSM-EX which has 20 logical cpu's on one socket. 20
logical cpu's in that socket will fall under two cluster groups. So 2
batches of grouped IPI's vs 20 serialized(atleast the sending part)
IPI's.
thanks,
suresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists