[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200903182243.34090.nickpiggin@yahoo.com.au>
Date: Wed, 18 Mar 2009 22:43:33 +1100
From: Nick Piggin <nickpiggin@...oo.com.au>
To: Mathieu Desnoyers <compudj@...stal.dyndns.org>
Cc: ltt-dev@...ts.casi.polymtl.ca, Ingo Molnar <mingo@...e.hu>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Josh Boyer <jwboyer@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org
Subject: Re: [ltt-dev] cli/sti vs local_cmpxchg and local_add_return
On Wednesday 18 March 2009 02:14:37 Mathieu Desnoyers wrote:
> * Nick Piggin (nickpiggin@...oo.com.au) wrote:
> > On Tuesday 17 March 2009 12:32:20 Mathieu Desnoyers wrote:
> > > Hi,
> > >
> > > I am trying to get access to some non-x86 hardware to run some atomic
> > > primitive benchmarks for a paper on LTTng I am preparing. That should
> > > be useful to argue about performance benefit of per-cpu atomic
> > > operations vs interrupt disabling. I would like to run the following
> > > benchmark module on CONFIG_SMP :
> > >
> > > - PowerPC
> > > - MIPS
> > > - ia64
> > > - alpha
> > >
> > > usage :
> > > make
> > > insmod test-cmpxchg-nolock.ko
> > > insmod: error inserting 'test-cmpxchg-nolock.ko': -1 Resource
> > > temporarily unavailable dmesg (see dmesg output)
> > >
> > > If some of you would be kind enough to run my test module provided
> > > below and provide the results of these tests on a recent kernel
> > > (2.6.26~2.6.29 should be good) along with their cpuinfo, I would
> > > greatly appreciate.
> > >
> > > Here are the CAS results for various Intel-based architectures :
> > >
> > > Architecture | Speedup | CAS |
> > > Interrupts |
> > >
> > > | (cli + sti) / local cmpxchg | local | sync |
> > > | Enable (sti) | Disable (cli)
> > >
> > > -----------------------------------------------------------------------
> > >---- ---------------------- Intel Pentium 4 | 5.24
> > > | 25 | 81 | 70 | 61 | AMD Athlon(tm)64 X2
> > > | 4.57
> > >
> > > | 7 | 17 | 17 | 15 | Intel
> > >
> > > Core2 | 6.33 | 6 | 30 | 20
> > >
> > > | 18 | Intel Xeon E5405 | 5.25 |
> > > | 8 24 | 20 | 22 |
> > >
> > > The benefit expected on PowerPC, ia64 and alpha should principally come
> > > from removed memory barriers in the local primitives.
> >
> > Benefit versus what? I think all of those architectures can do SMP
> > atomic compare exchange sequences without barriers, can't they?
>
> Hi Nick,
>
> I want to compare if it is faster to use SMP cas without barriers to
> perform synchronization of the tracing hot path wrt interrupts or if it
> is faster to disable interrupts. These decisions will depend on the
> benchmark I propose, because it is comparing the time it takes to
> perform both.
>
> Overall, the benchmarks will allow to choose between those two
> simplified hotpath pseudo-codes (offset is global to the buffer,
> commit_count is per-subbuffer).
>
>
> * lockless :
>
> do {
> old_offset = local_read(&offset);
> get_cycles();
> compute needed size.
> new_offset = old_offset + size;
> } while (local_cmpxchg(&offset, old_offset, new_offset) != old_offset);
>
> /*
> * note : writing to buffer is done out-of-order wrt buffer slot
> * physical order.
> */
> write_to_buffer(offset);
>
> /*
> * Make sure the data is written in the buffer before commit count is
> * incremented.
> */
> smp_wmb();
>
> /* note : incrementing the commit count is also done out-of-order */
> count = local_add_return(size, &commit_count[subbuf_index]);
> if (count is filling a subbuffer)
> allow to wake up readers
Ah OK, so you just mean the benefit of using local atomics is avoiding
the barriers that you get with atomic_t.
I'd thought you were referring to some benefit over irq disable pattern.
> * irq off :
>
> (note : offset and commit count would each be written to atomically
> (type unsigned long))
>
> local_irq_save(flags);
>
> get_cycles();
> compute needed size;
> offset += size;
>
> write_to_buffer(offset);
>
> /*
> * Make sure the data is written in the buffer before commit count is
> * incremented.
> */
> smp_wmb();
>
> commit_count[subbuf_index] += size;
> if (count is filling a subbuffer)
> allow to wake up readers
>
> local_irq_restore(flags);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists