[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B7D5BB4.4000307@zytor.com>
Date:	Thu, 18 Feb 2010 07:24:36 -0800
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Luca Barbieri <luca@...a-barbieri.com>
CC:	Andi Kleen <andi@...stfloor.org>, mingo@...e.hu,
	a.p.zijlstra@...llo.nl, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 09/10] x86-32: use SSE for atomic64_read/set if available
On 02/18/2010 02:27 AM, Luca Barbieri wrote:
>> CR changes are slow and synchronize the CPU. The later is always slow.
>>
>> It sounds like you didn't time it?
> I didn't, because I think it strongly depends on the microarchitecture
> and I don't have a comprehensive set of machines to test on, so it
> would just be a single data point.
> 
> The lock prefix on cmpxchg8b is also serializing so it might be as bad.
No.  LOCK isn't serializing in the same way CRx writes are.
> Anyway, if we use this, we should keep TS cleared in kernel mode and
> lazily restore it on return to userspace.
> This would make clts/stts performance mostly moot.
This is what kernel_fpu_begin/kernel_fpu_end is all about.  We
definitely cannot leave TS cleared without the user space CPU state
moved to its home location, or we have yet another complicated state to
worry about.
I really feel that without a *strong* use case for this, there is
absolutely no point.
	-hpa
-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
