lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A9636EB.7020408@kernel.org>
Date:	Thu, 27 Aug 2009 16:34:03 +0900
From:	Tejun Heo <tj@...nel.org>
To:	Jan Beulich <JBeulich@...ell.com>
CC:	"H. Peter Anvin" <hpa@...or.com>, mingo@...e.hu,
	tglx@...utronix.de, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86: make use of inc/dec conditional

Hello, Jan.

Jan Beulich wrote:
>>>> "H. Peter Anvin" <hpa@...or.com> 19.08.09 18:48 >>>
>> On 08/19/2009 12:48 AM, Jan Beulich wrote:
>>> According to gcc's instruction selection, inc/dec can be used without
>>> penalty on most CPU models, but should be avoided on others. Hence we
>>> should have a config option controlling the use of inc/dec, and
>>> respective abstraction macros to avoid making the resulting code too
>>> ugly. There are a few instances of inc/dec that must be retained in
>>> assembly code, due to that code's dependency on the instruction not
>>> changing the carry flag.
>> One thing: I doubt it matters one measurable iota when it comes to
>> locked operations.
> 
> Okay, I think I agree to this point.
> 
>> Furthermore:
>>
>> -		     "decl %2	;\n"
>> +		     _ASM_DECL "%2	;\n"
>> 		     "jne 1b		;\n"
>> 		     "adcl $0, %0	;\n"
>>
>> It looks to me that the carry flag is live across the dec there.  The
> 
> Indeed, I overlooked that when going through and checking for the
> CF-is-live instances.
> 
>> other csum code look scary to me too.
>>
>> The rest of them look technically okay, but you're bloating them by two
>> bytes (one byte in 64-bit mode) for every instance.  You may want to
>> consider if any particular instance is more icache-critical than
>> stall-critical.  This is probably more of a concern for inlines than for
>> regular single-instance code like the string operations.
> 
> So the background really is that I wanted to introduce a percpu_inc()
> operation subsequently (here with the goal to reduce code size by one
> byte in a couple of places - initially just for inc_irq_stat(), didn't look
> for other potential users), but then realized that it wouldn't be nice
> to unconditionally introduce a possible stall here. Hence I went and
> first created said config option, and then also went through and
> identified the uses of inc/dec that could be replaced based on that
> config option.

Given that we're already sprinkling inc/dec's via atomic ops, I think
this part can proceed independently.  Also, if the only affected
machine is the hot p4, I don't think it would worth any amount of
code.  :-)

For the percpu part, wouldn't it be better to have
__builtin_contant_p() on the add/sub parameter, use inc/dec if the
param is constant and 1 and make simple wrapper for inc/dec if still
necessary?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ