lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <yw1xr3jmdjme.fsf@unicorn.mansr.com>
Date:	Thu, 19 Nov 2015 16:44:25 +0000
From:	Måns Rullgård <mans@...sr.com>
To:	Nicolas Pitre <nicolas.pitre@...aro.org>
Cc:	Alexey Brodkin <Alexey.Brodkin@...opsys.com>,
	Arnd Bergmann <arnd@...db.de>, rmk+kernel@....linux.org.uk,
	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/5] ARM: asm/div64.h: adjust to generic codde

Nicolas Pitre <nicolas.pitre@...aro.org> writes:

> On Thu, 19 Nov 2015, Måns Rullgård wrote:
>
>> Nicolas Pitre <nicolas.pitre@...aro.org> writes:
>> 
>> > +static inline uint64_t __arch_xprod_64(uint64_t m, uint64_t n, bool bias)
>> > +{
>> > +	unsigned long long res;
>> > +	unsigned int tmp = 0;
>> > +
>> > +	if (!bias) {
>> > +		asm (	"umull	%Q0, %R0, %Q1, %Q2\n\t"
>> > +			"mov	%Q0, #0"
>> > +			: "=&r" (res)
>> > +			: "r" (m), "r" (n)
>> > +			: "cc");
>> > +	} else if (!(m & ((1ULL << 63) | (1ULL << 31)))) {
>> > +		res = m;
>> > +		asm (	"umlal	%Q0, %R0, %Q1, %Q2\n\t"
>> > +			"mov	%Q0, #0"
>> > +			: "+&r" (res)
>> > +			: "r" (m), "r" (n)
>> > +			: "cc");
>> > +	} else {
>> > +		asm (	"umull	%Q0, %R0, %Q2, %Q3\n\t"
>> > +			"cmn	%Q0, %Q2\n\t"
>> > +			"adcs	%R0, %R0, %R2\n\t"
>> > +			"adc	%Q0, %1, #0"
>> > +			: "=&r" (res), "+&r" (tmp)
>> > +			: "r" (m), "r" (n)
>> 
>> Why is tmp using a +r constraint here?  The register is not written, so
>> using an input-only operand could/should result in better code.  That is
>> also what the old code did.
>
> No, it is worse. gcc allocates two registers because, somehow, it 
> doesn't think that the first one still holds zero after the first usage.  
> This way usage of only one temporary register is forced throughout, 
> producing better code.

Makes sense.  Thanks for explaining.

-- 
Måns Rullgård
mans@...sr.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ