lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.20.1511191137350.22569@knanqh.ubzr>
Date:	Thu, 19 Nov 2015 11:42:45 -0500 (EST)
From:	Nicolas Pitre <nicolas.pitre@...aro.org>
To:	Måns Rullgård <mans@...sr.com>
cc:	Alexey Brodkin <Alexey.Brodkin@...opsys.com>,
	Arnd Bergmann <arnd@...db.de>, rmk+kernel@....linux.org.uk,
	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/5] ARM: asm/div64.h: adjust to generic codde

On Thu, 19 Nov 2015, Måns Rullgård wrote:

> Nicolas Pitre <nicolas.pitre@...aro.org> writes:
> 
> > +static inline uint64_t __arch_xprod_64(uint64_t m, uint64_t n, bool bias)
> > +{
> > +	unsigned long long res;
> > +	unsigned int tmp = 0;
> > +
> > +	if (!bias) {
> > +		asm (	"umull	%Q0, %R0, %Q1, %Q2\n\t"
> > +			"mov	%Q0, #0"
> > +			: "=&r" (res)
> > +			: "r" (m), "r" (n)
> > +			: "cc");
> > +	} else if (!(m & ((1ULL << 63) | (1ULL << 31)))) {
> > +		res = m;
> > +		asm (	"umlal	%Q0, %R0, %Q1, %Q2\n\t"
> > +			"mov	%Q0, #0"
> > +			: "+&r" (res)
> > +			: "r" (m), "r" (n)
> > +			: "cc");
> > +	} else {
> > +		asm (	"umull	%Q0, %R0, %Q2, %Q3\n\t"
> > +			"cmn	%Q0, %Q2\n\t"
> > +			"adcs	%R0, %R0, %R2\n\t"
> > +			"adc	%Q0, %1, #0"
> > +			: "=&r" (res), "+&r" (tmp)
> > +			: "r" (m), "r" (n)
> 
> Why is tmp using a +r constraint here?  The register is not written, so
> using an input-only operand could/should result in better code.  That is
> also what the old code did.

No, it is worse. gcc allocates two registers because, somehow, it 
doesn't think that the first one still holds zero after the first usage.  
This way usage of only one temporary register is forced throughout, 
producing better code.

I meant to have this split out in a separate patch but messed it up 
somehow.



> 
> > +			: "cc");
> > +	}
> > +
> > +	if (!(m & ((1ULL << 63) | (1ULL << 31)))) {
> > +		asm (	"umlal	%R0, %Q0, %R1, %Q2\n\t"
> > +			"umlal	%R0, %Q0, %Q1, %R2\n\t"
> > +			"mov	%R0, #0\n\t"
> > +			"umlal	%Q0, %R0, %R1, %R2"
> > +			: "+&r" (res)
> > +			: "r" (m), "r" (n)
> > +			: "cc");
> > +	} else {
> > +		asm (	"umlal	%R0, %Q0, %R2, %Q3\n\t"
> > +			"umlal	%R0, %1, %Q2, %R3\n\t"
> > +			"mov	%R0, #0\n\t"
> > +			"adds	%Q0, %1, %Q0\n\t"
> > +			"adc	%R0, %R0, #0\n\t"
> > +			"umlal	%Q0, %R0, %R2, %R3"
> > +			: "+&r" (res), "+&r" (tmp)
> > +			: "r" (m), "r" (n)
> > +			: "cc");
> > +	}
> > +
> > +	return res;
> > +}
> 
> -- 
> Måns Rullgård
> mans@...sr.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ