lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.20.1511161710270.5826@knanqh.ubzr>
Date:	Mon, 16 Nov 2015 20:20:38 -0500 (EST)
From:	Nicolas Pitre <nicolas.pitre@...aro.org>
To:	Arnd Bergmann <arnd@...db.de>
cc:	Russell King - ARM Linux <linux@....linux.org.uk>,
	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [GIT PULL] optimize 64-by-32 ddivision for constant divisors on
 32-bit machines

Arnd,

Please pull the following branch:

	git://git.linaro.org/people/nicolas.pitre/linux div64

This contains those patches I've initially posted here:

	https://lkml.org/lkml/2015/11/2/715

Only changes to those posted patches are cosmetic improvements such as 
the use of ilog2() replacing the custom __div64_ffs(). Exposure in 
linux-next would be a good thing.

I also included fixes for a couple do_div() misuses that an allyesconfig 
build turned up after switching ARM to the generic do_div() code.  
Those patches have been posted separately and addressed to relevant 
maintainers. They are included here until/unless those maintainers 
include those patches in their tree.

Original cover letter:

This is a generalization of the optimization I produced for ARM a decade
ago to turn constant divisors into a multiplication by the divisor
reciprocal. Turns out that after all those years gcc is still not
optimizing things on its own for that case.

This has important performance benefits as discussed in this thread:

	https://lkml.org/lkml/2015/10/28/851

This series brings the formerly ARM-only optimization to all 32-bit
architectures using C code by default.  The possibility for the actual
multiplication to be implemented in assembly is provided in order to get
optimal code.  The ARM version can be used as an example implementation
for other interested architectures to implement.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ