lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 10 Mar 2011 16:48:03 +0200
From:	Phil Carmody <ext-phil.2.carmody@...ia.com>
To:	akpm@...ux-foundation.org
Cc:	gregkh@...e.de, linux-kernel@...r.kernel.org, sboyd@...eaurora.org
Subject: [PATCHv3 0/4] Improve fallback LPJ calculation


Apologies for picking on you, Andrew, and sending this out of the blue,
but I didn't have much luck with my previous attempt, and I quite like
this patchset, so thought it was worth trying again.
(http://lkml.org/lkml/2010/9/28/121)

The guts of this patchset are in patch 2/4. The motivation for that patch
is that currently our OMAP calibrates itself using the trial-and-error 
binary chop fallback that some other architectures no longer need to 
perform. This is a lengthy process, taking 0.2s in an environment where 
boot time is of great interest.

Patch 2/4 has two optimisations. Firstly, it replaces the initial repeated-
doubling to find the relevant power of 2 with a tight loop that just does 
as much as it can in a jiffy. Secondly, it doesn't binary chop over an 
entire power of 2 range, it choses a much smaller range based on how much 
it squeezed in, and failed to squeeze in, during the first stage. Both 
are significant optimisations, and bring our calibration down from 23 
jiffies to 5, and, in the process, often arrive at a more accurate lpj 
value.

The 'bands' and 'sub-logarithmic' growth may look over-engineered, but 
they only cost a small level of inaccuracy in the initial guess (for all 
architectures) in order to avoid the very large inaccuracies that appeared
during testing (on x86_64 architectures, and presumably others with less 
metronomic operation). Note that due to the existence of the TSC and 
other timers, the x86_64 will not typically use this fallback routine, 
but I wanted to code defensively, able to cope with all kinds of processor 
behaviours and kernel command line options.

Patch 3/4 is an additional trap for the nightmare scenario where the
initial estimate is very inaccurate, possibly due to things like SMIs.
It simply retries with a larger bound.

1/4 is simply cosmetic to prepare for 2/4. 
4/4 is simply to assist testing and not intended for integration.


Changes since initial RFC:
 - More informational commit messages
 - Inserted patch 3/4 after discovering that x86_64 had a failure case.


Thanks for your time,
Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ