lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 11 Apr 2011 16:00:27 +0300
From:	Phil Carmody <ext-phil.2.carmody@...ia.com>
To:	ext Andrew Morton <akpm@...ux-foundation.org>
Cc:	Andrew Worsley <amworsley@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: Fix for critical bogoMIPS intermittent calculation failure

On 07/04/11 15:17 -0700, ext Andrew Morton wrote:
> On Thu, 10 Mar 2011 17:03:51 +1100
> Andrew Worsley <amworsley@...il.com> wrote:
> 
> > Please find attach a fix to the TSC  (Time Stamp Counter) based
> > bogoMIPS calculation used on secondary CPUs which has two faults:
> 
> Phil, you've had your nose in init/calibrate.c recently.  Would you
> have time to review Andrew's changes?  Thanks.

No problem. Nokia's trained me to be a bit pedantic, so I noticed a few
style issues or niggles.
 
> Andrew, the patch was mssing a signed-off-by:.  Please review the
> description of this in Docuemntations/SubmuittingPatches and then, if
> agreeable, send one along?
> 
> 
> 
> From: Andrew Worsley <amworsley@...il.com>
> 
> A fix to the TSC (Time Stamp Counter) based bogoMIPS calculation used on
> secondary CPUs which has two faults:
> 
> 1: Not handling wrapping of the lower 32 bits of the TSC counter on
>    32bit kernel - perhaps TSC is not reset by a warm reset?
> 
> 2: TSC and Jiffies are no incrementing together properly.  Either
>    jiffies increment too quickly or Time Stamp Counter isn't incremented
>    in during an SMI but the real time clock is and jiffies are
>    incremented.
> 
> Case 1 can result in a factor of 16 too large a value which makes udelay()
> values too small and can cause mysterious driver errors.  Case 2 appears
> to give smaller 10-15% errors after averaging but enough to cause
> occasional failures on my own board
> 
> I have tested this code on my own branch and attach patch suitable for
> current kernel code.  See below for examples of the failures and how the
> fix handles these situations now.
> 
> I reported this issue earlier here:
>      Intermittent problem with BogoMIPs calculation on Intel AP CPUs -
> http://marc.info/?l=linux-kernel&m=129947246316875&w=4
> 
> I suspect this issue has been seen by others but as it is intermittent and
> bogoMIPS for secondary CPUs are no longer printed out it might have been
> difficult to identify this as the cause.  Perhaps these unresolved issues,
> although quite old, might be relevant as possibly this fault has been
> around for a while.  In particular Case 1 may only be relevant to 32bit
> kernels on newer HW (most people run 64bit kernels?).  Case 2 is less
> dramatic since the earlier fix in this area and also intermittent.
> 
>    Re: bogomips discrepancy on Intel Core2 Quad CPU -
> http://marc.info/?l=linux-kernel&m=118929277524298&w=4
>    slow system and bogus bogomips  -
> http://marc.info/?l=linux-kernel&m=116791286716107&w=4
>    Re: Re: [RFC-PATCH] clocksource: update lpj if clocksource has -
> http://marc.info/?l=linux-kernel&m=128952775819467&w=4
> 
> This issue is masked a little by commit feae3203d711db0a ("timers, init:
> Limit the number of per cpu calibration bootup messages") which only
> prints out the first bogoMIPS value making it much harder to notice other
> values differing.  Perhaps it should be changed to only suppress them when
> they are similar values?
> 
> Here are some outputs showing faults occurring and the new code handling
> them properly.  See my earlier message for examples of the original
> failure.
> 
>     Case 1:   A Time Stamp Counter wrap:
> ...
> Calibrating delay loop (skipped), value calculated using timer
> frequency.. 6332.70 BogoMIPS (lpj=31663540)
> ....
> calibrate_delay_direct() timer_rate_max=31666493
> timer_rate_min=31666151 pre_start=4170369255 pre_end=4202035539
> calibrate_delay_direct() timer_rate_max=2425955274
> timer_rate_min=2425954941 pre_start=4265368533 pre_end=2396356387
> calibrate_delay_direct() ignoring timer_rate as we had a TSC wrap
> around start=4265368581 >=post_end=2396356511
> calibrate_delay_direct() timer_rate_max=31666274
> timer_rate_min=31665942 pre_start=2440373374 pre_end=2472039515
> calibrate_delay_direct() timer_rate_max=31666492
> timer_rate_min=31666160 pre_start=2535372139 pre_end=2567038422
> calibrate_delay_direct() timer_rate_max=31666455
> timer_rate_min=31666207 pre_start=2630371084 pre_end=2662037415
> Calibrating delay using timer specific routine.. 6333.28 BogoMIPS (lpj=31666428)
> Total of 2 processors activated (12665.99 BogoMIPS).
> ....
> 
>     Case 2:  Some thing (presumably the SMM interrupt?) causing the
> very low increase in TSC counter for the DELAY_CALIBRATION_TICKS
> increase in jiffies
> ...
> Calibrating delay loop (skipped), value calculated using timer
> frequency.. 6333.25 BogoMIPS (lpj=31666270)
> ...
> calibrate_delay_direct() timer_rate_max=31666483
> timer_rate_min=31666074 pre_start=4199536526 pre_end=4231202809
> calibrate_delay_direct() timer_rate_max=864348 timer_rate_min=864016
> pre_start=2405343672 pre_end=2406207897
> calibrate_delay_direct() timer_rate_max=31666483
> timer_rate_min=31666179 pre_start=2469540464 pre_end=2501206823
> calibrate_delay_direct() timer_rate_max=31666511
> timer_rate_min=31666122 pre_start=2564539400 pre_end=2596205712
> calibrate_delay_direct() timer_rate_max=31666084
> timer_rate_min=31665685 pre_start=2659538782 pre_end=2691204657
> calibrate_delay_direct() dropping min bogoMips estimate 1 = 864348
> Calibrating delay using timer specific routine.. 6333.27 BogoMIPS (lpj=31666390)
> Total of 2 processors activated (12666.53 BogoMIPS).
> ...
> 
> After 70 boots I saw 2 variations <1% slip through
> 
> Cc: Phil Carmody <ext-phil.2.carmody@...ia.com>
> Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> ---
> 
>  init/calibrate.c |   70 +++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 65 insertions(+), 5 deletions(-)
> 
> diff -puN init/calibrate.c~init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure init/calibrate.c
> --- a/init/calibrate.c~init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure
> +++ a/init/calibrate.c
> @@ -38,6 +38,9 @@ static unsigned long __cpuinit calibrate
>  	unsigned long timer_rate_min, timer_rate_max;
>  	unsigned long good_timer_sum = 0;
>  	unsigned long good_timer_count = 0;
> +	unsigned long measured_times[MAX_DIRECT_CALIBRATION_RETRIES];
> +	int max = -1; /* index of measured_times with max/min values or not set */
> +	int min = -1;
>  	int i;
>  
>  	if (read_current_timer(&pre_start) < 0 )
> @@ -90,17 +93,74 @@ static unsigned long __cpuinit calibrate
>  		 * If the upper limit and lower limit of the timer_rate is
>  		 * >= 12.5% apart, redo calibration.
>  		 */
> -		if (pre_start != 0 && pre_end != 0 &&
> +		printk(KERN_DEBUG
> +"calibrate_delay_direct() timer_rate_max=%lu timer_rate_min=%lu pre_start=%lu pre_end=%lu\n",
> +			  timer_rate_max, timer_rate_min, pre_start, pre_end);
> +		if (start >= post_end)
> +			printk(KERN_NOTICE
> +				"calibrate_delay_direct() ignoring timer_rate as we had a TSC wrap around start=%lu >=post_end=%lu\n",
> +				start, post_end);
> +		if (start < post_end && pre_start != 0 && pre_end != 0 &&

This could be split off into a separate patch, as it is independent of the rest of the patch.

>  		    (timer_rate_max - timer_rate_min) < (timer_rate_max >> 3)) {
>  			good_timer_count++;
>  			good_timer_sum += timer_rate_max;
> -		}
> +			measured_times[i] = timer_rate_max;
> +			if (max < 0 || timer_rate_max > measured_times[max])
> +				max = i;
> +			if (min < 0 || timer_rate_max < measured_times[min])
> +				min = i;
> +		} else
> +			measured_times[i] = 0;

Inconsistent use of {}.

> +
>  	}
>  
> -	if (good_timer_count)
> -		return (good_timer_sum/good_timer_count);
> +	/*
> +	 * Find the maximum & minimum - if they differ too much throw out the one with
> +	 * the largest difference from the mean and try again...
> +	 */
> +	while (good_timer_count > 1) {
> +		unsigned long estimate;
> +		unsigned long maxdiff;
> +
> +		/* compute the estimate */
> +		estimate = (good_timer_sum/good_timer_count);

Unnecessary (), no spaces around / operator.

> +		maxdiff = estimate >> 3;
> +
> +		/* if range is within 12% let's take it */
> +		if ((measured_times[max] - measured_times[min]) < maxdiff)
> +			return (estimate);

Unnecessary ().

> +
> +		/* ok - drop the worse value and try again... */
> +		good_timer_sum = 0;
> +		good_timer_count = 0;
> +		if ((measured_times[max] - estimate) < (estimate - measured_times[min])) {
> +			printk(KERN_NOTICE
> +	"calibrate_delay_direct() dropping min bogoMips estimate %d = %lu\n",
> +				min, measured_times[min]);
> +			measured_times[min] = 0;
> +			min = max;
> +		} else {
> +			printk(KERN_NOTICE
> +	"calibrate_delay_direct() dropping max bogoMips estimate %d = %lu\n",

You could squash those two strings into one with a %s, and trust the compiler to 
do the right thing. However, apart from those, there are no functional problems.
I currently do not have a system upon which I can test the patch, alas.

Reviewed-by: Phil Carmody <ext-phil.2.carmody@...ia.com>

Phil

> +				max, measured_times[max]);
> +			measured_times[max] = 0;
> +			max = min;
> +		}
> +
> +		for (i = 0; i < MAX_DIRECT_CALIBRATION_RETRIES; i++) {
> +			if (measured_times[i] == 0)
> +				continue;
> +			good_timer_count++;
> +			good_timer_sum += measured_times[i];
> +			if (measured_times[i] < measured_times[min])
> +				min = i;
> +			if (measured_times[i] > measured_times[max])
> +				max = i;
> +		}
> +
> +	}
>  
> -	printk(KERN_WARNING "calibrate_delay_direct() failed to get a good "
> +	printk(KERN_NOTICE "calibrate_delay_direct() failed to get a good "
>  	       "estimate for loops_per_jiffy.\nProbably due to long platform interrupts. Consider using \"lpj=\" boot option.\n");
>  	return 0;
>  }
> _
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ