lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 12 Nov 2012 15:27:19 -0800
From:	John Stultz <johnstul@...ibm.com>
To:	Prarit Bhargava <prarit@...hat.com>
CC:	paulmck@...ux.vnet.ibm.com,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Marcelo Tosatti <mtosatti@...hat.com>
Subject: Re: RCU NOHZ, tsc, and clock_gettime

On 10/12/2012 08:40 AM, Prarit Bhargava wrote:
>
>>> One possibility is that if the cpu we're doing our timekeeping
>>> accumulation on is different then the one running the test, we might
>>> go into deeper idle for longer periods of time. Then when we
>>> accumulate time, we have more then a single tick to accumulate and
>>> that might require holding the timekeeper/xtime lock for longer
>>> times.
>>>
>>> And the max 2.9ns variance seems particularly low, given that we do
>>> call update_vsyscall every so often, and that should block
>>> clock_gettime() callers while we update the vsyscall data.  Could it
>>> be that the test is too short to see the locking effect, so you're
>>> just getting lucky, and that adding nohz is jostling the regularity
>>> of the execution so you then see the lock wait times?  If you
>>> increase the samples and sample loops by 1000 does that change the
>>> behavior?
> That's a possiblity, although I suspect that this has more to do with not
> executing the RCU NOHZ code given that we don't see a problem with the
> clock_gettime() vs clock_gettime() test.  I wonder if not executing the RCU NOHZ
> code somehow introduces a "regularity" with execution that results in the CPU
> always being in C0/polling when the test is run?
Hey Prarit,
     Just back from being on leave, and wanted to check in on this. Did 
you ever get to run with an increase sample size to see how that 
affected things?  Its exactly your point that the non-NOHZ case could 
align the execution of a short run in a way that you always see good 
results, where as with NOHZ the alignment might not be the same, so you 
see periodic delays from timer interrupts, etc.

Anyway, let me know if this got resolved or not.

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ