lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 21 Jan 2014 17:28:37 -0500
From:	Sasha Levin <sasha.levin@...cle.com>
To:	Peter Zijlstra <peterz@...radead.org>,
	Arjan van de Ven <arjan@...ux.intel.com>, lenb@...nel.org,
	rjw@...ysocki.net, Eliezer Tamir <eliezer.tamir@...ux.intel.com>,
	rui.zhang@...el.com, jacob.jun.pan@...ux.intel.com,
	Mike Galbraith <bitbucket@...ine.de>,
	Ingo Molnar <mingo@...nel.org>, hpa@...or.com,
	paulmck@...ux.vnet.ibm.com, Thomas Gleixner <tglx@...utronix.de>,
	John Stultz <john.stultz@...aro.org>,
	Andy Lutomirski <luto@...capital.net>
CC:	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 13/15] sched: Use a static_key for sched_clock_stable

On 12/12/2013 09:08 AM, Peter Zijlstra wrote:
> In order to avoid the runtime condition and variable load turn
> sched_clock_stable into a static_key.
>
> Also provide a shorter implementation of local_clock() and
> cpu_clock(int) when sched_clock_stable==1.
>
>                          MAINLINE   PRE       POST
>
>      sched_clock_stable: 1          1         1
>      (cold) sched_clock: 329841     221876    215295
>      (cold) local_clock: 301773     234692    220773
>      (warm) sched_clock: 38375      25602     25659
>      (warm) local_clock: 100371     33265     27242
>      (warm) rdtsc:       27340      24214     24208
>      sched_clock_stable: 0          0         0
>      (cold) sched_clock: 382634     235941    237019
>      (cold) local_clock: 396890     297017    294819
>      (warm) sched_clock: 38194      25233     25609
>      (warm) local_clock: 143452     71234     71232
>      (warm) rdtsc:       27345      24245     24243
>
> Signed-off-by: Peter Zijlstra<peterz@...radead.org>

Hi Peter,

This patch seems to be causing an issue with booting a KVM guest. It seems that it
causes the time to go random during early boot process:

	[    0.000000] Initmem setup node 30 [mem 0x12ee000000-0x138dffffff]
	[    0.000000]   NODE_DATA [mem 0xcfa42000-0xcfa72fff]
	[    0.000000]     NODE_DATA(30) on node 1
	[    0.000000] Initmem setup node 31 [mem 0x138e000000-0x142fffffff]
	[    0.000000]   NODE_DATA [mem 0xcfa11000-0xcfa41fff]
	[    0.000000]     NODE_DATA(31) on node 1
	[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
	[    0.000000] kvm-clock: cpu 0, msr 0:cf991001, boot clock
	[133538.294040] Zone ranges:
	[133538.294338]   DMA      [mem 0x00001000-0x00ffffff]
	[133538.294804]   DMA32    [mem 0x01000000-0xffffffff]
	[133538.295223]   Normal   [mem 0x100000000-0x142fffffff]
	[133538.295670] Movable zone start for each node

Looking at the code, initially I though that the problem is with:

	+void set_sched_clock_stable(void)
	+{
	+       if (!sched_clock_stable())
	+               static_key_slow_dec(&__sched_clock_stable);
	+}
	+
	+void clear_sched_clock_stable(void)
	+{
	+       /* XXX worry about clock continuity */
	+       if (sched_clock_stable())
	+               static_key_slow_inc(&__sched_clock_stable);
	+}

I think the jump label inc/dec is reversed here. We would want to inc it when enabling
and dec when disabling, no?
However, trying to reverse the two didn't help. I was still seeing the same odd behaviour.

I tried doing a simple conversion to using a simple var like before, which looks like this:

diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index 6bd6a67..a035932 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -76,26 +76,21 @@ EXPORT_SYMBOL_GPL(sched_clock);
  __read_mostly int sched_clock_running;

  #ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
-static struct static_key __sched_clock_stable = STATIC_KEY_INIT;
+static int __sched_clock_stable;

  int sched_clock_stable(void)
  {
-       if (static_key_false(&__sched_clock_stable))
-               return false;
-       return true;
+       return __sched_clock_stable;
  }

  void set_sched_clock_stable(void)
  {
-       if (!sched_clock_stable())
-               static_key_slow_dec(&__sched_clock_stable);
+       __sched_clock_stable = 1;
  }

  static void __clear_sched_clock_stable(struct work_struct *work)
  {
-       /* XXX worry about clock continuity */
-       if (sched_clock_stable())
-               static_key_slow_inc(&__sched_clock_stable);
+       __sched_clock_stable = 0;
  }

  static DECLARE_WORK(sched_clock_work, __clear_sched_clock_stable);
@@ -340,7 +335,7 @@ EXPORT_SYMBOL_GPL(sched_clock_idle_wakeup_event);
   */
  u64 cpu_clock(int cpu)
  {
-       if (static_key_false(&__sched_clock_stable))
+       if (!sched_clock_stable())
                 return sched_clock_cpu(cpu);

         return sched_clock();
@@ -355,7 +350,7 @@ u64 cpu_clock(int cpu)
   */
  u64 local_clock(void)
  {
-       if (static_key_false(&__sched_clock_stable))
+       if (!sched_clock_stable())
                 return sched_clock_cpu(raw_smp_processor_id());

         return sched_clock();


This has corrected the issue:

	[    0.000000] Initmem setup node 31 [mem 0x138e000000-0x142fffffff]
	[    0.000000]   NODE_DATA [mem 0xcfa11000-0xcfa41fff]
	[    0.000000]     NODE_DATA(31) on node 1
	[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
	[    0.000000] kvm-clock: cpu 0, msr 0:cf991001, boot clock
	[    0.000000] Zone ranges:
	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
	[    0.000000]   Normal   [mem 0x100000000-0x142fffffff]
	[    0.000000] Movable zone start for each node
	[    0.000000] Early memory node ranges
		[ timing is correct for the rest of the boot]

At this point, I thought that there's something up with jump labels being used this early (?) and
tried compiling with CONFIG_JUMP_LABELS=n, this didn't solve the issue.

This makes me thing there's something different related to jumplabels we're missing, as the
no-jumplabel config should be very similar to the patch I did above, I just can't figure
out what it is.


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ