lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 19 Mar 2013 18:03:06 +0100
From:	Jiri Bohac <jbohac@...e.cz>
To:	Thomas Gleixner <tglx@...utronix.de>,
	Dimitri Sivanich <sivanich@....com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: [PATCH][RFC] specific do_timer_cpu value for nohz off mode

Hi,

following up on a very old thread:
http://thread.gmane.org/gmane.linux.kernel/1212777

On Thu, Feb 16, 2012 at 08:59:00AM -0600, Dimitri Sivanich wrote:
> On Wed, Feb 15, 2012 at 09:36:47PM +0100, Thomas Gleixner wrote:
> > On Wed, 15 Feb 2012, Dimitri Sivanich wrote:
> > > On Wed, Feb 15, 2012 at 03:52:06PM +0100, Thomas Gleixner wrote:
> > > > So the first CPU which registers a clock event device takes it. That's
> > > > the boot CPU, no matter what.
> > > >
> > > Both kernel tracing and the original patch that I proposed for this
> > > showed plainly (at the time) that the tick_do_timer_cpu was not always cpu 0
> > > prior to modifying it for nohz=off.  Maybe that is no longer the case?
> > 
> > This logic has not been changed in years.
> 
> I did some tracing of all points where tick_do_timer_cpu is set in the
> 3.3.0-rc3+ kernel.
> 
> > 
> > tick_do_timer_cpu is initialized to TICK_DO_TIMER_BOOT and the first
> > cpu which registers either a global or a per cpu clock event device
> > takes it over. This is at least on x86 always the boot cpu, i.e. cpu0.
> > After that point nothing touches that variable when nohz is disabled
> > (runtime or compile time).
> 
> At that point it is set to cpu 0.  However, when we go into highres mode
> it does change.  Below are the two places it was set during boot with
> nohz=off on one of our x86 based machines.
> 
> [    0.000000] tick_setup_device: tick_do_timer_cpu 0
> [    1.924098] tick_broadcast_setup_oneshot: tick_do_timer_cpu 17
> 
> So in this example it's now cpu 17, and it stays that way from that point on.
> 
> A traceback at that point shows tick_init_highres is indeed initiating this:
> 
> [    1.924863]  [<ffffffff81087e01>] tick_broadcast_setup_oneshot+0x71/0x160
> [    1.924863]  [<ffffffff81087f23>] tick_broadcast_switch_to_oneshot+0x33/0x50
> [    1.924863]  [<ffffffff81088841>] tick_switch_to_oneshot+0x81/0xd0
> [    1.924863]  [<ffffffff810888a0>] tick_init_highres+0x10/0x20
> [    1.924863]  [<ffffffff81061e71>] hrtimer_run_pending+0x71/0xd0
> 
> > 
> > So I really want to see proper proof why that would not be the
> > case. If it really happens then we fix the root cause instead of
> > adding random sysfs interfaces.

As Dimitri wrote above, the switch from cpu 0 is done by
tick_broadcast_setup_oneshot. The first CPU switching to highres
takes the broadcast responsibility and also sets
tick_do_timer_cpu to itself.

This behaviour has been introduced by 7300711e
(clockevents: broadcast fixup possible waiters).

I don't see a good reason assign tick_do_timer_cpu to the CPU
doing the one-shot timer broadcasts. The timer interrupt will be
generated on any other CPU as well, be it through the broadcast
IPI or a per-CPU clockevent device. Any online CPU can do that
job, so how about just dropping the assignment?

The do_timer() code should not suffer from the jitter introduced
by the interrupt being generated by the broadcast, should it?

Signed-off-by: Jiri Bohac <jbohac@...e.cz>

--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -572,9 +572,6 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
 
 		bc->event_handler = tick_handle_oneshot_broadcast;
 
-		/* Take the do_timer update */
-		tick_do_timer_cpu = cpu;
-
 		/*
 		 * We must be careful here. There might be other CPUs
 		 * waiting for periodic broadcast. We need to set the



-- 
Jiri Bohac <jbohac@...e.cz>
SUSE Labs, SUSE CZ

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ