linux-kernel - Re: runqueue locks in schedule()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <1203799242.6242.108.camel@lappy>
Date:	Sat, 23 Feb 2008 21:40:42 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	stephane eranian <eranian@...glemail.com>
Cc:	linux-kernel@...r.kernel.org, ia64 <linux-ia64@...r.kernel.org>,
	Stephane Eranian <eranian@...il.com>,
	Corey J Ashford <cjashfor@...ibm.com>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: runqueue locks in schedule()


On Sat, 2008-02-23 at 15:50 +0100, stephane eranian wrote:
> Peter,
> 
> >  On Wed, 2008-01-16 at 16:29 -0800, stephane eranian wrote:
> >  > Hello,
> >  >
> >  > As suggested by people on this list, I have changed perfmon2 to use
> >  > the high resolution timers as the interface to allow timeout-based
> >  > event set multiplexing. This works around the problems I had with
> >  > tickless-enabled kernels.
> >  >
> >  > Multiplexing is supported in per-thread as well. In that case, the
> >  > timeout measures virtual time. When the thread is context switched
> >  > out, we need to save the remainder of the timeout and cancel the
> >  > timer. When the thread is context switched in, we need to reinstall
> >  > the timer. These timer save/restore operations have to be done in the
> >  > switch_to() code near the end of schedule().
> >  >
> >  > There are situations where hrtimer_start() may end up trying to
> >  > acquire the runqueue lock. This happens on a context switch where the
> >  > current thread is blocking (not preempted) and the new timeout happens
> >  > to be either in the past or just expiring. We've run into such
> >  > situations with simple tests.
> >  >
> >  > On all architectures, but IA-64, it seems thet the runqueue lock is
> >  > held until the end of schedule(). On IA-64, the lock is released
> >  > BEFORE switch_to() for some reason I don't quite remember. That may
> >  > not even be needed anymore.
> >  >
> >  > The early unlocking is controlled by a macro named
> >  > __ARCH_WANT_UNLOCKED_CTXSW. Defining this macros on X86 (or PPC) fixed
> >  > our problem.
> >  >
> >  > It is not clear to me why the runqueue lock needs to be held up until
> >  > the end of schedule() on some platforms and not on others. Not that
> >  > releasing the lock earlier does not necessarily introduce more
> >  > overhead because the lock is never re-acquired later in the schedule()
> >  > function.
> >  >
> >  > Question:
> >  >    - is it safe to release the lock before switch_to() on all architectures?
> >
> >  I had similar problem when using hrtimers from the scheduler, I extended
> >  the HRTIMER_CB_IRQSAFE_NO_SOFTIRQ time type to run with cpu_base->lock
> >  unlocked.
> >
> I am running into an issue when enabling this flag. Basically, the
> timer never fires
> when it gets into this situation where in hrtimer_start() the timer
> ends up being the
> next one to fire. In this mode,  hrtimer_enqueue_reprogram() become a NOP. But
> then nobody never inserts the time into any queue. There is a comment that
> says "caller site takes care of this". Could you elaborate on this?

That would mean the timer already expired by the time you get to program
it.

The way to handle these is:

for (;;) {
	if (hrtimer_active(timer))
		break;

	now = hrtimer_cb_get_time(timer);
	hrtimer_forward(timer, now, period);
	hrtimer_start(timer, timer->expires, HRTIMER_MODE_ABS);
}

You could use the return value from hrtimer_forward() to determine how
many events you missed if that is needed. The timer function needs a
similar loop if it wants to use HRTIMER_RESTART.

Single shot timers can handle it like in kernel/hrtimer.c:do_nanosleep()

  hrtimer_start(timer, ...);
  if (!hrtimer_active(timer))
	/* handle the missed expiration */



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/