linux-kernel - Re: runqueue locks in schedule()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7c86c4470802230650l1605db93o3c9b38ec52bcba89@mail.gmail.com>
Date:	Sat, 23 Feb 2008 15:50:33 +0100
From:	"stephane eranian" <eranian@...glemail.com>
To:	"Peter Zijlstra" <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, ia64 <linux-ia64@...r.kernel.org>,
	"Stephane Eranian" <eranian@...il.com>,
	"Corey J Ashford" <cjashfor@...ibm.com>,
	"Ingo Molnar" <mingo@...e.hu>
Subject: Re: runqueue locks in schedule()

Peter,

>  On Wed, 2008-01-16 at 16:29 -0800, stephane eranian wrote:
>  > Hello,
>  >
>  > As suggested by people on this list, I have changed perfmon2 to use
>  > the high resolution timers as the interface to allow timeout-based
>  > event set multiplexing. This works around the problems I had with
>  > tickless-enabled kernels.
>  >
>  > Multiplexing is supported in per-thread as well. In that case, the
>  > timeout measures virtual time. When the thread is context switched
>  > out, we need to save the remainder of the timeout and cancel the
>  > timer. When the thread is context switched in, we need to reinstall
>  > the timer. These timer save/restore operations have to be done in the
>  > switch_to() code near the end of schedule().
>  >
>  > There are situations where hrtimer_start() may end up trying to
>  > acquire the runqueue lock. This happens on a context switch where the
>  > current thread is blocking (not preempted) and the new timeout happens
>  > to be either in the past or just expiring. We've run into such
>  > situations with simple tests.
>  >
>  > On all architectures, but IA-64, it seems thet the runqueue lock is
>  > held until the end of schedule(). On IA-64, the lock is released
>  > BEFORE switch_to() for some reason I don't quite remember. That may
>  > not even be needed anymore.
>  >
>  > The early unlocking is controlled by a macro named
>  > __ARCH_WANT_UNLOCKED_CTXSW. Defining this macros on X86 (or PPC) fixed
>  > our problem.
>  >
>  > It is not clear to me why the runqueue lock needs to be held up until
>  > the end of schedule() on some platforms and not on others. Not that
>  > releasing the lock earlier does not necessarily introduce more
>  > overhead because the lock is never re-acquired later in the schedule()
>  > function.
>  >
>  > Question:
>  >    - is it safe to release the lock before switch_to() on all architectures?
>
>  I had similar problem when using hrtimers from the scheduler, I extended
>  the HRTIMER_CB_IRQSAFE_NO_SOFTIRQ time type to run with cpu_base->lock
>  unlocked.
>
I am running into an issue when enabling this flag. Basically, the
timer never fires
when it gets into this situation where in hrtimer_start() the timer
ends up being the
next one to fire. In this mode,  hrtimer_enqueue_reprogram() become a NOP. But
then nobody never inserts the time into any queue. There is a comment that
says "caller site takes care of this". Could you elaborate on this?


Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/