lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 24 Feb 2015 09:19:06 -0800
From:	Jörn Engel <joern@...estorage.com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Steven Rostedt <srostedt@...hat.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>,
	Gregory Haskins <ghaskins@...ell.com>,
	linux-kernel@...r.kernel.org
Subject: Re: RFC: revert 43fa5460fe60

On Tue, Feb 24, 2015 at 10:33:44AM -0500, Steven Rostedt wrote:
> On Tue, 24 Feb 2015 09:55:09 -0500
> Jörn Engel <joern@...estorage.com> wrote:
> 
> > I came across a silly problem that tempted me to revert 43fa5460fe60.
> > We had a high-priority realtime thread woken, TIF_NEED_RESCHED was set
> > for the running thread and the realtime thread didn't run for >2s.
> > Problem was a system call that mapped a ton of device memory and never
> > hit a cond_resched() point.  Obvious solution is to fix the long-running
> > system call.
> 
> I'm assuming that you are running a non PREEMPT kernel
> (PREEMPT_VOLUNTARY doesn't count). If that's the case, then it is very
> much likely that you will hit long latencies. Regardless of this change
> or not (you will continue to need to whack-a-mole here).

Correct.

> > Applying that solution quickly turns into a game of whack-a-mole.  Not
> > the worst game in the world and all those moles surely deserve a solid
> > hit on the head.  But what is annoying in my case is that I had plenty
> > of idle cpus during the entire time and the high-priority thread was
> > allowed to run anywhere.  So if the thread had been moved to a different
> > runqueue immediately there would have been zero delay.  Sure, the cache
> > is semi-cold or the migration may even be cross-package.  That is a
> > tradeoff we are willing to make and we set the cpumask explicitly that
> > way.  We want this thread to run quickly, anywhere.
> > 
> > So we could check for currently idle cpus when waking realtime threads.
> > If there are any, immediately move the woken thread over.  Maybe have a
> > check whether the running thread on the current cpu is in a syscall and
> > retain current behaviour if not.
> > 
> > Now this is not quite the same as reverting 43fa5460fe60 and I would
> > like to verify the idea before I spend time on a patch you would never
> > consider merging anyway.
> 
> The thing is, the patch you want to revert helped a lot for
> CONFIG_PREEMPT kernels. It doesn't make sense to gut a change for a non
> CONFIG_PREEMPT kernel as that already suffers huge latencies.
> 
> This is considering that you are indeed not running a CONFIG_PREEMPT
> kernel, as you stated that it doesn't preempt until it hits a
> cond_resched().

Well, reverting was my first instinct, but for different reasons I think
it is wrong.  Simply reverting can result in the high priority thread
moving from one cpu with a running process to a different cpu with a
running process.  In both cases you may trip over a mole, so nothing
much is gained.

But if you know that the destination cpu is idle, you can avoid any
moles, give or take a small race window maybe.  The moles are still
present and you still need some debug tool to detect them and fix them
over time.  But as cpus increase over time, your chances of getting
lucky in spite of bad kernel code also increase.

Is that a worthwhile approach, at least for non PREEMPT?

Jörn

--
Schrödinger's cat is <BLINK>not</BLINK> dead.
-- Illiad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ