lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150224103344.5f92a507@gandalf.local.home>
Date:	Tue, 24 Feb 2015 10:33:44 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Jörn Engel <joern@...estorage.com>
Cc:	Steven Rostedt <srostedt@...hat.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>,
	Gregory Haskins <ghaskins@...ell.com>,
	linux-kernel@...r.kernel.org
Subject: Re: RFC: revert 43fa5460fe60

On Tue, 24 Feb 2015 09:55:09 -0500
Jörn Engel <joern@...estorage.com> wrote:

> Hello Steven!
> 
> I came across a silly problem that tempted me to revert 43fa5460fe60.
> We had a high-priority realtime thread woken, TIF_NEED_RESCHED was set
> for the running thread and the realtime thread didn't run for >2s.
> Problem was a system call that mapped a ton of device memory and never
> hit a cond_resched() point.  Obvious solution is to fix the long-running
> system call.

I'm assuming that you are running a non PREEMPT kernel
(PREEMPT_VOLUNTARY doesn't count). If that's the case, then it is very
much likely that you will hit long latencies. Regardless of this change
or not (you will continue to need to whack-a-mole here).

> 
> Applying that solution quickly turns into a game of whack-a-mole.  Not
> the worst game in the world and all those moles surely deserve a solid
> hit on the head.  But what is annoying in my case is that I had plenty
> of idle cpus during the entire time and the high-priority thread was
> allowed to run anywhere.  So if the thread had been moved to a different
> runqueue immediately there would have been zero delay.  Sure, the cache
> is semi-cold or the migration may even be cross-package.  That is a
> tradeoff we are willing to make and we set the cpumask explicitly that
> way.  We want this thread to run quickly, anywhere.
> 
> So we could check for currently idle cpus when waking realtime threads.
> If there are any, immediately move the woken thread over.  Maybe have a
> check whether the running thread on the current cpu is in a syscall and
> retain current behaviour if not.
> 
> Now this is not quite the same as reverting 43fa5460fe60 and I would
> like to verify the idea before I spend time on a patch you would never
> consider merging anyway.

The thing is, the patch you want to revert helped a lot for
CONFIG_PREEMPT kernels. It doesn't make sense to gut a change for a non
CONFIG_PREEMPT kernel as that already suffers huge latencies.

This is considering that you are indeed not running a CONFIG_PREEMPT
kernel, as you stated that it doesn't preempt until it hits a
cond_resched().

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ