lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1355234577.17101.288.camel@gandalf.local.home>
Date:	Tue, 11 Dec 2012 09:02:57 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	frank.rowand@...sony.com,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	linux-rt-users <linux-rt-users@...r.kernel.org>,
	Carsten Emde <C.Emde@...dl.org>,
	John Kacur <jkacur@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Clark Williams <clark.williams@...il.com>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: [RFC][PATCH RT 3/4] sched/rt: Use IPI to trigger RT task push
 migration instead of pulling

On Tue, 2012-12-11 at 13:43 +0100, Thomas Gleixner wrote:
> On Mon, 10 Dec 2012, Steven Rostedt wrote:
> > On Mon, 2012-12-10 at 17:15 -0800, Frank Rowand wrote:
> > 
> > > I should have also mentioned some previous experience using IPIs to
> > > avoid runq lock contention on wake up.  Someone encountered IPI
> > > storms when using the TTWU_QUEUE feature, thus it defaults to off
> > > for CONFIG_PREEMPT_RT_FULL:
> > > 
> > >   #ifndef CONFIG_PREEMPT_RT_FULL
> > >   /*
> > >    * Queue remote wakeups on the target CPU and process them
> > >    * using the scheduler IPI. Reduces rq->lock contention/bounces.
> > >    */
> > >   SCHED_FEAT(TTWU_QUEUE, true)
> > >   #else
> > >   SCHED_FEAT(TTWU_QUEUE, false)
> > > 
> > 
> > Interesting, but I'm wondering if this also does it for every wakeup? If
> > you have 1000 tasks waking up on another CPU, this could potentially
> > send out 1000 IPIs. The number of IPIs here looks to be # of tasks
> > waking up, and perhaps more than that, as there could be multiple
> > instances that try to wake up the same task.
> 
> Not using the TTWU_QUEUE feature limits the IPIs to a single one,
> which is only sent if the newly woken task preempts the current task
> on the remote cpu and the NEED_RESCHED flag was not yet set.
>  
> With TTWU_QUEUE you can induce massive latencies just by starting
> hackbench. You get a herd wakeup on CPU0 which then enqueues hundreds
> of tasks to the remote pull list and sends IPIs. The remote CPUs pulls
> the tasks and activate them on their runqueue in hard interrupt
> context. That easiliy can accumulate to hundreds of microseconds when
> you do a mass push of newly woken tasks.
> 
> Of course it avoids fiddling with the remote rq lock, but it becomes
> massivly non deterministic.

Agreed. I never suggested to use TTWU_QUEUE. I was just stating the
difference between that and my patches.

> 
> > Now this patch set, the # of IPIs is limited to the # of CPUs. If you
> > have 4 CPUs, you'll get a storm of 3 IPIs. That's a big difference.
> 
> Yeah, the big difference is that you offload the double lock to the
> IPI. So in the worst case you interrupt the most latency sensitive
> task running on the remote CPU. Not sure if I really like that
> "feature".
>  

First, the pulled CPU isn't necessarily running the most latency
sensitive task. It just happens to be running more than one RT task, and
the waiting RT task can migrate. The running task may be of the same
priority as the waiting task. And they both may be the lowest priority
RT tasks in the system, and a CPU just went idle.

Currently, what we have is a huge contention on both the pulled CPU rq
lock. We've measured over 500us latencies due to it. This hurts even the
CPU that has the overloaded task, as the contention is on its lock.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ