lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 27 Nov 2012 08:59:03 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Viresh Kumar <viresh.kumar@...aro.org>
Cc:	pjt@...gle.com, paul.mckenney@...aro.org, tglx@...utronix.de,
	tj@...nel.org, suresh.b.siddha@...el.com, venki@...gle.com,
	mingo@...hat.com, peterz@...radead.org, Arvind.Chauhan@....com,
	linaro-dev@...ts.linaro.org, patches@...aro.org,
	pdsw-power-team@....com, linux-kernel@...r.kernel.org,
	linux-rt-users@...r.kernel.org
Subject: Re: [PATCH V2 Resend 3/4] workqueue: Schedule work on non-idle cpu
 instead of current one

On Tue, 2012-11-27 at 19:18 +0530, Viresh Kumar wrote:
> On 27 November 2012 18:56, Steven Rostedt <rostedt@...dmis.org> wrote:
> > A couple of things. The sched_select_cpu() is not cheap. It has a double
> > loop of domains/cpus looking for a non idle cpu. If we have 1024 CPUs,
> > and we are CPU 1023 and all other CPUs happen to be idle, we could be
> > searching 1023 CPUs before we come up with our own.
> 
> Not sure if you missed the first check sched_select_cpu()
> 
> +int sched_select_cpu(unsigned int sd_flags)
> +{
> +       /* If Current cpu isn't idle, don't migrate anything */
> +       if (!idle_cpu(cpu))
> +               return cpu;
> 
> We aren't going to search if we aren't idle.

OK, we are idle, but CPU 1022 isn't. We still need a large search. But,
heh we are idle we can spin. But then why go through this in the first
place ;-)


> 
> > Also, I really don't like this as a default behavior. It seems that this
> > solution is for a very special case, and this can become very intrusive
> > for the normal case.
> 
> We tried with an KCONFIG option for it, which Tejun rejected.

Yeah, I saw that. I don't like adding KCONFIG options either. Best is to
get something working that doesn't add any regressions. If you can get
this to work without making *any* regressions in the normal case than
I'm totally fine with that. But if this adds any issues with the normal
case, then it's a show stopper.

> 
> > To be honest, I'm uncomfortable with this approach. It seems to be
> > fighting a symptom and not the disease. I'd rather find a way to keep
> > work from being queued on wrong CPU. If it is a timer, find a way to
> > move the timer. If it is something else, lets work to fix that. Doing
> > searches of possibly all CPUs (unlikely, but it is there), just seems
> > wrong to me.
> 
> As Vincent pointed out, on big LITTLE systems we just don't want to
> serve works on big cores. That would be wasting too much of power.
> Specially if we are going to wake up big cores.
> 
> It would be difficult to control the source driver (which queues work) to
> little cores. We thought, if somebody wanted to queue work on current
> cpu then they must use queue_work_on().

As Tejun has mentioned earlier, is there any assumptions anywhere that
expects an unbounded work queue to not migrate? Where per cpu variables
might be used. Tejun had a good idea of forcing this to migrate the work
*every* time. To not let a work queue run on the same CPU that it was
queued on. If it can survive that, then it is probably OK. Maybe add a
config option that forces this? That way, anyone can test that this
isn't an issue.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ