linux-kernel - Re: [PATCH V2 Resend 3/4] workqueue: Schedule work on non-idle cpu instead of current one

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKohpo=hhScCDUnDcqnZx4YzU_8nF1658jmSM_wzsOpSCxXwqA@mail.gmail.com>
Date:	Mon, 7 Jan 2013 15:28:33 +0530
From:	Viresh Kumar <viresh.kumar@...aro.org>
To:	rostedt@...dmis.org, Tejun Heo <tj@...nel.org>
Cc:	Vincent Guittot <vincent.guittot@...aro.org>, pjt@...gle.com,
	paul.mckenney@...aro.org, tglx@...utronix.de,
	suresh.b.siddha@...el.com, venki@...gle.com, mingo@...hat.com,
	peterz@...radead.org, Arvind.Chauhan@....com,
	linaro-dev@...ts.linaro.org, patches@...aro.org,
	pdsw-power-team@....com, linux-kernel@...r.kernel.org,
	linux-rt-users@...r.kernel.org
Subject: Re: [PATCH V2 Resend 3/4] workqueue: Schedule work on non-idle cpu
 instead of current one

Hi Tejun,

On 4 January 2013 20:39, Tejun Heo <tj@...nel.org> wrote:
> I don't know either.  Changing behavior subtly like this is hard.  I
> usually try to spot some problem cases and try to identify patterns
> there.  Once you identify a few of them, understanding and detecting
> other problem cases get a lot easier.  In this case, maybe there are
> too many places to audit and the problems are too subtle, and, if we
> *have* to do it, the only thing we can do is implementing a debug
> option to make such problems more visible - say, always schedule to a
> different cpu on queue_work().
>
> That said, at this point, the patchset doesn't seem all that
> convincing to me and if I'm comprehending responses from others
> correctly that seems to be the consensus.  It might be a better
> approach to identify the specific offending workqueue users and make
> them handle the situation more intelligently than trying to impose the
> behavior on all workqueue users.  At any rate, we need way more data
> showing this actually helps and if so why.

I understand your concerns and believe me, even i feel the same :)
I had another idea, that i wanted to share.

Firstly the root cause of this patchset.

Myself and some others in Linaro are working on ARM future cores:
big.LITTLE systems.
Here we have few very powerful, high power consuming cores (big,
currently A15's) and
few very efficient ones (LITTLE, currently A7's).

The ultimate goal is to save as much power as possible without compromising
much with performance. For, that we wanted most of the stuff to run on LITTLE
cores and some performance-demanding stuff on big Cores. There are
multiple things
going around in this direction. Now, we thought A15's or big cores
shouldn't be used
for running small tasks like timers/workqueues and hence this patch is
an attempt
towards reaching that goal.

Over that we can do some load balancing of works within multiple alive
cpus, so that
it can get done quickly. Also, we shouldn't start using an idle cpu
just for processing
work :)

I have another idea that we can try:

queue_work_on_any_cpu().

With this we would not break any existing code and can try to migrate
old users to
this new infrastructure (atleast the ones which are rearming works from their
work_handlers). What do you say?

To take care of the cache locality issue, we can pass an argument to
this routine,
that can provide
- the mask of cpus to schedule this work on
  OR
- Sched Level (SD_LEVEL) of cpus to run it.

Waiting for your view on it :)

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/