linux-kernel - Re: [PATCH] workqueue: Restore cpus_allowed mask for sleeping workqueue rescue threads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110915161430.GE1548@tusker>
Date:	Thu, 15 Sep 2011 17:14:30 +0100
From:	Ripduman Sohan <Ripduman.Sohan@...cam.ac.uk>
To:	Tejun Heo <tj@...nel.org>
Cc:	linux-kernel@...r.kernel.org, peterz@...radead.org
Subject: Re: [PATCH] workqueue: Restore cpus_allowed mask for sleeping
 workqueue rescue threads

Tejun Heo <tj@...nel.org> wrote:

> Hello,
> 
> On Thu, Sep 01, 2011 at 02:36:33PM +0100, Ripduman Sohan wrote:
> > Rescuer threads may be migrated (and are bound) to particular CPUs when
> > active.  However, the allowed_cpus mask is not restored when they return
> > to sleep rendering inconsistent the presented and actual set of CPUs the
> > process may potentially run on.  This patch fixes this oversight by
> > recording the allowed_cpus mask for rescuer threads when it enters the
> > rescuer_thread() main loop and restoring it every time the thread sleeps.
> 
> Hmmm... so, currently, rescuer is left bound to the last cpu it worked
> on.  Why is this a problem?
> 
> Thanks.
> 
> -- 
> tejun

Hi,

The rescuer being left bound to the last CPU it was active on is not a
problem.  As I pointed out in the commit log the issue is that the
allowed_cpus mask is not restored when rescuers return to sleep,
rendering inconsistent the presented and actual set of CPUs the
process may potentially run on.

Perhaps an explanation is in order.  I am working on a system where we
constantly sample process run-state (including the process
Cpus_Allowed field in /proc/<pid>/status) to build a forward plan of
where the process _may_ run in the future.  In situations of high
memory pressue (common on our setup) where the rescuers ran often the
plan begun to significantly deviate from the calculated schedule
because rescuer threads were marked as only runnable on a single CPU
when in reality they would bounce across CPUs.

I've currently put in a special-case exception in our code to account
for the fact that rescuer threads may run on _any_ CPU regardless of
the current cpus_allowed mask but I thought it would be useful to
correct it.  I'm happy to continue with my current approach if you
deem the patch irrelevant.

Kind regards,

--rip
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/