linux-kernel - Re: [PATCH 5/6] Makes procs file writable to move all threads by tgid at once

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2f86c2480907241353l63818dfehb20c9d4918a3f069@mail.gmail.com>
Date:	Fri, 24 Jul 2009 13:53:53 -0700
From:	Benjamin Blum <bblum@...gle.com>
To:	Paul Menage <menage@...gle.com>
Cc:	Matt Helsley <matthltc@...ibm.com>, linux-kernel@...r.kernel.org,
	containers@...ts.linux-foundation.org, akpm@...ux-foundation.org,
	serue@...ibm.com, lizf@...fujitsu.com
Subject: Re: [PATCH 5/6] Makes procs file writable to move all threads by tgid 
	at once

On Fri, Jul 24, 2009 at 1:47 PM, Paul Menage<menage@...gle.com> wrote:
> On Fri, Jul 24, 2009 at 10:23 AM, Matt Helsley<matthltc@...ibm.com> wrote:
>>
>> Well, I imagine holding tasklist_lock is worse than cgroup_mutex in some
>> ways since it's used even more widely. Makes sense not to use it here..
>
> Just to clarify - the new "procs" code doesn't use cgroup_mutex for
> its critical section, it uses a new cgroup_fork_mutex, which is only
> taken for write during cgroup_proc_attach() (after all setup has been
> done, to ensure that no new threads are created while we're updating
> all the existing threads). So in general there'll be zero contention
> on this lock - the cost will be the cache misses due to the rwlock
> bouncing between the different CPUs that are taking it in read mode.

Right. The different options so far are:

Global rwsem: only needs one lock, but prevents all forking when a
write is in progress. It should be quick enough, if it's just "iterate
down the threadgroup list in O(n)". In the good case, fork() slows
down by a cache miss when taking the lock in read mode.
Threadgroup-local rwsem: Needs adding a field to task_struct. Only
forks within the same threadgroup would block on a write to the procs
file, and the zero-contention case is the same as before.
Using tasklist_lock: Currently, the call to cgroup_fork() (which
starts the race) is very far above where tasklist_lock is taken in
fork, so taking tasklist_lock earlier is very infeasible. Could
cgroup_fork() be moved downwards to inside it, and if so, how much
restructuring would be needed? Even if so, this still adds stuff that
is being done (unnecessarily) while holding a global mutex.

> What happened to the big-reader lock concept from 2.4.x? That would be
> applicable here - minimizing the overhead on the critical path when
> the write operation is expected to be very rare.

Seems like a good application, but it appears to be gone in the
current kernel. Also, from my understanding, it would have to be a
global (or at least not threadgroup-local) lock, no? Were we to use
this and try to write to the procs file while a bunch of forks are in
progress, how long would the write operation have to block? (that is,
at least with a rwsem, the writing thread seems to get the lock rather
quickly when there's contention.) Depending on just how slow
write-locking one of these is, it might kill any hopes of performing a
write while forks are in progress.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/