lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 03 Apr 2008 17:53:07 -0700
From:	Frank Mayhar <fmayhar@...gle.com>
To:	Roland McGrath <roland@...hat.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: posix-cpu-timers revamp

On Wed, 2008-04-02 at 14:42 -0700, Frank Mayhar wrote:
> On Wed, 2008-04-02 at 13:34 -0700, Frank Mayhar wrote:
> > One little gotcha we just ran into, though:  When checking
> > tsk->signal->(anything) in run_posix_cpu_timers(), we have to hold
> > tasklist_lock to avoid a race with release_task().  This is going to
> > make even the null case always cost more than before.
> 
> This race, by the way, is because we're dereferencing task->signal at
> interrupt once per tick.  We ran into a case where a process was going
> through release_task() and being torn down on one CPU while running a
> timer tick on another.  Under load.  It's not a very likely race but
> with sufficient time or load it's pretty much inevitable.
> 
> My thought is to move thread_group_cputime from the signal structure to
> hanging directly off the task structure.  It would be shared in the same
> way as the signal structure is now but would be deallocated with the
> task structure rather than the signal structure.  This should mean that
> I could avoid getting tasklist_lock under most conditions.

Okay, having run face-first into this race and having every combination
of spinlock serialization fail for me, I've done a variation of the
above scheme.

For the local environment, I solved the problem by moving the percpu
structure out of the signal structure entirely and by making it
refcounted.  It is allocated as before, but now in two parts, a normal
structure with an atomic refcount that has a pointer to the percpu
structure.  The signal structure doesn't point to it any longer, but
each task_struct in the thread group does, and each of these references
is counted.  New threads will also get a reference (at the top of
copy_signal()) and be counted.  All access goes through the task
structure.  References are removed in __put_task_struct() when the task
itself is destroyed; when the last reference goes away, the structures
are freed.

This eliminates the races with signal_struct being freed and has the
nice effect that there is a bit less overhead in places like
account_group_user_time() and friends.  In run_posix_cpu_timers(),
though, I have to pick up the tasklist_lock early (and therefore in
every case) because it's still dereferencing tsk->signal in the early
comparison.

I'm thinking about moving all of the itimer stuff (i.e. the
cputime_expires structures) into the refcounted structure as well, thus
avoiding the signal_struct entirely so we don't need the tasklist_lock
in the fast path.  I don't know how any of this will affect the UP case,
though.  I'll have to continue to think about it and I'm sure you have
something to say as well.  (And if anyone else wants to chime in,
they're welcome.)
-- 
Frank Mayhar <fmayhar@...gle.com>
Google, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ