linux-kernel - Re: posix-cpu-timers revamp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <1207758594.29898.11.camel@bobble.smo.corp.google.com>
Date:	Wed, 09 Apr 2008 09:29:54 -0700
From:	Frank Mayhar <fmayhar@...gle.com>
To:	Roland McGrath <roland@...hat.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: posix-cpu-timers revamp

On Tue, 2008-04-08 at 15:49 -0700, Roland McGrath wrote:
> My explanation about the constraints on exit_state was specifically about
> the context of update_process_times(), which is part of the path for a
> clock tick interrupting current.

Understood.

> > And, you guessed it, the invariant gets violated.  Apparently the passed
> > task_struct isn't the same as "current" at this point.
> 
> The scheduler code has gotten a lot more complex since I first implemented
> posix-cpu-timers, and I've never been any expert on the scheduler at all.
> But I'm moderately sure all those things are all involved in context
> switch where the task of interest is about to be on a CPU or just was on a
> CPU.  I doubt those are places where the task in question could be
> simultaneously executing in exit_notify() on another CPU.  But we'd need
> to ask the scheduler experts to be sure we know what we're talking about
> there.

This was my conclusion as well.  Certainly the path through do_fork()
(elucidated below) doesn't even allow the task in question to even be
executing, much less on a different CPU, but all these routines with
"struct task_struct *" parameters make me nervous.  Which is why I
inserted the invariant check in the first place.

> > Found the exception.  do_fork() violates the invariant when it's
> > cranking up a new process.  Hmmm.
> 
> I haven't figured out what actual code path this refers to.

do_fork=>wake_up_new_task=>task_new_fair=>
	enqueue_task_fair=>enqueue_entity=>update_curr

> This sort of concern is among the reasons that checking ->signal was the
> course I found wise to suggest to begin with.  We can figure out what the
> constraints on ->exit_state are in all the places by understanding every
> corner of the scheduler.

Well, as much as I would like to take the time to do that, I do have a
_real_ job, here. :-)

>   We can measure whether it winds up being in a
> cooler cache line than ->signal and a net loss to add the load, or has
> superior performance as you seem to think.  Or we can just test the
> constraint that matters, whether the pointer we loaded was in fact null,
> and rely on RCU to make it not matter if there is a race after that load.
> It doesn't matter whether tsk is current or not, it only matters that we
> have the pointer and that we're using some CPU array slot or other that
> noone else is using simultaneously.
> 
>     static inline void account_group_exec_runtime(struct task_struct *tsk,
> 						  unsigned long long runtime)
>     {
> 	    struct signal_struct *sig;
> 	    struct task_cputime *times;
> 
> 	    rcu_read_lock();
> 	    sig = rcu_dereference(tsk->signal);
> 	    if (likely(sig) && sig->cputime.totals) {
> 		    times = per_cpu_ptr(sig->cputime.totals, get_cpu());
> 		    times->sum_exec_runtime += runtime;
> 		    put_cpu_no_resched();
> 	    }
> 	    rcu_read_unlock();
>     }

Yeah, agreed.  Of course, I was hoping (in vain, apparently) to avoid
this level of overhead here.  And I suspect I'll really have to do it in
each of these routines.  But I suppose it can't be helped.

Even with a thorough understanding of the scheduler(s) and code based on
that understanding, we would still not (necessarily) be protected from
future changes that violate the assumptions we make on that basis.
-- 
Frank Mayhar <fmayhar@...gle.com>
Google, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/