linux-kernel - Re: [PATCH 4/5] thread_group_cputime: simplify, document the "alive" check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100611164050.GA19325@dhcp-lab-161.englab.brq.redhat.com>
Date:	Fri, 11 Jun 2010 18:40:51 +0200
From:	Stanislaw Gruszka <sgruszka@...hat.com>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Ingo Molnar <mingo@...e.hu>, Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/5] thread_group_cputime: simplify, document the
	"alive" check

On Fri, Jun 11, 2010 at 05:15:33PM +0200, Oleg Nesterov wrote:
> On 06/11, Stanislaw Gruszka wrote:
> >
> > On Fri, Jun 11, 2010 at 01:09:56AM +0200, Oleg Nesterov wrote:
> > > thread_group_cputime() looks as if it is rcu-safe, but in fact this
> > > was wrong until ea6d290c which pins task->signal to task_struct.
> > > It checks ->sighand != NULL under rcu, but this can't help if ->signal
> > > can go away. Fortunately the caller either holds ->siglock, or it is
> > > fastpath_timer_check() which uses current and checks exit_state == 0.
> >
> > Hmm, I thought we avoided calling thread_group_cputime() from
> > fastpatch_timer_check(), but seems it is still possible when we
> > call run_posix_cpu_timers() on two different cpus simultaneously ...
> 
> No, we can't. thread_group_cputimer() does test-and-set ->running
> under cputimer->lock.
> 
> But when I sent these patches, I realized we have another race here
> (with or without these patches). I am already doing the fix.

Don't know what you catch, I was thinking about:

cpu0							cpu1

fastpath_timer_check():

if (sig->cputimer.running) {
  struct task_cputime group_sample;
							stop_process_timers():	
        						
							spin_lock_irqsave(&cputimer->lock, flags);
        						cputimer->running = 0;
        						spin_unlock_irqrestore(&cputimer->lock, flags);
				
  thread_group_cputimer(tsk, &group_sample);
					
> > > - Since ea6d290c commit tsk->signal is stable, we can read it first
> > >   and avoid the initialization from INIT_CPUTIME.
> > >
> > > - Even if tsk->signal is always valid, we still have to check it
> > >   is safe to use next_thread() under rcu_read_lock(). Currently
> > >   the code checks ->sighand != NULL, change it to use pid_alive()
> > >   which is commonly used to ensure the task wasn't unhashed before
> > >   we take rcu_read_lock().
> >
> > I'm not sure how important are values of almost dead task, but
> > perhaps would be better to return times form all threads
> > using as base sig->curr_target in loop.
> 
> Could you clarify?

Avoid pid_alive check and loop starting from sig->curr_target:
 
       t = tsk = sig->curr_target;
       do {
                times->utime = cputime_add(times->utime, t->utime);
                times->stime = cputime_add(times->stime, t->stime);
                times->sum_exec_runtime += t->se.sum_exec_runtime;
       } while_each_thread(tsk, t);

I don't know what are rules regarding accessing sig->curr_target, but
if this is done under sighand->siglock we should be safe. Question
if if we always have lock taken, we tried to assure that in the past,
but if we really do?
 
Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/