linux-kernel - Re: [RFC/RFT PATCH v3] sched: automated per tty task groups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101112181240.GB8659@redhat.com>
Date:	Fri, 12 Nov 2010 19:12:40 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Mike Galbraith <efault@....de>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Markus Trippelsdorf <markus@...ppelsdorf.de>
Subject: Re: [RFC/RFT PATCH v3] sched: automated per tty task groups

On 11/11, Mike Galbraith wrote:
>
> On Thu, 2010-11-11 at 21:27 +0100, Oleg Nesterov wrote:
>
> > But the real problem is that copy_process() can fail after that,
> > and in this case we have the unbalanced kref_get().
>
> Memory leak, will fix.
>
> > > +++ linux-2.6.36.git/kernel/exit.c
> > > @@ -174,6 +174,7 @@ repeat:
> > >  	write_lock_irq(&tasklist_lock);
> > >  	tracehook_finish_release_task(p);
> > >  	__exit_signal(p);
> > > +	sched_autogroup_exit(p);
> >
> > This doesn't look right. Note that "p" can run/sleep after that
> > (or in parallel), set_task_rq() can use the freed ->autogroup.
>
> So avoiding refcounting rcu released task_group backfired.  Crud.

Just in case, the lock order may be wrong. sched_autogroup_exit()
takes task_group_lock under write_lock(tasklist), while
sched_autogroup_handler() takes them in reverse order.


I am not sure, but perhaps this can be simpler?
wake_up_new_task() does autogroup_fork(), and do_exit() does
sched_autogroup_exit() before the last schedule. Possible?


> > Btw, I can't apply this patch...
>
> It depends on the patch below from Peter, or manual fixup.

Thanks. It also applies cleanly to 2.6.36.


Very basic question. Currently sched_autogroup_create_attach()
has the only caller, __proc_set_tty(). It is a bit strange that
signal->tty change is process-wide, but sched_autogroup_create_attach()
move the single thread, the caller. What about other threads in
this thread group? The same for proc_clear_tty().


> +void sched_autogroup_create_attach(struct task_struct *p)
> +{
> +       autogroup_move_task(p, autogroup_create());
> +
> +       /*
> +        * Correct freshly allocated group's refcount.
> + 	   * Move takes a reference on destination, but
> +        * create already initialized refcount to 1.
> +        */
> + 	if (p->autogroup != &autogroup_default)
> +               autogroup_kref_put(p->autogroup);
> +}

OK, but I don't understand "p->autogroup != &autogroup_default"
check. This is true if autogroup_create() succeeds. Otherwise
autogroup_create() does autogroup_kref_get(autogroup_default),
doesn't this mean we need unconditional _put ?

And can't resist, minor cosmetic nit,

>  static inline struct task_group *task_group(struct task_struct *p)
>  {
> +       struct task_group *tg;
>         struct cgroup_subsys_state *css;
>
>         css = task_subsys_state_check(p, cpu_cgroup_subsys_id,
>                         lockdep_is_held(&task_rq(p)->lock));
> -       return container_of(css, struct task_group, css);
> +       tg = container_of(css, struct task_group, css);
> +
> +       autogroup_task_group(p, &tg);

Fell free to ignore, but imho

	return autogroup_task_group(p, tg);

looks a bit better. Why autogroup_task_group() returns its
result via pointer?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/