lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161110130913.GA11933@redhat.com>
Date:   Thu, 10 Nov 2016 14:09:13 +0100
From:   Oleg Nesterov <oleg@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Mike Galbraith <efault@....de>, hartsjc@...hat.com,
        vbendel@...hat.com, vlovejoy@...hat.com,
        linux-kernel@...r.kernel.org
Subject: Re: sched/autogroup: race if !sysctl_sched_autogroup_enabled ?

On 11/09, Peter Zijlstra wrote:
>
> On Wed, Nov 09, 2016 at 05:59:33PM +0100, Oleg Nesterov wrote:
>
> > We need to ensure that autogroup/tg returned by autogroup_task_group()
> > can't go away if we race with autogroup_move_group(), and unless the
> > caller holds ->siglock we rely on fact that autogroup_move_group()
> > will a) see this task and b) do sched_move_task() which needs the same
> > same rq->lock.
> >
> > However. autogroup_move_group() skips for_each_thread/sched_move_task
> > if sysctl_sched_autogroup_enabled == 0.
> >
> > So. Doesn't this mean that cgroup migration to the root cgroup can race
> > with autogroup_move_group() and use the soon-to-be-freed autogroup->tg?
>
> Argh, its too late for this, also jet-lag. But maybe, I can sort of feel
> a hole here but cannot for the life of me still think.

And the 3rd case which I didn't think about yesterday. And now I really hope
it can explain the vmcore we have.

If sysctl_sched_autogroup_enabled was enabled and then disabled, it is
possible that the "autogrouped" process runs with ag->kref.refcount == 1,
and if it does setsid() it frees its active task_group.

> > although this is a bit off-topic. Another question is that I fail to
> > understand why sched_autogroup_create_attach() does autogroup_create()
> > and changes signal->autogroup even if !sysctl_sched_autogroup_enabled.
>
> I really cannot remember back that far, but it could be to allow
> flipping it back on.

Yes, I thought about this too, but I think it is hardly possible to explain
what do we actually want when sysctl_sched_autogroup_enabled changes from 0
to 1.

So I am going to send the patch which simply moves the sysctl check from
autogroup_move_group() to sched_autogroup_create_attach(), but perhaps I
should split this change?

I mean, the first patch for -stable could just remove the current check,
the 2nd one will add it into sched_autogroup_create_attach().

Oleg.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ