lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181113215919.GC15590@tower.DHCP.thefacebook.com>
Date:   Tue, 13 Nov 2018 21:59:23 +0000
From:   Roman Gushchin <guro@...com>
To:     Oleg Nesterov <oleg@...hat.com>
CC:     Roman Gushchin <guroan@...il.com>, Tejun Heo <tj@...nel.org>,
        "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH v2 3/6] cgroup: cgroup v2 freezer

Hi Oleg!

On Tue, Nov 13, 2018 at 04:48:25PM +0100, Oleg Nesterov wrote:
> On 11/12, Roman Gushchin wrote:
> >
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -83,7 +83,8 @@ struct task_group;
> >  #define TASK_WAKING			0x0200
> >  #define TASK_NOLOAD			0x0400
> >  #define TASK_NEW			0x0800
> > -#define TASK_STATE_MAX			0x1000
> > +#define TASK_FROZEN			0x1000
> > +#define TASK_STATE_MAX			0x2000
> 
> Just noticed the new task state... Why? Can't we avoid it?

We can, but it's nice to show to userspace that tasks are frozen,
rather than just stuck somewhere in the kernel...

> 
> ...
> 
> > +void cgroup_freezer_enter(void)
> > +{
> > +	long state = current->state;
> 
> Why? it must be TASK_RUNNING?
> 
> If not set_current_state() at the end is simply wrong... Yes, __refrigerator()
> does this, but at least it has a reason although it is wrong too.
> 
> > +	struct cgroup *cgrp;
> > +
> > +	if (!current->frozen) {
> > +		spin_lock_irq(&css_set_lock);
> > +		current->frozen = true;
> > +		cgrp = task_dfl_cgroup(current);
> > +		cgrp->freezer.nr_frozen_tasks++;
> > +
> > +		WARN_ON_ONCE(cgrp->freezer.nr_tasks_to_freeze <
> > +			     cgrp->freezer.nr_frozen_tasks);
> > +
> > +		if (cgrp->freezer.nr_tasks_to_freeze ==
> > +		    cgrp->freezer.nr_frozen_tasks)
> > +			cgroup_queue_notify_frozen(cgrp);
> > +		spin_unlock_irq(&css_set_lock);
> > +	}
> > +
> > +	/* refrigerator */
> > +	set_current_state(TASK_WAKEKILL | TASK_INTERRUPTIBLE | TASK_FROZEN);
> 
> Why not __set_current_state() ?

Hm, it's not a hot path at all, so set_current_state() is good enough.
Not a strong preference, of course.

> 
> If ->state include TASK_INTERRUPTIBLE, why do we need TASK_WAKEKILL?
> 
> And again, why TASK_FROZEN?

So, should it be just TASK_INTERRUPTIBLE | TASK_FROZEN ?

> 
> > +	clear_thread_flag(TIF_SIGPENDING);
> > +	schedule();
> > +	recalc_sigpending();
> 
> I simply can't understand these 3 lines above but I bet this is not correct ;)

So, yeah, the problem is that if there is TIF_SIGPENDING bit set, schedule()
will return immediately, so we're getting pretty much a busy loop here.
This is a nasty workaround.

I believe we can clear and not call recalc_sigpending() at all. Does this seem
to be correct?

> 
> if nothing else recalc_sigpending() without ->siglock is wrong, it can race
> with signal_wakeup/etc.
> 
> > +	set_current_state(state);
> 
> see above...

Thank you for the review!
And looking forward for more comments from you!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ