lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131113170724.GA17739@redhat.com>
Date:	Wed, 13 Nov 2013 18:07:24 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	David Rientjes <rientjes@...gle.com>,
	David Laight <David.Laight@...LAB.COM>,
	Geert Uytterhoeven <geert@...ux-m68k.org>,
	Ingo Molnar <mingo@...nel.org>, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: oom-kill && frozen()

On 11/13, Tejun Heo wrote:
>
> Hello,
>
> On Tue, Nov 12, 2013 at 05:56:43PM +0100, Oleg Nesterov wrote:
> > On 11/12, Oleg Nesterov wrote:
> > > I am also wondering if it makes any sense to turn PF_FROZEN into
> > > TASK_FROZEN, something like (incomplete, probably racy) patch below.
> > > Note that it actually adds the new state, not the the qualifier.
> >
> > As for the current usage of PF_FROZEN... David, it seems that
> > oom_scan_process_thread()->__thaw_task() is dead? Probably this
> > was fine before, when __thaw_task() cleared the "need to freeze"
> > condition, iirc it was PF_FROZEN.
> >
> > But today __thaw_task() can't help, no? the task will simply
> > schedule() in D state again.
>
> Yeah, it'll have to be actively excluded using e.g. PF_FREEZER_SKIP,
> which, BTW, can usually only be manipulated by the task itself.

Oh, yes, yes, yes, I agree. PF_FREEZER_SKIP and the growing number of
freezable_schedule() makes this all more confusing.

In fact I was think about something like

	1. Add the new __TASK_FREEZABLE qualifier

	2. Turn freezable_schedule() into

		void freezable_schedule(void)
		{
			spin_lock_irq(&current->pi_lock);
			if (current->state)
				current->state |= __TASK_FREEZABLE
			spin_unlock_irq(&current->pi_lock);
			
			schedule();

			try_to_freeze();
		}

	3. Kill PF_FREEZER_SKIP/freezer_do_not_count/count/should_skip

	4. Change freeze_task() and fake_signal_wake_up()

		-	wake_up_state(p, TASK_INTERRUPTIBLE);
		+	wake_up_state(p, TASK_INTERRUPTIBLE | __TASK_FREEZABLE);

Unfortunately, this can only work if the caller can tolerate the
false wakeup. We can even fix wait_for_vfork_done(), but say
ptrace_stop() can't work this way.

And even if we can make this work, the very fact that freezable_schedule()
does schedule() twice does not look right.

_Perhaps_ we can do something like "selective wakeup"? IOW, ignoring the
races/details,

	1. Add __TASK_FROZEN qualifier _and_ state

	2. Change frozen(),

		static inline bool frozen(struct task_struct *p)
		{
			return p->state & __TASK_FROZEN;
		}

	2. Change freezable_schedule(),

		void freezable_schedule(void)
		{
			spin_lock_irq(&current->pi_lock);
			if (current->state)
				current->state |= __TASK_FROZEN;
			spin_unlock_irq(&current->pi_lock);
			
			schedule();
		}

	3. Change __refrigerator() to use saved_state | __TASK_FROZEN
	   too.

	4. Finally, change try_to_wake_up() path to do

		-	p->state = TASK_WAKING;
		+	p->state &= ~state;
		+	if (p->state & ~(TASK_DEAD | TASK_WAKEKILL | TASK_PARKED))
		+		return;
		+	else
		+		p->state = TASK_WAKING;

	   IOW, if the task sleeps in, say, TASK_INTERRUPTIBLE | __TASK_FROZEN
	   then it need both try_to_wake_up(TASK_INTERRUPTIBLE) and
	   try_to_wake_up(__TASK_FROZEN) to wake up.

	 5. Kill PF_FREEZER_SKIP / etc.

Unfortunately, 4. is obviously needs more changes, although at first glance
nothing really nontrivial... we need a common helper for try_to_wake_up()
and ttwu_remote() which checks/changes ->state and we need to avoid "stat"
if we do not actually wake up.

Hmm. and this all makes me think that at least s/PF_FROZEN/TASK_FROZEN/ as
a first step actually makes some sense... Note the "qualifier _and_ state"
above.

Tejun, Peter, do you think this makes any sense? I am just curious, but
"selective wakeup" looks potentially useful.



And what about oom_scan_process_thread() ? should we simply kill this
dead frozen/__thaw_task code or should we change freezing() to respect
TIF_MEMDIE?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ