linux-kernel - Re: [PATCH 1/3] sched: add sched_task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150219041753.GA13423@treble.redhat.com>
Date:	Wed, 18 Feb 2015 22:17:53 -0600
From:	Josh Poimboeuf <jpoimboe@...hat.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>, Jiri Kosina <jkosina@...e.cz>,
	Seth Jennings <sjenning@...hat.com>,
	linux-kernel@...r.kernel.org, Vojtech Pavlik <vojtech@...e.cz>
Subject: Re: [PATCH 1/3] sched: add sched_task_call()

On Thu, Feb 19, 2015 at 01:20:58AM +0100, Peter Zijlstra wrote:
> On Wed, Feb 18, 2015 at 11:12:56AM -0600, Josh Poimboeuf wrote:
> > > So uhm, what happens if your target task is running? When will you
> > > retry? The problem I see is that if you do a sample approach you might
> > > never hit an opportune moment.
> > 
> > We attack it from multiple angles.
> > 
> > First we check the stack of all sleeping tasks.  That patches the
> > majority of tasks immediately.  If necessary, we also do that
> > periodically in a workqueue to catch any stragglers.
> 
> So not only do you need an 'atomic' stack save, you need to analyze and
> flip its state in the same atomic region. The task cannot start running
> again after the save and start using old functions while you analyze the
> stack and flip it.

Yes, exactly.

> > The next line of attack is patching tasks when exiting the kernel to
> > user space (system calls, interrupts, signals), to catch all CPU-bound
> > and some I/O-bound tasks.  That's done in patch 9 [1] of the consistency
> > model patch set.
> 
> So the HPC people are really into userspace that does for (;;) ; and
> isolate that on CPUs and have the tick interrupt stopped and all that.
> 
> You'll not catch those threads on the sysexit path.
> 
> And I'm fairly sure they'll not want to SIGSTOP/CONT their stuff either.
> 
> Now its fairly easy to also handle this; just mark those tasks with a
> _TIF_WORK_SYSCALL_ENTRY flag, have that slowpath wait for the flag to
> go-away, then flip their state and clear the flag.

I guess you mean patch the task when it makes a syscall?  I'm doing that
already on syscall exit with a bit in _TIF_ALLWORK_MASK and
_TIF_DO_NOTIFY_MASK.

> > As a last resort, if there are still any tasks which are sleeping on a
> > to-be-patched function, the user can send them SIGSTOP and SIGCONT to
> > force them to be patched.
> 
> You typically cannot SIGSTOP/SIGCONT kernel threads. Also
> TASK_UNINTERRUPTIBLE sleeps are unaffected by signals.
> 
> Bit pesky that.. needs pondering.

I did have a scheme for patching kthreads which are sleeping on
to-be-patched functions.

But now I'm thinking that kthreads will almost never be a problem.  Most
kthreads are basically this:

void thread_fn()
{
	while (1) {
		/* do some stuff */

		schedule();

		/* do other stuff */
	}
}

So a kthread would typically only fail the stack check if we're trying
to patch either schedule() or the top-level thread_fn.

Patching thread_fn wouldn't be possible unless we killed the thread.

And I'd guess we can probably live without being able to patch
schedule() for now :-)

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/