linux-kernel - Re: [PATCH 1/3] sched: add sched_task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150220095003.GA23506@gmail.com>
Date:	Fri, 20 Feb 2015 10:50:03 +0100
From:	Ingo Molnar <mingo@...nel.org>
To:	Jiri Kosina <jkosina@...e.cz>
Cc:	Josh Poimboeuf <jpoimboe@...hat.com>,
	Vojtech Pavlik <vojtech@...e.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>,
	Seth Jennings <sjenning@...hat.com>,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH 1/3] sched: add sched_task_call()

* Jiri Kosina <jkosina@...e.cz> wrote:

> Alright, so to sum it up:
> 
> - current stack dumping (even looking at /proc/<pid>/stack) is not 
>   guaranteed to yield "correct" results in case the task is running at the 
>   time the stack is being examined

Don't even _think_ about trying to base something as 
dangerous as live patching the kernel image on the concept 
of:

  'We can make all stack backtraces reliably correct all 
   the time, with no false positives, with no false
   negatives, 100% of the time, and quickly discover and
   fix bugs in that'.

It's not going to happen:

 - The correctness of stacktraces partially depends on
   tooling and we don't control those.

 - More importantly, there's no strong force that ensures
   we can rely on stack backtraces: correcting bad stack
   traces depends on people hitting those functions and
   situations that generate them, seeing a bad stack trace,
   noticing that it's weird and correcting whatever code or
   tooling quirk causes the stack entry to be incorrect.

Essentially unlike other kernel code which breaks stuff if 
it's incorrect, there's no _functional_ dependence on stack 
traces, so live patching would be the first (and pretty 
much only) thing that breaks on bad stack traces ...

If you think you can make something like dwarf annotations 
work reliably to base kernel live patching on that, 
reconsider.

Even with frame pointer backtraces can go bad sometimes, I 
wouldn't base live patching even on _that_, and that's a 
very simple concept with a performance cost that most 
distros don't want to pay.

So if your design is based on being able to discover 'live' 
functions in the kernel stack dump of all tasks in the 
system, I think you need a serious reboot of the whole 
approach and get rid of that fragility before any of that 
functionality gets upstream!

> - For live patching use-case, the stack has to be 
>   analyzed (and decision on what to do based on the 
>   analysis) in the NMI handler itself, otherwise it gets 
>   racy again

You simply cannot reliably determine from the kernel stack 
whether a function is used by a task or not, and actually 
modify the kernel image, from a stack backtrace, as things 
stand today. Full stop.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/