[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f6f1208a-12c4-77b8-2e1d-fb4a03a2211a@linux.intel.com>
Date: Mon, 7 Dec 2020 18:19:56 +0200
From: Vladimir Kondratiev <vladimir.kondratiev@...ux.intel.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Jonathan Corbet <corbet@....net>,
Luis Chamberlain <mcgrof@...nel.org>,
Kees Cook <keescook@...omium.org>,
Iurii Zaikin <yzaikin@...gle.com>,
"Paul E. McKenney" <paulmck@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Randy Dunlap <rdunlap@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Mauro Carvalho Chehab <mchehab+huawei@...nel.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
"Guilherme G. Piccoli" <gpiccoli@...onical.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Kars Mulder <kerneldev@...smulder.nl>,
Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
Kishon Vijay Abraham I <kishon@...com>,
Arvind Sankar <nivedita@...m.mit.edu>,
Joe Perches <joe@...ches.com>,
Rafael Aquini <aquini@...hat.com>,
Christian Brauner <christian.brauner@...ntu.com>,
Alexei Starovoitov <ast@...nel.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>,
Michel Lespinasse <walken@...gle.com>,
Jann Horn <jannh@...gle.com>, chenqiwu <chenqiwu@...omi.com>,
Minchan Kim <minchan@...nel.org>,
Christophe Leroy <christophe.leroy@....fr>,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org
Subject: Re: [RFC PATCH v2] do_exit(): panic() recursion detected
I see 2 paths how "bad things" can cause recursive do_exit - various
traps that go through die() and therefore covered by panic_on_oops; and
do_group_exit() as result of fatal signal.
Provided one add "panic on coredump" functionality, path through
do_group_exit() covered as well.
Let's drop this patch.
Thanks, Vladimir
On 12/7/20 5:49 PM, Eric W. Biederman wrote:
> Vladimir Kondratiev <vladimir.kondratiev@...ux.intel.com> writes:
>
>> Please ignore version 1 of the patch - it was sent from wrong mail address.
>>
>> To clarify the reason:
>>
>> Situation where do_exit() re-entered, discovered by static code analysis.
>> For safety critical system, it is better to panic() when minimal chance of
>> corruption detected. For this reason, we also panic on fatal signal delivery -
>> patch for this not submitted yet.
>
> What did the static code analysis say? What triggers the recursion.
>
> What makes it safe to even call panic on this code path? Is there
> enough kernel stack?
>
> My sense is that if this actually can happen and is a real concern,
> and that it is safe to do something on this code path it is probably
> better just to ooops. That way if someone is trying to debug such
> a recursion they will have a backtrace to work with. Plus panic
> on oops will work.
>
> Eric
>
>>
>> On 12/7/20 2:44 PM, Vladimir Kondratiev wrote:
>>> Recursive do_exit() is symptom of compromised kernel integrity.
>>> For safety critical systems, it may be better to
>>> panic() in this case to minimize risk.
>>>
>>> Signed-off-by: Vladimir Kondratiev <vladimir.kondratiev@...ux.intel.com>
>>> Change-Id: I42f45900a08c4282c511b05e9e6061360d07db60
>>> ---
>>> Documentation/admin-guide/kernel-parameters.txt | 6 ++++++
>>> include/linux/kernel.h | 1 +
>>> kernel/exit.c | 7 +++++++
>>> kernel/sysctl.c | 9 +++++++++
>>> 4 files changed, 23 insertions(+)
>>>
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>> index 44fde25bb221..6e12a6804557 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -3508,6 +3508,12 @@
>>> bit 4: print ftrace buffer
>>> bit 5: print all printk messages in buffer
>>> + panic_on_exit_recursion
>>> + panic() when do_exit() recursion detected, rather then
>>> + try to stay running whenever possible.
>>> + Useful on safety critical systems; re-entry in do_exit
>>> + is a symptom of compromised kernel integrity.
>>> +
>>> panic_on_taint= Bitmask for conditionally calling panic() in add_taint()
>>> Format: <hex>[,nousertaint]
>>> Hexadecimal bitmask representing the set of TAINT flags
>>> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
>>> index 2f05e9128201..5afb20534cb2 100644
>>> --- a/include/linux/kernel.h
>>> +++ b/include/linux/kernel.h
>>> @@ -539,6 +539,7 @@ extern int sysctl_panic_on_rcu_stall;
>>> extern int sysctl_panic_on_stackoverflow;
>>> extern bool crash_kexec_post_notifiers;
>>> +extern int panic_on_exit_recursion;
>>> /*
>>> * panic_cpu is used for synchronizing panic() and crash_kexec() execution. It
>>> diff --git a/kernel/exit.c b/kernel/exit.c
>>> index 1f236ed375f8..162799a8b539 100644
>>> --- a/kernel/exit.c
>>> +++ b/kernel/exit.c
>>> @@ -68,6 +68,9 @@
>>> #include <asm/unistd.h>
>>> #include <asm/mmu_context.h>
>>> +int panic_on_exit_recursion __read_mostly;
>>> +core_param(panic_on_exit_recursion, panic_on_exit_recursion, int, 0644);
>>> +
>>> static void __unhash_process(struct task_struct *p, bool group_dead)
>>> {
>>> nr_threads--;
>>> @@ -757,6 +760,10 @@ void __noreturn do_exit(long code)
>>> */
>>> if (unlikely(tsk->flags & PF_EXITING)) {
>>> pr_alert("Fixing recursive fault but reboot is needed!\n");
>>> + if (panic_on_exit_recursion)
>>> + panic("Recursive do_exit() detected in %s[%d]\n",
>>> + current->comm, task_pid_nr(current));
>>> +
>>> futex_exit_recursive(tsk);
>>> set_current_state(TASK_UNINTERRUPTIBLE);
>>> schedule();
>>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>>> index afad085960b8..bb397fba2c42 100644
>>> --- a/kernel/sysctl.c
>>> +++ b/kernel/sysctl.c
>>> @@ -2600,6 +2600,15 @@ static struct ctl_table kern_table[] = {
>>> .extra2 = &one_thousand,
>>> },
>>> #endif
>>> + {
>>> + .procname = "panic_on_exit_recursion",
>>> + .data = &panic_on_exit_recursion,
>>> + .maxlen = sizeof(int),
>>> + .mode = 0644,
>>> + .proc_handler = proc_dointvec_minmax,
>>> + .extra1 = SYSCTL_ZERO,
>>> + .extra2 = SYSCTL_ONE,
>>> + },
>>> {
>>> .procname = "panic_on_warn",
>>> .data = &panic_on_warn,
>>>
Powered by blists - more mailing lists