[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1353648b-f3f7-5b8d-f0bb-28bdb1a66f0f@huawei.com>
Date: Wed, 3 Jun 2020 22:06:07 +0800
From: "Wangshaobo (bobo)" <bobo.shaobowang@...wei.com>
To: Josh Poimboeuf <jpoimboe@...hat.com>
CC: <huawei.libin@...wei.com>, <xiexiuqi@...wei.com>,
<cj.chengjian@...wei.com>, <mingo@...hat.com>, <x86@...nel.org>,
<linux-kernel@...r.kernel.org>, <live-patching@...r.kernel.org>,
<mbenes@...e.cz>, <devel@...ukata.com>, <viro@...iv.linux.org.uk>,
<esyr@...hat.com>
Subject: Re: Question: livepatch failed for new fork() task stack unreliable
在 2020/6/2 21:14, Josh Poimboeuf 写道:
> On Tue, Jun 02, 2020 at 09:22:30AM +0800, Wangshaobo (bobo) wrote:
>> so i think this question is related to ORC unwinder, could i ask if you have
>> strategy or plan to avoid this problem ?
> I suspect something like this would fix it (untested):
>
> diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
> index 6ad43fc44556..8cf95ded1410 100644
> --- a/arch/x86/kernel/stacktrace.c
> +++ b/arch/x86/kernel/stacktrace.c
> @@ -50,7 +50,7 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
> if (regs) {
> /* Success path for user tasks */
> if (user_mode(regs))
> - return 0;
> + break;
>
> /*
> * Kernel mode registers on the stack indicate an
> @@ -81,10 +81,6 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
> if (unwind_error(&state))
> return -EINVAL;
>
> - /* Success path for non-user tasks, i.e. kthreads and idle tasks */
> - if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
> - return -EINVAL;
> -
> return 0;
> }
>
> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> index 7f969b2d240f..d7396431261a 100644
> --- a/arch/x86/kernel/unwind_orc.c
> +++ b/arch/x86/kernel/unwind_orc.c
> @@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
> state->sp = sp;
> state->regs = NULL;
> state->prev_regs = NULL;
> - state->signal = false;
> + state->signal = ((void *)state->ip == ret_from_fork);
> break;
>
> case ORC_TYPE_REGS:
what a awesome job, thanks a lot, Josh
Today I test your fix, but arch_stack_walk_reliable() still return
failed sometimes, I
found one of three scenarios mentioned failed:
1. user task (just fork) but not been scheduled
test FAILED
it is because unwind_next_frame() get the first frame, this time
state->signal is false, and then return
failed in the same place for ret_from_fork has not executed at all.
2. user task (just fork) start excuting ret_from_fork() till
schedule_tail but not UNWIND_HINT_REGS
test condition :loop fork() in current system
result : SUCCESS,
it looks like this modification works for my perspective :
- /* Success path for non-user tasks, i.e. kthreads and idle tasks */
- if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
- return -EINVAL;
but is this possible to miss one invalid judgement condition ? (1)
3. call_usermodehelper_exec_async
test condition :loop call call_usermodehelper() in a module selfmade.
result : SUCCESS,
it looks state->signal==true works when unwind_next_frame() gets the
end ret_from_fork() frame,
but i don't know how does it work, i am confused by this sentences,
how does the comment means sibling calls and
calls to noreturn functions? (2)
/*
* Find the orc_entry associated with the text address.
*
* Decrement call return addresses by one so they work for
sibling
* calls and calls to noreturn functions.
*/
orc = orc_find(state->signal ? state->ip : state->ip - 1);
if (!orc) {
So i slightly modify your code, i move state->signal = ((void
*)state->ip == ret_from_fork) to unwind_start()
and render unwind_next_frame() remain the same as before:
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index e9cc182aa97e..ecce5051e8fd 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -620,6 +620,7 @@ void __unwind_start(struct unwind_state *state,
struct task_struct *task,
state->sp = task->thread.sp;
state->bp = READ_ONCE_NOCHECK(frame->bp);
state->ip = READ_ONCE_NOCHECK(frame->ret_addr);
+ state->signal = ((void *)state->ip == ret_from_fork);
}
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 7f969b2d240f..d7396431261a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
state->sp = sp;
state->regs = NULL;
state->prev_regs = NULL;
- state->signal = ((void *)state->ip == ret_from_fork);
+ state->signal = false;
break;
After modification all the three scenarios are captured and no longer
return failed, but i don't know
how does it affect the scenarios 3, because current frame->ret_addr(the
first frame) is not ret_from_fork,
it should return failed as scenarios1, but it didn't , i really want to
know the reason. (3)
thanks again
Wang ShaoBo
Powered by blists - more mailing lists