linux-kernel - Re: Question: livepatch failed for new fork() task stack unreliable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a5e0f476-02b5-cc44-8d4e-d33ff2138143@huawei.com>
Date:   Tue, 2 Jun 2020 09:22:30 +0800
From:   "Wangshaobo (bobo)" <bobo.shaobowang@...wei.com>
To:     Josh Poimboeuf <jpoimboe@...hat.com>
CC:     <huawei.libin@...wei.com>, <xiexiuqi@...wei.com>,
        <cj.chengjian@...wei.com>, <mingo@...hat.com>, <x86@...nel.org>,
        <linux-kernel@...r.kernel.org>, <live-patching@...r.kernel.org>,
        <mbenes@...e.cz>, <devel@...ukata.com>, <viro@...iv.linux.org.uk>,
        <esyr@...hat.com>
Subject: Re: Question: livepatch failed for new fork() task stack unreliable


在 2020/6/2 2:05, Josh Poimboeuf 写道:
> On Sat, May 30, 2020 at 10:21:19AM +0800, Wangshaobo (bobo) wrote:
>> 1) when a user mode task just fork start excuting ret_from_fork() till
>> schedule_tail, unwind_next_frame found
>>
>> orc->sp_reg is ORC_REG_UNDEFINED but orc->end not equals zero, this time
>> arch_stack_walk_reliable()
>>
>> terminates it's backtracing loop for unwind_done() return true. then 'if
>> (!(task->flags & (PF_KTHREAD | PF_IDLE)))'
>>
>> in arch_stack_walk_reliable() true and return -EINVAL after.
>>
>> * The stack trace looks like that:
>>
>> ret_from_fork
>>
>>        -=> UNWIND_HINT_EMPTY
>>
>>        -=> schedule_tail             /* schedule out */
>>
>>        ...
>>
>>        -=> UNWIND_HINT_REGS      /*  UNDO */
> Yes, makes sense.
>
>> 2) when using call_usermodehelper_exec_async() to create a user mode task,
>> ret_from_fork() still not exec whereas
>>
>> the task has been scheduled in __schedule(), at this time, orc->sp_reg is
>> ORC_REG_UNDEFINED but orc->end equals zero,
>>
>> unwind_error() return true and also terminates arch_stack_walk_reliable()'s
>> backtracing loop, end up return from
>>
>> 'if (unwind_error())' branch.
>>
>> * The stack trace looks like that:
>>
>> -=> call_usermodehelper_exec
>>
>>                   -=> do_exec
>>
>>                             -=> search_binary_handler
>>
>>                                        -=> load_elf_binary
>>
>>                                                  -=> elf_map
>>
>>                                                           -=> vm_mmap_pgoff
>>
>> -=> down_write_killable
>>
>> -=> _cond_resched
>>
>>               -=> __schedule           /* scheduled to work */
>>
>> -=> ret_from_fork       /* UNDO */
> I don't quite follow the stacktrace, but it sounds like the issue is the
> same as the first one you originally reported:

yes, true, same as the first one,  the only difference what i want to 
say is the task has been scheduled but the first one is not.

>> 1) The task was not actually scheduled to excute, at this time
>> UNWIND_HINT_EMPTY in ret_from_fork() has not reset unwind_hint, it's
>> sp_reg and end field remain default value and end up throwing an error
>> in unwind_next_frame() when called by arch_stack_walk_reliable();
> Or am I misunderstanding?
>
> And to reiterate, these are not "livepatch failures", right?  Livepatch
> doesn't fail when stack_trace_save_tsk_reliable() returns an error.  It
> recovers gracefully and tries again later.

yes, you are right,  "livepatch failures" only indicates serveral retry 
failures, we found if frequent fork() happend in current

system, it is easier to cause retry but still always end up success.

so i think this question is related to ORC unwinder, could i ask if you 
have strategy or plan to avoid this problem ?

thanks,

Wang ShaoBo