linux-kernel - Re: [PATCH 2/3] livepatch: send a fake signal to all blocking tasks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LSU.2.20.1705182011410.17317@pobox.suse.cz>
Date:   Thu, 18 May 2017 20:14:39 +0200 (CEST)
From:   Miroslav Benes <mbenes@...e.cz>
To:     Oleg Nesterov <oleg@...hat.com>
cc:     jpoimboe@...hat.com, jeyu@...hat.com, jikos@...nel.org,
        pmladek@...e.com, live-patching@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] livepatch: send a fake signal to all blocking
 tasks

On Thu, 18 May 2017, Oleg Nesterov wrote:

> I didn't see other patches in series, not sure I understand...

There is nothing relevant to this patch, I think. I did not want to bother
you with it.

> On 05/18, Miroslav Benes wrote:
> >
> > The very safe marking is done in entry.S on syscall and
> > interrupt/exception exit paths, and in a stack checking functions of
> > livepatch.  TIF_PATCH_PENDING is cleared and the next
> > recalc_sigpending() drops TIF_SIGPENDING.
> 
> Confused. The task can't return from do_signal() is signal_pending() is
> true, thus it will spin forever if klp_patch_pending(current)) is true.
> "forever" means until something else clears TIF_PATCH_PENDING, of course.
>
> exit_to_usermode_loop() calls do_signal(), then klp_update_patch_state().
> So it won't be cleared here.

Ok, so maybe I misunderstand the code. I see the loop in
exit_to_usermode_loop() for processing ALLWORK_MASK. There we call
do_signal(). We go to get_signal(). The infinite loop there is relevant
for us. We call dequeue_signal(). There, if I am not mistaken
__dequeue_signal() would return 0 in our case, because there is no real
signal pending and thus nothing in the signal data structures.
recalc_sigpending() is called and TIF_SIGPENDING is preserved there (I
presume TIF_PATCH_PENDING is set). signr is zero, dequeue_signal() returns
0. Back in get_signal() the loop is broken and zero is return. Then
do_signal() may or may not restart the syscall.

If not, we get back to exit_to_usermode_loop() and TIF_PATCH_PENDING is 
cleared. Yes, it is true that TIF_SIGPENDING is still set and we get to
do_signal() once more. But for the last time.

If the syscall is restarted, it may be different. I have to think about
this one. But...

> Even if you change the order, this won't help unless I missed something,
> TIF_PATCH_PENDING can be set when this task has already entered do_signal().

...I think it could be solved with this anyway. And of course it should 
solve the double call to do_signal() I described above.

Damn, I fixed exactly this in SLES a year or so ago and there is a note I 
did the same in proposed version for upstream. It must have fallen through
the cracks.

So, am I wrong somewhere? It could be anywhere, because it is quite 
confusing.

Regards,
Miroslav