lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 9 Jan 2017 13:50:19 +0100 (CET)
From:   Miroslav Benes <mbenes@...e.cz>
To:     Josh Poimboeuf <jpoimboe@...hat.com>
cc:     jeyu@...hat.com, jikos@...nel.org, pmladek@...e.com,
        corbet@....net, live-patching@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Documentation/livepatch: remove the limitation for
 schedule() patching

On Fri, 6 Jan 2017, Josh Poimboeuf wrote:

> On Fri, Jan 06, 2017 at 03:00:45PM +0100, Miroslav Benes wrote:
> > 
> > 2. reversion of the process does not work as expected. The kernel
> > crashes after the removal of the module. A task very likely slept in
> > schedule and was not migrated properly. It might be because of the races
> > in klp_reverse_transition() described by Petr, or might be somewhere
> > else. I'll look into it.
> 
> Hm, will be interesting to see the cause of this...

The absence of the patched schedule() on the stack was the cause. 
klp_try_switch_task() thus did not see it and happily migrated the task. 

The reason is funny. One cannot patch __schedule() (which is of 
interested) because of the notrace attribute. So all the callers need to 
be processed. I tried to make my life easier and patched only schedule(). 
GCC then inlined new __schedule() to the new schedule(). When I added 
noinline attribute to the new __schedule() everything was fine (because 
suddenly new schedule() was on the stack as expected).

There is still one thing which I don't understand. Why __schedule() 
(patched or the original) is not on the stack. The actual "sleep" 
should happen in __switch_to_asm() which is C function now. And there is a 
call to __switch_to_asm() in __schedule(). __schedule() thus should be on 
the stack, shouldn't it? What am I missing? __switch_to_asm() pushes %rbp 
on the stack...

Miroslav

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ