linux-kernel - Re: pstore does not work under xen

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190923154227.GA11201@dingwall.me.uk>
Date:   Mon, 23 Sep 2019 15:42:27 +0000
From:   James Dingwall <james@...gwall.me.uk>
To:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Cc:     Kees Cook <keescook@...omium.org>,
        Anton Vorontsov <anton@...msg.org>,
        Colin Cross <ccross@...roid.com>,
        Juergen Gross <jgross@...e.com>,
        "Luck, Tony" <tony.luck@...el.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>
Subject: Re: pstore does not work under xen

On Thu, Sep 19, 2019 at 12:37:40PM -0400, Boris Ostrovsky wrote:
> On 9/19/19 12:14 PM, James Dingwall wrote:
> > On Thu, Sep 19, 2019 at 03:51:33PM +0000, Luck, Tony wrote:
> >>> I have been investigating a regression in our environment where pstore 
> >>> (efi-pstore specifically but I suspect this would affect all 
> >>> implementations) no longer works after upgrading from a 4.4 to 5.0 
> >>> kernel when running under xen.  (This is an Ubuntu kernel but I don't 
> >>> think there are patches which affect this area.)
> >> I don't have any answer for this ... but want to throw out the idea that
> >> VMM systems could provide some hypercalls to guests to save/return
> >> some blob of memory (perhaps the "save" triggers automagically if the
> >> guest crashes?).
> >>
> >> That would provide a much better pstore back end than relying on emulation
> >> of EFI persistent variables (which have severe contraints on size, and don't
> >> support some pstore modes because you can't dynamically update EFI variables
> >> hundreds of times per second).
> >>
> > For clarification this is a dom0 crash rather than an HVM guest with EFI.  I
> > should probably have also mentioned the xen verion has changed from 4.8.4 to
> > 4.11.2 in case its behaviour on detection of crashed domain has changed.
> >
> > (For capturing guest crashes we have enabled xenconsole logging so the
> > hvc0 log is available in dom0.)
> 
> 
> Do you only see this difference between 4.4 and 5.0 when you crash via
> sysrq?
> 
> Because that's where things changed. On 4.4 we seem to be forcing an
> oops, which eventually calls kmsg_dump() and then panic. On 5.0 we call
> panic() directly from sysrq handler. And because Xen's panic notifier
> doesn't return we never get a chance to call kmsg_dump().
> 

Ok, I see that change in 8341f2f222d729688014ce8306727fdb9798d37e.  I 
hadn't tested it any other way before.  Using the null pointer 
de-reference module code at [1] a pstore record is generated as expected 
when the module is loaded (panic_on_oops=1).

I have also tested swapping the kmsg_dump() / 
atomic_notifier_call_chain() around in panic.c and this also results in 
a pstore record being created with sysrq-c.  I don't know if that would 
be an acceptable solution though since it may break behaviour that other 
things depend on.

James

[1] http://ubuntu.5.x6.nabble.com/How-To-Cause-An-Oops-td3681145.html