lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200723225745.GB32316@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>
Date:   Thu, 23 Jul 2020 22:57:45 +0000
From:   Anchal Agarwal <anchalag@...zon.com>
To:     Stefano Stabellini <sstabellini@...nel.org>
CC:     Boris Ostrovsky <boris.ostrovsky@...cle.com>, <tglx@...utronix.de>,
        <mingo@...hat.com>, <bp@...en8.de>, <hpa@...or.com>,
        <x86@...nel.org>, <jgross@...e.com>, <linux-pm@...r.kernel.org>,
        <linux-mm@...ck.org>, <kamatam@...zon.com>,
        <konrad.wilk@...cle.com>, <roger.pau@...rix.com>,
        <axboe@...nel.dk>, <davem@...emloft.net>, <rjw@...ysocki.net>,
        <len.brown@...el.com>, <pavel@....cz>, <peterz@...radead.org>,
        <eduval@...zon.com>, <sblbir@...zon.com>,
        <xen-devel@...ts.xenproject.org>, <vkuznets@...hat.com>,
        <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <dwmw@...zon.co.uk>, <benh@...nel.crashing.org>
Subject: Re: [PATCH v2 01/11] xen/manage: keep track of the on-going suspend mode

On Wed, Jul 22, 2020 at 04:49:16PM -0700, Stefano Stabellini wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On Wed, 22 Jul 2020, Anchal Agarwal wrote:
> > On Tue, Jul 21, 2020 at 05:18:34PM -0700, Stefano Stabellini wrote:
> > > On Tue, 21 Jul 2020, Boris Ostrovsky wrote:
> > > > >>>>>> +static int xen_setup_pm_notifier(void)
> > > > >>>>>> +{
> > > > >>>>>> +     if (!xen_hvm_domain())
> > > > >>>>>> +             return -ENODEV;
> > > > >>>>>>
> > > > >>>>>> I forgot --- what did we decide about non-x86 (i.e. ARM)?
> > > > >>>>> It would be great to support that however, its  out of
> > > > >>>>> scope for this patch set.
> > > > >>>>> I’ll be happy to discuss it separately.
> > > > >>>>
> > > > >>>> I wasn't implying that this *should* work on ARM but rather whether this
> > > > >>>> will break ARM somehow (because xen_hvm_domain() is true there).
> > > > >>>>
> > > > >>>>
> > > > >>> Ok makes sense. TBH, I haven't tested this part of code on ARM and the series
> > > > >>> was only support x86 guests hibernation.
> > > > >>> Moreover, this notifier is there to distinguish between 2 PM
> > > > >>> events PM SUSPEND and PM hibernation. Now since we only care about PM
> > > > >>> HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state.
> > > > >>> However, I may have to fix other patches in the series where this check may
> > > > >>> appear and cater it only for x86 right?
> > > > >>
> > > > >>
> > > > >> I don't know what would happen if ARM guest tries to handle hibernation
> > > > >> callbacks. The only ones that you are introducing are in block and net
> > > > >> fronts and that's arch-independent.
> > > > >>
> > > > >>
> > > > >> You do add a bunch of x86-specific code though (syscore ops), would
> > > > >> something similar be needed for ARM?
> > > > >>
> > > > >>
> > > > > I don't expect this to work out of the box on ARM. To start with something
> > > > > similar will be needed for ARM too.
> > > > > We may still want to keep the driver code as-is.
> > > > >
> > > > > I understand the concern here wrt ARM, however, currently the support is only
> > > > > proposed for x86 guests here and similar work could be carried out for ARM.
> > > > > Also, if regular hibernation works correctly on arm, then all is needed is to
> > > > > fix Xen side of things.
> > > > >
> > > > > I am not sure what could be done to achieve any assurances on arm side as far as
> > > > > this series is concerned.
> > >
> > > Just to clarify: new features don't need to work on ARM or cause any
> > > addition efforts to you to make them work on ARM. The patch series only
> > > needs not to break existing code paths (on ARM and any other platforms).
> > > It should also not make it overly difficult to implement the ARM side of
> > > things (if there is one) at some point in the future.
> > >
> > > FYI drivers/xen/manage.c is compiled and working on ARM today, however
> > > Xen suspend/resume is not supported. I don't know for sure if
> > > guest-initiated hibernation works because I have not tested it.
> > >
> > >
> > >
> > > > If you are not sure what the effects are (or sure that it won't work) on
> > > > ARM then I'd add IS_ENABLED(CONFIG_X86) check, i.e.
> > > >
> > > >
> > > > if (!IS_ENABLED(CONFIG_X86) || !xen_hvm_domain())
> > > >       return -ENODEV;
> > >
> > > That is a good principle to have and thanks for suggesting it. However,
> > > in this specific case there is nothing in this patch that doesn't work
> > > on ARM. From an ARM perspective I think we should enable it and
> > > &xen_pm_notifier_block should be registered.
> > >
> > This question is for Boris, I think you we decided to get rid of the notifier
> > in V3 as all we need  to check is SHUTDOWN_SUSPEND state which sounds plausible
> > to me. So this check may go away. It may still be needed for sycore_ops
> > callbacks registration.
> > > Given that all guests are HVM guests on ARM, it should work fine as is.
> > >
> > >
> > > I gave a quick look at the rest of the series and everything looks fine
> > > to me from an ARM perspective. I cannot imaging that the new freeze,
> > > thaw, and restore callbacks for net and block are going to cause any
> > > trouble on ARM. The two main x86-specific functions are
> > > xen_syscore_suspend/resume and they look trivial to implement on ARM (in
> > > the sense that they are likely going to look exactly the same.)
> > >
> > Yes but for now since things are not tested I will put this
> > !IS_ENABLED(CONFIG_X86) on syscore_ops calls registration part just to be safe
> > and not break anything.
> > >
> > > One question for Anchal: what's going to happen if you trigger a
> > > hibernation, you have the new callbacks, but you are missing
> > > xen_syscore_suspend/resume?
> > >
> > > Is it any worse than not having the new freeze, thaw and restore
> > > callbacks at all and try to do a hibernation?
> > If callbacks are not there, I don't expect hibernation to work correctly.
> > These callbacks takes care of xen primitives like shared_info_page,
> > grant table, sched clock, runstate time which are important to save the correct
> > state of the guest and bring it back up. Other patches in the series, adds all
> > the logic to these syscore callbacks. Freeze/thaw/restore are just there for at driver
> > level.
> 
> I meant the other way around :-)  Let me rephrase the question.
> 
> Do you think that implementing freeze/thaw/restore at the driver level
> without having xen_syscore_suspend/resume can potentially make things
> worse compared to not having freeze/thaw/restore at the driver level at
> all?
I think in both the cases I don't expect it to work. System may end up in
different state if you register vs not. Hibernation does not work properly
at least for domU instances without these changes on x86 and I am assuming the
same for ARM.

If you do not register freeze/thaw/restore callbacks for arm, then on
invocation of xenbus_dev_suspend, default suspend/resume callbacks
will be called for each driver and since you do not have any code to save domU's
xen primitives state (syscore_ops), hibernation will either fail or will demand a reboot.
I do no have setup to test the current state of ARM's hibernation

If you only register freeze/thaw/restore and no syscore_ops, it will again fail.
Since, I do not have an ARM setup running, I quickly ran a similar test on x86,
may not be an apple to apple comparison but instance failed to resume or I
should say stuck showing huge jump in time and required a reboot.

Now if this doesn't happen currently when you trigger hibernation on arm domU
instances or if system is still alive when you trigger hibernation in xen guest
then not registering the callbacks may be a better idea. In that case  may be 
I need to put arch specific check when registering freeze/thaw/restore handlers.

Hope that answers your question.

Thanks,
Anchal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ