linux-kernel - RE: [PATCH 16/30] drivers/hv/vmbus, video/hyperv_fb: Untangle and refactor Hyper-V panic notifiers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PH0PR21MB3025A46643EEDA6B14AC1ECED7C09@PH0PR21MB3025.namprd21.prod.outlook.com>
Date:   Tue, 3 May 2022 18:13:52 +0000
From:   "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
To:     "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "bhe@...hat.com" <bhe@...hat.com>,
        "pmladek@...e.com" <pmladek@...e.com>,
        "kexec@...ts.infradead.org" <kexec@...ts.infradead.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "bcm-kernel-feedback-list@...adcom.com" 
        <bcm-kernel-feedback-list@...adcom.com>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "linux-alpha@...r.kernel.org" <linux-alpha@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "linux-leds@...r.kernel.org" <linux-leds@...r.kernel.org>,
        "linux-mips@...r.kernel.org" <linux-mips@...r.kernel.org>,
        "linux-parisc@...r.kernel.org" <linux-parisc@...r.kernel.org>,
        "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
        "linux-remoteproc@...r.kernel.org" <linux-remoteproc@...r.kernel.org>,
        "linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
        "linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>,
        "linux-um@...ts.infradead.org" <linux-um@...ts.infradead.org>,
        "linux-xtensa@...ux-xtensa.org" <linux-xtensa@...ux-xtensa.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "openipmi-developer@...ts.sourceforge.net" 
        <openipmi-developer@...ts.sourceforge.net>,
        "rcu@...r.kernel.org" <rcu@...r.kernel.org>,
        "sparclinux@...r.kernel.org" <sparclinux@...r.kernel.org>,
        "xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "kernel-dev@...lia.com" <kernel-dev@...lia.com>,
        "kernel@...ccoli.net" <kernel@...ccoli.net>,
        "halves@...onical.com" <halves@...onical.com>,
        "fabiomirmar@...il.com" <fabiomirmar@...il.com>,
        "alejandro.j.jimenez@...cle.com" <alejandro.j.jimenez@...cle.com>,
        "andriy.shevchenko@...ux.intel.com" 
        <andriy.shevchenko@...ux.intel.com>,
        "arnd@...db.de" <arnd@...db.de>, "bp@...en8.de" <bp@...en8.de>,
        "corbet@....net" <corbet@....net>,
        "d.hatayama@...fujitsu.com" <d.hatayama@...fujitsu.com>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "dyoung@...hat.com" <dyoung@...hat.com>,
        "feng.tang@...el.com" <feng.tang@...el.com>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "hidehiro.kawai.ez@...achi.com" <hidehiro.kawai.ez@...achi.com>,
        "jgross@...e.com" <jgross@...e.com>,
        "john.ogness@...utronix.de" <john.ogness@...utronix.de>,
        "keescook@...omium.org" <keescook@...omium.org>,
        "luto@...nel.org" <luto@...nel.org>,
        "mhiramat@...nel.org" <mhiramat@...nel.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "paulmck@...nel.org" <paulmck@...nel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "senozhatsky@...omium.org" <senozhatsky@...omium.org>,
        "stern@...land.harvard.edu" <stern@...land.harvard.edu>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "vgoyal@...hat.com" <vgoyal@...hat.com>,
        vkuznets <vkuznets@...hat.com>,
        "will@...nel.org" <will@...nel.org>,
        Andrea Parri <parri.andrea@...il.com>,
        Dexuan Cui <decui@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        KY Srinivasan <kys@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Tianyu Lan <Tianyu.Lan@...rosoft.com>,
        Wei Liu <wei.liu@...nel.org>
Subject: RE: [PATCH 16/30] drivers/hv/vmbus, video/hyperv_fb: Untangle and
 refactor Hyper-V panic notifiers

From: Guilherme G. Piccoli <gpiccoli@...lia.com> Sent: Friday, April 29, 2022 3:35 PM
> 
> Hi Michael, first of all thanks for the great review, much appreciated.
> Some comments inline below:
> 
> On 29/04/2022 14:16, Michael Kelley (LINUX) wrote:
> > [...]
> >> hypervisor I/O completion), so we postpone that to run late. But more
> >> relevant: this *same* vmbus unloading happens in the crash_shutdown()
> >> handler, so if kdump is set, we can safely skip this panic notifier and
> >> defer such clean-up to the kexec crash handler.
> >
> > While the last sentence is true for Hyper-V on x86/x64, it's not true for
> > Hyper-V on ARM64.  x86/x64 has the 'machine_ops' data structure
> > with the ability to provide a custom crash_shutdown() function, which
> > Hyper-V does in the form of hv_machine_crash_shutdown().  But ARM64
> > has no mechanism to provide such a custom function that will eventually
> > do the needed vmbus_initiate_unload() before running kdump.
> >
> > I'm not immediately sure what the best solution is for ARM64.  At this
> > point, I'm just pointing out the problem and will think about the tradeoffs
> > for various possible solutions.  Please do the same yourself. :-)
> >
> 
> Oh, you're totally right! I just assumed ARM64 would the the same, my
> bad. Just to propose some alternatives, so you/others can also discuss
> here and we can reach a consensus about the trade-offs:
> 
> (a) We could forget about this change, and always do the clean-up here,
> not relying in machine_crash_shutdown().
> Pro: really simple, behaves the same as it is doing currently.
> Con: less elegant/concise, doesn't allow arm64 customization.
> 
> (b) Add a way to allow ARM64 customization of shutdown crash handler.
> Pro: matches x86, more customizable, improves arm64 arch code.
> Con: A tad more complex.
> 
> Also, a question that came-up: if ARM64 has no way of calling special
> crash shutdown handler, how can you execute hv_stimer_cleanup() and
> hv_synic_disable_regs() there? Or are they not required in ARM64?
> 

My suggestion is to do (a) for now.  I suspect (b) could be a more
extended discussion and I wouldn't want your patch set to get held
up on that discussion.  I don't know what the sense of the ARM64
maintainers would be toward (b).  They have tried to avoid picking
up code warts like have accumulated on the x86/x64 side over the
years, and I agree with that effort.  But as more and varied
hypervisors become available for ARM64, it seems like a framework
for supporting a custom shutdown handler may become necessary.
But that could take a little time.

You are right about hv_stimer_cleanup() and hv_synic_disable_regs().
We are not running these when a panic occurs on ARM64, and we
should be, though the risk is small.   We will pursue (b) and add
these additional cleanups as part of that.  But again, I would suggest
doing (a) for now, and we will switch back to your solution once
(b) is in place.

> 
> >>
> >> (c) There is also a Hyper-V framebuffer panic notifier, which relies in
> >> doing a vmbus operation that demands a valid connection. So, we must
> >> order this notifier with the panic notifier from vmbus_drv.c, in order to
> >> guarantee that the framebuffer code executes before the vmbus connection
> >> is unloaded.
> >
> > Patch 21 of this set puts the Hyper-V FB panic notifier on the pre_reboot
> > notifier list, which means it won't execute before the VMbus connection
> > unload in the case of kdump.   This notifier is making sure that Hyper-V
> > is notified about the last updates made to the frame buffer before the
> > panic, so maybe it needs to be put on the hypervisor notifier list.  It
> > sends a message to Hyper-V over its existing VMbus channel, but it
> > does not wait for a reply.  It does, however, obtain a spin lock on the
> > ring buffer used to communicate with Hyper-V.   Unless someone has
> > a better suggestion, I'm inclined to take the risk of blocking on that
> > spin lock.
> 
> The logic behind that was: when kdump is set, we'd skip the vmbus
> disconnect on notifiers, deferring that to crash_shutdown(), logic this
> one refuted in the above discussion on ARM64 (one more Pro argument to
> the idea of refactoring aarch64 code to allow a custom crash shutdown
> handler heh). But you're right, for the default level 2, we skip the
> pre_reboot notifiers on kdump, effectively skipping this notifier.
> 
> Some ideas of what we can do here:
> 
> I) we could change the framebuffer notifier to rely on trylocks, instead
> of risking a lockup scenario, and with that, we can execute it before
> the vmbus disconnect in the hypervisor list;

I think we have to do this approach for now.

> 
> II) we ignore the hypervisor notifier in case of kdump _by default_, and
> if the users don't want that, they can always set the panic notifier
> level to 4 and run all notifiers prior to kdump; would that be terrible
> you think? Kdump users might don't care about the framebuffer...
> 
> III) we go with approach (b) above and refactor arm64 code to allow the
> custom crash handler on kdump time, then [with point (I) above] the
> logic proposed in this series is still valid - seems more and more the
> most correct/complete solution.

But even when/if we get approach (b) implemented, having the
framebuffer notifier on the pre_reboot list is still too late with the
default of panic_notifier_level = 2.  The kdump path will reset the
VMbus connection and then the framebuffer notifier won't work.

> 
> In any case, I guess we should avoid workarounds if possible and do the
> things the best way we can, to encompass all (or almost all) the
> possible scenarios and don't force things on users (like enforcing panic
> notifier level 4 for Hyper-V or something like this...)
> 
> More feedback from you / Hyper-V folks is pretty welcome about this.
> 
> 
> >
> >> [...]
> > The "Fixes:" tags imply that these changes should be backported to older
> > longterm kernel versions, which I don't think is the case.  There is a
> > dependency on Patch 14 of your series where PANIC_NOTIFIER is
> > introduced.
> >
> 
> Oh, this was more related with archeology of the kernel. When I'm
> investigating stuff, I really want to understand why code was added and
> that usually require some time git blaming stuff, so having that pronto
> in the commit message is a bonus.
> 
> But of course we don't need to use the Fixes tag for that, easy to only
> mention it in the text. A secondary benefit by using this tag is to
> indicate this is a _real fix_ to some code, and not an improvement, but
> as you say, I agree we shouldn't backport it to previous releases having
> or not the Fixes tag (AFAIK it's not mandatory to backport stuff with
> Fixes tag).
> 
> 
> >> [...]
> >> + * intrincated is the relation of this notifier with Hyper-V framebuffer
> >
> > s/intrincated/intricate/
> 
> Thanks, fixed in V2!
> 
> 
> >
> >> [...]
> >> +static int hv_panic_vmbus_unload(struct notifier_block *nb, unsigned long val,
> >>  			      void *args)
> >> +{
> >> +	if (!kexec_crash_loaded())
> >
> > I'm not clear on the purpose of this condition.  I think it means
> > we will skip the vmbus_initiate_unload() if a panic occurs in the
> > kdump kernel.  Is there a reason a panic in the kdump kernel
> > should be treated differently?  Or am I misunderstanding?
> 
> This is really related with the point discussed in the top of this
> response - I assumed both ARM64/x86_64 would behave the same and
> disconnect the vmbus through the custom crash handler when kdump is set,
> so worth skipping it here in the notifier. But that's not true for ARM64
> as you pointed, so this guard against kexec is really part of the
> decision/discussion on what to do with ARM64 heh

But note that vmbus_initiate_unload() already has a guard built-in.
If the intent of this test is just as a guard against running twice,
then it isn't needed.

> 
> Cheers!